AWS EBS Cost Optimization Guide 2026

Most AWS bills hide a quiet line item that engineers rarely audit, and EBS is the worst offender. Volumes outlive the instances they were attached to. Snapshots pile up forever because nobody actually owns the retention policy. And honestly? Half the fleet I look at still runs gp2, even though gp3 has been generally available since December 2020 and is roughly 20% cheaper for the same performance.

If you haven't done a focused EBS sweep in the last 12 months, you almost certainly have 30-50% savings sitting on the table.

So this guide is the practical playbook: how EBS pricing actually works in 2026, the four levers that move the bill (volume type, size, snapshots, orphans), and copy-paste scripts for AWS CLI, Terraform, and Lambda to automate every cleanup task. Let's dive in.

How EBS Pricing Works in 2026

EBS bills you on three independent axes — and that's exactly what makes naive cost tracking miss so much waste:

Provisioned capacity (GB-month). You pay for the full volume size whether or not the filesystem inside is empty. A 500 GB volume with 50 GB used costs the same as one with 480 GB used. Painful, but that's the model.
Provisioned performance (IOPS and throughput, on gp3, io1, io2, io2 Block Express).
Snapshots ($0.05/GB-month for standard incremental snapshots in us-east-1; $0.0125/GB-month for the Snapshot Archive tier).

Approximate on-demand prices in us-east-1 (verify your region with aws pricing get-products — these have been stable, but they do vary):

Volume type	Storage $/GB-month	Baseline performance	Best for
gp3	$0.08	3,000 IOPS + 125 MB/s included; up to 16,000 IOPS / 1,000 MB/s	General-purpose default
gp2	$0.10	3 IOPS/GB (burst to 3,000)	Legacy — migrate
io2 Block Express	$0.125	Up to 256,000 IOPS, 4,000 MB/s, 99.999% durability	Mission-critical OLTP
st1	$0.045	Throughput-optimized HDD	Big data, log processing
sc1	$0.015	Cold HDD	Infrequently accessed

Two facts that consistently surprise engineers when I walk them through their bill: gp3 with default performance is 20% cheaper than gp2 with no measurable downside for typical workloads, and snapshots are billed on the unique blocks they reference, not the apparent volume size. So a chain of 30 daily snapshots of a 500 GB volume rarely costs 15 TB-month — but it can easily cost 1-2 TB-month if writes are scattered across the disk.

Lever 1: Migrate gp2 to gp3 (the 20-Minute Win)

This is, hands down, the single highest-ROI EBS task you can do in 2026. gp3 is cheaper per GB, decouples IOPS and throughput from capacity, and supports the same instance types. Best part: the migration is online. No downtime, no detach, no maintenance window.

Step 1: Inventory all gp2 volumes

aws ec2 describe-volumes \
  --filters Name=volume-type,Values=gp2 \
  --query 'Volumes[].[VolumeId,Size,Iops,State,Tags[?Key==`Name`]|[0].Value]' \
  --output table \
  --region us-east-1

For multi-account orgs, run it through every account with aws sts assume-role, or use AWS Resource Explorer with a multi-account view (which honestly is a much nicer experience these days).

Step 2: Decide the target IOPS and throughput

The default gp3 configuration (3,000 IOPS, 125 MB/s) matches or exceeds the performance of any gp2 volume up to 1,000 GB. For volumes between 1,000 and 3,334 GB, gp2 was provisioning 3-10K IOPS via the size-tied formula — so check CloudWatch VolumeReadOps and VolumeWriteOps over the past 14 days at p99 to size correctly. If the actual IOPS used is below 3,000, default gp3 is fine.

Step 3: Modify the volume in place

aws ec2 modify-volume \
  --volume-id vol-0abc123def456 \
  --volume-type gp3 \
  --iops 3000 \
  --throughput 125

The modification triggers an "optimizing" state that runs in the background; the volume is fully usable throughout. One quirk to know about: you can only issue another modify-volume on the same volume after 6 hours, so plan accordingly.

Step 4: Bulk-migrate with a script

#!/usr/bin/env bash
set -euo pipefail
REGION="${1:-us-east-1}"

aws ec2 describe-volumes \
  --filters Name=volume-type,Values=gp2 \
  --query 'Volumes[].VolumeId' \
  --output text \
  --region "$REGION" | tr '\t' '\n' | while read -r vol; do
    echo "Modifying $vol -> gp3"
    aws ec2 modify-volume \
      --volume-id "$vol" \
      --volume-type gp3 \
      --iops 3000 \
      --throughput 125 \
      --region "$REGION"
    sleep 1  # avoid throttling
done

Step 5: Lock it in with Terraform

resource "aws_ebs_volume" "data" {
  availability_zone = "us-east-1a"
  size              = 500
  type              = "gp3"
  iops              = 3000
  throughput        = 125
  encrypted         = true
  kms_key_id        = aws_kms_key.ebs.arn

  tags = {
    Name        = "app-data"
    Environment = "prod"
    CostCenter  = "platform-eng"
  }
}

And — this is the bit teams forget — add an SCP or AWS Config rule to block any new gp2 volumes from sneaking back in:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "DenyGp2Volumes",
    "Effect": "Deny",
    "Action": ["ec2:CreateVolume", "ec2:ModifyVolume"],
    "Resource": "*",
    "Condition": {
      "StringEquals": { "ec2:VolumeType": "gp2" }
    }
  }]
}

Lever 2: Find and Delete Unattached "Orphan" Volumes

When an EC2 instance is terminated, EBS volumes only auto-delete if DeleteOnTermination=true. For data volumes, that flag is almost always false — so the volumes just sit there in available state forever, billing at full price. In any AWS account older than two years, expect 5-15% of total EBS spend to be sitting on truly orphaned volumes. I've seen accounts where it was 25%.

Identify orphans

aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[].[VolumeId,Size,VolumeType,CreateTime,Tags[?Key==`Name`]|[0].Value]' \
  --output table \
  --region us-east-1

Snapshot before deletion (the safe pattern)

Never delete an unattached volume blindly. Someone may detach a volume mid-debug-session and forget about it, and now you're the person who blew away their morning's work. The safe pattern: snapshot, tag the snapshot with the source volume metadata, then delete after a 30-day grace period.

#!/usr/bin/env bash
set -euo pipefail
REGION="${1:-us-east-1}"
TODAY=$(date -u +%Y-%m-%d)

aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[].VolumeId' \
  --output text \
  --region "$REGION" | tr '\t' '\n' | while read -r vol; do
    [ -z "$vol" ] && continue
    snap=$(aws ec2 create-snapshot \
      --volume-id "$vol" \
      --description "pre-delete archive of $vol on $TODAY" \
      --tag-specifications "ResourceType=snapshot,Tags=[{Key=PreDeleteArchive,Value=true},{Key=SourceVolumeId,Value=$vol},{Key=ArchivedOn,Value=$TODAY}]" \
      --query SnapshotId --output text --region "$REGION")
    echo "Snapshot $snap created for $vol"
done

Wait for snapshots to reach completed, then delete the volumes. Want to save even more? Archive the snapshots:

aws ec2 modify-snapshot-tier \
  --snapshot-id snap-0123456789abcdef0 \
  --storage-tier archive

Snapshot Archive drops the cost from $0.05/GB-month to $0.0125/GB-month — a 75% reduction — at the cost of a 24-72 hour restore time. Perfect for compliance archives. Not so great if you actually expect to ever restore in a hurry.

Automate with Lambda + EventBridge

Run this weekly:

import boto3
from datetime import datetime, timezone, timedelta

ec2 = boto3.client("ec2")
GRACE_DAYS = 14

def lambda_handler(event, context):
    cutoff = datetime.now(timezone.utc) - timedelta(days=GRACE_DAYS)
    paginator = ec2.get_paginator("describe_volumes")
    candidates = []

    for page in paginator.paginate(Filters=[{"Name": "status", "Values": ["available"]}]):
        for vol in page["Volumes"]:
            if vol["CreateTime"] < cutoff:
                candidates.append({
                    "VolumeId": vol["VolumeId"],
                    "Size": vol["Size"],
                    "Type": vol["VolumeType"],
                    "AgeDays": (datetime.now(timezone.utc) - vol["CreateTime"]).days,
                })

    monthly_cost = sum(c["Size"] * 0.08 for c in candidates if c["Type"] == "gp3") + \
                   sum(c["Size"] * 0.10 for c in candidates if c["Type"] == "gp2")

    print(f"Found {len(candidates)} orphan volumes wasting ~${monthly_cost:.2f}/month")
    # Send to Slack or trigger downstream snapshot+delete pipeline
    return {"candidates": candidates, "estimatedMonthlyWaste": monthly_cost}

Lever 3: Snapshot Hygiene with Data Lifecycle Manager

Manual aws ec2 create-snapshot calls in cron jobs are exactly how snapshot graveyards happen. Use Amazon Data Lifecycle Manager (DLM) for everything — it handles creation, retention, cross-account copy, and deletion declaratively. Honestly, this should have been the default ten years ago.

A production-grade DLM policy in Terraform

resource "aws_dlm_lifecycle_policy" "daily_snapshots" {
  description        = "Daily snapshots, 7-day retention, archive after 30 days"
  execution_role_arn = aws_iam_role.dlm.arn
  state              = "ENABLED"

  policy_details {
    resource_types = ["VOLUME"]

    target_tags = {
      Backup = "daily"
    }

    schedule {
      name = "DailyRollingSnapshots"

      create_rule {
        interval      = 24
        interval_unit = "HOURS"
        times         = ["03:00"]
      }

      retain_rule {
        count = 7
      }

      copy_tags = true

      tags_to_add = {
        SnapshotCreator = "DLM"
      }

      archive_rule {
        retain_rule {
          interval      = 180
          interval_unit = "DAYS"
        }
      }
    }
  }
}

This single policy replaces three different patterns I see in real accounts on a weekly basis: cron-based create-snapshot with no cleanup, AWS Backup running the same schedule at higher cost, and per-application Lambdas that nobody maintains anymore (the last engineer who touched them left in 2023).

Find dangling snapshots from deleted volumes

aws ec2 describe-snapshots \
  --owner-ids self \
  --query 'Snapshots[?VolumeId!=`vol-ffffffff`].[SnapshotId,VolumeId,VolumeSize,StartTime]' \
  --output table | head -50

# Cross-reference with existing volumes
existing=$(aws ec2 describe-volumes --query 'Volumes[].VolumeId' --output text)
aws ec2 describe-snapshots --owner-ids self \
  --query 'Snapshots[].[SnapshotId,VolumeId,VolumeSize]' --output text | \
  awk -v ev="$existing" '{
    split(ev, arr, " ");
    found=0;
    for (i in arr) if (arr[i] == $2) found=1;
    if (!found) print
  }'

Snapshots whose source volume no longer exists are the easiest deletes — if the volume is gone, the data is already historical, and the snapshot has no operational value beyond an audit trail. Tier them to Snapshot Archive, or delete after a documented retention window. Either way, they don't need to keep costing you full price.

Enable EBS Snapshot Recycle Bin

Before any bulk snapshot delete, turn on the Recycle Bin so accidental deletes stay recoverable for up to a year:

aws rbin create-rule \
  --resource-type EBS_SNAPSHOT \
  --retention-period RetentionPeriodValue=14,RetentionPeriodUnit=DAYS \
  --description "14-day recycle bin for EBS snapshots" \
  --tags '[{"Key":"Owner","Value":"finops"}]'

Lever 4: Right-Size Volume Capacity and IOPS

Most data volumes are provisioned 2-3x larger than necessary because someone picked a number "to be safe" three years ago, and nobody's revisited it since. Unlike instance right-sizing, EBS shrinking isn't directly supported — but you can resize up for free, and create new smaller volumes from snapshots.

Find oversized volumes with CloudWatch agent

EBS itself doesn't report filesystem usage to CloudWatch. (Annoying, but there's a reason — EBS doesn't actually know what's inside the block device.) You'll need the CloudWatch unified agent or a similar exporter. Once it's reporting disk_used_percent:

aws cloudwatch get-metric-statistics \
  --namespace CWAgent \
  --metric-name disk_used_percent \
  --dimensions Name=InstanceId,Value=i-0123456789abcdef0 Name=path,Value=/ Name=device,Value=nvme0n1p1 \
  --start-time $(date -u -v-30d +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
  --period 86400 \
  --statistics Maximum

Volumes whose 30-day max usage stays under 40% are candidates for shrinking. The procedure: snapshot, create a new smaller volume from the snapshot using --size, attach, sync filesystem, swap. For root volumes, AMI the instance and relaunch — a 30-minute task that recovers 60% of the wasted GBs forever.

Right-size IOPS on gp3

This is the trap most teams fall into right after migrating to gp3: they over-provision IOPS "just in case." Each additional IOPS above the included 3,000 costs $0.005/IOPS-month. A volume with 16,000 provisioned IOPS adds about $65/month per volume in extra IOPS charges alone. Multiply that across a fleet, and the math gets ugly fast. Audit:

aws ec2 describe-volumes \
  --filters Name=volume-type,Values=gp3 \
  --query 'Volumes[?Iops>`3000`].[VolumeId,Size,Iops,Throughput]' \
  --output table

Then check the actual usage with the CloudWatch VolumeReadOps + VolumeWriteOps sum at the p99 over 14 days. If actual peak is below 50% of provisioned, dial back to either 3,000 (free) or actual + 30% headroom.

Lever 5: Detect Idle Volumes (Attached But Unused)

Orphans are unattached. Idle volumes are attached but receive almost no I/O — usually because the instance got repurposed at some point and the disk was forgotten. CloudWatch makes these easy enough to find:

aws cloudwatch get-metric-statistics \
  --namespace AWS/EBS \
  --metric-name VolumeReadOps \
  --dimensions Name=VolumeId,Value=vol-0123456789abcdef0 \
  --start-time $(date -u -v-14d +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
  --period 86400 \
  --statistics Sum --query 'Datapoints[?Sum>`100`]' --output table

Volumes with fewer than 100 reads per day for 14 consecutive days are essentially idle. Either they're write-only logs (check VolumeWriteOps before pulling the trigger), or they're dead weight. Snapshot, archive, detach, delete.

Putting It Together: A 30-Day EBS Cost Reduction Plan

Week 1 — Run inventory scripts. Tag every volume with Owner, Application, and BackupPolicy. Set up the Recycle Bin.
Week 2 — Bulk-migrate every gp2 volume to gp3 with default performance. Verify CloudWatch IOPS for outliers and tune up where needed. Add the SCP to block new gp2 volumes.
Week 3 — Snapshot all unattached volumes, tag them, archive snapshots older than 30 days. Delete volumes after a 14-day grace window. Replace cron-based snapshot scripts with one DLM policy per backup tier.
Week 4 — Identify oversized and idle volumes. Schedule shrink-via-snapshot operations during change windows. Stand up the weekly Lambda audit job and route findings to Slack or your FinOps queue.

For a typical 100-instance AWS account with no prior EBS hygiene, this sequence reliably delivers 30-50% off the EBS line item — often $5,000-$25,000/month depending on scale. And unlike Reserved Instance commitments, every action here is reversible. Worst case, you snapshot something, delete the volume, and recreate it next week. The blast radius is small, the upside is huge.

Frequently Asked Questions

Is gp3 always cheaper than gp2?

For storage at default performance, yes — gp3 is $0.08/GB-month versus gp2 at $0.10/GB-month. The crossover point is when you provision more than 3,000 IOPS or 125 MB/s on gp3, which costs extra. For volumes that need very high IOPS-per-GB (above 3 IOPS/GB on small volumes that were exploiting gp2 burst credits), do the math first. In practice, 95% of gp2 volumes are cheaper as default gp3.

Will migrating gp2 to gp3 cause downtime or performance loss?

No. aws ec2 modify-volume changes the type online. The volume enters an "optimizing" state that runs in the background and remains fully readable and writable. Performance during optimization is at least as good as the source gp2 baseline. The only real constraint is that you can't run a second modify-volume on the same volume within 6 hours.

How do I shrink an EBS volume to a smaller size?

EBS doesn't support direct in-place shrinking (sadly). The standard procedure: take a snapshot, create a new smaller volume from it via aws ec2 create-volume --snapshot-id ... --size N, attach to the same instance on a different device, sync the filesystem with rsync or dd + resize2fs, swap the device names, and detach the old volume. For root volumes, the cleanest path is to create an AMI, relaunch with a smaller root volume specified in block device mappings, and migrate.

What is EBS Snapshot Archive and when should I use it?

Snapshot Archive is a tier that costs $0.0125/GB-month — 75% cheaper than the standard $0.05/GB-month tier — but takes 24-72 hours to restore and has a 90-day minimum retention. Use it for compliance retention copies, monthly fulls older than 90 days, or any snapshot you keep for legal or audit reasons but expect never to actually restore. Don't use it for operational backups where RTO matters.

How do I prevent orphaned volumes when EC2 instances terminate?

Set DeleteOnTermination=true in the block device mapping at instance launch. For root volumes this is the default; for additional data volumes it's the default false, which is exactly how orphans accumulate. In Terraform, set delete_on_termination = true on the relevant ebs_block_device blocks. Combine this with an AWS Config rule (ec2-volume-inuse-check) that flags any volume in available state for more than 7 days.

Can I use AWS Backup instead of Data Lifecycle Manager?

Yes, but it's more expensive. AWS Backup adds a per-protected-resource fee and per-GB charges on top of the underlying snapshot storage cost — typically a 15-30% premium for EBS-only protection. Use AWS Backup when you need cross-service backup orchestration (RDS + EBS + DynamoDB in one plan), cross-account vault locking for compliance (SEC 17a-4, HIPAA), or formal restore testing. For pure EBS lifecycle automation, DLM is the cheaper and simpler tool.