The Hidden Drain: Why Zombie Resources Are Costing You Thousands
Every cloud environment accumulates waste over time. It's just a fact of life. Developers spin up EC2 instances for testing and forget to terminate them. Teams delete VMs but leave their disks, public IPs, and snapshots behind. Load balancers sit idle after backend services get decommissioned. These forgotten assets — known as zombie resources — silently drain your cloud budget month after month.
And the numbers are honestly staggering.
According to Flexera's 2025 State of the Cloud Report, organizations waste between 27% and 32% of their cloud spending on unused resources. Gartner puts the average even higher at 35%, rising to 55% in companies without a formal optimization strategy. For a company spending $100,000 per month on cloud infrastructure, that's $27,000 to $55,000 going to resources that deliver zero value. Every single month.
This guide gives you the exact CLI commands and scripts to hunt down zombie resources across AWS, Azure, and GCP — plus automation patterns to keep them from coming back.
What Counts as a Zombie Resource?
A zombie resource is any cloud asset that's provisioned and incurring charges but no longer serving a useful purpose. They generally fall into two buckets:
- Orphaned resources — assets that remain provisioned after their parent resource was deleted. Think: an EBS volume left behind after an EC2 instance was terminated, or a public IP that no longer points to anything.
- Idle resources — assets that are technically active but functionally useless. A running EC2 instance with near-zero CPU and network I/O for weeks, or a DynamoDB table with provisioned capacity but zero requests.
The most common zombie resource types across all three clouds include:
- Unattached storage volumes and disks
- Orphaned snapshots with no retention policy
- Unassociated static/elastic IPs
- Idle load balancers with no healthy targets
- Stopped or suspended VMs still incurring storage charges
- Unused NAT gateways and VPN connections
- Empty container registries and stale images
- Dormant databases with zero query activity
So, let's go hunt some zombies.
AWS: Finding Zombie Resources with the AWS CLI
AWS is probably the most common environment for zombie resource accumulation, partly because of how EBS volumes behave on instance termination. By default, terminating an EC2 instance only detaches (not deletes) its EBS volumes — unless you explicitly selected "Delete on Termination" at launch. In busy development environments where instances are frequently launched and terminated, this creates a steady drip of orphaned volumes that nobody notices until the bill arrives.
Find Unattached EBS Volumes
Unattached EBS volumes are the single most common zombie resource on AWS. They have a status of available, meaning they exist but aren't attached to any instance.
# List all unattached EBS volumes in a single region
aws ec2 describe-volumes \
--region us-east-1 \
--filters Name=status,Values=available \
--query "Volumes[].{VolumeId:VolumeId, SizeGB:Size, Type:VolumeType, Created:CreateTime, AZ:AvailabilityZone}" \
--output table
To scan across all AWS regions at once (which you definitely should — zombies love hiding in regions you forgot you were using):
# Scan ALL regions for unattached EBS volumes
for region in $(aws ec2 describe-regions --query "Regions[].RegionName" --output text); do
echo "=== Region: $region ==="
aws ec2 describe-volumes \
--region "$region" \
--filters Name=status,Values=available \
--query "Volumes[].{VolumeId:VolumeId, SizeGB:Size, Type:VolumeType}" \
--output table
done
Before deleting any volume, always create a snapshot first. Snapshots are significantly cheaper than keeping an unused volume running, and they're your safety net if someone comes asking about that data three weeks later:
# Snapshot a volume before deletion (much cheaper than keeping the volume)
aws ec2 create-snapshot \
--region us-east-1 \
--volume-id vol-0abcd1234abcd1234 \
--description "Backup before zombie cleanup"
# Then delete the volume
aws ec2 delete-volume \
--region us-east-1 \
--volume-id vol-0abcd1234abcd1234
Find Unassociated Elastic IPs
AWS charges for Elastic IPs that aren't associated with a running instance. And since February 2024, AWS also charges for all public IPv4 addresses at $0.005 per hour ($3.65/month), making orphaned EIPs even more costly than they used to be.
# List all Elastic IPs not associated with any resource
aws ec2 describe-addresses \
--query "Addresses[?AssociationId==null].{PublicIP:PublicIp, AllocationId:AllocationId, Domain:Domain}" \
--output table
# Export unused EIPs to JSON for review before cleanup
aws ec2 describe-addresses \
--query "Addresses[?AssociationId==null]" \
--output json > unused-eips-$(date +%Y%m%d).json
# Release an unused Elastic IP after review
aws ec2 release-address --allocation-id eipalloc-0abcdef1234567890
Find Stopped EC2 Instances
Stopped EC2 instances don't incur compute charges, but their attached EBS volumes, associated Elastic IPs, and any Marketplace AMI subscriptions continue to bill. Long-stopped instances are almost always forgotten test environments that someone meant to clean up "next week."
# List all stopped EC2 instances across the account
aws ec2 describe-instances \
--filters Name=instance-state-name,Values=stopped \
--query "Reservations[].Instances[].{InstanceId:InstanceId, Type:InstanceType, Name:Tags[?Key=='Name']|[0].Value, LaunchTime:LaunchTime, State:State.Name}" \
--output table
Find Idle EC2 Instances Using CloudWatch
Running instances with near-zero CPU utilization over an extended period are prime zombie candidates. Generally, if average CPU stays below 5% for 14 or more days, that instance probably isn't doing anything useful:
# Check average CPU utilization for an instance over the last 14 days
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-0abcdef1234567890 \
--start-time $(date -u -v-14d +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--period 86400 \
--statistics Average \
--output table
Find Unused Load Balancers
# Find ALBs/NLBs with no registered targets
for arn in $(aws elbv2 describe-load-balancers --query "LoadBalancers[].LoadBalancerArn" --output text); do
tg_count=$(aws elbv2 describe-target-groups \
--load-balancer-arn "$arn" \
--query "length(TargetGroups)" --output text)
if [ "$tg_count" == "0" ]; then
echo "NO TARGETS: $arn"
fi
done
Find Orphaned EBS Snapshots
# Find snapshots older than 90 days owned by your account
aws ec2 describe-snapshots \
--owner-ids self \
--query "Snapshots[?StartTime<='$(date -u -v-90d +%Y-%m-%dT%H:%M:%SZ)'].{SnapshotId:SnapshotId, VolumeId:VolumeId, SizeGB:VolumeSize, StartTime:StartTime, Description:Description}" \
--output table
Azure: Finding Zombie Resources with the Azure CLI
Azure has its own patterns of waste accumulation. When VMs are deleted, their managed disks, network interfaces, public IPs, and network security groups often stay behind. Azure's resource group model helps contain resources, but it doesn't automatically clean up the orphans — that's on you.
Find Unattached Managed Disks
When a managed disk is attached to a VM, its managedBy property contains the VM's resource ID. When it's orphaned, this property is null.
# List all unattached managed disks
az disk list \
--query "[?managedBy==null].{Name:name, RG:resourceGroup, SizeGB:diskSizeGb, SKU:sku.name, Location:location, Created:timeCreated}" \
--output table
# Check when a specific disk was last detached
az disk show \
--name myOrphanedDisk \
--resource-group myResourceGroup \
--query "lastOwnershipUpdateTime"
Here's a handy script with a dry-run safety flag (I'd strongly recommend running the dry run first — trust me on this one):
# Set to 1 to delete, 0 to just list (dry run)
DELETE_DISKS=0
for id in $(az disk list --query "[?managedBy==null].[id]" -o tsv); do
if [ "$DELETE_DISKS" == "1" ]; then
echo "Deleting unattached disk: $id"
az disk delete --ids "$id" --yes --no-wait
else
echo "ORPHANED: $id"
fi
done
Find Orphaned Public IPs
Standard SKU static public IPs cost approximately $3.65/month each. Doesn't sound like much, right? But if you've got 50 orphaned IPs scattered across your subscriptions, that's $180/month for absolutely nothing.
# List all public IPs not associated with any resource
az network public-ip list \
--query "[?ipConfiguration==null && natGateway==null].{Name:name, RG:resourceGroup, IP:ipAddress, SKU:sku.name, Allocation:publicIpAllocationMethod}" \
--output table
# Delete an orphaned public IP
az network public-ip delete \
--resource-group myResourceGroup \
--name myUnusedPublicIP
Find Empty App Service Plans
App Service plans incur charges whether or not they host any applications. An empty Standard S1 plan runs roughly $70/month — just sitting there doing nothing.
# Find App Service Plans hosting zero applications
az appservice plan list \
--query "[?numberOfSites==`0`].{Name:name, RG:resourceGroup, SKU:sku.name, Location:location}" \
--output table
# Delete an empty plan
az appservice plan delete \
--resource-group myResourceGroup \
--name myEmptyPlan \
--yes
Find Orphaned Network Interfaces
# List NICs not attached to any VM
az network nic list \
--query "[?virtualMachine==null].{Name:name, RG:resourceGroup, Location:location}" \
--output table
Azure Resource Graph KQL Queries
For large-scale environments with multiple subscriptions, Azure Resource Graph lets you run powerful cross-subscription queries using KQL. This is a game-changer if you're managing more than a handful of subscriptions:
// Find all unattached managed disks across all subscriptions
Resources
| where type == "microsoft.compute/disks"
| where properties.diskState == "Unattached"
| extend sizeGb = tostring(properties.diskSizeGB)
| extend sku = tostring(sku.name)
| extend timeCreated = tostring(properties.timeCreated)
| join kind=leftouter (
resourcecontainers
| where type == "microsoft.resources/subscriptions"
| project subscriptionId, subscriptionName = name
) on subscriptionId
| project name, subscriptionName, resourceGroup, location, sizeGb, sku, timeCreated
| sort by sizeGb desc
GCP: Finding Zombie Resources with gcloud
GCP has similar waste patterns, with an additional nuance worth paying attention to: persistent disks are priced at $0.040/GB/month for pd-standard and $0.170/GB/month for pd-ssd. A single forgotten 500 GB SSD disk costs $85/month — that's over $1,000/year for storage nobody's using.
Find Unattached Persistent Disks
The NOT users:* filter identifies disks not attached to any VM instance:
# List all unattached persistent disks
gcloud compute disks list \
--filter="NOT users:*" \
--format="table(name, zone.basename(), sizeGb, type.basename(), status, lastDetachTimestamp)" \
--project=YOUR_PROJECT_ID
Find Terminated VM Instances
# List all terminated (stopped) instances
gcloud compute instances list \
--filter="status=TERMINATED" \
--format="table(name, zone.basename(), machineType.basename(), status)" \
--project=YOUR_PROJECT_ID
Find Old Snapshots Without a Retention Policy
Snapshots accumulate fast, especially if you've got automated snapshot schedules with no expiration configured. List them all and compare their creation dates against your retention threshold:
# List all snapshots sorted by creation time (oldest first)
gcloud compute snapshots list \
--format="table(name, diskSizeGb, creationTimestamp, storageBytes, status)" \
--sort-by=creationTimestamp \
--project=YOUR_PROJECT_ID
# Delete a specific old snapshot
gcloud compute snapshots delete SNAPSHOT_NAME \
--project=YOUR_PROJECT_ID \
--quiet
Find Unused Static External IPs
# List static IPs not in use
gcloud compute addresses list \
--filter="status=RESERVED" \
--format="table(name, address, region.basename(), status)" \
--project=YOUR_PROJECT_ID
Use GCP's Built-in Idle Resource Recommender
One thing GCP does really well is its native Recommender API. It'll proactively identify idle persistent disks, IPs, and custom images for you:
# Get idle persistent disk recommendations
gcloud recommender recommendations list \
--project=YOUR_PROJECT_ID \
--location=us-central1-a \
--recommender=google.compute.disk.IdleResourceRecommender \
--format="table(content.operationGroups[0].operations[0].resource, priority, description)"
# Get idle IP address recommendations
gcloud recommender recommendations list \
--project=YOUR_PROJECT_ID \
--location=us-central1 \
--recommender=google.compute.address.IdleResourceRecommender \
--format="table(content.operationGroups[0].operations[0].resource, priority, description)"
Automate the Hunt: Scheduled Cleanup Scripts
Finding zombies once is useful. But honestly, it's preventing them from accumulating that saves real money over time. Here are automation patterns for each cloud provider.
AWS: Lambda + EventBridge for Scheduled Cleanup
This Python Lambda function identifies unattached EBS volumes and sends a report via SNS. Schedule it with an EventBridge rule to run daily or weekly:
import boto3
import json
from datetime import datetime
def lambda_handler(event, context):
ec2 = boto3.client("ec2")
sns = boto3.client("sns")
# Find all unattached EBS volumes
response = ec2.describe_volumes(
Filters=[{"Name": "status", "Values": ["available"]}]
)
zombies = []
total_cost = 0.0
for vol in response["Volumes"]:
size_gb = vol["Size"]
vol_type = vol["VolumeType"]
# Approximate monthly cost by volume type
cost_per_gb = {
"gp3": 0.08, "gp2": 0.10, "io1": 0.125,
"io2": 0.125, "st1": 0.045, "sc1": 0.015,
"standard": 0.05
}
monthly_cost = size_gb * cost_per_gb.get(vol_type, 0.10)
total_cost += monthly_cost
zombies.append({
"VolumeId": vol["VolumeId"],
"SizeGB": size_gb,
"Type": vol_type,
"MonthlyCost": f"${monthly_cost:.2f}",
"Created": vol["CreateTime"].isoformat(),
"AZ": vol["AvailabilityZone"]
})
if zombies:
message = (
f"Found {len(zombies)} unattached EBS volumes\n"
f"Estimated monthly waste: ${total_cost:.2f}\n\n"
+ json.dumps(zombies, indent=2)
)
sns.publish(
TopicArn="arn:aws:sns:us-east-1:123456789012:zombie-alerts",
Subject=f"Zombie EBS Report - {datetime.utcnow().strftime('%Y-%m-%d')}",
Message=message
)
return {"zombies_found": len(zombies), "estimated_monthly_waste": f"${total_cost:.2f}"}
Schedule it with an EventBridge cron rule to run every Monday morning:
# Create an EventBridge rule for weekly execution
aws events put-rule \
--name "weekly-zombie-scan" \
--schedule-expression "cron(0 8 ? * MON *)" \
--description "Weekly scan for zombie EBS volumes"
aws events put-targets \
--rule "weekly-zombie-scan" \
--targets "Id"="1","Arn"="arn:aws:lambda:us-east-1:123456789012:function:zombie-scanner"
Azure: Automation Runbook for Disk Cleanup
Use an Azure Automation runbook to scan for orphaned disks on a weekly schedule:
# Schedule a weekly scan using Azure CLI
az automation schedule create \
--resource-group myAutomationRG \
--automation-account-name myAutomation \
--name "weekly-zombie-scan" \
--frequency Week \
--interval 1 \
--start-time "2026-03-10T08:00:00Z"
GCP: Cloud Functions + Cloud Scheduler
Deploy a Cloud Function that scans for unattached disks, creates snapshots for safety, then deletes them. Trigger it on a schedule with Cloud Scheduler:
# Create a Cloud Scheduler job for weekly execution
gcloud scheduler jobs create http zombie-disk-scanner \
--schedule="0 8 * * 1" \
--uri="https://REGION-PROJECT_ID.cloudfunctions.net/scan-zombie-disks" \
--http-method=POST \
--oidc-service-account-email=zombie-scanner@PROJECT_ID.iam.gserviceaccount.com \
--location=us-central1
Open-Source Tools for Multi-Cloud Zombie Hunting
If you're managing resources across multiple clouds (and let's be real, most mid-to-large companies are these days), these open-source tools provide cross-cloud visibility without vendor lock-in:
- CDOps Cloud Zombie Hunter — A CLI utility that scans AWS, Azure, and GCP for unattached EBS volumes, orphaned snapshots, idle instances, unused load balancers, and more. It uses read-only permissions, so it's safe to run without worrying about accidental deletions.
- Zombie Hunter (FinOps + ChatOps) — Scans multi-cloud infrastructure and reports findings straight to Slack. Supports configurable thresholds (snapshot age, idle days) and a
dry_runmode for safe previews. - Steampipe — Query multi-cloud resources using SQL. You can write queries like
SELECT * FROM aws_ebs_volume WHERE state = 'available'to find zombies across providers. If you're comfortable with SQL, this one's fantastic. - CloudQuery — Extracts cloud asset data into PostgreSQL, enabling SQL-based analysis and integration with BI tools for trend reporting.
Building a Zombie Prevention Framework
Catching zombies is reactive. Preventing them is what separates mature FinOps practices from periodic fire drills. Here's a framework that combines tagging, governance, and automation to keep your cloud environment clean.
1. Enforce Mandatory Tagging at Provisioning
Every resource should be tagged with at minimum: Owner, Project, Environment, and ExpirationDate. Use cloud-native policy engines to block untagged resource creation:
- AWS — Use AWS Organizations Tag Policies and Service Control Policies (SCPs) to require tags.
- Azure — Use Azure Policy with
requireeffect to deny deployments without mandatory tags. - GCP — Use Organization Policy constraints combined with Terraform validation rules.
I can't stress this enough: tagging is the single most impactful thing you can do. Without it, you'll be playing detective every time you find an orphaned resource, trying to figure out who created it and whether it's safe to delete.
2. Schedule Non-Production Environment Shutdowns
Development, staging, and QA environments rarely need 24/7 uptime. If your team works standard business hours, that's 128 hours of idle time per week per instance. Scheduling shutdowns for nights and weekends alone can cut non-production compute costs by up to 65%.
3. Set Expiration Dates on Temporary Resources
Tag temporary resources with an ExpirationDate and run a nightly Lambda or Cloud Function that deletes expired ones. This prevents test environments and proof-of-concept setups from becoming permanent budget items — which, if we're being honest, happens way more often than anyone likes to admit.
4. Run Weekly Zombie Scans
Use the CLI commands and automation scripts from this guide to run weekly audits. Send reports to a Slack channel or email distribution list so resource owners get notified and have a window to claim resources before cleanup happens.
5. Track the Zombie Resource Percentage KPI
Measure zombie resource costs as a percentage of total cloud spend. Mature FinOps teams target less than 5%. Track this metric monthly and report it alongside other cloud cost KPIs. Nothing motivates cleanup quite like having a number attached to the waste that leadership can see.
Quick-Win Checklist: Your First Zombie Hunt
If you've never run a zombie audit before, start with these high-impact checks. They typically surface 10–20% of wasted monthly spend:
- Unattached storage volumes — EBS volumes (AWS), managed disks (Azure), persistent disks (GCP)
- Orphaned static IPs — Elastic IPs (AWS), public IPs (Azure), static external IPs (GCP)
- Stopped or terminated instances — still incurring charges for attached storage and IPs
- Old snapshots — beyond your retention window with no deletion policy
- Idle load balancers — ALBs/NLBs with no healthy targets or zero traffic
- Empty App Service plans or unused Cloud Run services — provisioned but hosting nothing
Run the CLI commands from this guide, export the results to a spreadsheet, and calculate the monthly cost of each zombie. Prioritize cleanup by cost impact. Most teams find $5,000–$15,000 in annual savings from this first pass alone — and that's usually just scratching the surface.
Frequently Asked Questions
What is the difference between orphaned and zombie cloud resources?
Orphaned resources are cloud assets that remain provisioned after their parent resource was deleted — like an EBS volume left behind when an EC2 instance is terminated. Zombie resources are a broader category that includes orphaned resources plus any asset that's technically active but functionally unnecessary, like a running VM with near-zero CPU utilization. Both cost money without delivering value, and the fix is the same: identify, review, and either terminate or right-size them.
How much money can I save by cleaning up unused cloud resources?
Industry data consistently shows that organizations waste 27–35% of their cloud budget on unused or underutilized resources. A first-time zombie audit typically uncovers 10–20% of monthly spend that can be eliminated right away. For a company spending $50,000/month on cloud infrastructure, that translates to $5,000–$10,000 in immediate monthly savings. Ongoing weekly scans and automation typically achieve 20–30% cost reduction within six months.
How often should I scan for zombie resources?
At least weekly. Many mature FinOps teams run daily scans for high-cost resource types (compute instances, databases) and weekly scans for storage-related zombies (volumes, snapshots, IPs). The key is automation — manual audits don't happen frequently enough and miss resources created between reviews. Use EventBridge (AWS), Azure Automation (Azure), or Cloud Scheduler (GCP) to schedule scans and deliver reports automatically.
Is it safe to delete unattached cloud resources automatically?
Not without safeguards. An unattached disk might contain data that hasn't been migrated yet. A stopped instance could be intentionally paused for a maintenance window. Always implement a grace period workflow: tag newly discovered zombies with a review date, notify resource owners, and only auto-delete after a 7–14 day window if no one claims the resource. For critical environments, snapshot volumes before deletion so data can be recovered if needed.
What tools can detect zombie resources across multiple cloud providers?
Several open-source tools provide multi-cloud zombie detection. CDOps Cloud Zombie Hunter and Zombie Hunter both scan AWS, Azure, and GCP from a single CLI. Steampipe lets you query all three clouds using SQL. For enterprise needs, commercial platforms like CloudHealth, Spot by NetApp, and CAST AI offer automated multi-cloud zombie detection with policy-driven cleanup. Each cloud provider also has native tools — AWS Trusted Advisor and Compute Optimizer, Azure Advisor, and GCP Recommender — but those only cover their own platform.