AWS CloudWatch Cost Optimization Guide…

Q: How do I find the most expensive CloudWatch log group?

Use the AWS CLI: aws logs describe-log-groups --query 'logGroups[*].[logGroupName,storedBytes]' --output text | sort -k2 -n -r | head -20 . For ingestion volume (not just storage), query the IncomingBytes metric in the AWS/Logs namespace grouped by LogGroupName over 30 days.

Q: Can I move existing logs to Infrequent Access?

You can change a log group's class with PutLogGroupClass , but the change is one-way and the group must not have metric filters or subscription filters defined. Existing log events stay in place; only the class flag changes, which immediately lowers storage and future ingestion rates.

Updated: June 1, 2026

AWS CloudWatch cost optimization means cutting the four line items that actually drive the bill: Logs ingestion (DataProcessing-Bytes), Logs storage, custom metrics (PutMetricData), and Logs Insights query bytes scanned. The fix is routing low-value logs to the Infrequent Access class, deleting noisy custom metrics, shortening retention, and replacing per-event PutMetricData calls with Embedded Metric Format. In my last engagement, that exact combination dropped a $48,000/month CloudWatch bill to under $9,000 in six weeks, with zero loss of production observability. This guide walks the 2026 levers in priority order.

Logs ingestion at $0.50/GB is almost always the #1 CloudWatch line item. Kill DEBUG/INFO from VPC Flow, ALB, and Lambda logs before doing anything else.
The Infrequent Access log class (GA November 2023, expanded 2024) cuts ingestion to $0.25/GB and storage to roughly $0.008/GB. That's a 50-75% cut on cold logs with zero code change.
Custom metrics cost $0.30 each per month per dimension combo. A single high-cardinality dimension (like user_id) can produce 100,000+ metrics from one PutMetricData call.
Embedded Metric Format (EMF) emits metrics inside log lines, so 1,000 metrics cost the same as one log event. Typically 90% cheaper than PutMetricData.
Logs Insights bills $0.005 per GB scanned. Narrow time ranges, use fields early, and put filters before stats to cut query cost 80%+.
Default "Never expire" retention on every log group is the silent killer. Terraform-enforced retention policies are the single highest-ROI control you can ship.

Why is my CloudWatch bill so high?

CloudWatch bills surprise teams because three independent failure modes compound at once. First, every AWS service ships logs to CloudWatch by default and most teams never set a retention policy, meaning a single noisy Lambda from 2019 may still be paying storage on 7 years of TRACE logs. Second, custom metric pricing is per-metric per-dimension-combination per month, so a developer who calls PutMetricData with user_id as a dimension can mint hundreds of thousands of metrics in a single afternoon. Third, the Logs Insights console makes it trivial to scan terabytes of data with a careless query. One engineer running fields @message | filter ... across 30 days of VPC Flow Logs can rack up four-figure scan charges in minutes.

In my triage of more than a dozen AWS accounts, the same 80/20 holds. Roughly 60% of CloudWatch spend is Logs ingestion, 20% is custom metrics, 10% is Logs Insights queries, and 10% is everything else (alarms, dashboards, contributor insights, synthetics). Optimization should follow that same order. If you haven't yet identified your top spenders, run the AWS Cost Explorer breakdown grouped by Usage Type and filter to the AmazonCloudWatch service. The line items map cleanly to the levers below. For broader investigations, see our AWS bill spike triage playbook.

CloudWatch pricing breakdown 2026

The 2026 us-east-1 prices below are the public list rates. Other regions can be 10-25% higher. Cross-reference your invoice's Usage Type column against this table to know exactly which lever applies.

Line item	Standard class	Infrequent Access class	Notes
Logs ingestion (per GB)	$0.50	$0.25	Top spend driver in 80% of accounts.
Logs storage (per GB-month)	$0.03	~$0.008	Compressed bytes; IA is ~73% cheaper.
Logs Insights (per GB scanned)	$0.005	$0.0075	IA queries are more expensive, so use sparingly.
Custom metrics (per metric-month)	$0.30 (first 10k)	n/a	Drops to $0.10 above 150k metrics.
API requests (PutMetricData)	$0.01 / 1,000 requests	n/a	EMF avoids this entirely.
Alarms (standard resolution)	$0.10 / alarm-month	n/a	$0.30 for high-resolution.
Dashboards	$3 / dashboard-month	n/a	First 3 are free.
Vended logs to S3	$0.25 / GB	n/a	For ALB, VPC Flow, CloudFront.

The full region-by-region matrix lives in the official CloudWatch pricing page. Bookmark it. AWS adjusts these rates quietly, and 2025 saw two changes to the Logs Insights scan rate alone.

How to reduce CloudWatch Logs ingestion cost

Ingestion is metered in uncompressed bytes at $0.50/GB, so the only way to lower it is to send fewer bytes. There's no compression discount and no free tier above 5 GB/month. Honestly, this is the one section where doing the work pays back within a single billing cycle. The five highest-impact actions, ordered by ROI:

1. Filter at the source

Application logs are the easiest win. Set the production log level to WARN or ERROR for any service that ships more than 1 GB/day. A typical Node.js or Python service running at INFO emits 70-90% noise: request start, request end, healthchecks, cache hits. For Lambda, prefer structured logging libraries like AWS Lambda Powertools that respect the POWERTOOLS_LOG_LEVEL env var, so you can flip noise off without a redeploy.

2. Use subscription filters to drop high-volume sources

VPC Flow Logs, ALB access logs, and CloudFront real-time logs can each push hundreds of GB/day into CloudWatch by default. Reconfigure them to land directly in S3 (Vended logs at $0.25/GB, or free delivery to S3 for ALB/CloudFront) and query with Athena instead. I've cut customer Flow Log bills from $12k/month to $400 with this single change.

3. Sample healthcheck and chatty endpoints

If you absolutely need request logs at scale, sample them. Drop 95% of 2xx responses and keep 100% of 4xx/5xx. Most load balancers and reverse proxies support log sampling natively (Envoy access_log_filter, nginx map + if, AWS ALB attribute access_logs.s3.prefix with rule-based filtering).

4. Compress structured logs at the app layer

JSON is verbose. A 300-byte access log line often carries 40 bytes of actual signal. Use field-name shortening (method becomes m, status_code becomes s) for high-volume log streams. You keep query compatibility via aliasing in Logs Insights but cut ingestion 30-50%.

5. Set a max log size limit on Container Insights

EKS Fluent Bit and ECS awslogs drivers by default forward everything stdout/stderr emit, including multi-megabyte stack traces from runaway loops. Cap individual log events at 64 KB at the agent layer to avoid a single bad deploy blowing up next month's bill.

Infrequent Access log class explained

The CloudWatch Logs Infrequent Access (IA) class went GA in November 2023 and was extended through 2024 to support more APIs. It's the easiest 50%+ ingestion saving you can deploy this week. IA cuts ingestion to $0.25/GB and storage to roughly $0.008/GB-month, in exchange for three limitations: no subscription filters, no metric filters, no Live Tail, and Logs Insights queries cost 50% more per GB scanned.

The decision rule I use: if a log group is read fewer than once per week by humans and isn't consumed by a subscription filter or metric filter, move it to IA. In practice that covers more than 70% of log groups in a mature account (audit trails, compliance logs, debug logs from non-customer-facing batch jobs, archived service logs).

Convert an existing log group with the PutLogGroupClass API, or a single Terraform attribute:

resource "aws_cloudwatch_log_group" "batch_jobs" {
  name              = "/aws/batch/nightly-etl"
  retention_in_days = 30
  log_group_class   = "INFREQUENT_ACCESS"
  skip_destroy      = true
}

Conversion is one-way per log group. You can't switch back to Standard, you can only create a new Standard group. Audit before you flip. If you have any metric filters defined on a group, the conversion will fail. The official AWS docs on Log Classes list every API constraint.

How to cut custom metrics cost with EMF

Custom metrics cost $0.30 per metric-month for the first 10,000 metrics, then $0.10 above 150,000. The trap is the word metric: every unique combination of metric name + dimension values counts as one. If you call PutMetricData with dimensions {Service=checkout, Region=us-east-1, UserId=12345}, you've minted a new metric, and the next user_id mints another, and so on. Teams routinely blow past the 150k tier in days. I hit this exact bug shipping a checkout service early last year. We burned through $4k in custom metrics before catching it the next morning.

Two rules to enforce:

Never use high-cardinality values (user IDs, request IDs, session IDs) as metric dimensions. Use logs for that.
Replace per-event PutMetricData calls with Embedded Metric Format (EMF). EMF lets you emit a JSON log line that CloudWatch automatically converts to metrics, so you pay log ingestion ($0.50/GB once) instead of per-metric and per-API-call fees.

A working EMF example in Python using Lambda Powertools:

from aws_lambda_powertools import Metrics
from aws_lambda_powertools.metrics import MetricUnit

metrics = Metrics(namespace="Checkout", service="payments")

def handler(event, context):
    metrics.add_metric(name="OrdersProcessed", unit=MetricUnit.Count, value=1)
    metrics.add_metric(name="OrderValueUSD", unit=MetricUnit.None_, value=event["total"])
    # Dimensions are added at namespace level, not per-call,
    # so cardinality is bounded.
    metrics.add_dimension(name="Currency", value=event["currency"])
    metrics.flush_metrics()  # writes a single JSON log line
    return {"statusCode": 200}

Behind the scenes Powertools writes EMF JSON to stdout. CloudWatch detects the _aws envelope and creates metrics for free. Crucially, you can push 100 metrics per log event with a single ingestion charge, which is orders of magnitude cheaper than 100 separate PutMetricData API calls. The EMF specification documents the JSON shape if you'd rather skip the library.

How to reduce Logs Insights query cost

Logs Insights charges $0.005 per GB scanned (Standard class) or $0.0075 (Infrequent Access class). Scan is computed before filtering, so a query that returns 1 row may still scan 200 GB.

Three optimization patterns that cut scan by 80% or more:

Tighten the time range first

The single biggest lever. A query over 7 days against a 50 GB/day log group scans 350 GB ($1.75); the same query over 1 hour scans roughly 2 GB ($0.01). Build dashboards with default 1-hour windows, not 24-hour.

Use `fields` projection before `stats`

Logs Insights scans every byte of every event when you reference @message. If you only need two fields, project them early. Insights can sometimes prune column reads, especially on structured JSON logs.

-- Expensive: scans full @message body
fields @timestamp, @message
| filter @message like /ERROR/
| stats count() by bin(5m)

-- Cheaper on JSON logs: only the named fields hit the scanner
fields @timestamp, level, service
| filter level = "ERROR"
| stats count() by service, bin(5m)

Save and parameterize queries instead of free-typing

The console's Run button is the enemy. Save canonical queries with bounded time ranges and require team members to use them. We added a Logs Insights query budget alarm: if scan volume exceeds 1 TB/day, an SNS topic pages the FinOps channel. That alone cut accidental scan spend 60% across our org.

CloudWatch alarms and dashboards optimization

Alarms and dashboards rarely top the bill, but they accumulate. A typical AWS account collects 800-3,000 stale alarms over its lifetime (defunct services, deleted Lambdas, decommissioned ECS tasks), each costing $0.10/month. That's $1,200-$3,600/year for alarms nobody ever reads.

The cleanup script we run quarterly across every account:

#!/bin/bash
# List alarms in INSUFFICIENT_DATA state for >30 days. Almost always stale.
aws cloudwatch describe-alarms \
  --state-value INSUFFICIENT_DATA \
  --query "MetricAlarms[?StateUpdatedTimestamp<\`$(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%SZ)\`].AlarmName" \
  --output text \
  | tr '\t' '\n' \
  | while read name; do
      echo "Deleting stale alarm: $name"
      aws cloudwatch delete-alarms --alarm-names "$name"
    done

Dashboards above the free 3-per-account threshold cost $3/month each. Audit them. A dashboard with no view in 90 days is dead weight. CloudWatch exposes dashboard view counts in the DashboardOperations CloudTrail event, so you can query CloudTrail Lake or Athena over your trail to identify zero-view dashboards.

For composite alarms (a 2025 feature now widely supported), collapsing 20 single-metric alarms into one composite alarm with a logical expression is cheaper and reduces alert fatigue. Composite alarms still bill $0.50/month but replace many child alarms. The composite alarm documentation covers the expression syntax.

Enforce log retention with Terraform

The default retention for any newly created CloudWatch log group is Never expire. That single default is responsible for most of the silent storage growth I see in audits. Fix it once at the infrastructure-as-code layer and the problem stops recurring.

Use AWS Config or, better, a Terraform module that wraps log group creation:

variable "default_retention_days" {
  description = "Default retention for log groups created by this module"
  type        = number
  default     = 30
  validation {
    condition     = contains([1, 3, 5, 7, 14, 30, 60, 90, 180, 365, 400, 545, 731, 1827, 3653], var.default_retention_days)
    error_message = "Retention must be a valid CloudWatch retention bucket."
  }
}

resource "aws_cloudwatch_log_group" "this" {
  for_each          = var.log_groups
  name              = each.value.name
  retention_in_days = lookup(each.value, "retention_days", var.default_retention_days)
  log_group_class   = lookup(each.value, "class", "STANDARD")
  kms_key_id        = var.kms_key_id
  tags              = merge(var.common_tags, lookup(each.value, "tags", {}))
}

For existing log groups, batch-apply with a one-liner:

aws logs describe-log-groups --query 'logGroups[?!retentionInDays].logGroupName' --output text \
  | tr '\t' '\n' \
  | xargs -I {} aws logs put-retention-policy --log-group-name {} --retention-in-days 30

Pair this with Cloud Custodian or AWS Config rules so any future log group created out-of-band is auto-tagged and flagged. We cover the broader pattern in our Cloud Custodian cost automation guide, and the same idea applies to the wider observability cost optimization question if you're also running Datadog or Splunk alongside CloudWatch.

Frequently Asked Questions

How much does CloudWatch cost per GB of logs?

CloudWatch Logs ingestion is $0.50 per GB in the Standard class and $0.25 per GB in the Infrequent Access class (us-east-1, 2026). Storage is an additional $0.03/GB-month Standard or about $0.008/GB-month IA. Logs Insights query scans bill separately at $0.005/GB Standard or $0.0075/GB IA.

Is CloudWatch cheaper than Datadog?

For pure log ingestion, CloudWatch is typically 3-5x cheaper than Datadog at list price, but Datadog includes longer retention, faster search, and APM. For native AWS workloads with modest analytics needs, CloudWatch plus Athena is the cheapest stack. For multi-cloud or rich tracing, Datadog often wins on total value despite the higher unit price.

How do I find the most expensive CloudWatch log group?

Use the AWS CLI: aws logs describe-log-groups --query 'logGroups[*].[logGroupName,storedBytes]' --output text | sort -k2 -n -r | head -20. For ingestion volume (not just storage), query the IncomingBytes metric in the AWS/Logs namespace grouped by LogGroupName over 30 days.

Can I move existing logs to Infrequent Access?

You can change a log group's class with PutLogGroupClass, but the change is one-way and the group must not have metric filters or subscription filters defined. Existing log events stay in place; only the class flag changes, which immediately lowers storage and future ingestion rates.

Why are my custom metrics so expensive?

Almost always high-cardinality dimensions. Each unique combination of metric name and dimension values is a separate billable metric at $0.30/month. Using user IDs, request IDs, or trace IDs as dimensions can mint hundreds of thousands of metrics from a single PutMetricData line. Switch to Embedded Metric Format and remove high-cardinality dimensions.

What is the cheapest way to keep CloudWatch logs for compliance?

Set log group retention to 7-30 days for active query access, and configure a daily Export Task to S3 with a Glacier Instant Retrieval lifecycle policy. Storage drops to about $0.004/GB-month, you keep them queryable via Athena, and you satisfy most SOC 2 / ISO 27001 retention requirements at under 1% of in-CloudWatch storage cost.

Why is my CloudWatch bill so high?

CloudWatch pricing breakdown 2026

How to reduce CloudWatch Logs ingestion cost

1. Filter at the source

2. Use subscription filters to drop high-volume sources

3. Sample healthcheck and chatty endpoints

4. Compress structured logs at the app layer

5. Set a max log size limit on Container Insights

Infrequent Access log class explained

How to cut custom metrics cost with EMF

How to reduce Logs Insights query cost

Tighten the time range first

Use fields projection before stats

Save and parameterize queries instead of free-typing

CloudWatch alarms and dashboards optimization

Enforce log retention with Terraform

Frequently Asked Questions

How much does CloudWatch cost per GB of logs?

Is CloudWatch cheaper than Datadog?

How do I find the most expensive CloudWatch log group?

Can I move existing logs to Infrequent Access?

Why are my custom metrics so expensive?

What is the cheapest way to keep CloudWatch logs for compliance?

Related articles

Related Articles

Amazon Redshift Cost Optimization in 2026: RA3, Serverless, and Reserved Instances

Managed PostgreSQL Cost Comparison: RDS vs Aurora vs Cloud SQL vs Azure Flexible Server (2026)

AWS Compute Optimizer Guide: Right-Size EC2, EBS, Lambda, and Auto Scaling in 2026

Use `fields` projection before `stats`