AWS Cost Optimization: What I'd Audit First on a $50K Bill

Give me read access to a $50,000/month AWS account and I will tell you within a day where the first 20-30% is hiding, because on a mid-size bill it is almost always hiding in the same four places, in the same order: data transfer you can’t see in the console, instances sized for a load test that ran two years ago, on-demand pricing on a baseline that never moves, and storage rotting in the most expensive class AWS sells. None of this needs an architecture rewrite. AWS cost optimization, at least the first and biggest pass of it, is just the bill read in the right order by someone who knows where AWS buries the meter.

This is the order I work. It is the same audit I run on every account I’m handed, and it is the offer — if you want me to run it on yours, the post ends with how. But you can run most of it yourself today, and you should, because nobody is going to care about your bill as much as you do.

A note before the recipe: I deal in ranges, not promises. The exact saving on your account depends on what you’ve built. What I can promise is that the mistakes below are common enough that the question is usually how much, not whether.

Hour zero: get the real bill, not the dashboard

Before touching a single resource, I want the granular data. The AWS console’s cost dashboard rounds, groups, and hides the things that matter. Two tools give you the truth.

Cost Explorer, with rightsizing recommendations turned on, is the fast view — group by service, then by usage type, and the bill stops being one big number and starts being a list of decisions. Resource-level and hourly granularity costs extra ($0.01 per 1,000 usage records per month), but for one audit pass it’s worth pennies.

The Cost and Usage Report (CUR) is the ground truth — line-item, hourly, every charge AWS makes, delivered to your own S3 bucket. Generating it is free; you pay only the few cents of S3 storage. If you’re going to do this seriously, set up CUR (now delivered via AWS Data Exports) on day one. Everything below is a query against it.

The first thing I look at isn’t a resource. It’s the shape of the bill: what fraction is compute, what fraction is storage, what fraction is the line most people never read — data transfer.

First place I look: data transfer (the invisible 15%)

This is first because it’s the one nobody instruments, and on a networked workload it’s frequently the single most wasteful line. AWS charges for moving bytes, and the meter runs in places the console never surfaces.

NAT Gateway. This is the one I find money in most often. A NAT Gateway costs $0.045 per hour just to exist, plus $0.045 for every GB it processes (us-east-1). The hourly charge is trivial; the per-GB charge is where it hurts. If your private-subnet instances pull container images, packages, or — the classic — objects from S3 through the NAT Gateway, you are paying 4.5 cents a GB to route traffic that should be free. A VPC Gateway Endpoint for S3 and DynamoDB costs nothing and takes that traffic off the NAT path entirely. I check this on every account, and on data-heavy ones it’s often the biggest single line-item fix. (The full NAT teardown — both charges, every fix — is its own post: the NAT Gateway hidden tax.)

Cross-AZ traffic. AWS charges $0.01/GB in each direction for data crossing Availability Zones in the same region — $0.02 round trip. Spread a chatty app and its database across three AZs for “high availability” and you can pay a real tax on every query. Sometimes the HA is worth it. Often the chattiness is an accident of where things got scheduled, and pinning the hot path to one AZ (while keeping failover) cuts the line without cutting resilience.

Data transfer out to the internet. First 100 GB/month is free, aggregated across all services and regions; after that it’s $0.09/GB up to 10 TB. If you’re serving meaningful traffic straight off EC2 or an ALB, CloudFront in front often costs less per GB and offloads the origin — the egress math frequently pays for the CDN by itself.

The reason this is step one: data transfer is the only major cost that doesn’t show up as a resource you can point at. You have to read it out of the bill on purpose.

Second: rightsizing — the load test that never ended

Now the obvious one, done properly. Most over-provisioning isn’t malice; it’s an m5.2xlarge somebody picked for a launch-day spike that never came back, running at 8% CPU ever since.

AWS Compute Optimizer is free and does the heavy lifting. It analyses CloudWatch metrics and gives rightsizing recommendations for EC2, Auto Scaling groups, EBS volumes, Lambda, and ECS-on-Fargate (and, as of 2026, RDS and idle-resource recommendations for things like NAT Gateways too). Default lookback is 14 days; pay a small per-resource fee for Enhanced Infrastructure Metrics and it’ll look back ~93 days, which I’d do before resizing anything seasonal — you don’t want to shrink a box right before its busy month.

My discipline here: I trust the direction of the recommendation, not the exact target. Compute Optimizer is right that the box is too big; whether you drop one size or two depends on headroom you understand and it doesn’t. Rightsizing is also the one step you can get wrong — under-provision a latency-sensitive service and you’ve traded a cost problem for an outage. Move one size at a time, watch the metrics, repeat.

Two rightsizing moves that aren’t just “smaller”:

Move to gp3 EBS. gp2 is $0.10/GB-month; gp3 is $0.08/GB-month — 20% cheaper, by AWS’s own number — and gp3 includes 3,000 IOPS and 125 MB/s baseline free. For the overwhelming majority of volumes, gp3 is the correct default and gp2 is just an older, costlier setting nobody changed. io2 is for genuine high-IOPS workloads only.
Move to Graviton. AWS’s ARM64 chips run at materially lower cost than comparable x86 — the current Graviton page cites up to 20% less cost (the older “up to 40% better price-performance” figure was the Graviton2-generation claim). If your stack is interpreted or already cross-compiles cleanly (most Go, Java, Python, Node services do), the migration is often a base-image change and a redeploy. On the observability platform I ran on EKS — Grafana LGTM, 6 TB/day for 15 departments and 200+ developers — moving the fleet to ARM Graviton2 under Karpenter took 40% off the compute line, with no performance loss. Java compiled cleanly to ARM64; the work was node-pool config and a redeploy, not a rewrite.

Third: commitments — stop paying on-demand for a baseline that never moves

Rightsize first, commit second — never the other way round, or you’ll buy a commitment for capacity you’re about to delete.

Once the fleet is the right size, look at the baseline that runs 24/7. Paying on-demand for steady-state compute is leaving the largest predictable discount on the table.

Savings Plans are the current answer for compute. A Compute Savings Plan gives up to 66% off and — this is the point — applies across EC2, Fargate, and Lambda, across regions, instance families, and operating systems. You commit to a dollar-per-hour spend, not a specific instance, so it keeps applying as your fleet changes. An EC2 Instance Savings Plan goes deeper, up to 72%, in exchange for locking to an instance family in a region. Standard Reserved Instances also reach up to 72% but are rigid; for EC2, Savings Plans have largely superseded them.

The trap I see: people assume Savings Plans cover everything. Compute Savings Plans do not cover databases. For that, AWS launched a separate Database Savings Plans product (December 2025) covering RDS, Aurora, ElastiCache (Valkey), OpenSearch, DynamoDB and more — up to ~35%. Redshift remains Reserved-Node-only. So on a real bill you may need two or three commitment instruments, not one. Map them to the steady-state portion of each — not the peak, or you’ll over-commit and pay for unused commitment.

Rule of thumb I hold to: commit to the floor, pay on-demand (or Spot) for the spikes. Start with a 1-year, no-upfront Compute Savings Plan sized to your reliable baseline. You can always commit deeper once you trust the number. (Which instrument to buy — and why Savings Plans beat Reserved Instances for most teams — is the Reserved Instances vs Savings Plans decision guide.)

Fourth: storage and the things nobody deleted

Storage is rarely the biggest line, but it’s the easiest free money, because most of it is waste nobody is defending.

S3 in the wrong class. S3 Standard is $0.023/GB-month — the most expensive tier — and most data sitting in it hasn’t been read in months. For unpredictable access, S3 Intelligent-Tiering moves objects between tiers automatically with no retrieval fees and no operational overhead (you pay $0.0025 per 1,000 objects/month for monitoring). For known-cold data, a lifecycle policy to Standard-IA ($0.0125), Glacier Instant Retrieval ($0.004), or Glacier Deep Archive (~$0.001) cuts the line by 5-20x. Standard at $0.023 vs Deep Archive at ~$0.001 is not a rounding difference; it’s a 20x difference on the same bytes. (The whole S3 lever — classes, lifecycle, multipart cleanup, retrieval fees — is the S3 cost optimization playbook.)

The graveyard. Three things bill silently:

Unattached EBS volumes keep billing per GB-month after the instance they served is long gone.
Old EBS snapshots — $0.05/GB-month each — accumulate forever unless something deletes them.
Public IPv4 addresses. Since 1 February 2024, AWS charges $0.005/hour for every public IPv4 address, attached or not — about $3.60/month each. On an account with a sprawl of idle Elastic IPs and load balancers, this adds up to a line worth reading.

AWS Trusted Advisor flags most of this — idle load balancers, underutilised EBS volumes, unassociated Elastic IPs — but the full cost-optimization check set requires a paid Support plan (Business Support and above). If you’re on Basic, Compute Optimizer plus a CUR query gets you most of the same answers for free.

The order is the recipe

The sequence matters more than any single fix:

Get the real bill — CUR + Cost Explorer, read the shape before touching anything.
Data transfer first — it’s invisible and often the biggest single waste (NAT Gateway, cross-AZ, egress).
Rightsize — Compute Optimizer for direction, gp3 and Graviton for structural wins, one size at a time.
Commit — Savings Plans on the rightsized baseline, never before.
Storage cleanup — S3 classes, then delete the graveyard.

Run it top to bottom and the early steps make the later ones cheaper — you don’t want to buy a 3-year commitment on an instance you’re about to delete, or rightsize a fleet before you’ve stopped it routing free traffic through a paid NAT Gateway.

I’ve run this enough times that the pattern is boringly consistent. On the cost work I take on, the range I quote is 20-40% with no performance loss — and on a mid-size bill the first pass usually lands in that band without touching the architecture. The exact number is yours to discover; the order is the same on every account.

If you’d rather not run it yourself: this audit is my AWS cost-optimization offer. I read your bill, run this sequence against your CUR, and hand you a prioritised list — biggest, safest wins first, with the numbers attached. No architecture rewrite, no lock-in, no commitment to me beyond the audit. Reach me through rajesh.medampudi.com/work-with-me — a short email about what you’re dealing with is the best starting point.

Sources

(all checked 2026-06-18, us-east-1 unless noted)

https://aws.amazon.com/vpc/pricing/ — NAT Gateway $0.045/hr + $0.045/GB; VPC Gateway Endpoint for S3/DynamoDB is free
https://docs.aws.amazon.com/vpc/latest/userguide/nat-gateway-pricing.html — NAT Gateway pricing detail
https://aws.amazon.com/ec2/pricing/on-demand/ — DTO first 100GB/mo free then $0.09/GB to 10TB; cross-AZ $0.01/GB each direction
https://aws.amazon.com/ebs/pricing/ — gp2 $0.10, gp3 $0.08 (GB-mo); snapshots $0.05/GB-mo; unattached volumes still bill
https://aws.amazon.com/ebs/general-purpose/ — gp3 “20% less expensive than gp2”, 3000 IOPS + 125 MB/s baseline included
https://docs.aws.amazon.com/ebs/latest/userguide/ebs-volume-types.html — gp3 as default, io2 for high-IOPS
https://aws.amazon.com/s3/pricing/ — S3 Standard $0.023, Standard-IA $0.0125, Glacier Instant $0.004, Flexible $0.0036, Deep Archive ~$0.00099 (GB-mo); Intelligent-Tiering monitoring $0.0025/1000 objects
https://aws.amazon.com/s3/storage-classes/intelligent-tiering/ — Intelligent-Tiering auto-tiers, no retrieval fees; lifecycle transition request costs
https://aws.amazon.com/savingsplans/compute-pricing/ — Compute SP up to 66%, EC2 Instance SP up to 72%; covers EC2/Fargate/Lambda across regions/families
https://aws.amazon.com/savingsplans/database-pricing/ — Database Savings Plans (Dec 2025) cover RDS/Aurora/ElastiCache-Valkey/OpenSearch/DynamoDB up to ~35%
https://aws.amazon.com/blogs/aws/introducing-database-savings-plans-for-aws-databases/ — Database Savings Plans launch (2025-12-02); Redshift remains Reserved-Node-only
https://aws.amazon.com/ec2/graviton/ — current page: “up to 20% less cost” vs comparable x86; ARM64/Neoverse (40% figure was Graviton2-gen)
https://aws.amazon.com/compute-optimizer/pricing/ — free; covers EC2/ASG/EBS/Lambda/ECS-Fargate (+RDS, idle-resource 2026); 14-day default, ~93-day with paid Enhanced Infrastructure Metrics
https://aws.amazon.com/aws-cost-management/aws-cost-explorer/pricing/ — rightsizing recs; hourly/resource granularity $0.01 per 1,000 usage records/mo
https://docs.aws.amazon.com/cur/latest/userguide/what-is-cur.html — CUR is most granular billing data, free to generate (pay only S3 storage), delivered via Data Exports
https://aws.amazon.com/blogs/aws/new-aws-public-ipv4-address-charge-public-ip-insights/ — $0.005/hr per public IPv4 since 2024-02-01, attached or not
https://docs.aws.amazon.com/awssupport/latest/user/cost-optimization-checks.html — Trusted Advisor cost checks (idle LBs, underutilised EBS, unassociated EIPs)
https://aws.amazon.com/premiumsupport/technology/trusted-advisor/ — full cost-optimization checks require paid Support plan (Business+)