Lambda Cost Optimisation: Beyond Cutting Memory
Lambda is cheap until it isn't. Here's the full set of levers — memory tuning, ARM, packaging, provisioned concurrency — that take a runaway serverless bill back to sensible.
By Andrii Votiakov on
Lambda starts cheap and stays cheap for most workloads. But on busy systems — APIs at hundreds of requests per second, batch processors, ETL fanout — it can quickly become a four or five-figure monthly line. The good news: it's also one of the easier services to optimise because everything is exposed as a knob.
Quick answer
Lambda is billed for invocations ($0.20/million) and GB-seconds of execution. Cutting cost is mostly about cutting GB-seconds: lower memory where you can, raise it where it actually makes things faster, ship as ARM (Graviton2), strip cold-start fat, and use Provisioned Concurrency only where latency justifies it.
The five levers, in order of impact
1. Memory tuning (often counter-intuitive)
Lambda allocates CPU proportional to memory. Raising memory often makes a CPU-bound function run faster, which can make it cheaper. The only way to know is to measure.
Use AWS Lambda Power Tuning (open-source state machine, runs in your account). For each function:
- Tries memory at 128, 256, 512, 1024, 1769, 3008, 5120, 10240 MB
- Records duration and cost
- Recommends optimal setting
Typical findings:
- I/O-bound functions (DynamoDB calls, HTTP fetches): smallest memory wins
- CPU-bound (image processing, JSON parsing-heavy): more memory is cheaper because runtime drops faster than cost rises
- Most "default 512 MB" handlers are over-provisioned by 2-4x
2. Switch architecture to ARM (Graviton2)
One-line change in CloudFormation/CDK/Terraform: Architecture: arm64. Saves 20% on every GB-second. Works for Node.js, Python, Java, Go, .NET out of the box. Some native dependencies need an ARM rebuild — check before flipping for image-processing or ML inference functions. If you're also moving EC2 and RDS to ARM, the Graviton migration guide has the full sequencing. And once you've right-sized Lambda alongside your broader serverless baseline, lock the floor in with Compute Savings Plans.
If you're using Node 20+, Python 3.10+, or Go 1.20+, you can flip to ARM today on most workloads.
3. Cold-start fat
Cold starts are billed (init duration counts). And they make P99 latency worse, which means you'll be tempted to buy Provisioned Concurrency, which is expensive.
Cut cold-start fat:
- Strip dependencies. Bundle only what you use. esbuild for Node, pip + AWS Lambda Powertools-only for Python.
- Layer carefully. Layers are nice for shared code but can balloon size and slow init.
- Use SnapStart (Java, Python 3.12+, .NET 8+). Pre-warmed snapshots; can cut cold start by 10x.
- Lazy-init expensive stuff. Don't open a database pool in the global init if only 30% of invocations need it.
4. Provisioned Concurrency only where it pays
PC pre-warms instances. You pay even when idle. Worth it when:
- Latency-sensitive (sub-100ms P99 required)
- Predictable peak traffic (set up a scheduled scale-up)
- Cold starts visibly hurt user experience
Not worth it for:
- Batch jobs
- Background processors with relaxed latency
- Anything called fewer than 100 times an hour
If you do need PC, scheduled scaling is way cheaper than always-on. Scale up before peak, down after.
5. Right-sized timeouts
Default timeout is sometimes set as a "safety" buffer that becomes a bill bomb. A function with a bug looping until the 15-minute timeout costs you 15 full minutes per invocation.
Set timeouts close to actual P99 + headroom. Use CloudWatch logs to find the actual P99 — usually well under 1 second for API-style handlers.
Less-obvious savings
Drop CloudWatch DEBUG logs in production
A Node Lambda calling console.debug inside a hot path can ship 50 KB per invocation to CloudWatch Logs. At 1M invocations/day that's 50 GB/day = $25/day = $750/month in Logs ingestion. From one Lambda. Strip debug logs in prod or use a pattern like if (process.env.LOG_LEVEL === 'debug').
Avoid Lambda for steady-state high-throughput workloads
Lambda is brilliant for spiky and event-driven. For a service hitting 50+ RPS continuously 24/7, ECS Fargate or EKS often comes out cheaper at equivalent reliability. A detailed comparison of where each wins is in the cloud run vs Lambda vs functions post. Rough rule: above ~100M invocations/month or sustained 50% concurrency, model both.
Tighten event-source batch sizes
For SQS, Kinesis, DynamoDB Streams: bigger BatchSize and MaximumBatchingWindowInSeconds mean fewer invocations. A SQS-triggered Lambda processing one message at a time is doing the same work for 10x the invocation cost vs batches of 10.
Watch out for fan-out chains
A Lambda that triggers another Lambda that triggers another. Every level adds invocations and end-to-end latency. Step Functions or batched processing in a single function is usually cheaper.
Concurrency limits
Set reserved concurrency on functions that could run away. A poison-pill SQS message looping a function 1,000 concurrent invocations × 15-minute timeout will surprise you in monthly billing.
Common Lambda bill bombs I've seen
- A scheduled function meant to run nightly stuck in a 1-second loop because EventBridge rule was misconfigured. Burned $3,200 in a week.
- Default 1024 MB on every function in a 200-function account. Power Tuning across the lot found 70% could drop to 256 MB. $2,800/month saved.
- DEBUG logging on a high-volume API Gateway-fronted Lambda. $1,100/month in Logs alone.
- Provisioned Concurrency = 50, on a function that peaked at 8 concurrent. $700/month for nothing.
Realistic numbers
Recent serverless-heavy client (~$6,400/month Lambda + Logs):
- Power Tuning across 47 functions: $1,400/month
- ARM switch on 31 of them: $700/month
- DEBUG log strip on 4 hot paths: $900/month
- Reduced Provisioned Concurrency on 3 functions: $500/month
- Tightened SQS batch sizes: $300/month
Final: $2,600/month, ~59% reduction.
If your Lambda bill is bigger than you think it should be, book a call. Most runaway serverless bills can be diagnosed in an hour.