Lambda Cost Optimisation: Beyond Cutting Memory

Lambda is cheap until it isn't. Here's the full set of levers — memory tuning, ARM, packaging, provisioned concurrency — that take a runaway serverless bill back to sensible.

By Andrii Votiakov on 2026-03-23

Lambda starts cheap and stays cheap for most workloads. But on busy systems — APIs at hundreds of requests per second, batch processors, ETL fanout — it can quickly become a four or five-figure monthly line. The good news: it's also one of the easier services to optimise because everything is exposed as a knob.

Quick answer

Lambda is billed for invocations ($0.20/million) and GB-seconds of execution. Cutting cost is mostly about cutting GB-seconds: lower memory where you can, raise it where it actually makes things faster, ship as ARM (Graviton2), strip cold-start fat, and use Provisioned Concurrency only where latency justifies it.

The five levers, in order of impact

1. Memory tuning (often counter-intuitive)

Lambda allocates CPU proportional to memory. Raising memory often makes a CPU-bound function run faster, which can make it cheaper. The only way to know is to measure.

Use AWS Lambda Power Tuning (open-source state machine, runs in your account). For each function:

Tries memory at 128, 256, 512, 1024, 1769, 3008, 5120, 10240 MB
Records duration and cost
Recommends optimal setting

Typical findings:

I/O-bound functions (DynamoDB calls, HTTP fetches): smallest memory wins
CPU-bound (image processing, JSON parsing-heavy): more memory is cheaper because runtime drops faster than cost rises
Most "default 512 MB" handlers are over-provisioned by 2-4x

2. Switch architecture to ARM (Graviton2)

One-line change in CloudFormation/CDK/Terraform: Architecture: arm64. Saves 20% on every GB-second. Works for Node.js, Python, Java, Go, .NET out of the box. Some native dependencies need an ARM rebuild — check before flipping for image-processing or ML inference functions. If you're also moving EC2 and RDS to ARM, the Graviton migration guide has the full sequencing. And once you've right-sized Lambda alongside your broader serverless baseline, lock the floor in with Compute Savings Plans.

If you're using Node 20+, Python 3.10+, or Go 1.20+, you can flip to ARM today on most workloads.

3. Cold-start fat

Cold starts are billed (init duration counts). And they make P99 latency worse, which means you'll be tempted to buy Provisioned Concurrency, which is expensive.

Cut cold-start fat:

Strip dependencies. Bundle only what you use. esbuild for Node, pip + AWS Lambda Powertools-only for Python.
Layer carefully. Layers are nice for shared code but can balloon size and slow init.
Use SnapStart (Java, Python 3.12+, .NET 8+). Pre-warmed snapshots; can cut cold start by 10x.
Lazy-init expensive stuff. Don't open a database pool in the global init if only 30% of invocations need it.

4. Provisioned Concurrency only where it pays

PC pre-warms instances. You pay even when idle. Worth it when:

Latency-sensitive (sub-100ms P99 required)
Predictable peak traffic (set up a scheduled scale-up)
Cold starts visibly hurt user experience

Not worth it for:

Batch jobs
Background processors with relaxed latency
Anything called fewer than 100 times an hour

If you do need PC, scheduled scaling is way cheaper than always-on. Scale up before peak, down after.

5. Right-sized timeouts

Default timeout is sometimes set as a "safety" buffer that becomes a bill bomb. A function with a bug looping until the 15-minute timeout costs you 15 full minutes per invocation.

Set timeouts close to actual P99 + headroom. Use CloudWatch logs to find the actual P99 — usually well under 1 second for API-style handlers.

Less-obvious savings

Drop CloudWatch DEBUG logs in production

A Node Lambda calling console.debug inside a hot path can ship 50 KB per invocation to CloudWatch Logs. At 1M invocations/day that's 50 GB/day = $25/day = $750/month in Logs ingestion. From one Lambda. Strip debug logs in prod or use a pattern like if (process.env.LOG_LEVEL === 'debug').

Avoid Lambda for steady-state high-throughput workloads

Lambda is brilliant for spiky and event-driven. For a service hitting 50+ RPS continuously 24/7, ECS Fargate or EKS often comes out cheaper at equivalent reliability. A detailed comparison of where each wins is in the cloud run vs Lambda vs functions post. Rough rule: above ~100M invocations/month or sustained 50% concurrency, model both.

Tighten event-source batch sizes

For SQS, Kinesis, DynamoDB Streams: bigger BatchSize and MaximumBatchingWindowInSeconds mean fewer invocations. A SQS-triggered Lambda processing one message at a time is doing the same work for 10x the invocation cost vs batches of 10.

Watch out for fan-out chains

A Lambda that triggers another Lambda that triggers another. Every level adds invocations and end-to-end latency. Step Functions or batched processing in a single function is usually cheaper.

Concurrency limits

Set reserved concurrency on functions that could run away. A poison-pill SQS message looping a function 1,000 concurrent invocations × 15-minute timeout will surprise you in monthly billing.

Common Lambda bill bombs I've seen

A scheduled function meant to run nightly stuck in a 1-second loop because EventBridge rule was misconfigured. Burned $3,200 in a week.
Default 1024 MB on every function in a 200-function account. Power Tuning across the lot found 70% could drop to 256 MB. $2,800/month saved.
DEBUG logging on a high-volume API Gateway-fronted Lambda. $1,100/month in Logs alone.
Provisioned Concurrency = 50, on a function that peaked at 8 concurrent. $700/month for nothing.

Realistic numbers

Recent serverless-heavy client (~$6,400/month Lambda + Logs):

Power Tuning across 47 functions: $1,400/month
ARM switch on 31 of them: $700/month
DEBUG log strip on 4 hot paths: $900/month
Reduced Provisioned Concurrency on 3 functions: $500/month
Tightened SQS batch sizes: $300/month

Final: $2,600/month, ~59% reduction.

If your Lambda bill is bigger than you think it should be, book a call. Most runaway serverless bills can be diagnosed in an hour.