1.1 AWS Lambda
Key Takeaways
- Lambda's maximum function timeout is 900 seconds (15 minutes); ephemeral /tmp storage is configurable from 512 MB up to 10,240 MB (10 GB), and memory ranges 128 MB to 10,240 MB.
- Synchronous (RequestResponse) payloads are capped at 6 MB request and response; asynchronous (Event) event payloads are capped at 256 KB.
- Reserved concurrency caps and isolates a function's concurrent executions; provisioned concurrency pre-warms execution environments to remove cold-start latency.
- A version is an immutable snapshot of code plus configuration; an alias is a movable pointer to a version that supports weighted (canary) traffic shifting.
- Asynchronous invocations are retried twice by default; exhausted events go to a dead-letter queue (DLQ) or an on-failure destination instead of being silently dropped.
Why Lambda Dominates the Exam
AWS Lambda is serverless compute that runs your code in response to events, billed per request and per GB-second of compute (rounded to 1 ms). It is the single most-tested service on the AWS Certified Developer - Associate (DVA-C02) exam, which carries 65 questions in 130 minutes with a scaled passing score of 720 out of 1,000. Development with AWS Services is 32% of the scored content, and Lambda anchors it. Most items are scenario-based: which limit did you hit, how does concurrency behave under a traffic spike, or which invocation model retries on your behalf.
Handler, Context, and the Execution Environment
Every function has a handler — the entry point Lambda calls once per event. The handler receives an event object (the payload) and a context object holding invocation metadata such as aws_request_id, function_name, and get_remaining_time_in_millis(). Code placed outside the handler (the initialization phase) runs once per execution environment, so create SDK clients, open database connections, and load config there to reuse them across warm invocations. A classic trap: opening a new database connection inside the handler exhausts connection pools under load.
The Three Invocation Models
| Model | Caller waits? | Retries | Payload limit | Example source |
|---|---|---|---|---|
| Synchronous (RequestResponse) | Yes | Caller's job | 6 MB request + response | API Gateway, ALB, SDK Invoke |
| Asynchronous (Event) | No | 2 retries (default) | 256 KB | S3, SNS, EventBridge |
| Poll-based (event source mapping) | N/A | Per source | Batch-based | SQS, Kinesis, DynamoDB Streams |
For poll-based sources, Lambda itself polls and forms batches — you do not write the polling loop. With SQS, set the function reserved concurrency and batch size carefully; with Kinesis and DynamoDB Streams, ordering is preserved per shard and a poison record can block the shard unless you configure bisectBatchOnFunctionError or a failure destination.
Configuration Building Blocks
Environment variables pass configuration and are encrypted at rest with an AWS KMS key. Layers package shared libraries or a custom runtime; a function attaches up to five layers, and the unzipped function plus all layers must stay under the 250 MB deployment quota. Need bigger artifacts? Package the function as a container image up to 10 GB. Versions are immutable snapshots of code and configuration ($LATEST is mutable); aliases are named, movable pointers to a version. Aliases support weighted routing — send 10% of traffic to a new version for a canary deployment, then shift to 100%.
Concurrency, Cold Starts, and Limits
Reserved concurrency sets a fixed ceiling for one function, both guaranteeing that capacity and fencing the function so a noisy neighbor cannot drain the account's default 1,000 concurrent-execution pool. Provisioned concurrency keeps a set number of environments pre-initialized and warm, eliminating cold-start latency for predictable traffic. A cold start is the time to download code, start the runtime, and run init code. Memorize these hard limits:
| Limit | Value |
|---|---|
| Max timeout | 900 s (15 min) |
| Memory | 128 MB - 10,240 MB |
| /tmp ephemeral storage | 512 MB - 10,240 MB |
| Sync payload (req + resp) | 6 MB |
| Async event payload | 256 KB |
| Layers per function | 5 |
| Unzipped package + layers | 250 MB |
| Container image size | 10 GB |
| Default account concurrency | 1,000 |
CPU scales with memory: allocating more memory proportionally raises vCPU, so a CPU-bound function can finish faster (and sometimes cheaper) at higher memory. AWS Lambda Power Tuning is the standard tool to find the cost/performance sweet spot.
Error Handling and DLQs
For asynchronous invocations, Lambda retries a failed event twice (three total attempts) with delays, then discards it unless you capture it. Configure a dead-letter queue (an SQS queue or SNS topic) or, preferably, an on-failure destination (which can also be EventBridge or another Lambda and includes richer context). For poll-based SQS, failed messages return to the queue and move to the source queue's DLQ after maxReceiveCount. Never assume an async failure surfaces to the original caller — it does not.
Networking and Observability
By default a function runs in an AWS-managed VPC with internet access. Attach it to your VPC only when it must reach private resources (an RDS database, an internal NLB), and remember that such a function then loses default internet access — give it a NAT gateway or VPC endpoints for AWS APIs. Lambda automatically emits logs to Amazon CloudWatch Logs (one log group per function), publishes metrics such as Invocations, Errors, Throttles, Duration, and ConcurrentExecutions, and integrates with AWS X-Ray for distributed tracing when active tracing is enabled.
A spike in the Throttles metric, not Errors, is the signal that you hit a concurrency limit and should raise reserved/account concurrency rather than debug code.
Common Exam Traps
- Confusing reserved (a ceiling that also reserves capacity) with provisioned (pre-warmed environments) concurrency — they solve different problems.
- Forgetting that the 6 MB synchronous limit covers both the request and the response combined.
- Assuming
$LATESTis safe to point production aliases at — promote to an immutable version instead. - Initializing clients inside the handler, defeating connection reuse across warm starts.
A function must return a 4 MB JSON payload directly to an API Gateway caller and occasionally writes a 2 GB temporary file during processing. Which statement correctly describes the relevant limits?
One team wants to eliminate cold-start latency for a function with steady, predictable traffic, while another team wants to cap a noisy function so it cannot consume all account concurrency. Which combination is correct?
An S3 ObjectCreated event triggers a Lambda function asynchronously, and the function intermittently throws an error. What happens to events that keep failing, and how should you capture them?