3.2 Lambda, Serverless Compute, and Container Services
Key Takeaways
- Lambda runs event-driven code with no servers, scales automatically per request, and bills per request plus GB-second of duration.
- Lambda hard limits: 15-minute (900s) max timeout, 128 MB-10,240 MB memory, 512 MB-10 GB ephemeral /tmp, 6 MB synchronous and 256 KB asynchronous payloads.
- ECS orchestrates Docker containers on EC2 or Fargate; EKS runs upstream-compatible Kubernetes on EC2 or Fargate.
- Fargate is serverless container compute - you declare vCPU and memory, AWS owns the host, and you pay per vCPU/GB-second.
- Decision rule: short event task -> Lambda; containers with least ops -> Fargate; containers needing host/GPU/daemon control -> ECS/EKS on EC2; existing Kubernetes -> EKS.
Quick Answer: Lambda = event-driven, <=15 min, zero servers, scales to thousands instantly. Fargate = serverless containers, no host to patch. ECS on EC2 = Docker with full host control. EKS = managed Kubernetes control plane. Pick on execution duration, control needs, and how much operations you are willing to own.
AWS Lambda
AWS Lambda runs a function in response to an event and tears the environment down afterward. You never provision capacity; concurrency scales with incoming events. Pricing is per request plus per GB-second of compute, where vCPU is allocated proportionally to the memory you set (1,769 MB ~= 1 full vCPU), so raising memory can finish CPU-bound work faster and cheaper.
Lambda Hard Limits (memorize these)
| Dimension | Limit |
|---|---|
| Max timeout | 15 minutes (900 seconds) |
| Memory | 128 MB to 10,240 MB |
| Ephemeral storage (/tmp) | 512 MB to 10,240 MB |
| Concurrent executions | 1,000 per Region default (soft, raise via quota) |
| Deployment package | 50 MB zipped / 250 MB unzipped, or 10 GB container image |
| Synchronous payload | 6 MB request and response |
| Asynchronous payload | 256 KB |
| Environment variables | 4 KB total |
Lambda Invocation Models
| Source | Invocation | Notes |
|---|---|---|
| API Gateway / ALB | Synchronous | Caller waits for the response (6 MB cap) |
| S3 / SNS / EventBridge | Asynchronous | Internal queue, automatic retries, send failures to a DLQ |
| SQS / Kinesis / DynamoDB Streams | Event-source mapping (poll) | Lambda polls and invokes in batches |
Cold starts add latency on the first invoke of a new environment. Use Provisioned Concurrency to pre-warm latency-sensitive functions, and Reserved Concurrency to cap a function so it cannot starve other functions of the account pool. When work exceeds 15 minutes, hand off to Step Functions, Fargate, or AWS Batch instead of forcing it into Lambda.
Amazon ECS (Elastic Container Service)
ECS is AWS-native Docker orchestration. Core objects: a task definition (the blueprint - image, vCPU, memory, ports, env, IAM task role), a task (a running instance of that definition), a service (keeps a desired task count and registers with a load balancer), and a cluster (logical grouping).
| EC2 launch type | Fargate launch type | |
|---|---|---|
| Host management | You patch and scale EC2 | AWS owns the host |
| Scaling | You scale the EC2 capacity | Per task, automatic |
| Billing | EC2 instance price | Per vCPU + GB per second |
| Host access / daemons | Full (SSH, GPU, custom agents) | None |
| Best for | GPU, dense bin-packing, custom kernels | Simplicity, least ops |
Amazon EKS (Elastic Kubernetes Service)
EKS runs an AWS-managed, multi-AZ Kubernetes control plane; you bring worker capacity as EC2 managed node groups, self-managed nodes, or Fargate. Because it tracks upstream Kubernetes, existing manifests, Helm charts, and kubectl tooling work unchanged - the right answer whenever a question mentions an existing Kubernetes investment or portability across clouds.
AWS Fargate
Fargate is the serverless data plane for both ECS and EKS. You declare vCPU and memory per task; AWS provisions, isolates, and patches the host. Each task runs in its own kernel boundary for strong isolation, and billing is per vCPU and GB per second. Fargate is the default "least operational overhead" container answer unless the workload needs host access, GPUs, privileged daemons, or very dense bin-packing.
Choosing the Right Compute
| Scenario | Best service |
|---|---|
| Event-driven glue, <=15 min | Lambda |
| Containers with minimal ops | ECS or EKS on Fargate |
| Containers needing host/GPU control | ECS or EKS on EC2 |
| Long-running stateful single node | EC2 |
| Large-scale batch | AWS Batch (EC2 or Fargate) |
| Existing Kubernetes migration | EKS |
On the Exam: "Least operational overhead to run containers" -> Fargate. "Run code when an object lands in S3" -> Lambda. "Migrate a Kubernetes app keeping existing tooling" -> EKS. "Job runs 40 minutes" -> NOT Lambda (exceeds 15-minute cap).
Worked Scenario: Picking the Compute Tier
A team receives image uploads to an S3 bucket and must generate three thumbnail sizes per image, then notify a downstream system. Each transform takes about eight seconds. Because the work is event-driven, short, and stateless, Lambda triggered by an S3 event is the strongest answer: it scales to thousands of concurrent invocations automatically and bills only for the seconds it runs. If the same team also runs a long-lived API behind a load balancer that occasionally serves heavy 30-minute report generations, that report job exceeds Lambda's 900-second ceiling, so it belongs on Fargate (or AWS Batch) instead.
This split - Lambda for the fast event handler, Fargate for the long task - is exactly the trade-off the exam wants you to recognize.
Common Traps to Avoid
- Forcing long jobs into Lambda. Any execution over 15 minutes cannot run on Lambda at all - more memory or Provisioned Concurrency does not help. Move it to Fargate, Step Functions, or AWS Batch.
- Choosing ECS-on-EC2 when the question says "least operational overhead." Managing EC2 hosts (patching, scaling, capacity) is operational overhead; Fargate removes it. Only pick EC2-backed when the workload needs host access, GPUs, privileged daemons, or dense bin-packing.
- Picking ECS when an existing Kubernetes investment is mentioned. Re-platforming working Kubernetes manifests to the ECS API is rework; EKS keeps them.
- Ignoring concurrency limits. A burst that exceeds the 1,000 default concurrent executions throttles unless you raise the quota or use Reserved Concurrency to protect critical functions.
- Cold-start latency. For latency-sensitive synchronous APIs, plain on-demand Lambda may add cold-start delay; Provisioned Concurrency pre-warms environments to eliminate it.
A data pipeline step regularly runs for about 40 minutes per execution and is currently implemented as a single AWS Lambda function that keeps timing out. What is the most appropriate redesign?
A team wants to run containerized microservices with the least possible infrastructure to patch, scale, or manage, while keeping standard Docker images. Which option fits best?
An organization already runs production workloads on Kubernetes on-premises and wants to migrate to AWS while reusing its existing manifests, Helm charts, and kubectl workflows. Which service minimizes rework?