3.2 Lambda, Serverless Compute, and Container Services

Key Takeaways

  • Lambda runs event-driven code with no servers, scales automatically per request, and bills per request plus GB-second of duration.
  • Lambda hard limits: 15-minute (900s) max timeout, 128 MB-10,240 MB memory, 512 MB-10 GB ephemeral /tmp, 6 MB synchronous and 256 KB asynchronous payloads.
  • ECS orchestrates Docker containers on EC2 or Fargate; EKS runs upstream-compatible Kubernetes on EC2 or Fargate.
  • Fargate is serverless container compute - you declare vCPU and memory, AWS owns the host, and you pay per vCPU/GB-second.
  • Decision rule: short event task -> Lambda; containers with least ops -> Fargate; containers needing host/GPU/daemon control -> ECS/EKS on EC2; existing Kubernetes -> EKS.
Last updated: June 2026

Quick Answer: Lambda = event-driven, <=15 min, zero servers, scales to thousands instantly. Fargate = serverless containers, no host to patch. ECS on EC2 = Docker with full host control. EKS = managed Kubernetes control plane. Pick on execution duration, control needs, and how much operations you are willing to own.

AWS Lambda

AWS Lambda runs a function in response to an event and tears the environment down afterward. You never provision capacity; concurrency scales with incoming events. Pricing is per request plus per GB-second of compute, where vCPU is allocated proportionally to the memory you set (1,769 MB ~= 1 full vCPU), so raising memory can finish CPU-bound work faster and cheaper.

Lambda Hard Limits (memorize these)

DimensionLimit
Max timeout15 minutes (900 seconds)
Memory128 MB to 10,240 MB
Ephemeral storage (/tmp)512 MB to 10,240 MB
Concurrent executions1,000 per Region default (soft, raise via quota)
Deployment package50 MB zipped / 250 MB unzipped, or 10 GB container image
Synchronous payload6 MB request and response
Asynchronous payload256 KB
Environment variables4 KB total

Lambda Invocation Models

SourceInvocationNotes
API Gateway / ALBSynchronousCaller waits for the response (6 MB cap)
S3 / SNS / EventBridgeAsynchronousInternal queue, automatic retries, send failures to a DLQ
SQS / Kinesis / DynamoDB StreamsEvent-source mapping (poll)Lambda polls and invokes in batches

Cold starts add latency on the first invoke of a new environment. Use Provisioned Concurrency to pre-warm latency-sensitive functions, and Reserved Concurrency to cap a function so it cannot starve other functions of the account pool. When work exceeds 15 minutes, hand off to Step Functions, Fargate, or AWS Batch instead of forcing it into Lambda.

Amazon ECS (Elastic Container Service)

ECS is AWS-native Docker orchestration. Core objects: a task definition (the blueprint - image, vCPU, memory, ports, env, IAM task role), a task (a running instance of that definition), a service (keeps a desired task count and registers with a load balancer), and a cluster (logical grouping).

EC2 launch typeFargate launch type
Host managementYou patch and scale EC2AWS owns the host
ScalingYou scale the EC2 capacityPer task, automatic
BillingEC2 instance pricePer vCPU + GB per second
Host access / daemonsFull (SSH, GPU, custom agents)None
Best forGPU, dense bin-packing, custom kernelsSimplicity, least ops

Amazon EKS (Elastic Kubernetes Service)

EKS runs an AWS-managed, multi-AZ Kubernetes control plane; you bring worker capacity as EC2 managed node groups, self-managed nodes, or Fargate. Because it tracks upstream Kubernetes, existing manifests, Helm charts, and kubectl tooling work unchanged - the right answer whenever a question mentions an existing Kubernetes investment or portability across clouds.

AWS Fargate

Fargate is the serverless data plane for both ECS and EKS. You declare vCPU and memory per task; AWS provisions, isolates, and patches the host. Each task runs in its own kernel boundary for strong isolation, and billing is per vCPU and GB per second. Fargate is the default "least operational overhead" container answer unless the workload needs host access, GPUs, privileged daemons, or very dense bin-packing.

Choosing the Right Compute

ScenarioBest service
Event-driven glue, <=15 minLambda
Containers with minimal opsECS or EKS on Fargate
Containers needing host/GPU controlECS or EKS on EC2
Long-running stateful single nodeEC2
Large-scale batchAWS Batch (EC2 or Fargate)
Existing Kubernetes migrationEKS

On the Exam: "Least operational overhead to run containers" -> Fargate. "Run code when an object lands in S3" -> Lambda. "Migrate a Kubernetes app keeping existing tooling" -> EKS. "Job runs 40 minutes" -> NOT Lambda (exceeds 15-minute cap).

Worked Scenario: Picking the Compute Tier

A team receives image uploads to an S3 bucket and must generate three thumbnail sizes per image, then notify a downstream system. Each transform takes about eight seconds. Because the work is event-driven, short, and stateless, Lambda triggered by an S3 event is the strongest answer: it scales to thousands of concurrent invocations automatically and bills only for the seconds it runs. If the same team also runs a long-lived API behind a load balancer that occasionally serves heavy 30-minute report generations, that report job exceeds Lambda's 900-second ceiling, so it belongs on Fargate (or AWS Batch) instead.

This split - Lambda for the fast event handler, Fargate for the long task - is exactly the trade-off the exam wants you to recognize.

Common Traps to Avoid

  • Forcing long jobs into Lambda. Any execution over 15 minutes cannot run on Lambda at all - more memory or Provisioned Concurrency does not help. Move it to Fargate, Step Functions, or AWS Batch.
  • Choosing ECS-on-EC2 when the question says "least operational overhead." Managing EC2 hosts (patching, scaling, capacity) is operational overhead; Fargate removes it. Only pick EC2-backed when the workload needs host access, GPUs, privileged daemons, or dense bin-packing.
  • Picking ECS when an existing Kubernetes investment is mentioned. Re-platforming working Kubernetes manifests to the ECS API is rework; EKS keeps them.
  • Ignoring concurrency limits. A burst that exceeds the 1,000 default concurrent executions throttles unless you raise the quota or use Reserved Concurrency to protect critical functions.
  • Cold-start latency. For latency-sensitive synchronous APIs, plain on-demand Lambda may add cold-start delay; Provisioned Concurrency pre-warms environments to eliminate it.
Test Your Knowledge

A data pipeline step regularly runs for about 40 minutes per execution and is currently implemented as a single AWS Lambda function that keeps timing out. What is the most appropriate redesign?

A
B
C
D
Test Your Knowledge

A team wants to run containerized microservices with the least possible infrastructure to patch, scale, or manage, while keeping standard Docker images. Which option fits best?

A
B
C
D
Test Your Knowledge

An organization already runs production workloads on Kubernetes on-premises and wants to migrate to AWS while reusing its existing manifests, Helm charts, and kubectl workflows. Which service minimizes rework?

A
B
C
D