2.1 Elastic Load Balancing (ELB) — ALB, NLB, and GWLB
Key Takeaways
- Application Load Balancer (ALB) operates at Layer 7 (HTTP/HTTPS) and supports path-based, host-based, header, query-string, and source-IP routing plus WebSocket and HTTP/2.
- Network Load Balancer (NLB) operates at Layer 4 (TCP/UDP/TLS), gives one static IP per Availability Zone, preserves the client source IP, and scales to millions of requests per second.
- Gateway Load Balancer (GWLB) operates at Layer 3, uses GENEVE encapsulation on port 6081, and transparently inserts third-party firewall/IDS/IPS appliances.
- Cross-zone load balancing is on by default and free on ALB but off by default on NLB, where it incurs inter-AZ data charges when enabled.
- Connection draining (deregistration delay, default 300s) lets in-flight requests finish before a target is removed.
Quick Answer: ALB = Layer 7 HTTP/HTTPS with content-based routing. NLB = Layer 4 TCP/UDP/TLS with static IPs and source-IP preservation. GWLB = Layer 3 transparent insertion of third-party security appliances. The SAA-C03 (Solutions Architect Associate) exam tests which load balancer fits a scenario more than how to configure one.
Why Resilient Architects Care About ELB
Elastic Load Balancing (ELB) is the front door for nearly every resilient design in Domain 2. It distributes incoming traffic across multiple targets — EC2 (Elastic Compute Cloud) instances, IP addresses, AWS Lambda functions, or ECS (Elastic Container Service) tasks — spread across two or more Availability Zones (AZs). When a target or an entire AZ fails, the load balancer stops routing to it within a few health-check intervals, so users never see the failure. ELB is a regional, managed, auto-scaling service: you address it by DNS name (or static IP for NLB), never by a single instance IP.
Application Load Balancer (ALB)
The ALB terminates and inspects HTTP/HTTPS, so it can route on request content. Listener rules are evaluated in priority order and route matching requests to a target group.
| Routing condition | Example rule | Use case |
|---|---|---|
| Path-based | /api/* → API target group | Microservice splitting |
| Host-based | shop.example.com → Shop group | Multiple sites, one ALB |
| HTTP header | X-Platform: mobile → Mobile group | A/B testing |
| Query string | ?lang=de → DE group | Localization |
| Source IP | 10.0.0.0/8 → Internal group | Split internal vs. public |
ALB extras tested on the exam: native WebSocket and HTTP/2, SSL/TLS termination using ACM (AWS Certificate Manager) certs, SNI (Server Name Indication) for multiple certs on one listener, built-in Cognito/OIDC authentication, sticky sessions via the AWSALB cookie, and integration with AWS WAF (Web Application Firewall). ALB uses dynamic private IPs, so a client cannot whitelist a fixed IP — a frequent exam trap.
Network Load Balancer (NLB)
The NLB works at Layer 4. It does not read HTTP content; it forwards TCP, UDP, TLS, or TCP_UDP flows with ultra-low latency. Three properties drive exam answers:
- Static IP per AZ — one per enabled subnet, and you can attach an Elastic IP. This is the only ELB that gives a fixed IP clients can whitelist.
- Source-IP preservation — targets see the real client IP (ALB rewrites it; targets see the ALB's IP unless they read
X-Forwarded-For). - Scale and latency — millions of requests per second with single-digit-millisecond latency, ideal for gaming, IoT, and custom TCP protocols.
A common pattern is NLB in front of an ALB: the NLB provides the static/Elastic IP an enterprise firewall whitelist requires, and the ALB does Layer 7 routing behind it.
Gateway Load Balancer (GWLB)
The GWLB is for inserting third-party virtual appliances (next-gen firewalls, IDS/IPS, deep-packet inspection) transparently into the traffic path. It pairs a Layer 3 gateway with a load-balancer fleet, using GENEVE encapsulation on UDP port 6081. Traffic flows in → GWLB → appliance → back to GWLB → original destination, all invisible to the application. Memorize "third-party security appliance + GENEVE + port 6081 → GWLB."
Side-by-Side
| Capability | ALB | NLB | GWLB |
|---|---|---|---|
| OSI layer | 7 | 4 | 3 |
| Static IP / EIP | No (DNS only) | Yes | Via endpoints |
| Lambda target | Yes | No | No |
| AWS WAF | Yes | No | No |
| Preserves client IP | No (header) | Yes | Yes |
| Typical use | Web apps, APIs | Extreme perf, static IP | Inline appliances |
Cross-Zone Load Balancing and Health Checks
With cross-zone load balancing, every node spreads traffic across all registered targets in all enabled AZs, not just its own AZ. ALB has it on by default and free; NLB has it off by default and charges for inter-AZ data when enabled. Health checks (healthy/unhealthy thresholds, interval, timeout) decide which targets receive traffic; an unhealthy target is drained and gets no traffic until it passes again. Connection draining (deregistration delay, default 300 seconds) lets in-flight requests complete before a target is removed during a deploy or scale-in.
On the exam, "clients dropped mid-request during deployment" points to tuning the deregistration delay.
Target Groups, Listeners, and Common Exam Traps
ELB routing is built from three objects you should be able to name. A listener checks for connections on a protocol and port (e.g., HTTPS:443). Listener rules (ALB only) evaluate conditions in priority order and forward to a target group. A target group holds the registered targets, the target type (instance, IP, Lambda, or ALB), and the health-check settings. One ALB can host many listeners, each with many rules, fanning out to many target groups — that is what makes a single ALB able to front an entire microservice estate.
Health-Check Tuning
| Setting | Default | Effect |
|---|---|---|
| Healthy threshold | 5 (NLB 3) | Consecutive passes to mark a target in service |
| Unhealthy threshold | 2 | Consecutive failures to mark it out of service |
| Interval | 30s (NLB 30s) | Time between checks |
| Timeout | 5s | How long a single check may take |
| Success codes | 200 | Which HTTP codes count as healthy (ALB) |
Fast detection wants a low interval and low unhealthy threshold; stability wants the opposite — the exam often asks which to change so a flapping instance is removed faster.
Frequently Tested Distinctions
- Static IP needed → NLB (never ALB). Lambda or WAF needed → ALB (never NLB).
- Preserve client IP at the target without parsing headers → NLB; with ALB the target must read the
X-Forwarded-Forheader. - TLS offload to free target CPU → terminate TLS at the ALB/NLB with an ACM certificate; end-to-end encryption → re-encrypt to targets or use NLB TCP pass-through.
- One static entry point + Layer 7 routing → NLB in front of ALB.
- Sticky sessions keep a user pinned to one target via the
AWSALBcookie (duration-based) or an application cookie — useful for stateful sessions, but it can unbalance a fleet, so prefer externalizing session state to ElastiCache or DynamoDB where possible.
Keeping these mappings crisp lets you answer most ELB scenarios in seconds: read the hard constraint (static IP, Lambda, WAF, source-IP, GENEVE appliance), and the correct load balancer falls out immediately.
An enterprise client can only reach the application from an on-premises firewall that whitelists a small set of destination IP addresses. The application serves HTTPS traffic. Which load balancer best meets the requirement?
A security team must run a third-party next-generation firewall that inspects all packets entering a VPC, transparently, without changing application routing. Which service inserts the appliance into the traffic path?
During rolling deployments, some users report dropped HTTPS requests exactly when instances are removed from the ALB. What is the most direct fix?