2.3 Multi-AZ and Multi-Region Architectures

Key Takeaways

  • Multi-AZ protects against the failure of a single Availability Zone within one Region; Multi-Region protects against an entire Region outage and serves global users with low latency.
  • RDS Multi-AZ (single-standby) uses synchronous replication with 60–120s failover; RDS Multi-AZ DB cluster (two readable standbys) failover is ~35s; Aurora stores data 6 ways across 3 AZs.
  • Aurora Global Database replicates cross-Region with sub-second lag and promotes a secondary in under a minute; DynamoDB Global Tables are active-active multi-Region with last-writer-wins.
  • S3 and DynamoDB automatically replicate across at least three AZs in a Region; cross-Region replication (S3 CRR, RDS cross-Region replica) must be explicitly configured and is asynchronous.
  • Route 53 failover, latency-based, geolocation, geoproximity, and weighted routing policies steer traffic across Regions using health checks.
Last updated: June 2026

Quick Answer: Multi-AZ = high availability inside one Region (survives an AZ outage). Multi-Region = disaster recovery plus global performance (survives a Region outage). Choose Multi-AZ as the production baseline; add Multi-Region when RPO/RTO or data-sovereignty requirements demand it.

Multi-AZ: High Availability Within a Region

An Availability Zone (AZ) is one or more discrete data centers with independent power, cooling, and networking. Spreading resources across two or more AZs means a single-AZ outage does not take down the application. The exam expects you to know each service's built-in behavior.

ServiceMulti-AZ behavior
EC2 + ALBPlace instances in 2+ AZs; ALB routes around the failed AZ
RDS Multi-AZSynchronous standby in another AZ; automatic failover 60–120s
AuroraData replicated 6 ways across 3 AZs; up to 15 read replicas
ElastiCache (Redis)Multi-AZ with automatic failover (Memcached has none)
S3 / DynamoDB / EFSAutomatically span 3+ AZs in the Region
NAT GatewayOne per AZ for true Multi-AZ resilience

RDS Failover Detail

Multi-AZ instanceMulti-AZ DB cluster
Standbys1 (not readable)2 (readable)
ReplicationSynchronousSemi-synchronous
Failover60–120 seconds~35 seconds
Read scalingNoYes

On the Exam: "Automatic database failover with minimal downtime" → RDS Multi-AZ. "Failover and read scaling" → Aurora or the RDS Multi-AZ DB cluster.

Multi-Region: Disaster Recovery and Global Reach

Multi-Region is needed when you must survive a full Region outage, serve sub-100ms latency to users on other continents, or keep data inside a country for data sovereignty. It roughly doubles infrastructure cost, so it is not the default.

ServiceMulti-Region capability
S3Cross-Region Replication (CRR), asynchronous
RDSCross-Region read replica, asynchronous
AuroraGlobal Database — sub-second cross-Region lag
DynamoDBGlobal Tables — active-active, multi-Region writes
CloudFrontGlobal edge network by design
Route 53Global DNS with health checks and failover

Aurora Global Database vs. DynamoDB Global Tables

  • Aurora Global Database: one primary Region for writes, up to five secondary read-only Regions, sub-second replication lag, and promotion of a secondary in under a minute during a Regional disaster. Best when you need a relational engine with a tiny RPO.
  • DynamoDB Global Tables: active-active — every Region accepts reads and writes, replicated typically in under a second with last-writer-wins conflict resolution. Best for globally distributed apps needing local low-latency writes everywhere.

Route 53 Routing Policies

PolicyUse case
FailoverActive-passive: send to secondary when primary health check fails
Latency-basedRoute each user to the lowest-latency Region
GeolocationRoute by the user's continent/country (sovereignty, licensing)
GeoproximityRoute by distance with adjustable traffic bias
WeightedSplit by percentage (e.g., 90/10 canary)

On the Exam: "Global users see high latency" → latency-based routing + CloudFront. "Application must survive a Region outage" → failover routing with health checks to a standby Region. Pair Route 53 health checks with CloudWatch alarms to trigger automatic DNS failover. Remember S3 CRR and RDS cross-Region replicas are asynchronous, so they carry a larger RPO than Aurora Global Database's sub-second lag.

Edge, Global Acceleration, and Subtle Distinctions

Two global services round out the Multi-Region picture. Amazon CloudFront caches content at hundreds of edge locations and is the answer for serving static and dynamic content with low latency worldwide; it also shields the origin and integrates with AWS WAF and Shield. AWS Global Accelerator gives you two static anycast IP addresses and routes user traffic over the AWS backbone to the nearest healthy Regional endpoint, with near-instant failover — ideal for non-HTTP, gaming, or VoIP workloads needing fast Regional failover, where CloudFront (an HTTP cache) does not fit.

NeedService
Cache HTTP content globallyCloudFront
Static anycast IPs + fast Regional failover for TCP/UDPGlobal Accelerator
DNS-level Region routingRoute 53

High-Stakes Exam Distinctions

  • Multi-AZ vs. read replica: Multi-AZ is for availability (synchronous standby, automatic failover, standby not readable on single-instance Multi-AZ). A read replica is for read scaling (asynchronous, readable, no automatic promotion unless you build it). Do not confuse them — a question asking for failover wants Multi-AZ; one asking to offload read queries wants a read replica.
  • S3 durability and AZ spread: S3 stores objects redundantly across at least three AZs in a Region (eleven nines of durability), but that is not Multi-Region — surviving a Region outage still requires Cross-Region Replication.
  • Single point of failure hunting: a single NAT Gateway, a single-AZ subnet, or an EC2 instance with no ASG are classic SPOFs the exam plants; the fix is to spread across AZs and front with an ELB/ASG.
  • Sovereignty: when data must remain in a country, use Route 53 geolocation routing plus Regional resources in that country, never a global active-active that could serve data from elsewhere.

The mental model for the whole section: AZ failure is handled inside a Region with Multi-AZ; Region failure and global latency are handled across Regions with replication (Aurora Global Database, DynamoDB Global Tables, S3 CRR) and traffic steering (Route 53, CloudFront, Global Accelerator).

Test Your Knowledge

A relational workload must survive a complete AWS Region outage with less than one second of potential data loss and recover in under a minute. Which solution fits best?

A
B
C
D
Test Your Knowledge

A global application must accept low-latency writes in North America, Europe, and Asia simultaneously, with each Region serving its local users. Which database meets this requirement?

A
B
C
D
Test Your Knowledge

An active-passive deployment must automatically send users to a standby Region only when the primary Region's endpoint becomes unhealthy. Which Route 53 routing policy should be used?

A
B
C
D