3.1 EC2 Instance Types, Purchasing Options, and Placement Groups
Key Takeaways
- EC2 instance families map to workloads: General Purpose (M/T), Compute Optimized (C), Memory Optimized (R/X/z), Storage Optimized (I/D/H), and Accelerated Computing (P/G/Inf/Trn).
- On-Demand has no commitment; Reserved Instances and Savings Plans both save up to 72% for 1-3 year commitments; Spot saves up to 90% but can be reclaimed with a 2-minute warning.
- Placement groups control physical placement: Cluster (low latency, single AZ), Spread (max 7 instances per AZ, anti-correlated failure), Partition (up to 7 partitions/AZ for HDFS/Cassandra).
- On the SAA-C03 exam, match the keyword: compute-intensive batch -> C-series, in-memory database -> R/X-series, high random IOPS local storage -> I-series, ML training -> P/Trn-series.
- Spot fits fault-tolerant, stateless, interruptible work (batch, CI/CD, rendering); never put production databases or stateful single-node apps on pure Spot.
Quick Answer: Match the workload to a family (General M/T, Compute C, Memory R/X, Storage I/D, Accelerated P/G). Use On-Demand for spiky/unknown demand, Reserved Instances or Savings Plans for steady 24/7 baseline (up to 72% off), and Spot for interruptible batch work (up to 90% off). Reach for a placement group only when the question states a latency, throughput, or hardware-isolation requirement.
Exam Context: Why Domain 3 Cares About Compute Choice
The AWS Certified Solutions Architect - Associate (SAA-C03) exam is 65 questions in 130 minutes, scored 100-1000 with a passing scaled score of 720, and costs 150 USD. Domain 3 (Design High-Performing Architectures) is 24% of the blade, so roughly 15-16 scored questions hinge on picking the compute, storage, database, and networking option that meets a stated performance target at the lowest reasonable cost.
Compute questions almost always give you a workload profile (CPU-bound, memory-bound, bursty, interruptible) and ask for the best fit - the trap answers are technically functional but oversized, overpriced, or wrong for the access pattern.
EC2 Instance Naming Convention
Read an instance name left to right - e.g. m7i.xlarge:
- m = family (General Purpose)
- 7 = generation (higher is newer/faster/cheaper per unit)
- i = processor/capability suffix (i = Intel, g = AWS Graviton/ARM, a = AMD, d = local NVMe, n = enhanced networking)
- xlarge = size (vCPU and memory step)
Graviton (g) instances usually deliver the best price/performance for scale-out, recompilable workloads - watch for "reduce cost without losing performance" cues.
Instance Families
| Family | Category | Examples | Use Cases |
|---|---|---|---|
| T | General (burstable) | t3.micro, t4g.medium | Dev/test, low-traffic sites, microservices with idle periods |
| M | General (balanced) | m7i.large, m7g.xlarge | Web/app servers, small/mid databases, backend APIs |
| C | Compute optimized | c7g.xlarge, c7i.2xlarge | Batch, HPC, ad serving, ML inference, gaming servers |
| R | Memory optimized | r7g.2xlarge, r7i.4xlarge | In-memory caches, real-time analytics, mid SAP |
| X / z | Extreme memory | x2idn.32xlarge, z1d | SAP HANA, huge in-memory databases, EDA |
| I | Storage (NVMe SSD) | i4i.xlarge | NoSQL, high random IOPS, transactional logs |
| D / H | Storage (HDD dense) | d3.xlarge, h1.2xlarge | MapReduce, distributed file systems, log warehouses |
| P | GPU compute | p5.48xlarge | Deep-learning training, large-scale simulation |
| G | GPU graphics | g5.xlarge | Rendering, transcoding, game streaming, light ML |
| Inf / Trn | ML accelerators | inf2.xlarge, trn1.32xlarge | Cost-efficient ML inference / training on custom silicon |
On the Exam: "CPU-bound video transcoding" -> C-series. "In-memory caching of a large dataset" -> R-series. "Burstable web app idle most of the day" -> T-series. "Train a large neural network" -> P or Trn-series.
EC2 Purchasing Options
| Option | Commitment | Savings | Best For |
|---|---|---|---|
| On-Demand | None | Baseline | Spiky/unknown demand, short-lived, first month of a new app |
| Reserved Instance | 1 or 3 yr | Up to 72% | Steady-state, fixed family workloads (core DB tier) |
| Savings Plans | 1 or 3 yr ($/hr) | Up to 72% | Flexible across family/Region/OS, and Lambda/Fargate |
| Spot | None | Up to 90% | Fault-tolerant, stateless, interruptible jobs |
| Dedicated Host | On-Demand/Reserved | Varies | Per-socket/per-core BYOL licensing, compliance |
| Dedicated Instance | On-Demand | Small premium | Hardware not shared across accounts |
| Capacity Reservation | None | 0% (reserves) | Guarantee capacity in a specific AZ (e.g. DR drills) |
Reserved Instance flavors: Standard RI gives the deepest discount but locks the instance type; Convertible RI allows changing family/OS/tenancy for a smaller discount. Payment ladder: All Upfront > Partial Upfront > No Upfront.
Savings Plans flavors: a Compute Savings Plan is the most flexible (covers EC2 any family/Region plus Fargate and Lambda); an EC2 Instance Savings Plan locks the family and Region but flexes size and OS for a slightly higher discount.
Spot details: AWS reclaims Spot capacity with a 2-minute interruption notice delivered via instance metadata or an EventBridge event. Use a Spot Fleet or EC2 Auto Scaling with mixed instances to spread across pools. Spot Block (fixed-duration) is deprecated - design for interruption instead.
Placement Groups
| Type | Layout | Use Case | Limit |
|---|---|---|---|
| Cluster | Same rack, single AZ | Lowest latency / highest per-flow throughput (HPC, tight MPI) | One AZ; loses redundancy |
| Spread | Distinct racks/hardware | Isolate a few critical instances from correlated failure | Max 7 instances per AZ |
| Partition | Logical partitions, separate racks | Rack-aware distributed stores (HDFS, HBase, Cassandra, Kafka) | Up to 7 partitions per AZ |
On the Exam: "Lowest network latency between nodes of an HPC cluster" -> Cluster. "Maximize availability for a small set of critical instances" -> Spread. "Distributed database that must survive a rack failure" -> Partition.
Worked Cost Example
Suppose a web tier runs four m7i.large instances 24/7 for the whole year, plus extra capacity that scales up only during a three-hour nightly batch window. The cost-optimal architecture splits the workload by usage pattern rather than buying one model for everything. The four always-on instances are predictable steady-state demand, so commit them with a one- or three-year Compute Savings Plan (up to 72% off) - and prefer a Compute Savings Plan over a Standard RI here because it still flexes across family, Region, and even Fargate if the fleet evolves.
The nightly burst capacity is interruptible batch work, so run it on Spot (up to 90% off) behind an Auto Scaling group with a mixed-instances policy that draws from several instance pools to reduce the chance of simultaneous interruption.
Common Traps to Avoid
- Oversizing the family. A bursty, mostly-idle web server does not need an M-series instance - a T-series burstable instance with CPU credits is cheaper. Read the utilization profile, not just the peak.
- Spot for stateful single nodes. Never recommend pure Spot for a primary database or any single-node stateful service; the 2-minute reclaim notice will eventually cost you data or availability.
- Confusing Cluster and Spread. Cluster optimizes for latency but sacrifices redundancy (one AZ, one rack); Spread optimizes for failure isolation but caps at 7 instances per AZ. The question's keyword - "latency" vs "availability" - decides the answer.
- Reserved Instance vs Savings Plan. When the workload may change instance family or move Regions, the flexible Savings Plan beats a locked Standard RI even though the headline discount looks similar.
A genomics team runs a tightly coupled HPC simulation where nodes exchange large messages and demand the lowest possible inter-node network latency and highest throughput. Which placement strategy meets this requirement?
A nightly data-analytics job runs for about three hours, is fully checkpointed, and can restart from its last checkpoint if interrupted. Which purchasing option minimizes cost?
An application keeps a large dataset entirely in an in-memory Redis cache and is constrained by available RAM rather than CPU. Which EC2 instance family is the best fit?