4.1 Compute Cost Optimization — Right-Sizing, RIs, Savings Plans, and Spot
Key Takeaways
- Domain 4 is 20% of the SAA-C03 exam (65 questions, 130 minutes, 720/1000 to pass) — roughly 11-13 scored questions test cost-optimized design.
- Right-size first (AWS Compute Optimizer), then auto-scale, then apply a pricing model — never buy commitments for an unmeasured workload.
- Reserved Instances and Savings Plans both save up to 72% on EC2; Savings Plans are the more flexible answer for mixed EC2/Lambda/Fargate fleets.
- Spot Instances save up to 90% but can be reclaimed with a 2-minute warning — only correct for fault-tolerant, stateless, or checkpointable work.
- Lambda and Fargate Spot eliminate idle compute charges entirely for bursty or sporadic workloads where steady EC2 would sit unused.
Why Domain 4 Matters
Design Cost-Optimized Architectures is 20% of the SAA-C03 exam. With 65 questions in 130 minutes (15 unscored, 50 scored) and a scaled passing score of 720/1000, roughly 11-13 scored items live here. Questions rarely ask "what is cheapest" in the abstract — they describe a traffic shape (steady, spiky, interruptible, or part-time) and ask for the cheapest model that still meets stated availability or latency requirements.
The Optimization Order (Memorize This)
Apply these in sequence — buying a commitment before right-sizing locks in waste:
- Right-size — stop over-provisioning vCPU/RAM.
- Auto Scale — never pay for idle capacity.
- Pricing model — match commitment to the workload's predictability.
- Architecture — move bursty work to serverless (Lambda, Fargate).
Right-Sizing Tools and Signals
| Tool | What it inspects |
|---|---|
| AWS Compute Optimizer | ML right-sizing for EC2, Auto Scaling groups, EBS, Lambda memory, Fargate |
| AWS Cost Explorer | RI/Savings Plan recommendations from billing + utilization |
| CloudWatch | CPU, network, disk; memory needs the CloudWatch agent |
| Trusted Advisor | Flags idle/underutilized EC2 (full checks need Business support) |
| Signal | Right-sizing action |
|---|---|
| CPU steadily under 20% | Drop one instance size |
| Memory under 30% on R-series | Move to compute-optimized C-series |
| T-series CPU credits piling up | Instance is oversized for its bursty load |
Trap: CloudWatch does not report memory by default. If a question says "right-size based on memory," the answer involves installing the CloudWatch agent first.
Pricing Model Comparison
| Model | Max discount | Commitment | Best for |
|---|---|---|---|
| On-Demand | 0% | None | Spiky, short-lived, unpredictable |
| Reserved Instance | Up to 72% | 1 or 3 yr | 24/7 steady workloads (databases) |
| Savings Plan | Up to 72% | 1 or 3 yr ($/hr) | Mixed EC2/Lambda/Fargate fleets |
| Spot | Up to 90% | None | Fault-tolerant batch, CI/CD, analytics |
| Dedicated Host | Varies | On-Demand/Reserved | BYOL socket-based licensing |
Reserved Instances vs. Savings Plans
- Standard RI — highest discount, locked to instance family/Region; can change AZ and size within the family.
- Convertible RI — slightly lower discount, exchangeable for a different family, OS, or tenancy.
- Compute Savings Plan — most flexible: covers EC2 and Lambda and Fargate across any family, size, OS, and Region.
- EC2 Instance Savings Plan — higher discount than Compute SP but locked to one family + Region.
On the Exam: "Most flexible compute savings" → Compute Savings Plan. "Steady database, highest discount, won't change instance" → 3-year Standard RI, All Upfront.
Worked Example: Choosing a Commitment
Suppose a fleet runs a constant baseline of about 20 m6i instances plus daytime bursts. The cheapest layered design is: cover the always-on baseline with a Compute Savings Plan sized to that 20-instance floor, let Auto Scaling add On-Demand or Spot above the floor for daytime peaks, and never commit to the peak. Committing to the peak (a common wrong answer) pays full price for capacity that exists only a few hours a day. A second classic mistake is buying RIs before right-sizing — if Compute Optimizer later recommends a smaller family, a Standard RI is stranded while a Convertible RI or Compute Savings Plan would have adapted.
Spot and Serverless
Spot capacity is reclaimed when AWS needs it back, with a 2-minute interruption notice delivered through CloudWatch Events and instance metadata. Spot is correct only for fault-tolerant, stateless, or checkpointable work: batch processing, big-data analytics, CI/CD runners, rendering, and containerized jobs that can restart. It is the wrong answer for a primary database, a stateful session server, or anything that cannot tolerate sudden loss.
Reduce interruption impact with three techniques the exam expects:
- Diversification — request many instance types across many Availability Zones so AWS can fulfill capacity from the deepest pools.
- EC2 Fleet / Spot Fleet — declare a target capacity and mix On-Demand baseline with Spot burst in one request.
- capacity-optimized allocation — let AWS place Spot in the pools least likely to be interrupted, rather than purely cheapest.
For sporadic event-driven work, Lambda bills per request plus per-millisecond GB-second of memory-time, so you pay zero while idle — unlike an always-on EC2 box that bills 24/7 even at 1% CPU. Fargate removes server management for containers, and Fargate Spot brings up-to-70%-style discounts to interruption-tolerant tasks. A useful rule: as request volume climbs and becomes steady, the cost lines cross and committed EC2 (with RIs or a Savings Plan) eventually beats per-invocation Lambda — so very high, predictable throughput is a signal to move back toward reserved EC2 or containers, not to scale Lambda indefinitely.
Common Compute-Cost Traps
| Tempting wrong answer | Why it fails |
|---|---|
| Spot for a production database | 2-minute reclamation breaks durability/availability |
| Standard RI before right-sizing | Locks in oversized instances; cannot exchange |
| Committing to peak capacity | Pays full price for capacity used a few hours/day |
| Lambda for steady high-volume compute | Per-invocation cost eventually exceeds reserved EC2 |
| Larger instance to fix bursty T-series CPU | Burst credits accruing means it is already oversized |
A company runs a transactional database 24/7 on an m5.2xlarge EC2 instance with steady utilization and no plans to change the instance type. Which pricing model gives the HIGHEST cost savings?
A platform team wants the deepest commitment-based discount that still applies automatically across EC2, AWS Lambda, and Fargate as their fleet mix changes month to month. Which option fits BEST?
An application resizes user-uploaded images. Each job runs about 30 seconds, and uploads arrive sporadically — anywhere from 0 to 1,000 per hour with long idle gaps. Which compute choice is MOST cost-effective?