A web application experiences predictable daily traffic spikes at 9 AM and traffic drops at 6 PM. Which scaling approach ensures instances are ready BEFORE the spike?

Scheduled scaling to add capacity at 8:45 AM. Scheduled scaling proactively adds capacity at a predetermined time, ensuring instances are ready before the traffic spike arrives. Reactive scaling (target tracking, step) only responds after demand increases, causing latency during ramp-up.

What happens when an EC2 instance in an Auto Scaling group fails its health check?

The instance is terminated and a new one is launched to replace it. Auto Scaling automatically terminates unhealthy instances and launches new replacements. This self-healing behavior is a key resilience feature. The new instance is registered with the load balancer automatically.

Which Auto Scaling policy type is the SIMPLEST to configure and recommended by AWS for most use cases?

Target tracking scaling. Target tracking scaling is the simplest to configure — you just specify a metric and a target value (e.g., keep average CPU at 50%). Auto Scaling automatically manages instance additions and removals to maintain that target.

An Auto Scaling group uses a Launch Template. The team needs to update the AMI for new instances while keeping existing instances running. What should they do?

Create a new version of the Launch Template with the new AMI, then start an Instance Refresh. Launch Templates support versioning. Create a new version with the updated AMI, set it as default, and use Instance Refresh to gradually replace existing instances with new ones from the updated template — all without downtime.

An application uses an Auto Scaling group with a minimum of 2, desired of 4, and maximum of 8 instances across 2 AZs. If one AZ completely fails, what happens?

Auto Scaling launches new instances in the remaining AZ to maintain desired capacity. Auto Scaling detects that instances in the failed AZ are unhealthy and launches replacement instances in the remaining healthy AZ to maintain the desired capacity of 4 instances. The load balancer automatically routes all traffic to the healthy AZ.

EC2 Auto Scaling — Dynamic, Predictive, and Scheduled

Quick Answer: Auto Scaling maintains the right number of EC2 instances: it adds instances when demand increases, removes them when demand drops, and replaces unhealthy instances automatically. Use target tracking for simple metric-based scaling, predictive scaling for forecasted demand, and scheduled scaling for known patterns.

Auto Scaling Group (ASG) Fundamentals

An Auto Scaling group (ASG) is a collection of EC2 instances managed as a logical unit for scaling and management.

Key Parameters

Parameter	Description	Example
Minimum	Least number of instances to keep running	2
Desired	Target number of instances (Auto Scaling adjusts toward this)	4
Maximum	Most instances allowed	10
Launch Template	EC2 configuration (AMI, type, SG, user data, IAM role)	lt-0123456789
VPC/Subnets	Which AZs to deploy instances in	us-east-1a, 1b, 1c
Health Check	EC2 (instance status) or ELB (target health)	ELB
Cooldown	Wait period after scaling before next action	300 seconds

Self-Healing

If an instance fails health checks, Auto Scaling:

Marks the instance as unhealthy
Terminates the unhealthy instance
Launches a replacement instance
Registers the new instance with the load balancer

This happens automatically with no manual intervention — it is a key resilience feature.

Scaling Policy Types

1. Target Tracking Scaling (Recommended)

The simplest and most recommended policy type. You define a target metric value, and Auto Scaling adjusts capacity to keep the metric at that target.

Example Target	Description
CPU utilization = 50%	Add instances when CPU > 50%, remove when < 50%
Request count per target = 1000	Keep ~1000 requests per instance via ALB
Average network in = 10 GB	Scale based on network throughput
Custom metric	Any CloudWatch metric you publish

2. Step Scaling

Scales by different amounts based on the size of the alarm breach:

CPU Range	Action
50-70%	Add 1 instance
70-90%	Add 2 instances
> 90%	Add 3 instances

3. Simple Scaling

Legacy policy — waits for the cooldown period before the next scaling action. Not recommended for new implementations; use target tracking or step scaling instead.

4. Scheduled Scaling

Adjusts capacity at specific dates/times for known traffic patterns:

Scale up to 20 instances at 8 AM every weekday
Scale down to 4 instances at 8 PM every weekday
Scale up for anticipated Black Friday traffic

5. Predictive Scaling

Uses machine learning to analyze historical traffic patterns and automatically provisions capacity in advance of predicted demand.

Feature	Description
Forecast	ML model predicts traffic 48 hours in advance
Pre-scaling	Instances launched BEFORE demand arrives
Retraining	Model retrains daily with latest data
Best for	Cyclical patterns (daily, weekly)

On the Exam: "The application experiences daily traffic spikes at the same time every day, and users experience slow response times during the ramp-up" → Predictive scaling or scheduled scaling (both pre-provision capacity).

Launch Templates vs. Launch Configurations

Feature	Launch Template	Launch Configuration
Status	Current, recommended	Legacy, not recommended
Versioning	Yes (multiple versions)	No (immutable)
Mixed instances	Yes (multiple instance types)	No (single type)
Spot + On-Demand	Yes (mixed allocation)	Limited
T2/T3 unlimited	Configurable	Limited

Scaling Cooldown

The cooldown period (default 300 seconds) prevents Auto Scaling from launching or terminating additional instances before the effects of previous activities take effect.

Scale-out cooldown — Wait before launching more instances
Scale-in cooldown — Wait before terminating more instances

Tip: If your instances take a long time to warm up, increase the cooldown period. If you use target tracking scaling, cooldown is managed automatically.

Instance Refresh

Instance refresh updates instances in an ASG without downtime:

Set a minimum healthy percentage (e.g., 90%)
Auto Scaling replaces instances in batches
Each batch is launched, health-checked, and warmed up before the next batch starts

Use cases: Deploy new AMI, update launch template, apply new configuration.

AWS Solutions Architect Associate

2.2 EC2 Auto Scaling — Dynamic, Predictive, and Scheduled

Key Takeaways

EC2 Auto Scaling — Dynamic, Predictive, and Scheduled

Auto Scaling Group (ASG) Fundamentals

Key Parameters

Self-Healing

Scaling Policy Types

1. Target Tracking Scaling (Recommended)

2. Step Scaling

3. Simple Scaling

4. Scheduled Scaling

5. Predictive Scaling

Launch Templates vs. Launch Configurations

Scaling Cooldown

Instance Refresh

AWS Solutions Architect Associate

1Introduction

2Domain 1: Design Secure Architectures (30%)

3Domain 2: Design Resilient Architectures (26%)

4Domain 3: Design High-Performing Architectures (24%)

5Domain 4: Design Cost-Optimized Architectures (20%)

6VPC and Networking Deep Dive

7Migration, Transfer, and Hybrid Services

8Serverless Architecture and Application Services

9Advanced Topics and Exam Scenarios

2.2 EC2 Auto Scaling — Dynamic, Predictive, and Scheduled

Key Takeaways

EC2 Auto Scaling — Dynamic, Predictive, and Scheduled

Auto Scaling Group (ASG) Fundamentals

Key Parameters

Self-Healing

Scaling Policy Types

1. Target Tracking Scaling (Recommended)

2. Step Scaling

3. Simple Scaling

4. Scheduled Scaling

5. Predictive Scaling

Launch Templates vs. Launch Configurations

Scaling Cooldown

Instance Refresh