6.3 Container Sizing, Scaling, and Environment Choices

Key Takeaways

Container sizing controls CPU, memory, cost, scheduling behavior, and reliability.
ACI sizing is set at deployment for the container group, while Container Apps scales by replicas within configured bounds.
Container Apps environments influence networking, ingress, observability, and application grouping.
Scaling decisions should match workload shape, not just image packaging.
Exam traps include confusing image storage scale, replica scale, and App Service plan scale.

Last updated: May 2026

Start with workload shape

A container image is only a packaging format. It does not decide the right Azure runtime by itself. Administrators select the runtime by asking how the workload starts, how long it runs, how it receives traffic, how it scales, how it is updated, and what network path it needs.

Use ACI for direct execution when the workload can be described as a container group with fixed resources and a simple endpoint or job lifecycle. Use Container Apps when the workload is an application that benefits from replicas, ingress rules, revisions, secrets, scale rules, and managed environment behavior. Use App Service for Containers when the workload is a web app and the organization wants App Service plan features, custom domains, certificates, slots, backups, and familiar web app configuration.

Decision point	ACI	Container Apps	App Service for Containers
Primary unit	Container group	Container app revision and replicas	Web app in an App Service plan
Best for	Simple jobs, burst tasks, test tools	Microservices and event-driven apps	Web apps and APIs using App Service features
Scaling model	Create appropriately sized container group; no rich app autoscale model	Replica-based autoscale with min and max replicas	Plan and app autoscale depending on tier
Deployment update behavior	Replace or recreate container group settings	New revisions for revision-scope changes	App configuration or slot-based deployment
Common exam keyword	Quick run, no orchestrator	Scale to zero, ingress, revisions	Slots, custom domain, TLS, backup

Sizing containers

Container sizing means allocating CPU and memory. Under-size a container and it may fail, restart, or perform poorly. Over-size it and you waste cost or block density. The exam often gives symptoms such as restarts during peak load, out-of-memory behavior, or a batch job that takes too long. The administrator should check metrics, logs, and configured CPU and memory before changing the image registry or DNS.

For ACI, CPU and memory are specified during container group creation. You can deploy with values like --cpu 2 --memory 4, but changing those values later usually means updating or recreating the container group. This fits short-lived or simple workloads where a redeploy is acceptable.

az container create \
  --resource-group rg-compute \
  --name aci-transform \
  --image examacr104.azurecr.io/jobs/transform:v3 \
  --cpu 2 \
  --memory 4 \
  --restart-policy OnFailure

For Container Apps, each container has CPU and memory settings, and the app has replica scale settings. Do not confuse vertical allocation with horizontal scaling. Increasing CPU per replica can help each replica process more work. Increasing max replicas lets the app add more copies under scale conditions. Both can be needed, but they answer different problems.

az containerapp update \
  --name orders-api \
  --resource-group rg-compute \
  --cpu 1.0 \
  --memory 2Gi \
  --min-replicas 1 \
  --max-replicas 10

Scaling patterns

HTTP scale is usually about request volume or concurrency. Queue scale is about backlog. Scheduled or batch work may not need autoscale at all if it runs in ACI and exits. A long-running public API that must survive variable request traffic is usually a poor fit for a manually recreated ACI container group.

Container Apps can scale to zero when the minimum replica count is zero. This is attractive for event-driven workloads because cost can drop when idle. The tradeoff is cold start behavior. If the question says there must always be at least one warm instance, set minimum replicas to one or more. If the requirement is lowest idle cost and cold start is acceptable, minimum replicas of zero is a strong clue.

Maximum replicas are just as important. Without a reasonable maximum, unexpected traffic or poison messages can create cost or downstream pressure. An administrator might cap max replicas to protect a database, storage account, or API dependency. In exam scenarios, look for words like budget, downstream throttling, or maximum concurrency.

Environment and network choices

A Container Apps environment is more than a label. It groups apps for shared networking, observability, and environment-level configuration. Internal ingress means access is restricted compared with external ingress. If the app must only be reachable from a private application gateway, private network, or other internal service, the environment and ingress configuration become central.

ACI can be deployed with public IP exposure or into a virtual network in supported patterns. If a question requires a simple private worker that connects to a database over private IP, ACI with VNet connectivity might be enough. If the same question also requires event scaling, revisions, and traffic splitting, Container Apps is the better fit.

Observability should be part of sizing. Use logs for process failures and startup errors. Use metrics for CPU, memory, replica count, restarts, and request behavior. If a container is constantly restarting, do not jump directly to scaling out. Read the logs first; an app crash, bad environment variable, failed secret reference, or wrong port can look like a capacity problem from far away.

Portal and CLI decision logic

Use the portal when the task is exploratory, one-off, or asks for a visible configuration path. For example, creating an ACI container from a known image for a lab is straightforward in the portal. Use CLI or Bicep when the task must be repeatable, parameterized, or applied across environments. AZ-104 questions may show CLI snippets and ask what they accomplish, so recognize flags for CPU, memory, ingress, min replicas, max replicas, registry, and restart policy.

Portal checks:

Symptom	Where to look
Container cannot pull image	Registry settings, identity, ACR permissions, compute logs
Container starts but fails	Logs, environment variables, command, restart policy
Public app not reachable	Ingress mode, target port, DNS, firewall, network integration
Unexpected cost	CPU and memory settings, replica counts, idle minimum replicas
Old version still active	Image tag, active revision, traffic split, deployment pipeline

Exam traps

Scaling ACR is not the same as scaling a container app. ACR tier or geo-replication can improve registry throughput and availability for image distribution, but it does not add runtime replicas. Scaling an App Service plan is not the same as setting Container Apps max replicas. App Service plan scale changes workers for apps in that plan. Container Apps scale changes replicas of a container app inside its managed environment.

Another trap is assuming every container service supports every Kubernetes feature. Container Apps is Kubernetes-based behind the scenes, but administrators do not manage nodes or use arbitrary Kubernetes objects for AZ-104 tasks. ACI is simpler still. If the requirement says manage node pools, Kubernetes upgrades, or cluster-level networking, the answer is outside this chapter's platform choices and points toward AKS, which AZ-104 may mention only as contrast in container scenarios.

Test Your Knowledge

A Container App has slow response under load, but each replica is already CPU saturated. Which change most directly increases per-replica capacity?

Increase CPU and memory allocated to the container

Create an ACR retention policy

Disable ingress

Add a custom domain

Test Your Knowledge

A workload can tolerate cold starts and should have the lowest idle cost. Which Container Apps setting is most relevant?

Set minimum replicas to zero

Set minimum TLS version to 1.2

Enable App Service backup

Use an ACR Basic registry

Test Your Knowledge

Which statement correctly distinguishes ACR scale from runtime scale?

Increasing ACR tier automatically increases Container Apps replicas.

ACR stores and distributes images; runtime services scale the running containers.

ACR max replicas control App Service worker count.

ACR firewall rules set ACI CPU and memory.

Up Next

6.4 App Service Plan, Apps, and Runtime Configuration

Continue learning

AZ-104 Microsoft Azure Administrator Study Guide

1Chapter 1: Exam Orientation and Microsoft Learn Source Control

2Chapter 2: Identity, RBAC, and Governance Foundations

3Chapter 3: Subscriptions, Policy, Costs, and Resource Organization

4Chapter 4: Storage Accounts, Access, and Data Protection

5Chapter 5: Compute, Virtual Machines, Scale Sets, and Bicep

6Chapter 6: Containers, App Service, and Platform Compute

7Chapter 7: Virtual Networking, Routing, and Secure Access

8Chapter 8: Name Resolution, Load Balancing, and Connectivity Troubleshooting

9Chapter 9: Monitoring, Alerting, Backup, and Site Recovery

10Chapter 10: AZ-104 Integration and Troubleshooting Case Labs

11Chapter 11: Final Review, Exam Experience, Renewal, and Career Path

6.3 Container Sizing, Scaling, and Environment Choices

Key Takeaways

Start with workload shape

Sizing containers

Scaling patterns

Environment and network choices

Portal and CLI decision logic

Exam traps