6.3 Container Sizing, Scaling, and Environment Choices

Key Takeaways

  • Container sizing controls CPU, memory, cost, scheduling behavior, and reliability.
  • ACI sizing is set at deployment for the container group, while Container Apps scales by replicas within configured bounds.
  • Container Apps environments influence networking, ingress, observability, and application grouping.
  • Scaling decisions should match workload shape, not just image packaging.
  • Exam traps include confusing image storage scale, replica scale, and App Service plan scale.
Last updated: May 2026

Start with workload shape

A container image is only a packaging format. It does not decide the right Azure runtime by itself. Administrators select the runtime by asking how the workload starts, how long it runs, how it receives traffic, how it scales, how it is updated, and what network path it needs.

Use ACI for direct execution when the workload can be described as a container group with fixed resources and a simple endpoint or job lifecycle. Use Container Apps when the workload is an application that benefits from replicas, ingress rules, revisions, secrets, scale rules, and managed environment behavior. Use App Service for Containers when the workload is a web app and the organization wants App Service plan features, custom domains, certificates, slots, backups, and familiar web app configuration.

Decision pointACIContainer AppsApp Service for Containers
Primary unitContainer groupContainer app revision and replicasWeb app in an App Service plan
Best forSimple jobs, burst tasks, test toolsMicroservices and event-driven appsWeb apps and APIs using App Service features
Scaling modelCreate appropriately sized container group; no rich app autoscale modelReplica-based autoscale with min and max replicasPlan and app autoscale depending on tier
Deployment update behaviorReplace or recreate container group settingsNew revisions for revision-scope changesApp configuration or slot-based deployment
Common exam keywordQuick run, no orchestratorScale to zero, ingress, revisionsSlots, custom domain, TLS, backup

Sizing containers

Container sizing means allocating CPU and memory. Under-size a container and it may fail, restart, or perform poorly. Over-size it and you waste cost or block density. The exam often gives symptoms such as restarts during peak load, out-of-memory behavior, or a batch job that takes too long. The administrator should check metrics, logs, and configured CPU and memory before changing the image registry or DNS.

For ACI, CPU and memory are specified during container group creation. You can deploy with values like --cpu 2 --memory 4, but changing those values later usually means updating or recreating the container group. This fits short-lived or simple workloads where a redeploy is acceptable.

az container create \
  --resource-group rg-compute \
  --name aci-transform \
  --image examacr104.azurecr.io/jobs/transform:v3 \
  --cpu 2 \
  --memory 4 \
  --restart-policy OnFailure

For Container Apps, each container has CPU and memory settings, and the app has replica scale settings. Do not confuse vertical allocation with horizontal scaling. Increasing CPU per replica can help each replica process more work. Increasing max replicas lets the app add more copies under scale conditions. Both can be needed, but they answer different problems.

az containerapp update \
  --name orders-api \
  --resource-group rg-compute \
  --cpu 1.0 \
  --memory 2Gi \
  --min-replicas 1 \
  --max-replicas 10

Scaling patterns

HTTP scale is usually about request volume or concurrency. Queue scale is about backlog. Scheduled or batch work may not need autoscale at all if it runs in ACI and exits. A long-running public API that must survive variable request traffic is usually a poor fit for a manually recreated ACI container group.

Container Apps can scale to zero when the minimum replica count is zero. This is attractive for event-driven workloads because cost can drop when idle. The tradeoff is cold start behavior. If the question says there must always be at least one warm instance, set minimum replicas to one or more. If the requirement is lowest idle cost and cold start is acceptable, minimum replicas of zero is a strong clue.

Maximum replicas are just as important. Without a reasonable maximum, unexpected traffic or poison messages can create cost or downstream pressure. An administrator might cap max replicas to protect a database, storage account, or API dependency. In exam scenarios, look for words like budget, downstream throttling, or maximum concurrency.

Environment and network choices

A Container Apps environment is more than a label. It groups apps for shared networking, observability, and environment-level configuration. Internal ingress means access is restricted compared with external ingress. If the app must only be reachable from a private application gateway, private network, or other internal service, the environment and ingress configuration become central.

ACI can be deployed with public IP exposure or into a virtual network in supported patterns. If a question requires a simple private worker that connects to a database over private IP, ACI with VNet connectivity might be enough. If the same question also requires event scaling, revisions, and traffic splitting, Container Apps is the better fit.

Observability should be part of sizing. Use logs for process failures and startup errors. Use metrics for CPU, memory, replica count, restarts, and request behavior. If a container is constantly restarting, do not jump directly to scaling out. Read the logs first; an app crash, bad environment variable, failed secret reference, or wrong port can look like a capacity problem from far away.

Portal and CLI decision logic

Use the portal when the task is exploratory, one-off, or asks for a visible configuration path. For example, creating an ACI container from a known image for a lab is straightforward in the portal. Use CLI or Bicep when the task must be repeatable, parameterized, or applied across environments. AZ-104 questions may show CLI snippets and ask what they accomplish, so recognize flags for CPU, memory, ingress, min replicas, max replicas, registry, and restart policy.

Portal checks:

SymptomWhere to look
Container cannot pull imageRegistry settings, identity, ACR permissions, compute logs
Container starts but failsLogs, environment variables, command, restart policy
Public app not reachableIngress mode, target port, DNS, firewall, network integration
Unexpected costCPU and memory settings, replica counts, idle minimum replicas
Old version still activeImage tag, active revision, traffic split, deployment pipeline

Exam traps

Scaling ACR is not the same as scaling a container app. ACR tier or geo-replication can improve registry throughput and availability for image distribution, but it does not add runtime replicas. Scaling an App Service plan is not the same as setting Container Apps max replicas. App Service plan scale changes workers for apps in that plan. Container Apps scale changes replicas of a container app inside its managed environment.

Another trap is assuming every container service supports every Kubernetes feature. Container Apps is Kubernetes-based behind the scenes, but administrators do not manage nodes or use arbitrary Kubernetes objects for AZ-104 tasks. ACI is simpler still. If the requirement says manage node pools, Kubernetes upgrades, or cluster-level networking, the answer is outside this chapter's platform choices and points toward AKS, which AZ-104 may mention only as contrast in container scenarios.

Test Your Knowledge

A Container App has slow response under load, but each replica is already CPU saturated. Which change most directly increases per-replica capacity?

A
B
C
D
Test Your Knowledge

A workload can tolerate cold starts and should have the lowest idle cost. Which Container Apps setting is most relevant?

A
B
C
D
Test Your Knowledge

Which statement correctly distinguishes ACR scale from runtime scale?

A
B
C
D