5.7 VM Compute Case Lab

Key Takeaways

  • A strong compute design starts with requirements for availability, security, scale, management, backup, and cost before selecting VM features.
  • Bicep can express the repeatable build while operational services such as Backup, Monitor, Bastion, and autoscale complete the administrator workflow.
  • Troubleshooting should isolate deployment, identity, network, guest OS, application health, monitoring, and recovery layers.
  • Exam case studies reward reading constraints carefully and mapping each requirement to the minimum Azure feature that satisfies it.
Last updated: May 2026

Case Scenario

Contoso has a legacy line-of-business application that will move to Azure. The web tier is stateless and can run on Linux. The application tier is Windows-based and must stay on traditional VMs for now. The database runs on a separate managed database service, so application VMs should not store durable business data locally. Administrators require private management access, repeatable deployment, backup for the Windows application servers, monitoring alerts, and a way to scale the web tier during seasonal traffic.

The target region is East US. The business wants higher availability inside the region but is not yet ready for full multi-region active-active design. Security policy denies direct RDP or SSH from the internet. Cost matters, but the app supports customer transactions, so single-VM designs are not acceptable for production.

Requirements Matrix

RequirementDesign decisionReason
Repeatable deploymentBicep modulesAvoid manual drift and support environments
Private admin accessAzure Bastion or VPN path, no VM public IPsMeets no direct internet management rule
Stateless web scaleVM Scale Set across zonesAdds and replaces instances predictably
Windows app tier HATwo or more VMs across zones or availability setReduces single host or zone failure risk
Durable dataManaged database and managed disks only for server stateDo not use temporary disk for business data
BackupRecovery Services vault policy for app VMsSupports restore and retention
MonitoringAzure Monitor Agent, data collection rules, alertsOperational visibility
Change safetyWhat-if, deployment names, staged rolloutReduces deployment surprises

Proposed Architecture

Use one resource group for production compute resources and one for shared operations resources if the organization separates duties. Create a hub or shared management network if that already exists; otherwise, deploy a VNet with subnets for web, app, Bastion, and private endpoints where needed. Deploy the web tier as a VM Scale Set with instances distributed across zones and behind a load balancer or application gateway. Deploy the Windows application tier as multiple VMs, also zone-aware when supported, with no public IP addresses.

Use managed identities for VMs that need Azure resource access. Use Key Vault for secrets, but avoid placing secrets in template files. Protect the Windows application VMs with Azure Backup. Enable boot diagnostics and Azure Monitor Agent. Configure NSGs so only required app ports flow between tiers. Management traffic should arrive through Bastion or a private network path.

Bicep Skeleton

The following skeleton is not a full production template, but it shows how an administrator structures the deployment into parameters, modules, and outputs:

targetScope = 'resourceGroup'

param location string = resourceGroup().location
param environment string = 'prod'
param adminUsername string
@secure()
param adminPassword string

module network './modules/network.bicep' = {
  name: 'network-${environment}'
  params: {
    location: location
    environment: environment
  }
}

module web './modules/web-vmss.bicep' = {
  name: 'web-vmss-${environment}'
  params: {
    location: location
    subnetId: network.outputs.webSubnetId
    instanceCount: 3
    vmSku: 'Standard_D2s_v5'
  }
}

module app './modules/windows-app-vms.bicep' = {
  name: 'app-vms-${environment}'
  params: {
    location: location
    subnetId: network.outputs.appSubnetId
    adminUsername: adminUsername
    adminPassword: adminPassword
    vmNames: [
      'vm-app-01'
      'vm-app-02'
    ]
  }
}

output webFrontendIp string = web.outputs.frontendIp

The real modules would define NSGs, load balancing, health probes, VM extensions, diagnostic settings, and backup registration where appropriate. The point is that the environment-specific choices are parameters, while the architecture is repeatable.

Build and Validate Workflow

Start with validation and what-if:

az account set --subscription SUB-PROD
az group create --name rg-contoso-prod-compute --location eastus
az deployment group what-if \
  --resource-group rg-contoso-prod-compute \
  --template-file main.bicep \
  --parameters @prod.parameters.json
az deployment group create \
  --name contoso-prod-20260505 \
  --resource-group rg-contoso-prod-compute \
  --template-file main.bicep \
  --parameters @prod.parameters.json

After deployment, verify instance and network state:

az vmss list-instances -g rg-contoso-prod-compute -n vmss-web-prod --output table
az vm list -g rg-contoso-prod-compute -d --output table
az network nic list-effective-nsg -g rg-contoso-prod-compute -n vm-app-01-nic
az monitor metrics list --resource RESOURCE_ID --metric "Percentage CPU"

Then enable or confirm backup:

az backup protection enable-for-vm \
  --resource-group rg-contoso-prod-ops \
  --vault-name rsv-contoso-prod \
  --vm /subscriptions/SUB-PROD/resourceGroups/rg-contoso-prod-compute/providers/Microsoft.Compute/virtualMachines/vm-app-01 \
  --policy-name AppServerDaily

Failure Injection Walkthrough

Failure 1: The web scale set deploys, but the site returns 502 through the application gateway. Check backend health. If probes fail, verify the probe path, port, NSG rules, guest firewall, and whether the web service started after cloud-init or extension execution. If instances are unhealthy because an extension failed, review extension status and logs before changing the load balancer.

Failure 2: The Windows app VM cannot be reached by RDP. Because policy denies public RDP, this is expected unless Bastion or VPN is configured. Check Bastion subnet name, Bastion public IP, VNet peering or route path, NSG rules, and guest firewall. Do not add a public IP and open 3389 unless the requirement changes.

Failure 3: Backup enablement fails for vm-app-02. Check vault region, RBAC, whether the VM is already protected, locks, policy, VM agent health, and backup extension status. Compare with vm-app-01, which is protected, to identify configuration drift.

Failure 4: Seasonal traffic arrives, but autoscale does not add web instances. Check autoscale min and max, metric rule duration, cooldown, CPU metric availability, and quota for the VM family in East US. If max count is 3 and current count is 3, the rule is working as configured but the design limit is too low.

Design Tradeoffs

DecisionGood answerWeak answer
Management accessBastion or private connectivityPublic RDP or SSH to every VM
Web scaleScale set with health probesManual clone of one VM during outage
App availabilityMultiple VMs across zones or setOne large VM only
DeploymentBicep with parameters and what-ifPortal-only undocumented build
RecoveryBackup policy and restore testAssume snapshots are enough
StateExternal database and durable disksStore customer data on temporary disk

Exam Case Study Method

Read the constraints first: security, region, availability, cost, and management model. Then map each requirement to the smallest feature that satisfies it. If a public IP is forbidden, eliminate answers that open inbound internet management. If stateless scale is required, prefer scale sets and autoscale. If the requirement is cross-region disaster recovery, availability zones are not enough. If the requirement is a repeatable deployment, prefer Bicep or ARM over portal steps.

Finally, validate operational completeness. A VM that deploys successfully but lacks backup, monitoring, secure access, and patch management is not finished. Azure administrators are tested on running the environment, not merely creating it.

Test Your Knowledge

In the Contoso case, why is a VM Scale Set a strong fit for the web tier?

A
B
C
D
Test Your Knowledge

Security policy denies direct RDP and SSH from the internet. Which design choice aligns with the requirement?

A
B
C
D
Test Your Knowledge

A what-if operation shows that a deployment will replace a production NIC. What should the administrator do next?

A
B
C
D