2.1 Model Deployments and Playgrounds

Key Takeaways

Microsoft Foundry organizes AI app work around projects, models, agents, tools, evaluations, monitoring, and enterprise controls.
The model catalog is for choosing models by capability, modality, cost, latency, region, and deployment option; it is not the same thing as a callable endpoint.
A model deployment gives a selected model a deployment name and configuration so application code can send inference requests to it.
Deployments can include model version, capacity or provisioning type, content filtering configuration, and rate limiting details, depending on the model.
The model playground is the fastest place to test prompts, parameters, safety behavior, model comparisons, and generated code before building an app.

Last updated: June 2026

The Foundry Build Flow

Microsoft Foundry is the Azure platform surface for building AI apps with models, agents, tools, evaluations, tracing, monitoring, and governance in one place. For AI-901, the important pattern is not memorizing every portal blade. It is recognizing the order of work: organize the solution in a project, pick the right model, make that model available for inference, test behavior, and then integrate it into an application.

A Foundry project keeps related assets together: model deployments, data connections, prompts, indexes, agents, evaluations, and monitoring artifacts. A Foundry resource is the Azure resource layer that supplies the model and tool capabilities. Microsoft documentation has active new and classic portal experiences, so exam questions should be read for concepts rather than exact menu names.

Catalog, Deployment, Endpoint, Playground

Foundry item	What it means for AI-901	Do not confuse it with
Model catalog	The place to discover and compare models by provider, task, modality, region, price, and deployment type	A running endpoint
Model deployment	A named configuration that makes a selected model callable for inference	The model card itself
Inference endpoint	The API surface your app calls, often using the deployment name in the `model` parameter	The browser playground
Model playground	A controlled test space for prompts, parameters, tools, safety, comparison, and code export	Production monitoring
Instant model	A preview shortcut that lets supported models be called by name without a deployment	A replacement for all production deployments

The local AI-901 cheat sheet compresses this as Catalog | Deploy | Test | Code. That is a strong exam memory aid. The catalog helps you choose. The deployment lets you use. The playground helps you test. The SDK or API lets your application call the model.

What A Deployment Adds

A deployment gives a model a stable name and configuration. Microsoft documents deployment details such as model name, model version, capacity or provisioning type, content filtering configuration, and rate limiting configuration. The exact fields depend on the model and deployment type.

This distinction matters in scenarios. A team might browse a capable multimodal model in the catalog, but the app still cannot call it until the model is deployed or otherwise available through an approved access pattern. A Foundry resource can host many deployments, and billing is tied to inference performed on those deployments, not to merely reading a catalog entry.

Deployment choice is also where production constraints show up. If the app needs a specific region for data residency, reserved throughput, custom content filters, custom guardrails, endpoint-specific configuration, quota partitioning, or a fine-tuned model, a named deployment is usually the safer answer than a preview instant-model shortcut.

How To Choose A Model

Use the scenario, not the model name alone:

Identify the modality. Text chat, embeddings, image input, image generation, speech, and multimodal reasoning point to different model families or Foundry Tools.
Match capability to risk. Simple extraction may not need the largest reasoning model. Complex multi-step analysis may need a stronger model even if it costs more.
Check constraints. Region, latency, quota, data handling, content filters, and available deployment types can eliminate otherwise attractive choices.
Prefer grounding over guessing. If the answer depends on private or changing facts, plan for retrieval-augmented generation instead of relying only on the model's training data.
Prototype before coding. Use the playground to tune prompts, parameters, and safety before committing to SDK code.

Why The Playground Matters

The model playground is where a candidate can see the practical effect of prompt wording, system messages, temperature, maximum output length, and tools such as web search, file search, or code interpreter where available. It can also compare models side by side under synchronized inputs, which helps with price-to-performance decisions.

For AI-901, treat the playground as the bridge between concept and implementation. If a question says the team wants to test tone, response format, grounding, or safety behavior before writing code, the playground is the natural choice. If the question says a production app must call the model repeatedly, the answer shifts toward deployment names, endpoints, authentication, and SDK/API integration.

Test Your Knowledge

A developer has reviewed a model card in the Foundry catalog and decided it fits a customer-support prototype, but the app has no stable model name to use in requests and no inference configuration has been created. What should the developer do next for a deployment-based build?

Create a named model deployment with the needed configuration, then call that deployment from the app.

Assume the catalog page automatically creates an endpoint for every project.

Fine-tune the model before any prompt or playground testing is possible.

Move the project to Microsoft 365 because Foundry models cannot be called from code.

Test Your Knowledge

A team is prototyping with instant models, but the production version must pin behavior, use a custom content filter, and keep traffic in a specific supported region. Which approach best fits those requirements?

Use a named deployment because production configuration requirements are tied to deployment settings.

Use only instant models because they always provide the strongest enterprise controls.

Use the playground transcript as the production endpoint.

Avoid Foundry and train a foundation model from scratch.

Up Next

2.2 Prompts, RAG, and Evaluation

Continue learning

Microsoft Certified: Azure AI Fundamentals

Microsoft Certified: Azure AI Fundamentals (AI-901)

2.1 Model Deployments and Playgrounds

Key Takeaways

The Foundry Build Flow

Catalog, Deployment, Endpoint, Playground

What A Deployment Adds

How To Choose A Model

Why The Playground Matters

Microsoft Certified: Azure AI Fundamentals

1Chapter 1: AI-901 Format and Responsible AI

2Chapter 2: Microsoft Foundry, Models, and Agents

3Chapter 3: Azure AI Services, Vision, Language, and Extraction

4Chapter 4: AI-901 Scenario and Service Selection

5Chapter 5: Practice Labs, Common Traps, and Final Review

Microsoft Certified: Azure AI Fundamentals (AI-901)

2.1 Model Deployments and Playgrounds

Key Takeaways

The Foundry Build Flow

Catalog, Deployment, Endpoint, Playground

What A Deployment Adds

How To Choose A Model

Why The Playground Matters