Foundry Lab Sequence for AI-901
Key Takeaways
- Use labs to recognize the Foundry workflow: source check, project, model deployment, playground test, lightweight client call, agent test, and review.
- A model catalog entry is not enough for an app; a deployment makes the model callable and defines configuration such as content filtering and limits.
- Practice service selection across text, speech, vision, image generation, information extraction, search, and content safety instead of memorizing portal clicks.
- Content Understanding is the high-yield choice for schema-based extraction from documents, images, audio, and video, especially when confidence and grounding matter.
- Responsible AI belongs in every lab: define the user, likely harm, review path, transparency note, privacy boundary, and failure behavior.
Beginner Labs That Match the AI-901 Blueprint
AI-901 is a fundamentals exam, but Microsoft now describes candidates as needing conceptual knowledge plus foundational technical skills with Azure resources, Python syntax, and Foundry. Treat labs as small recognition drills: you are trying to see the moving parts, not become an AI engineer in one week. If you cannot access a paid Azure subscription, use Microsoft Learn exercises, portal screenshots, and code samples as a read-only lab. The goal is to answer workflow and service-selection questions without inventing features that Microsoft does not document.
Before the first lab, confirm the current AI-901 page and open the exam sandbox so the interface is familiar. This lowers wasted attention on exam day and keeps older AI-900 habits from drifting into review.
| Step | Lab target | What to do | AI-901 decision point |
|---|---|---|---|
| 1 | Source and scope | Read the exam page, study guide headings, and sandbox notes | Separate current AI-901 from older AI-900 habits |
| 2 | Foundry tour | Identify resource, project, model catalog, deployments, playground, tools, and agents | Know where a solution is planned and tested |
| 3 | Model deployment | Pick a chat-capable model, deploy it, and record the deployment name | Catalog lists options; deployment makes a model callable |
| 4 | Prompt trial | Test a system instruction and user prompt in the playground | Distinguish system behavior from user task input |
| 5 | Lightweight client | Read a short Python example that calls the endpoint with a deployment name | Recognize endpoint, credential, request, response, and error handling roles |
| 6 | Single agent | Build or trace a single-agent example with instructions and one tool | Agent uses instructions, tools, and state to complete a task |
| 7 | Text and speech | Try sentiment, key phrases, NER, speech to text, and text to speech examples | Match input and output modality to the right Foundry Tool |
| 8 | Vision and images | Compare visual prompt interpretation, OCR, image analysis, and image generation | Analyze an existing image versus create a new image |
| 9 | Extraction | Use Content Understanding examples for a form, receipt, call, or video | Extract fields into a schema with confidence and grounding |
| 10 | Search and review | Sketch a RAG flow with Azure AI Search, content safety, and human review | Search retrieves grounding; the model generates the answer |
The Foundry model lab should be simple. Pick a model because of capability, cost, latency, modality, or region, then deploy it before trying to call it. In exam language, chat models generate or reason over text, embedding models turn content into vectors for retrieval, multimodal models can interpret more than text, and image generation models create new visual output. Do not answer model-selection scenarios with the biggest model by default. Fundamentals questions often reward the least complex service that meets the requirement.
The prompt lab should force you to label the parts of a request. A system message sets behavior and boundaries. A user message carries the task. Parameters influence generation, but they do not add private knowledge. If a question needs current company policy, the safer pattern is retrieval-augmented generation with a search index or connected data, not asking the model to remember facts it was never given.
The agent lab should stay single-agent. AI-901 mentions creating and testing single-agent solutions in Foundry, so practice the idea that an agent can follow instructions and use tools. Do not overbuild multi-agent orchestration for this exam. The useful mental model is: a plain chat app replies from prompt and context; an agent can call an approved tool, inspect results, and continue toward a goal. Responsible design asks what the agent may do automatically, what needs confirmation, and where logs or traces would be reviewed.
For text and speech, keep the choices concrete. Use Azure AI Language for text analytics such as named entity recognition, key phrase extraction, sentiment, summarization, and PII detection. Use Azure AI Speech for speech to text, text to speech, and speech translation scenarios. If the scenario says a spoken prompt is handled by a deployed multimodal model, recognize the multimodal path; if it asks for a traditional transcription workflow, Speech is the cleaner answer.
For vision, extraction, and search, watch the output format. Azure AI Vision or a multimodal model can analyze image content. Image generation creates a new image. Content Understanding is stronger when the requirement is structured extraction from documents, images, audio, or video into a defined schema with confidence and source grounding. Azure AI Search is the retrieval layer for enterprise content and RAG, not a replacement for the generative model.
End each lab by writing the input, output, chosen service, and one responsible AI control such as human review, disclosure, privacy minimization, or content safety. Those notes are closer to AI-901 reasoning than memorizing portal clicks.
A learner finds a suitable chat model in the Foundry model catalog and immediately plans to call it from a Python app. What step is missing?
A beginner lab uses scanned invoices and needs vendor, date, line items, and total returned as structured fields with confidence and source grounding. Which Foundry-focused capability is the best fit?