Why a Skills-Measured Deep Dive for AI-103
Microsoft AI-103, Developing AI Apps and Agents on Azure, is the exam behind the Microsoft Certified: Azure AI Apps and Agents Developer Associate credential. It replaces AI-102, which Microsoft retires on June 30, 2026. If you already read our general AI-103 Exam Guide 2026, this is the companion: a domain-by-domain breakdown of the official AI-103 study guide with a practice-question strategy for each of the five objective areas.
The reason this matters is that AI-103 is not a vocabulary test. Microsoft's audience profile describes candidates as Azure AI engineers who build, manage, and deploy agents and AI solutions that take advantage of Microsoft Foundry, with Python experience and familiarity with general AI, generative AI, and Azure services. Scenario questions ask which Foundry service, which retrieval mode, which agent tool schema, or which safety control fits a constraint. The fastest way to get there is to practice by domain, not by reading a single stack of notes.
The Official AI-103 Skills Measured Snapshot
Microsoft lists the skills measured as of April 16, 2026 on the AI-103 study guide. There are five domains, and the two highest-weighted areas together account for roughly 55 to 65 percent of the exam.
| Domain | Weight | Question emphasis |
|---|---|---|
| Plan and manage an Azure AI solution | 25–30% | Foundry service choice, deployment, security, quotas, cost, monitoring, responsible AI guardrails |
| Implement generative AI and agentic solutions | 30–35% | RAG, LLM and multimodal consumption, tool-augmented flows, agent roles and tools, multi-agent orchestration, evaluation, tracing |
| Implement computer vision solutions | 10–15% | Text-to-image and text-to-video generation, editing, multimodal understanding, Content Understanding, prompt-injection defense |
| Implement text analysis solutions | 10–15% | Entity, topic, and summary extraction, structured JSON, sentiment and safety, Azure Translator, speech-to-text and text-to-speech |
| Implement information extraction solutions | 10–15% | Ingestion and indexing, semantic, hybrid, and vector search, custom skill enrichment, OCR-based RAG, Content Understanding analyzers |
A practical rule: if a topic appears in the 25–30% or 30–35% band, expect multiple scenario questions on it. The three 10–15% domains are smaller but each is technical enough that a thin review will cost you points.
Domain 1: Plan and Manage an Azure AI Solution (25–30%)
What the official outline tests
The AI-103 study guide splits this domain into four skill groups:
- Choose the appropriate Foundry services — pick the right model class (LLM, small language model, multimodal, Foundry Tools), the right service for grounding, vector search, agent workflows, or multimodal processing, the right retrieval and indexing method, and the right memory, tool, and knowledge integration services for agents.
- Set up AI solutions in Foundry — design Azure infrastructure for AI apps and agents, choose deployment options, configure model and agent deployments, and integrate Foundry projects with CI/CD pipelines.
- Manage, monitor, and secure AI systems — quotas, scaling, rate limits, and cost footprints; model drift, safety events, and grounding quality; data ingestion and search index health; managed identity, private networking, keyless credentials, and role policies.
- Implement responsible AI — safety filters, guardrails, risk detection, content moderation; evaluators, safety evaluations, and explanation tooling; trace logging, provenance metadata, and approval workflows; agent oversight modes, constraints, and tool-access controls.
Practice-question strategy
High-yield drill topics
- Managed identity vs. keyless credentials vs. API keys, and when each is appropriate.
- PTU vs. provisioned vs. global vs. batch deployment, and the cost or latency trade.
- Private endpoints, network isolation, and Foundry project-level RBAC.
- Quota, rate-limit, and scaling behavior for model and agent workloads.
- Safety filters, content moderation, and responsible AI evaluators as configuration choices, not afterthoughts.
- CI/CD for Foundry projects: what gets promoted, how agent definitions are versioned.
Domain 2: Implement Generative AI and Agentic Solutions (30–35%)
This is the largest domain and the one that most distinguishes AI-103 from the older AI-102 service-by-service structure.
What the official outline tests
- Build generative applications by using Foundry — deploy and consume LLMs, small models, code models, and multimodal models; implement retrieval-augmented generation; design workflows, tool-augmented flows, and multistep reasoning pipelines; evaluate models and apps for fabrications, relevance, quality, and safety; integrate generative workflows through Foundry SDKs and connectors; connect an application to a Foundry project.
- Build agents by using Foundry — define agent roles, goals, conversation-tracking, and tool schemas; integrate retrieval, function-calling, and conversation memory; wire up agent tools (APIs, knowledge stores, search, content understanding, custom functions); orchestrate multi-agent solutions; build autonomous or semiautonomous workflows with safeguards and approval flow controls; monitor and evaluate deployed agents and perform error analysis.
- Optimize and operationalize generative AI systems — prompt engineering and model parameter tuning; reflection, chain-of-thought evaluation, and self-critique loops; tracing, token analytics, safety signals, and latency observability; orchestrate multiple models, flows, or hybrid LLM-plus-rules engines.
Practice-question strategy
This domain is where hands-on labs pay off most. A conceptual reader can name RAG; a builder can diagnose why a grounded answer is empty, which retrieval mode returns the wrong chunks, or why an agent loop never exits. Pair every sub-skill with a small Foundry lab: a RAG app over Azure AI Search with vector and hybrid retrieval, an agent with at least two tools and a human approval gate, a multi-agent handoff, and an evaluation run that reports groundedness, relevance, and safety. Then drill scenario questions that describe a symptom (agent loops, tool returns the wrong shape, answers not grounded, unsafe output) and ask for the fix.
High-yield drill topics
- RAG ingestion: chunking, embeddings, integrated vectorization, and why hybrid plus semantic ranker usually beats pure vector search.
- Agent tool schemas and function calling: OpenAPI spec shape, parameter validation, and error returns.
- Conversation memory: per-thread short-term memory vs. long-term memory, and when to summarize.
- Multi-agent orchestration patterns: coordinator, handoff, and parallel fan-out, plus when each breaks.
- Approval gates and tool-access controls for high-impact actions (refunds, account changes, deletions).
- Evaluators: groundedness, relevance, coherence, fluency, safety, and what each one actually measures.
- Tracing and token analytics as debugging tools, not just observability decoration.
Domain 3: Implement Computer Vision Solutions (10–15%)
What the official outline tests
- Image- and video-generation solutions — text-to-image and text-to-video generation from prompts and reference media, inpainting and mask-based edits, prompt-driven modifications, video editing workflows, and platform generation and editing controls.
- Multimodal understanding workflows — visual context analysis with multimodal models, single and multi-image captions, visual question-answering, alt-text and accessibility-aligned descriptions, Azure Content Understanding for visual characteristics, video segment analysis, single-task and pro-mode Content Understanding pipelines, and object, component, or region identification.
- Responsible AI for multimodal content — filters for unsafe or disallowed visual content, indirect prompt-injection detection via embedded text in images, and visual policy rules such as watermarks, prohibited symbols, and brand usage enforcement.
Practice-question strategy
This domain is smaller but unusually trap-prone because candidates underestimate multimodal prompt injection. Drill two question families: (1) generation-vs-editing control selection, where the question describes a target output and you pick the right tool or control; and (2) safety and injection scenarios, where an image contains embedded text that hijacks the model. Build one lab that captions images and one that runs a Content Understanding pipeline over a video segment so you can speak to the configuration choices from memory.
High-yield drill topics
- When to use single-task vs. pro-mode Content Understanding.
- The difference between captions, alt-text, and extended image descriptions, and the accessibility alignment.
- Indirect prompt injection via embedded image text, and the mitigations Microsoft Foundry provides.
- Watermarking, prohibited-symbol detection, and brand-usage enforcement as visual policy rules.
Domain 4: Implement Text Analysis Solutions (10–15%)
What the official outline tests
- Language model text analysis — extract entities, topics, summaries, and structured JSON outputs with generative prompting and Foundry Tools; detect sentiment, tone, safety issues, and sensitive content; translate text with Azure Translator in Foundry Tools or LLM-powered translation flows; customize outputs for domain tasks like compliance summarization and domain-specific extraction.
- Speech solutions — speech-to-text and text-to-speech for agentic interactions, speech as an agent modality including custom speech models, multimodal reasoning from audio inputs, and speech translation with language models and Foundry Tools.
Practice-question strategy
The trap here is defaulting to an LLM for every text task. Microsoft tests when Azure Translator is the better choice over an LLM translation flow, when structured JSON output beats freeform summarization, and when speech is the right agent modality instead of text. Drill decision questions that pair a task (compliance summarization, real-time translation, agent voice interaction) with the right service combination. Build one speech-to-text-to-LLM-to-text-to-speech loop so the modality switches are concrete.
High-yield drill topics
- Structured JSON output vs. freeform summary, and the prompt patterns that produce reliable schema.
- Azure Translator vs. LLM translation: cost, latency, determinism, and domain customization.
- Sentiment, tone, safety, and sensitive-content detection as configurable signals.
- Custom speech models and when they are worth the training cost.
- Speech as an agent modality, including the handoff between speech-to-text, reasoning, and text-to-speech.
Domain 5: Implement Information Extraction Solutions (10–15%)
What the official outline tests
- Build retrieval and grounding pipelines — ingest and index documents, images, audio, and video; configure semantic, hybrid, and vector search for grounding; enrich with custom or built-in skills for text, images, and layout; configure RAG ingestion flow including OCR; connect retrieval pipelines to workflows and agent tools.
- Extract content from documents — multimodal pipelines combining OCR, layout analysis, and field extraction; clean grounded representations for agents and RAG with Content Understanding; structured or markdown output analyzers for downstream reasoning with Content Understanding.
Practice-question strategy
This domain overlaps with the generative AI domain's RAG coverage, but here the lens is extraction and indexing rather than generation. Drill questions that describe a source format (scanned PDF, form with tables, image-heavy contract, audio recording) and ask which pipeline produces clean structured output for downstream use. Build one end-to-end pipeline: ingest a PDF, run OCR and layout analysis, enrich with a custom skill, index for hybrid and vector search, and expose the index as an agent tool. That single lab answers most of the domain's scenario questions.
High-yield drill topics
- Semantic vs. hybrid vs. vector search, and when each is the right grounding strategy.
- Custom skill enrichment vs. built-in skills, and the enrichment pipeline order.
- OCR, layout analysis, and field extraction as a combined multimodal pipeline.
- Content Understanding analyzers that produce structured or markdown output for downstream reasoning.
- Connecting a retrieval pipeline directly to an agent tool vs. calling search inline.
Hands-On Labs vs. Conceptual Questions
AI-103 rewards a mix of two practice modes. Conceptual drilling builds the decision tables you need for service-choice questions. Hands-on labs build the symptom-to-cause recognition you need for scenario questions. If you only do one, you leave points on the table.
A useful split, weighted by domain weight:
| Domain | Conceptual drilling | Hands-on lab time |
|---|---|---|
| Plan and manage (25–30%) | Decision tables for identity, deployment, monitoring | One Foundry project with identity, RBAC, deployment, cost controls |
| Generative AI and agents (30–35%) | Evaluator definitions, tool schema rules | RAG app, agent with tools and approval gate, multi-agent handoff, evaluation run |
| Computer vision (10–15%) | Generation vs. editing controls, injection mitigations | Image caption + Content Understanding video pipeline |
| Text analysis (10–15%) | Translator vs. LLM, JSON output patterns | Speech-to-text-to-LLM-to-text-to-speech loop |
| Information extraction (10–15%) | Retrieval mode comparison, skill enrichment order | PDF-to-index pipeline with OCR, layout, custom skill, hybrid search |
If your time is short, spend lab hours on the generative AI and planning domains first. Those two domains carry more than half the exam and they are the ones where a conceptual-only candidate breaks down on scenario wording.
A Domain-Weighted AI-103 Practice Schedule
This eight-week schedule is built around the official weights, not around equal time per domain. Compress it if you already build Foundry apps at work; stretch it if you are coming from AI-900 or general software development.
| Week | Domain focus | Practice output |
|---|---|---|
| 1 | Plan and manage (25–30%) | 25 mixed questions; build a Foundry project with identity, deployment, cost controls |
| 2 | Plan and manage + responsible AI | 25 questions on safety, guardrails, monitoring, and oversight controls |
| 3 | Generative AI foundations (30–35%) | 30 questions on RAG, model choice, structured output, and evaluation |
| 4 | Agents and tools | 30 questions on tool schemas, function calling, memory, approval gates |
| 5 | Multi-agent and operationalization | 25 questions on orchestration, tracing, token analytics, error analysis |
| 6 | Computer vision (10–15%) | 15 questions; build a caption + Content Understanding lab |
| 7 | Text analysis + information extraction (10–15% each) | 30 questions; build a PDF-to-index pipeline and a speech loop |
| 8 | Timed mixed review | 50 mixed questions across all five domains, explanations after submit |
Tag every miss by failure type: concept, service boundary, syntax, sequence, or speed. Concept misses require documentation review. Service-boundary misses require a comparison table. Syntax misses require a short hands-on drill. Sequence misses require writing the order of operations. Speed misses require smaller timed sets. Rereading a chapter will not fix a lab-verification problem, so let the miss type drive the fix.
How to Use OpenExamPrep AI-103 Practice
- Diagnostic pass — 30 mixed questions to find your weak domains. Do not time this pass; read every explanation.
- Targeted pass — drill the two highest-weighted domains first (generative AI and agents, then planning and management). Tag misses by sub-skill, not just by domain.
- Scenario pass — work the agent, RAG, and information extraction scenario questions. For each, name the constraint, the layer, and the least-risky action before you commit.
- Timed pass — mixed review across all five domains, explanations only after submit, to rehearse pacing for the 120-minute exam.
Pair the practice with the general AI-103 Exam Guide 2026 for cost, format, retake, and an eight-week study plan. This skills deep dive is the practice companion; the exam guide is the overview.
Official Sources and Current Checks
- Microsoft AI-103 study guide (skills measured as of April 16, 2026): https://learn.microsoft.com/en-us/credentials/certifications/resources/study-guides/ai-103
- Certification page (Azure AI Apps and Agents Developer Associate): https://learn.microsoft.com/en-us/credentials/certifications/azure-ai-apps-and-agents-developer-associate/
- Microsoft exam scoring and score reports (700 pass score on a 1000-point scale): https://learn.microsoft.com/en-us/credentials/certifications/exam-scoring-reports
- Microsoft exam retake policy (24 hours after the first attempt, then varies): https://learn.microsoft.com/en-us/credentials/support/retake-policy
Microsoft updates certification exams periodically. The skills measured above are current as of April 16, 2026. Recheck the official study guide before you schedule, because AI certifications are in an active transition period and Microsoft updates the English exam first.
