Prompt Design, Model Selection, and Structured Outputs

Key Takeaways

  • AI-103 expects you to choose models by task requirements: reasoning depth, modality, latency, context size, cost, deployment type, and tool support.
  • A strong prompt defines role, task, context, constraints, output format, refusal behavior, and examples only when examples clarify the expected pattern.
  • Structured outputs are for reliable machine-readable JSON; function calling is for deciding whether an app should execute a tool or API.
  • For schema-bound outputs, require every field, set additionalProperties to false, validate the response, and handle schema or refusal failures.
  • Generation parameters such as temperature and max output tokens are operational controls, not substitutes for grounding, tool validation, or evaluations.
Last updated: June 2026

Why This Topic Matters

AI-103 is not a trivia exam about model names. Microsoft frames candidates as engineers who build, manage, and deploy Foundry AI apps and agents. That means you need to turn a business task into a model, prompt, parameter, and output contract that a production app can trust.

Model Selection Decision Grid

RequirementPreferExam clue
Deep planning, math, code review, multi-step analysisReasoning-capable large modelHard problem, high quality, latency acceptable
Short summarization, routing, classification, simple extractionSmall or mini modelHigh volume, low cost, low latency
Images, screenshots, diagrams, audio, or mixed inputsMultimodal modelUser provides non-text evidence
Similarity search and RAG indexingEmbedding modelNeed vectors, chunk retrieval, semantic match
Exact business schemaStructured outputsDownstream code needs valid JSON
External action or live dataTool or function callingNeed API, database, workflow, or search result

A safe exam answer usually chooses the smallest capability set that satisfies the task. Do not pick a reasoning model for a simple sentiment label if a smaller model can meet accuracy and latency targets. Do not fine-tune a model just to add current facts; use retrieval-augmented generation for facts and fine-tuning for durable behavior, tone, or domain-specific response patterns.

Prompt Anatomy

A production prompt should separate durable instructions from user input and retrieved content. Use system or developer instructions for role, policy, and format. Treat user text, tool output, documents, image OCR, and retrieved chunks as untrusted input. A practical prompt checklist is:

  1. Role: what the assistant is responsible for.
  2. Task: what the current request requires.
  3. Context: facts, retrieved chunks, or tool results to use.
  4. Constraints: what not to do, what to refuse, what to ask about.
  5. Format: table, bullets, JSON schema, or terse answer.
  6. Quality rule: cite sources, ask a clarifying question, or say when evidence is missing.

Use few-shot examples when the output pattern is subtle, such as mapping support tickets to a custom severity taxonomy. Avoid long example libraries that inflate token cost and hide the real user request. For deterministic extraction or routing, lower temperature and cap output length. For brainstorming, a higher temperature may be acceptable, but the exam will usually reward predictability in business apps.

Structured JSON Outputs

Structured outputs constrain the model to a JSON Schema subset so application code can parse and validate the response. They are most useful for extraction, routing, tool arguments, evaluation records, and UI state. Design the schema as an API contract: make fields required, use enums for closed choices, use arrays or objects only where the app expects them, and set additionalProperties to false so unexpected keys do not silently leak into automation.

Function calling is related but different. Structured output answers in a shape. Function calling lets the model request a tool call with arguments, while the host app executes the tool and returns the result. A model can propose the call, but your code remains responsible for validation, authorization, retries, user confirmation for high-impact actions, and final error handling.

Official Anchors

Test Your Knowledge

A claims intake app must extract claimType, policyNumber, dateOfLoss, and priority as valid JSON for a workflow engine. The answer must reject any extra fields. Which design best fits the requirement?

A
B
C
D