2.3 Foundry Agent Service
Key Takeaways
- A Foundry agent combines a model, instructions, and tools so it can reason over a request and take multi-step action.
- Foundry Agent Service is a managed platform for building, deploying, scaling, tracing, evaluating, publishing, and monitoring AI agents.
- The Responses API is the central entry point for Foundry models and tools in the current agent architecture.
- Agent runtime concepts include agents, conversations, and responses: the agent defines behavior, the conversation can preserve context, and the response is the generated output and tool activity.
- Agents need scoped tools, identity, RBAC, safety controls, tracing, and evaluation because tool calls can affect external systems.
From Chatbot To Agent
A basic chat app sends a prompt to a model and displays the response. An agent goes further: it uses a model, instructions, and tools to work toward a goal across one or more steps. In Foundry, agents can search files, use code interpreter, call functions or APIs, connect to Model Context Protocol (MCP) servers, use memory where supported, and work through managed runtime components.
For AI-901, the simplest definition is: agent = model + instructions + tools. The model provides reasoning and language generation. Instructions define the agent's role, limits, and expected behavior. Tools let the agent access data or take actions beyond text generation.
Agent Service At A Glance
| Component | What it does | Exam signal |
|---|---|---|
| Responses API | Provides a single entry point for model calls and tool orchestration | Use when code needs Foundry model and tool access |
| Agent runtime | Hosts and scales agents, manages lifecycle, conversations, and tool calls | Use when the agent should be managed by Foundry |
| Tools | Add capabilities such as file search, web search, code interpreter, custom functions, and MCP integrations | Use when the task requires data lookup or action |
| Conversations | Preserve interaction history across turns when needed | Use for multi-turn stateful work |
| Tracing and metrics | Show model calls, tool use, decisions, latency, and reliability | Use for debugging and production monitoring |
| Publishing | Creates stable managed access for users or other systems | Use after testing and evaluation |
Microsoft documentation now emphasizes the Responses API and newer Foundry terminology. Older docs and examples may mention assistants, threads, messages, or classic portal surfaces. The exam concept remains stable: agents need orchestration, tools, and observability, not just a single generated paragraph.
Agent Types And Build Paths
Foundry Agent Service supports multiple build styles. Prompt agents are configured through instructions, model selection, and tools, and are useful for rapid prototypes or internal assistants. Hosted agents are code-based agents that can be packaged and run under Foundry-managed hosting; Microsoft marks hosted agents as preview. Workflow agents can orchestrate steps, approvals, branching, or multiple agents and are also preview in current docs.
AI-901 questions are unlikely to ask for every preview nuance. Instead, expect scenarios such as: a user asks for meeting scheduling, the agent checks a calendar tool, proposes slots, and books the selected time. That is agentic because a tool call changes the outcome. A simple chatbot that only explains what scheduling means is not enough.
Agent Runtime Pattern
A Foundry agent workflow usually follows this process:
- Create the agent. Choose the model, write instructions, configure parameters, and attach allowed tools.
- Start or reuse a conversation. Preserve the turns needed for context, but avoid keeping unnecessary sensitive data.
- Generate a response. The agent interprets the user input, may call tools, may add items to the conversation, and produces output.
- Trace the run. Inspect which model calls, tool calls, and intermediate decisions occurred.
- Evaluate behavior. Check final task completion and step-by-step tool use.
- Publish and monitor. Promote stable versions, manage access, and watch production reliability.
This process explains why agents carry more risk than ordinary chat. If a tool can update a record, send a message, query private data, or run code, the agent needs least-privilege access and clear instructions.
Tools, Identity, And Safety
Tools should be scoped to the task. A support agent that only needs order status should not have broad write access to customer records. Foundry supports identity and access controls such as Microsoft Entra authentication, RBAC, managed identities, and On-Behalf-Of authentication for some tool scenarios. The exact setup depends on the tool, region, and preview status.
Safety controls matter at multiple points. User input can contain prompt attacks. Retrieved tool output can contain hostile instructions. Final responses can reveal sensitive data or unsupported claims. That is why agent designs combine instructions, guardrails, access control, tracing, and evaluation.
Evaluating Agents
Agent evaluation should test both the final result and the process. System-level checks ask whether the agent completed the task, resolved the user's intent, and stayed within instructions. Process-level checks ask whether it chose the right tool, passed correct parameters, used tool output correctly, and avoided unnecessary or failed calls.
On AI-901, choose an agent when the scenario requires multi-step behavior, tool use, or external action. Choose a basic model call or chat playground when the scenario only needs a one-turn answer, prompt experiment, or simple text generation.
A retailer wants an AI assistant that can look up an order, check the shipping carrier's API, and create a support-ticket note after the customer confirms the summary. Which Foundry concept best matches this design?
An agent usually gives a good final answer, but traces show it sometimes calls the refund API with the wrong customer identifier. Which evaluation focus is most directly relevant?