A claims operations app must answer policy questions from documents and also create a follow-up task in another system only when the situation requires it. Which architecture fits best?

A tool-calling agent with a retrieval tool and a workflow tool. Mixing knowledge retrieval with selective action-taking is the signature of an agent: a tool-calling agent with retrieval and workflow tools can decide when to file the task. A plain RAG chain answers questions but is not designed to decide when to call external systems, and a batch AI Function only transforms rows.

In Databricks agent development, what does wrapping an agent with the MLflow ResponsesAgent interface accomplish?

It standardizes how the agent is exposed so AI Playground, Agent Evaluation, and Databricks Apps can drive it uniformly. The ResponsesAgent interface standardizes how the agent is exposed so Databricks tooling works with it consistently, which is what unlocks Playground testing, evaluation, and app deployment. It does not change serving throughput, remove tool definitions, or encrypt memory.

An agent repeatedly calls the same search tool without making progress toward an answer. Which change is most likely to fix the behavior?

Tighten the tool schemas and descriptions and add explicit stopping conditions. Repeated tool loops usually signal weak tool contracts or missing termination logic rather than insufficient model creativity. Clear tool descriptions, well-structured argument schemas, and max-iteration or stopping rules address the root cause, while a bigger model or more overlap does not.

Tools, Agents & Function Calling — Free Study Guide 2026

From Chains to Agents

The Application Development domain is the exam's heaviest block at 30%, and the single most-tested judgment inside it is knowing when a problem needs an agent rather than a simpler chain. A chain is a fixed sequence of calls whose control flow is decided at design time: retrieve context, build a prompt, call the model, return the answer. An agent lets the model decide which tool or step to call next based on its reasoning over intermediate results. The exam repeatedly frames this as a trade-off: agents add flexibility but also add latency, cost, and harder evaluation, so you should never graduate to an agent unless a single retrieval pass genuinely cannot solve the task.

Dimension	RAG chain	Tool-calling agent
Control flow	Fixed, known at design time	Dynamic, model-decided
Best for	Retrieve-read-answer Q&A	Multi-step, conditional, actions
Latency	Lower, predictable	Higher, variable
Evaluation	Answer quality	Tool choice + args + final success
Example	Handbook chatbot with citations	Claims app that also files a ticket

When an agent beats a static chain: the workflow needs multiple steps, tool calls, conditional branching, iterative retrieval, or the ability to take an action in an external system. If a FAQ assistant has a strict retrieve-read-answer flow under a tight latency SLA, a chain is preferable because it is more predictable and lower latency. But once the app must answer policy questions, create access tickets, and check ticket status in external systems, retrieval alone is not enough and an agent or hybrid design fits.

Function / Tool Calling Mechanics

Tool calling (also called function calling) is the bridge between the model and live data or systems. You define a set of tools, each with a name, a natural-language description, and a typed argument schema, and the model emits a structured request to invoke one when it decides the task requires it. Your runtime executes the function, returns the result, and the model conditions its next output on that observation. The key division of responsibility for the exam: the model decides WHEN to call a tool; you decide WHICH tools exist and you validate the call schema. A support copilot that must fetch a customer's live order status from an internal API needs tool calling, not a larger context window, which helps reasoning over text but creates no live access to operational systems.

ReAct-Style Reasoning

Most tool-using agents follow a ReAct ('Reason + Act') loop: the model produces a Thought, chooses an Action (a tool call), reads the Observation returned, and repeats until it has enough information to give a final answer. A research assistant that must dynamically decide whether to search documents, run a calculator, or query a database until it can answer is the canonical ReAct case. Because the loop is open-ended, you must bound it with stopping conditions: a maximum iteration count and clear termination logic, or the agent can spin. When an agent keeps calling the same search tool without making progress, the fix is rarely a bigger model; it is tighter tool schemas, clearer tool descriptions, and explicit stopping rules.

The Databricks Mosaic AI Agent Framework

Databricks provides platform components so you do not build agent plumbing from scratch:

MLflow ResponsesAgent is the standard interface for exposing an agent so Databricks tooling (AI Playground, Agent Evaluation, Databricks Apps) can drive it uniformly. Wrapping a custom agent in ResponsesAgent is what unlocks those downstream features; skipping it loses platform integration.
LangChain integration exposes Vector Search as a vector store; the typical pattern is DatabricksVectorSearch(...).as_retriever() to plug retrieval into chains and agents.
LangGraph models the agent as a graph of nodes and edges with explicit state, which you need for branching, loops, retries, or memory across turns. Reach for it when control flow is conditional or cyclic; a linear chain is simpler but less expressive.
Databricks SQL Agent is the right abstraction for natural-language, read-only Q&A over governed Unity Catalog tables, because it translates requests into governed SQL rather than retrieving documents.
MCP (Model Context Protocol) is a standard protocol for exposing tools, resources, and prompts to agents. Databricks can host a managed MCP endpoint so an agent reaches, say, a Vector Search index as a governed tool instead of through a bespoke connector.
Agent Bricks are managed building blocks that handle tool wiring, memory, and evaluation hooks; the March 18, 2026 blueprint deepened its coverage.

Persistent state / memory lets a multi-step agent remember intermediate facts across calls. Match the storage to the state's lifetime: in-memory for a single request, and a persistent datastore scoped to the conversation for per-session memory, so a confirmed address is remembered within one conversation but never leaks across users. Cross-session memory belongs in a Delta table or a key-value store.

Worked Example and Common Traps

Consider a claims operations app that must answer policy questions, retrieve supporting documents, and create a follow-up task in another system only when needed. This mixes knowledge retrieval with selective action, so a tool-calling agent with a retrieval tool and a workflow tool fits, while a plain RAG chain cannot decide when to call external systems. Two safety-flavored traps recur. First, a tool with real side effects, such as a refund API with financial impact, should require explicit confirmation or human-in-the-loop approval before it runs. Second, offline agent evaluation must capture tool choice, tool arguments, and final task success, not just final wording, because an agent can fail by picking the wrong tool or passing bad parameters even when the prose reads well. Together these rules explain the exam's bias: prefer the simplest architecture that meets the requirement, then add agentic power deliberately and with guardrails.

Databricks Generative AI Engineer Associate Certification

Databricks Generative AI Engineer Associate

4.4 Tools, Agents & Function Calling

Key Takeaways

From Chains to Agents

Function / Tool Calling Mechanics

ReAct-Style Reasoning

The Databricks Mosaic AI Agent Framework

Worked Example and Common Traps

Databricks Generative AI Engineer Associate Certification

1Introduction & Exam Strategy

2Design Applications

3Data Preparation

4Application Development

5Assembling & Deploying Applications

6Governance, Evaluation & Monitoring

Databricks Generative AI Engineer Associate

4.4 Tools, Agents & Function Calling

Key Takeaways

From Chains to Agents

Function / Tool Calling Mechanics

ReAct-Style Reasoning

The Databricks Mosaic AI Agent Framework

Worked Example and Common Traps