In a retrieval-augmented chain, which component combines the user's question and the retrieved passages into the final input sent to the model?

A prompt template or prompt-building step. The prompt-building step assembles the user request and retrieved context into the model-ready input. An output parser runs later, after the model responds; the embedding model only turns text into vectors upstream; and a secret scope just stores credentials.

A contract-extraction workflow occasionally omits required JSON keys, breaking downstream code. Which change is most likely to stabilize the pipeline?

Use schema-constrained output with validation. A defined schema plus validation makes the output predictable and catches malformed responses before they reach downstream logic, which is far more reliable than hoping prompt wording keeps formatting consistent. A larger window, a friendlier tone, or extra documents do not enforce the required fields.

A prompt workflow always follows the same sequence: retrieve context, build a prompt, call the model, and return an answer. Which orchestration style fits best?

A chain. A chain suits fixed, predictable steps that need no dynamic tool selection or looping, and it is easier to test and optimize than an agent for a straightforward RAG flow. An autonomous agent adds unnecessary flexibility, cost, and evaluation difficulty when the path never changes.

Prompt Templates & Chains — Free Study Guide 2026

Prompt Templates: The Assembly Point

A prompt template is a reusable scaffold with placeholders that the application fills at run time. In a retrieval-augmented chain it is the prompt-building step that combines the user's question with the retrieved passages to form the final model-ready input. This is a frequently tested distinction: the component that merges question and context is the prompt template, not the embedding model (which turns text into vectors upstream), not a secret scope (which stores credentials), and not an output parser. The output parser runs later in the pipeline, after the model has already produced a response, to turn that text into a structured object.

Two layers of prompt matter. The system prompt sets persistent behavior — role, rules, output format, safety constraints — that should hold across turns. The user prompt carries the per-request instruction plus the retrieved context. Put stable instructions in the system prompt so they are not accidentally overridden by user input or by injected context.

Passing Retrieved Context Well

How you place context inside the template strongly affects grounding. The practices the exam rewards:

Delimit and position: place retrieved context close to the question with clear delimiters, and instruct the model to ground its answer in it. Dumping passages after the question with no delimiters, or shuffling them randomly each call, weakens the model's ability to use the evidence.
Ground and abstain: instruct the model to answer only from the provided context and to say it does not know when the answer is unsupported.
Carry source metadata: pass each chunk's source alongside its text and instruct the model to cite it, so answers include verifiable citations.
Inject user fields in the prompt: values like account tier or product SKU shape the answer only when they appear in the prompt the model sees; storing them only in index metadata helps retrieval but never reaches generation.

When retrieved passages overflow the context budget, rank and select the top passages (optionally summarizing) rather than concatenating everything or truncating by position.

Multi-Step Chains

A chain is a fixed sequence of calls whose control flow is known at design time. A prompt workflow that always runs retrieve context, build prompt, call model, return answer is a textbook chain — easier to reason about, test, and optimize than an autonomous agent, and the right choice when no dynamic tool selection or looping is required. Use an agent only when the path depends on intermediate results; do not graduate to one when a deterministic chain solves the problem.

Modern LangChain expresses chains compositionally, piping a prompt template into a model into an output parser (prompt | llm | parser), with a passthrough carrying the original question alongside the retrieved context. The mental model is a pipeline of typed steps, each independently testable. Separating retrieval, prompt assembly, generation, and post-processing into explicit components also lets you measure and swap each stage independently, which is why monolithic prompts are discouraged — a regression could come from any hidden stage with no clear boundary.

For tasks that fail in one shot, decompose the reasoning: ask the model to work step by step (chain-of-thought) or split the task into intent classification, retrieval, and answer generation so you can reserve an expensive model only for the step that needs it. When only a few labeled examples exist, few-shot prompting (showing input-output pairs) steers format and behavior most effectively without any training.

Output Parsers and Structured Responses

The output parser is the final stage that converts the model's raw text into something downstream code can use — a string, a JSON object, or a validated typed record. The dominant exam theme is structured output: when a downstream workflow expects every response to contain fixed fields (for example status, priority, and owner, or extracted invoice fields for SQL joins), ask the model for a fixed schema rather than free-form prose. A schema gives downstream code something predictable to validate and consume; it makes integration far more robust than parsing prose, though it does not guarantee zero hallucinations or always reduce token cost.

When an extraction pipeline intermittently omits required JSON keys, the fix is schema-constrained output with validation, which catches malformed responses before they break downstream logic — more reliable than hoping prompt wording alone keeps formatting consistent. Pair structured output with an offline JSON-parseable / schema-conformance metric so you measure how often responses actually validate.

Component	Role	When it runs
Prompt template	Assemble question + context + user fields	Before the model call
Chain	Sequence the fixed steps deterministically	Orchestrates the whole flow
Output parser	Convert response text to a structured object	After the model call

A Worked Example: Support-Ticket Router

Consider a chain that routes support tickets. The retriever pulls similar past tickets; the prompt template injects those examples with delimiters, states the routing rules, and lists the exact fields required; ChatDatabricks generates; and a JSON output parser returns {status, priority, owner}. Because a downstream queue expects those three machine-readable fields, the template demands a fixed schema and the parser validates it, so a malformed row is caught rather than silently corrupting the queue. If the model sometimes emits prose alongside the JSON, an explicit return only valid JSON instruction plus schema validation stabilizes it far better than rewording the tone. Each stage is measurable in isolation: retrieval relevance, schema-conformance rate, and routing accuracy are tracked separately so you can localize a regression instead of guessing which step failed.

Finally, treat prompts as governed assets: Databricks can version prompts in Unity Catalog, promote them with aliases, and let subject-matter experts edit them without code changes, so prompt iteration follows the same controlled release path as models and indexes.

Databricks Generative AI Engineer Associate Certification

Databricks Generative AI Engineer Associate

4.3 Prompt Templates & Chains

Key Takeaways

Prompt Templates: The Assembly Point

Passing Retrieved Context Well

Multi-Step Chains

Output Parsers and Structured Responses

A Worked Example: Support-Ticket Router

Databricks Generative AI Engineer Associate Certification

1Introduction & Exam Strategy

2Design Applications

3Data Preparation

4Application Development

5Assembling & Deploying Applications

6Governance, Evaluation & Monitoring

Databricks Generative AI Engineer Associate

4.3 Prompt Templates & Chains

Key Takeaways

Prompt Templates: The Assembly Point

Passing Retrieved Context Well

Multi-Step Chains

Output Parsers and Structured Responses

A Worked Example: Support-Ticket Router