In a RAG app, users complain that the model confidently answers questions that are not covered by the retrieved context. Which prompt update is best?

Instruct the model to answer only from the retrieved context and to say it does not know when the context lacks the answer. The failure is ungrounded generation, and the direct, low-cost fix is a grounding instruction telling the model to use only the retrieved context and to abstain when the answer is absent. Raising temperature worsens it, removing the system prompt removes constraints, and fine-tuning is heavy and does not force grounding.

Prompt outputs drift between runs for the same task, and the team wants more consistency. Which change usually helps first?

Lower the temperature toward 0. Run-to-run variation comes from sampling randomness, which temperature controls, so lowering temperature toward 0 makes outputs more deterministic. Few-shot examples address format, a larger model does not fix randomness, and chain-of-thought adds tokens without improving consistency.

You need an LLM to return a stable JSON object with fields issue_type, priority, and needs_human_review. Which prompt change is most effective?

Specify the exact JSON schema and include an example of the desired output. Reliable structured output comes from stating the exact schema and showing a concrete example so the model mirrors it. Higher temperature increases variability, prose reasoning pollutes the payload, and simply relocating the request does not enforce the shape.

Prompt Engineering Strategies — Free Study Guide 2026

Prompt Engineering for Reliable, Grounded Output

Prompt engineering is the cheapest lever in the GenAI toolbox: no training, no infrastructure, and instant iteration. The exam treats prompting as a design skill, matching a prompting technique to a requirement and knowing the failure modes each technique introduces. Databricks' AI Playground is the recommended place to iterate prompts quickly before wiring them into a chain or endpoint.

System prompts versus user prompts

Every chat request is built from roles. The system prompt sets durable behavior, including persona, rules, tone, output format, and grounding constraints, and it persists across the conversation. The user prompt carries the specific request or question for that turn. Put stable instructions such as 'You are an HR assistant; answer only from the provided context and cite the source' in the system prompt, and the variable question in the user prompt. A common design error is stuffing everything into one user message, which makes rules easy for the model to ignore and hard to reuse across requests.

Zero-shot, few-shot, and chain-of-thought

Zero-shot: instruction only, no examples. Best for simple, unambiguous tasks and lowest token cost.
Few-shot: include a handful of input-output examples in the prompt. Best when you need a specific format, label set, or style the model keeps missing; examples teach by demonstration without any training.
Chain-of-thought (CoT): ask the model to reason step by step before answering. Improves accuracy on multi-step reasoning and math, at the cost of more tokens and latency. For structured or classification tasks you usually do not want visible reasoning in the final payload.

Choosing among them is a trade-off: zero-shot is cheapest and fastest, few-shot buys reliability of format, and CoT buys reasoning accuracy but costs tokens. Match the technique to what is actually failing rather than applying the most elaborate option by default.

A worked example clarifies the escalation. Suppose a classifier must return one of three sentiment labels. Start zero-shot with a clear instruction; if the model occasionally invents a fourth label or wraps the answer in a sentence, add three few-shot examples showing exactly the label-only output you want, which usually fixes it without any code change. Only if the task also needed multi-step judgment, say weighing conflicting signals, would you add chain-of-thought - and even then you would keep the reasoning internal and return just the label. Escalate technique only as far as the observed failure requires.

Structured output

Many production steps must return machine-readable output, such as JSON or XML that downstream code parses. To make output stable, do three things: state the exact schema and field names, show a concrete example of the desired output, and instruct the model to return only that structure with no surrounding prose. To get a stable JSON object with fields like issue_type, priority, and needs_human_review, the most effective single change is to specify the schema explicitly and include an example, not to raise temperature or add more prose. The same holds for a strict XML envelope such as a summary tag followed by a tags tag: the most reliable prompt shows the exact template and a filled-in example so the model mirrors it. Where the platform supports it, structured-output or function-calling constraints enforce the schema even more reliably than instructions alone.

Prompt templates

A prompt template is a reusable scaffold with placeholders, for example a system template plus a user template that inserts the retrieved context and the question and instructs the model to answer only from that context and cite its sources. Templates are how RAG chains inject retrieved context at run time and how teams keep prompts version-controlled and testable. On the exam, the minimum RAG chain is a prompt template, the retrieved context, and the LLM - the template is what binds the question to the context.

Failure modes and mitigations

The exam loves 'users complain that X; which prompt change helps?' Learn these pairs:

Failure mode	Symptom	Primary mitigation
Hallucination / ungrounded answer	Confident answers outside the retrieved context	Instruct 'answer only from the provided context; if it is not there, say you do not know'; ground and cite
Format drift	Output shape varies; JSON breaks parsers	Specify schema plus an example; use structured output or function calling
Run-to-run inconsistency	Same input yields different answers	Lower temperature toward 0 for deterministic tasks
Verbosity / leaked reasoning	Extra prose around the payload	Ask for only the structure; keep CoT out of the returned object
Ignored rules	Model disregards stated constraints	Move rules into the system prompt; make them explicit and ordered
Prompt injection	Retrieved or user text overrides instructions	Separate instructions from data; add guardrails (Governance domain)

Two of these appear again and again. First, grounding: in a RAG app the fix for confident off-context answers is a prompt instruction to use only retrieved context and to abstain when the answer is not present - a prompt fix, not a model swap. Second, temperature: when outputs drift between runs and the team wants consistency, the first move is to lower the temperature, which reduces sampling randomness; leave few-shot examples and schema for format problems, not consistency problems. Distinguishing 'format is wrong' (schema and examples) from 'answer varies' (temperature) from 'answer is invented' (grounding) is exactly the discrimination the exam is testing.

Iterate empirically

Prompt engineering is empirical. Change one variable at a time, test against representative inputs in the AI Playground, and keep the prompt under version control so you can evaluate and roll back. This connects directly to the CI/CD-for-prompts theme that the March 18, 2026 blueprint update emphasizes.

Databricks Generative AI Engineer Associate Certification

Databricks Generative AI Engineer Associate

2.2 Prompt Engineering Strategies

Key Takeaways

Prompt Engineering for Reliable, Grounded Output

System prompts versus user prompts

Zero-shot, few-shot, and chain-of-thought

Structured output

Prompt templates

Failure modes and mitigations

Iterate empirically

Databricks Generative AI Engineer Associate Certification

1Introduction & Exam Strategy

2Design Applications

3Data Preparation

4Application Development

5Assembling & Deploying Applications

6Governance, Evaluation & Monitoring

Databricks Generative AI Engineer Associate

2.2 Prompt Engineering Strategies

Key Takeaways

Prompt Engineering for Reliable, Grounded Output

System prompts versus user prompts

Zero-shot, few-shot, and chain-of-thought

Structured output

Prompt templates

Failure modes and mitigations

Iterate empirically