3.2 Context and Reliability

Key Takeaways

  • Context windows are budgets; include the most relevant information first.
  • Prompt caching helps repeated static context when model and request rules support it.
  • Evals should test known success cases and known failure cases.
  • Retries, timeouts, guardrails, and fallbacks are production architecture controls.
Last updated: June 2026

Context Management

Every model call has a context budget. Put stable instructions and high-value context where the model can use them. Remove stale or irrelevant text. When facts can change, retrieve from an authoritative source instead of embedding old assumptions.

Reliability Controls

ControlUse for
Eval setMeasure expected behavior
Regression testCatch behavior drift
RetryHandle transient failure
TimeoutBound latency
FallbackDegrade gracefully
GuardrailReduce unsafe behavior
MonitoringObserve production outcomes

Evaluation Habit

Do not evaluate only happy paths. Include difficult prompts, tool failures, invalid JSON, stale retrieval, ambiguous user requests, and safety-sensitive scenarios. A system that works only on demo inputs is not production-ready.

Production Trade-off

Reliability is not free. More retrieval can improve factuality but add latency. Stronger models may improve reasoning but raise cost. More tools can expand capability but increase permission risk. Exam scenarios often ask you to balance those trade-offs instead of maximizing one metric blindly.

Reliability Habit

State the testable failure mode before choosing a control. That turns vague quality goals into observable behavior a team can measure.

Test Your Knowledge

Which reliability control is most directly used to catch behavior drift before deployment?

A
B
C
D
Congratulations!

You've completed this section

Continue exploring other exams