3.2 Context and Reliability
Key Takeaways
- Context windows are budgets; include the most relevant information first.
- Prompt caching helps repeated static context when model and request rules support it.
- Evals should test known success cases and known failure cases.
- Retries, timeouts, guardrails, and fallbacks are production architecture controls.
Context Management
Every model call has a context budget. Put stable instructions and high-value context where the model can use them. Remove stale or irrelevant text. When facts can change, retrieve from an authoritative source instead of embedding old assumptions.
Reliability Controls
| Control | Use for |
|---|---|
| Eval set | Measure expected behavior |
| Regression test | Catch behavior drift |
| Retry | Handle transient failure |
| Timeout | Bound latency |
| Fallback | Degrade gracefully |
| Guardrail | Reduce unsafe behavior |
| Monitoring | Observe production outcomes |
Evaluation Habit
Do not evaluate only happy paths. Include difficult prompts, tool failures, invalid JSON, stale retrieval, ambiguous user requests, and safety-sensitive scenarios. A system that works only on demo inputs is not production-ready.
Production Trade-off
Reliability is not free. More retrieval can improve factuality but add latency. Stronger models may improve reasoning but raise cost. More tools can expand capability but increase permission risk. Exam scenarios often ask you to balance those trade-offs instead of maximizing one metric blindly.
Reliability Habit
State the testable failure mode before choosing a control. That turns vague quality goals into observable behavior a team can measure.
Which reliability control is most directly used to catch behavior drift before deployment?
You've completed this section
Continue exploring other exams