4.7 Generative AI Foundations Case Lab
Key Takeaways
- A complete GenAI review connects the business goal, model capability, context design, inference settings, risk controls, and success metrics.
- The same idea can point to different AWS services depending on whether the team needs a packaged assistant, a custom Bedrock application, a managed AI service, or a non-AI workflow.
- Case analysis should identify hallucination, prompt injection, data leakage, context quality, cost, latency, and human-review requirements before launch.
- Official AWS practice resources should be used for exam readiness, while local study questions should remain practice checks rather than claims about live exam content.
Case scenario: internal policy assistant
A regional services company wants an internal assistant that helps employees find answers in HR, IT, travel, and security policies. Employees complain that search results are slow and inconsistent. Managers want faster answers, fewer support tickets, and a better onboarding experience. The assistant should answer in plain language, cite the source policy when possible, and tell users when the answer is not available. It should not make final approval decisions, reveal restricted documents, or provide legal advice.
This is a reasonable generative AI candidate because the work involves natural language, summarization, and retrieval across unstructured documents. It is not a pure model problem. The source documents must be current, permissioned, and owned. The application must decide how much conversation history to send, which chunks to retrieve, how to format answers, and what to do when sources conflict. The assistant needs controls because HR and security policies may contain sensitive information.
A packaged assistant such as Amazon Q Business may be worth evaluating if the company wants a managed enterprise assistant experience. A custom application using Amazon Bedrock may fit if the company needs more control over prompts, retrieval, user interface, workflows, or guardrails. A narrow managed AI service may fit subproblems, such as extracting text from uploaded documents with Amazon Textract. A non-AI search improvement may still be a valid baseline if the primary problem is messy metadata or outdated content.
Case workbook
| Review area | Case answer | Risk to check |
|---|---|---|
| Business goal | Reduce policy-search friction and support tickets | Success metric may be vague without baseline ticket data |
| Model capability | Text question answering and summarization | Model may answer beyond policy if not constrained |
| Context design | Retrieve approved policy chunks before generation | Stale, conflicting, or unauthorized content can mislead |
| Inference controls | Low temperature, concise maximum output, source-focused prompt | Overly long or creative answers can reduce trust |
| Hallucination plan | Refuse unsupported answers and cite sources | Users may still treat fluent text as official approval |
| Security plan | IAM, permission-aware retrieval, encryption, logging, and review | Sensitive HR or security documents could leak |
| Operations | Monitor feedback, failed answers, cost, and latency | Launch quality can degrade as policies change |
This workbook is the style of thinking the certification expects. You are not designing a transformer or tuning model weights. You are deciding whether the solution matches the business need and whether the AWS service path is sensible. You are also identifying which risks must be handled before broad use.
Walkthrough and study practice
Start by rewriting the use case in one sentence: Employees need faster, source-backed answers to internal policies, with human ownership of final decisions. That sentence clarifies scope. The assistant is not an autonomous policy approver. It is a retrieval and drafting aid. That distinction affects risk, service fit, and settings.
Next, check the source of truth. HR owns benefits policies, IT owns access procedures, travel owns reimbursement rules, and security owns incident reporting guidance. Each owner should approve what gets indexed and how often it refreshes. Sensitive documents should be reviewed before indexing, with attention to personal data and restricted procedures. If the documents live in Amazon S3, access design, encryption, logging, and sensitive-data discovery should be discussed with services such as IAM, AWS KMS, CloudTrail, CloudWatch, and Amazon Macie where appropriate.
Then decide how the model should behave. For this assistant, low temperature and concise output are likely better than creative variation. Retrieved passages should be presented as evidence, not as instructions that can override the system prompt. The assistant should cite sources, avoid unsupported claims, and route uncertain or high-impact questions to an owner or help desk. Guardrails for Amazon Bedrock can help enforce content boundaries in a Bedrock application, but they should be part of a wider design that includes retrieval permissions, prompt injection tests, and monitoring.
Finally, define measurements. Useful metrics include deflected support tickets, answer acceptance, source citation rate, unanswered question rate, user feedback, escalation volume, average latency, and token cost per answer. Review these metrics after pilot launch and after major policy updates. If the assistant often refuses valid questions, retrieval may be weak. If it answers unsupported questions too confidently, prompt design, context quality, or guardrails may need work. If cost is high, the team may need shorter prompts, smaller models, better retrieval filters, or clearer limits on conversation history.
For study, pair this case-lab approach with official AWS Skill Builder resources and the official practice question set or official practice exam when available to you. Treat practice questions as a way to test concepts, not as live exam content. The real skill is recognizing patterns: model capability, tokens and context, embeddings and retrieval, inference controls, hallucination, use-case fit, prompt injection, leakage, and governance. If you can explain those patterns in a business scenario, you are studying at the right level for this chapter.
In the policy-assistant case, why is retrieval important?
Which inference approach is most sensible for the internal policy assistant as a starting point?
Which metric would best help evaluate whether the assistant is useful after launch?