A team wants fast, low-cost internal summaries and a smaller model meets the measured quality bar. What is the best model-selection decision?

Use the smaller model because it meets quality needs with lower latency and cost. The best model is the one that meets business quality, latency, cost, and risk requirements. The largest model is not automatically the best production choice.

A workflow needs to extract structured text and tables from scanned forms. Which first AWS option should a practitioner consider before a generic foundation model?

Amazon Textract. Amazon Textract is a purpose-built managed service for extracting text and structured data from documents. A generic foundation model may be unnecessary or less controlled.

Which factor most increases the approval requirements for a model used in production?

The model output is customer-facing and could affect legal, safety, financial, or policy decisions. Higher-impact outputs need stronger grounding, guardrails, monitoring, human review, and governance. Low-risk brainstorming has a different control standard.

Model Selection: Capability, Latency, Cost, | Free Guide 2026

A Decision Framework For Model Choice

Model selection starts with the job to be done. A chatbot that answers employee policy questions, a document classifier, a sales email drafter, and an image moderation workflow have different requirements. Do not start by asking for the biggest model. Start by asking what the user needs, what data is allowed, how fast the response must be, how much variation is acceptable, and what happens if the output is wrong.

Amazon Bedrock gives teams access to foundation models through a managed service. Different models can vary by text capability, image capability, embeddings, context window, language support, latency, cost, safety behavior, and customization options. A practitioner does not need to memorize every provider detail, but should know that model choice affects quality and operations.

Requirement	Model or service question	AWS direction
General text generation	Does the task need broad language capability?	Compare suitable Amazon Bedrock text models.
Semantic search or RAG	Do we need embeddings for retrieval?	Use an embedding model with a supported vector store.
Enterprise assistant	Is the main need a managed business assistant?	Consider Amazon Q where it fits the use case.
Document extraction	Is there a purpose-built managed service?	Consider Amazon Textract before a generic FM.
Sentiment or entity detection	Is standard NLP enough?	Consider Amazon Comprehend.
Custom ML prediction	Is this a classic ML problem with data and labels?	Consider SageMaker AI or SageMaker Canvas.
Deterministic rule	Must the result always follow exact rules?	Use workflow logic, queries, or rules instead of GenAI.

Capability includes more than intelligence. A model must support the input and output modality: text, image, embeddings, or another format. It must support the needed language, context length, tool use pattern, and customization method. It should also be available in the intended AWS Region and meet the organization's data handling and compliance expectations.

Latency matters when the user is waiting. A customer chat response may need to arrive quickly. A nightly report summary can run more slowly. Larger prompts, larger models, longer outputs, and retrieval steps can increase response time. If a smaller model meets the quality bar, it may be the better production choice.

Cost is not only model price. It includes input tokens, output tokens, retrieval storage and search, orchestration, monitoring, evaluation, human review, and engineering support. A model that produces overly long outputs can be expensive even when the per-token rate looks acceptable. Prompt templates should limit output to what the workflow actually needs.

Risk changes the approval standard. A low-risk internal brainstorming tool can tolerate more variation than a customer-facing agent answering medical, legal, financial, safety, or employment questions. Higher-risk scenarios need stronger grounding, Guardrails for Amazon Bedrock where applicable, human review, logging, and clear escalation rules. Some scenarios should be rejected or redesigned.

Model selection workflow:

State the use case, user audience, and decision impact.
Identify data sources, sensitivity, and whether outputs are customer-facing.
Decide whether a managed AI service, Amazon Q, Amazon Bedrock, SageMaker AI, or ordinary automation best fits.
Create a small evaluation set with representative cases and edge cases.
Test candidate models against quality, latency, cost, and safety criteria.
Choose the simplest model and service design that meets the acceptance bar.
Document fallback behavior, monitoring, and the review cadence.

A common practitioner mistake is using a foundation model for every AI-related need. If the task is to extract tables from invoices, Amazon Textract may be a better first option. If the task is to translate text, Amazon Translate may be a better fit. If business users need no-code predictions from tabular data, SageMaker Canvas may be appropriate. If the task is an enterprise assistant with access to company systems, Amazon Q may fit better than building from scratch.

SageMaker AI is important when the organization needs to build, train, customize, evaluate, or deploy ML models beyond a simple managed API pattern. That path can be powerful, but it brings more responsibility for data preparation, lifecycle, deployment, monitoring, and model governance. The exam target candidate is not expected to implement those pipelines, but should know when the path is heavier than needed.

Good model selection produces a defendable decision. The answer should say why the selected model or service fits the data, task, latency, cost, and risk. It should also say what was not selected and why. That judgment is central to practitioner-level AWS AI work.

AWS AI Practitioner Study Guide

5.3 Model Selection: Capability, Latency, Cost, and Risk

Key Takeaways

A Decision Framework For Model Choice

AWS AI Practitioner Study Guide

1Chapter 1: AIF-C01 Orientation and Official Source Control

2Chapter 2: AI/ML Foundations and Use-Case Fit

3Chapter 3: ML Lifecycle, Metrics, and Practitioner MLOps

4Chapter 4: Generative AI Foundations and Inference Concepts

5Chapter 5: Prompting, Model Selection, Customization, and Evaluation

6Chapter 6: Amazon Bedrock, RAG, Agents, and Guardrails

7Chapter 7: AWS Managed AI/ML Services and SageMaker Map

8Chapter 8: Responsible AI, Human Review, and Safety

9Chapter 9: Security, Compliance, Governance, and Cost Controls

10Chapter 10: Integrated AWS AI Business Scenario Labs

11Chapter 11: Final Review, Exam Readiness, and Recertification

5.3 Model Selection: Capability, Latency, Cost, and Risk

Key Takeaways

A Decision Framework For Model Choice