AWS AI Practitioner Study Guide

Exam

1.1 Credential Scope, Code, and Search Language 1.2 Exam Format, Cost, Score, and Delivery 1.3 Official Exam Guide, Skill Builder, and Practice Workflow 1.4 Target Candidate Boundaries and Out-of-Scope Tasks 1.5 Domain Weights and Study Prioritization 1.6 Exam Policy, Retake, Results, and Recertification

2.1 AI, ML, Deep Learning, and Core Terminology 2.2 Supervised, Unsupervised, and Reinforcement Learning 2.3 Data Types, Labels, Structure, and Quality 2.4 Inference Patterns: Batch, Real-Time, and Embedded AI 2.5 AI Use-Case Fit and No-AI Decisions 2.6 Classification, Regression, Clustering, Forecasting, and Recommendation 2.7 AI/ML Foundations Case Lab

3.1 Data Collection, EDA, Preprocessing, and Feature Concepts 3.2 Training, Evaluation, Deployment, and Monitoring Lifecycle 3.3 Model Sources: Managed APIs, Open Source, and Custom Models 3.4 Evaluation Metrics and Business Metrics 3.5 Practitioner MLOps: Repeatability, Monitoring, and Retraining 3.6 SageMaker Lifecycle Service Map 3.7 ML Lifecycle Case Lab

4.1 Foundation Models, LLMs, Transformers, and Modalities 4.2 Tokens, Context Windows, Embeddings, and Vector Search 4.3 Inference Parameters: Temperature, Top-p, and Output Controls 4.4 Hallucination, Grounding, and Context Quality 4.5 GenAI Use-Case Fit and Risk Triage 4.6 Prompt Injection, Data Leakage, and User Input Risk 4.7 Generative AI Foundations Case Lab

5.1 Prompt Engineering Patterns and Business Quality 5.2 Zero-Shot, Few-Shot, Templates, and Instruction Design 5.3 Model Selection: Capability, Latency, Cost, and Risk 5.4 RAG vs Fine-Tuning vs Prompting vs Custom Models 5.5 Model Evaluation, Human Review, and Red-Team Feedback 5.6 Cost, Performance, and Throughput Decision-Making 5.7 Prompting and Model Selection Case Lab

6.1 Bedrock Core Concepts, Model Access, and Managed FM Choice 6.2 Knowledge Bases, RAG, Data Sources, and Grounding 6.3 Agents, Action Groups, Orchestration, and Business Workflows 6.4 Guardrails, Content Filters, Denied Topics, and Sensitive Data 6.5 Bedrock Model Evaluation, Monitoring, and Human Feedback 6.6 Bedrock Cost, Latency, Throughput, and Operational Fit 6.7 Bedrock, RAG, and Agents Case Lab

7.1 Managed AI Services vs Foundation Model Apps vs Custom ML 7.2 Text, Language, Search, and Document AI Services 7.3 Vision, Speech, Contact Center, and Personalization Services 7.4 Amazon Q Business, Developer, and Practitioner Fit 7.5 SageMaker Canvas, Studio, Clarify, Autopilot, and Data Wrangler 7.6 Data Foundation Services: S3, Glue, OpenSearch, and QuickSight 7.7 AWS AI Service Selection Case Lab

8.1 Fairness, Bias, Transparency, and Explainability 8.2 Privacy, Safety, Human Review, and Accountability 8.3 Guardrails, Clarify, A2I, and Content Safety Controls 8.4 Responsible AI Risk Registers and Governance Workflows 8.5 Monitoring, Feedback, Drift, and Incident Response 8.6 Responsible AI Case Lab

9.1 Shared Responsibility, IAM, and Least Privilege for AI 9.2 Encryption, Secrets, Networking, and Data Privacy 9.3 Prompt Injection, Data Exfiltration, and GenAI Threat Modeling 9.4 Logging, Monitoring, CloudTrail, CloudWatch, and Config 9.5 Compliance Artifact, Audit Manager, Macie, and Policy Evidence 9.6 AI Cost Controls, Pricing, Throughput, and Budget Governance 9.7 Security and Governance Case Lab

10.1 Customer Support GenAI Assistant Lab 10.2 Document Intelligence and Compliance Review Lab 10.3 Personalization, Forecasting, and Fraud Detection Lab 10.4 Enterprise Search, RAG, and Knowledge Management Lab 10.5 Responsible AI and Security Review Board Lab 10.6 Cost, Performance, and Operations Review Lab 10.7 Full AIF-C01 Business Simulation

11.1 Final 30-Day AIF-C01 Study Plan 11.2 Official Practice Resources and Weak-Domain Remediation 11.3 90-Minute Exam Timing, Flagging, and Guessing Workflow 11.4 Test-Day Checklist: Online or Test Center 11.5 Post-Exam Results, Retake, and Recertification Plan 11.6 AWS AI Practitioner Final Mixed Review

2.2 Supervised, Unsupervised, and Reinforcement Learning

Key Takeaways

Supervised learning uses labeled examples and is common for classification and regression business problems.
Unsupervised learning looks for structure in unlabeled data, such as clusters, segments, or unusual patterns.
Reinforcement learning learns from rewards and penalties, but it is less common for everyday practitioner approval decisions.
Learning type affects data readiness, validation strategy, service fit, and whether a no-AI or analytics solution is more appropriate.

Last updated: May 2026

Learning Type Is a Data Contract

Supervised learning starts with examples that include the desired answer. A loan record might include whether the customer defaulted, a ticket might include its final category, and a property listing might include its sale price. The model learns the relationship between inputs and labels. If the labels are wrong, inconsistent, biased, or outdated, the model can learn the wrong lesson with high confidence.

Classification is supervised learning when the label is a category. Examples include approve or reject, fraud or not fraud, urgent or routine, and product category. Regression is supervised learning when the label is a number, such as expected delivery days, monthly demand, or claim cost. The practitioner does not need to build the algorithm, but should ask whether labeled history exists and whether the label reflects the current business process.

Unsupervised learning starts without a target label. The model searches for useful structure, such as customer segments, similar documents, or unusual transaction patterns. This can be valuable for exploration and decision support, but it does not automatically tell the business what action to take. A cluster called group 3 has no business meaning until analysts interpret it and test whether acting on it improves outcomes.

Reinforcement learning is different. An agent takes actions in an environment and learns from rewards or penalties. It is useful conceptually for optimization, robotics, game-like simulation, and sequential decisions, but most AWS Certified AI Practitioner business scenarios will not require choosing a reinforcement learning architecture. If the organization cannot safely simulate actions and measure rewards, this path is usually not a first choice.

Learning type	Data required	Common business use	AWS implication
Supervised classification	Labeled categories	Ticket routing, fraud flagging, sentiment classes	SageMaker Canvas, SageMaker AI, Amazon Comprehend for text categories
Supervised regression	Labeled numeric outcomes	Cost estimate, demand estimate, risk score	SageMaker Canvas or SageMaker AI when custom prediction is needed
Unsupervised learning	Unlabeled records	Segmentation, similarity, anomaly exploration	SageMaker AI, analytics tools, vector search, human interpretation
Reinforcement learning	Environment, actions, rewards	Sequential optimization under feedback	Specialized builder path, not a default practitioner service choice

Service selection depends on whether the task is already solved by a managed service. If the business needs sentiment or key phrase extraction from text, Amazon Comprehend may be a faster path than building supervised NLP from scratch. If the business needs image labels, Amazon Rekognition may fit. If the task is a company-specific risk score, SageMaker Canvas can help business users experiment, while SageMaker AI supports a full builder workflow.

The validation strategy also changes. For supervised learning, compare predictions to known labels on data not used for learning. For unsupervised learning, review whether discovered groups are stable, explainable enough for the use case, and useful for an action. For reinforcement learning, evaluate behavior over time, including unsafe incentives. A model can optimize the metric and still damage the business if the metric is poorly chosen.

A practitioner should be suspicious when a team proposes supervised learning but has no labels. For example, if support tickets were never categorized consistently, the project may need a labeling effort before modeling. Amazon SageMaker Ground Truth can support labeling workflows, but labeling still costs time and requires clear definitions. A no-code ML tool cannot magically create trustworthy labels from unclear operations.

A practitioner should also be suspicious when a team proposes unsupervised learning as if it will produce a final decision. Clustering customers may reveal groups, but pricing, outreach, and eligibility decisions need business rules, fairness review, and measurement. Unsupervised results are often the beginning of analysis, not the end of governance. Human review is especially important when segments affect access, pricing, or customer treatment.

Use this decision workflow before approving a learning approach:

Identify the output: category, number, group, ranked item, action, or generated content.
Identify the feedback: labels, no labels, rewards, or human review.
Check data coverage across customer groups, seasons, regions, and edge cases.
Select the simplest AWS service path that matches the task.
Define what happens when confidence is low or the output conflicts with policy.
Decide how monitoring, retraining, and retirement will work.

Scenario: a retailer wants to predict whether an online order is likely to be returned. Historical orders include the final return status, so this is supervised classification. The team could explore SageMaker Canvas for a business-led prototype or SageMaker AI for a governed build. The approval discussion should include label quality, fairness, what action the score changes, and whether return policies must remain explainable to customers.

Scenario: a media company wants to group articles by theme but does not have a fixed taxonomy. This is closer to unsupervised discovery or embedding-based similarity. Amazon Bedrock embeddings with vector search, Amazon Comprehend features, or SageMaker exploration may help depending on the desired output. The business still needs editors to name the groups and decide how search, recommendation, or reporting will use them.

Test Your Knowledge

A historical dataset contains support ticket text and the final category assigned by agents. Which learning type best matches a model that predicts the category for new tickets?

Supervised classification

Unsupervised clustering

Reinforcement learning

Deterministic encryption

Test Your Knowledge

A team has unlabeled customer behavior records and wants to discover natural segments for analysis. Which learning type is most appropriate to explore?

Unsupervised learning

Supervised regression

A fixed if-then rule only

Speech synthesis

Test Your Knowledge

What is the main risk in proposing supervised learning when the organization has no reliable labels?

The model lacks trustworthy examples of the desired answer

The model will automatically become cheaper

The task becomes deterministic

IAM policies are no longer needed

Up Next

2.3 Data Types, Labels, Structure, and Quality

Continue learning

AWS AI Practitioner Study Guide

1Chapter 1: AIF-C01 Orientation and Official Source Control

2Chapter 2: AI/ML Foundations and Use-Case Fit

3Chapter 3: ML Lifecycle, Metrics, and Practitioner MLOps

4Chapter 4: Generative AI Foundations and Inference Concepts

5Chapter 5: Prompting, Model Selection, Customization, and Evaluation

6Chapter 6: Amazon Bedrock, RAG, Agents, and Guardrails

7Chapter 7: AWS Managed AI/ML Services and SageMaker Map

8Chapter 8: Responsible AI, Human Review, and Safety

9Chapter 9: Security, Compliance, Governance, and Cost Controls

10Chapter 10: Integrated AWS AI Business Scenario Labs

11Chapter 11: Final Review, Exam Readiness, and Recertification

2.2 Supervised, Unsupervised, and Reinforcement Learning

Key Takeaways

Learning Type Is a Data Contract