2.2 Supervised, Unsupervised, and Reinforcement Learning
Key Takeaways
- Supervised learning uses labeled examples and is common for classification and regression business problems.
- Unsupervised learning looks for structure in unlabeled data, such as clusters, segments, or unusual patterns.
- Reinforcement learning learns from rewards and penalties, but it is less common for everyday practitioner approval decisions.
- Learning type affects data readiness, validation strategy, service fit, and whether a no-AI or analytics solution is more appropriate.
Learning Type Is a Data Contract
Supervised learning starts with examples that include the desired answer. A loan record might include whether the customer defaulted, a ticket might include its final category, and a property listing might include its sale price. The model learns the relationship between inputs and labels. If the labels are wrong, inconsistent, biased, or outdated, the model can learn the wrong lesson with high confidence.
Classification is supervised learning when the label is a category. Examples include approve or reject, fraud or not fraud, urgent or routine, and product category. Regression is supervised learning when the label is a number, such as expected delivery days, monthly demand, or claim cost. The practitioner does not need to build the algorithm, but should ask whether labeled history exists and whether the label reflects the current business process.
Unsupervised learning starts without a target label. The model searches for useful structure, such as customer segments, similar documents, or unusual transaction patterns. This can be valuable for exploration and decision support, but it does not automatically tell the business what action to take. A cluster called group 3 has no business meaning until analysts interpret it and test whether acting on it improves outcomes.
Reinforcement learning is different. An agent takes actions in an environment and learns from rewards or penalties. It is useful conceptually for optimization, robotics, game-like simulation, and sequential decisions, but most AWS Certified AI Practitioner business scenarios will not require choosing a reinforcement learning architecture. If the organization cannot safely simulate actions and measure rewards, this path is usually not a first choice.
| Learning type | Data required | Common business use | AWS implication |
|---|---|---|---|
| Supervised classification | Labeled categories | Ticket routing, fraud flagging, sentiment classes | SageMaker Canvas, SageMaker AI, Amazon Comprehend for text categories |
| Supervised regression | Labeled numeric outcomes | Cost estimate, demand estimate, risk score | SageMaker Canvas or SageMaker AI when custom prediction is needed |
| Unsupervised learning | Unlabeled records | Segmentation, similarity, anomaly exploration | SageMaker AI, analytics tools, vector search, human interpretation |
| Reinforcement learning | Environment, actions, rewards | Sequential optimization under feedback | Specialized builder path, not a default practitioner service choice |
Service selection depends on whether the task is already solved by a managed service. If the business needs sentiment or key phrase extraction from text, Amazon Comprehend may be a faster path than building supervised NLP from scratch. If the business needs image labels, Amazon Rekognition may fit. If the task is a company-specific risk score, SageMaker Canvas can help business users experiment, while SageMaker AI supports a full builder workflow.
The validation strategy also changes. For supervised learning, compare predictions to known labels on data not used for learning. For unsupervised learning, review whether discovered groups are stable, explainable enough for the use case, and useful for an action. For reinforcement learning, evaluate behavior over time, including unsafe incentives. A model can optimize the metric and still damage the business if the metric is poorly chosen.
A practitioner should be suspicious when a team proposes supervised learning but has no labels. For example, if support tickets were never categorized consistently, the project may need a labeling effort before modeling. Amazon SageMaker Ground Truth can support labeling workflows, but labeling still costs time and requires clear definitions. A no-code ML tool cannot magically create trustworthy labels from unclear operations.
A practitioner should also be suspicious when a team proposes unsupervised learning as if it will produce a final decision. Clustering customers may reveal groups, but pricing, outreach, and eligibility decisions need business rules, fairness review, and measurement. Unsupervised results are often the beginning of analysis, not the end of governance. Human review is especially important when segments affect access, pricing, or customer treatment.
Use this decision workflow before approving a learning approach:
- Identify the output: category, number, group, ranked item, action, or generated content.
- Identify the feedback: labels, no labels, rewards, or human review.
- Check data coverage across customer groups, seasons, regions, and edge cases.
- Select the simplest AWS service path that matches the task.
- Define what happens when confidence is low or the output conflicts with policy.
- Decide how monitoring, retraining, and retirement will work.
Scenario: a retailer wants to predict whether an online order is likely to be returned. Historical orders include the final return status, so this is supervised classification. The team could explore SageMaker Canvas for a business-led prototype or SageMaker AI for a governed build. The approval discussion should include label quality, fairness, what action the score changes, and whether return policies must remain explainable to customers.
Scenario: a media company wants to group articles by theme but does not have a fixed taxonomy. This is closer to unsupervised discovery or embedding-based similarity. Amazon Bedrock embeddings with vector search, Amazon Comprehend features, or SageMaker exploration may help depending on the desired output. The business still needs editors to name the groups and decide how search, recommendation, or reporting will use them.
A historical dataset contains support ticket text and the final category assigned by agents. Which learning type best matches a model that predicts the category for new tickets?
A team has unlabeled customer behavior records and wants to discover natural segments for analysis. Which learning type is most appropriate to explore?
What is the main risk in proposing supervised learning when the organization has no reliable labels?