2.6 Classification, Regression, Clustering, Forecasting, and Recommendation
Key Takeaways
- Classification predicts a category, regression predicts a number, clustering discovers groups, forecasting predicts future values over time, and recommendation ranks items or actions.
- The correct pattern depends on the output the business will consume, not on the tool name alone.
- Metrics should connect model performance to business value, cost, risk, and user feedback.
- AWS service selection ranges from managed AI services and Amazon Personalize to SageMaker Canvas, SageMaker AI, analytics tools, and no-AI rules.
Name the Output First
Classification predicts a category. The output might be fraud or not fraud, urgent or routine, approved or rejected, positive or negative sentiment, or document type. Classification is often easy to explain to business users because the result is a label. The risk is that a label can hide uncertainty. A low-confidence classification may need review, especially when it affects money, access, safety, or customer treatment.
Regression predicts a number. The output might be expected demand, claim amount, delivery duration, churn probability, or risk score. Some scores are technically numeric but are used like classifications when a threshold triggers action. Practitioners should ask how thresholds are chosen, who can override them, and whether the numeric output is calibrated well enough for the decision.
Clustering discovers groups in data without a known target label. A marketing team might find behavior segments, or an operations team might group similar incidents. Clustering can reveal structure, but it does not automatically produce a decision. The team needs analysts and domain experts to name clusters, check for fairness issues, and decide whether action based on clusters is useful.
Forecasting predicts future values over time. The time dimension is essential. A forecast might estimate next month's demand, next week's call volume, or future infrastructure usage. Good forecasting needs historical data, seasonality awareness, event context, and validation that respects the timeline. A forecast should be compared with a simple baseline, because complex models do not always beat a well-understood planning rule.
Recommendation ranks items, content, products, offers, or next actions. The goal is usually to personalize or prioritize. Amazon Personalize is the managed service to consider for recommendation scenarios when user-item interaction data exists. A recommendation system can improve engagement, but it can also create stale suggestions, popularity bias, privacy concerns, or unfair exposure for new items.
| Pattern | Output | Example question | Possible AWS path | Common metric lens |
|---|---|---|---|---|
| Classification | Category | Which queue should handle this ticket? | Comprehend, Rekognition, SageMaker Canvas, SageMaker AI | Accuracy, precision, recall, F1, review rate |
| Regression | Number | What will this claim cost? | SageMaker Canvas or SageMaker AI | Error size, business cost of over or under estimate |
| Clustering | Group | Which customers behave similarly? | SageMaker AI, analytics, embeddings and search | Stability, interpretability, action value |
| Forecasting | Future value | How much demand next week? | SageMaker options and analytics workflows | Forecast error versus baseline |
| Recommendation | Ranked list | Which product should appear next? | Amazon Personalize or custom ML | Clicks, conversion, relevance, diversity |
Metrics must match the business risk. Accuracy may be misleading when the important class is rare, such as fraud. Precision asks how many flagged items are truly relevant. Recall asks how many relevant items were found. F1 balances precision and recall. AUC can help compare ranking ability. For a practitioner, the key is not memorizing formulas but knowing that the wrong metric can approve the wrong solution.
Business metrics matter too. A fraud model should reduce losses without creating unacceptable false declines. A recommendation model should improve conversion without damaging trust. A forecast should improve inventory decisions, not just produce a lower mathematical error. A document classifier should reduce handling time while maintaining quality. User feedback can reveal whether model outputs are helpful inside the actual workflow.
Service selection should start with the pattern and data. If the task is generic text classification or sentiment, Amazon Comprehend may fit. If the task is custom classification from tabular data and a business team wants to prototype, SageMaker Canvas may fit. If the organization needs a governed custom lifecycle, SageMaker AI is relevant. If the output is a ranked list of products for users, Amazon Personalize deserves consideration.
Generative AI can support these patterns but does not replace all of them. A foundation model can summarize why a customer might churn, generate an explanation for a human reviewer, or convert unstructured text into structured fields. However, a probabilistic generated paragraph is not the same as a validated classification model, a calibrated numeric forecast, or an auditable rule. Use Bedrock where generation or language reasoning is the value.
Use this quick mapping workflow:
- Ask what the output looks like: category, number, group, future time value, or ranked list.
- Confirm what historical data supports that output.
- Choose a metric tied to the cost of wrong outputs.
- Decide whether a managed AI service, Amazon Personalize, SageMaker Canvas, SageMaker AI, or a rule is the simplest fit.
- Define human review and monitoring for high-risk or low-confidence results.
Scenario: an insurer wants to estimate claim cost. That is regression if the output is a dollar estimate. If the workflow only needs low, medium, or high complexity, classification may be better. If the estimate triggers payment or denial, the risk profile changes and human review becomes important. The service choice could start with SageMaker Canvas for exploration, but production ownership may require a fuller SageMaker AI and governance path.
Scenario: a streaming service wants to suggest the next video. That is recommendation, not simple classification. Amazon Personalize may fit if user-item interactions are available and permitted. A new catalog with little user behavior may need editorial rules or popularity lists first. The practitioner should ask how the team will handle new users, new items, inappropriate recommendations, and diversity of results.
A model predicts whether an email is spam, promotional, or personal. Which task pattern is this?
A retailer predicts the number of units that will sell next week for each store. Which task pattern is most directly involved?
A company wants to rank products for each shopper based on interaction history. Which AWS service is especially relevant to evaluate?