What does an R-squared value of 0.92 indicate about a regression model?

The model explains 92% of the variance in the target variable. R-squared (coefficient of determination) measures the proportion of variance in the label that the model explains. An R-squared of 0.92 means the model explains 92% of the variation in the target variable. Higher values (closer to 1.0) indicate better model fit.

Which machine learning technique would you use to predict the temperature in a city tomorrow?

Regression. Temperature is a continuous numerical value, making this a regression problem. Regression predicts numbers (72.5°F), while classification predicts categories (hot/cold) and clustering finds groups in data.

Which regression evaluation metric penalizes larger prediction errors more heavily?

RMSE (Root Mean Squared Error). RMSE squares the errors before averaging, which means larger errors contribute disproportionately more to the metric. This makes RMSE more sensitive to outliers and large prediction errors compared to MAE, which treats all errors equally.

A model predicts house prices. The MAE is \$25,000 and the R-squared is 0.78. What does this tell you?

Predictions are off by \$25,000 on average, and the model explains 78% of price variation. MAE of \$25,000 means predictions are off by \$25,000 on average. R-squared of 0.78 means the model explains 78% of the variance in house prices. Together, they indicate a reasonably good model that could be improved.

Regression Models

Quick Answer: Regression models predict continuous numerical values like prices, temperatures, and quantities. Linear regression finds the straight-line relationship between features and the label. Key evaluation metrics are MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), and R-squared (proportion of variance explained). Higher R-squared values indicate better model performance.

What Is Regression?

Regression is a supervised machine learning technique that predicts a continuous numerical value based on input features. The output is a number on a continuous scale, not a category.

How to Identify a Regression Problem

Ask yourself: Is the predicted output a number on a continuous scale?

Predicting a house price ($450,000) → Regression (continuous number)
Predicting if an email is spam → Classification (category)
Predicting tomorrow's temperature (72.5°F) → Regression (continuous number)
Predicting customer segment → Clustering (group)

Linear Regression

Linear regression is the simplest and most fundamental regression algorithm. It finds the straight-line relationship between features and the label.

Simple Linear Regression (One Feature)

With one feature, linear regression fits a straight line through the data:

Formula: y = mx + b

y = predicted value (label)
x = input value (feature)
m = slope (how much y changes when x changes by 1)
b = y-intercept (value of y when x = 0)

Example: Predicting house price (y) based on square footage (x)

If m = 200 and b = 50,000
A 1,500 sq ft house: y = 200(1,500) + 50,000 = $350,000

Multiple Linear Regression (Multiple Features)

With multiple features, the model accounts for several input variables:

Formula: y = m₁x₁ + m₂x₂ + m₃x₃ + ... + b

Example: Predicting house price based on square footage (x₁), bedrooms (x₂), and age (x₃)

Regression Evaluation Metrics

After training a regression model, you evaluate its performance using these metrics:

Mean Absolute Error (MAE)

The average absolute difference between predicted and actual values.

Interpretation: On average, how far off are the predictions?
Example: MAE of $15,000 means predictions are off by $15,000 on average
Lower is better

Root Mean Squared Error (RMSE)

The square root of the average squared differences between predicted and actual values.

Interpretation: Similar to MAE but penalizes larger errors more heavily
Example: RMSE of $20,000 (larger than MAE because big errors are penalized more)
Lower is better

R-squared (Coefficient of Determination)

The proportion of variance in the label that the model explains.

Range: 0 to 1 (can be negative for very poor models)
Interpretation: How much of the variation in the output does the model explain?
Example: R² = 0.85 means the model explains 85% of the variation in house prices
Higher is better (1.0 = perfect, 0 = no better than predicting the average)

Metric	What It Measures	Good Values	Direction
MAE	Average prediction error	Close to 0	Lower is better
RMSE	Average error (penalizes large errors)	Close to 0	Lower is better
R-squared	Proportion of variance explained	Close to 1.0	Higher is better

On the Exam: You will NOT be asked to calculate these metrics. You need to know what each metric means, how to interpret it, and which direction is "better." Common question: "An R-squared of 0.92 indicates that..." → the model explains 92% of the variance in the data.

Common Regression Use Cases

Use Case	Features	Label
House price prediction	Size, bedrooms, location	Price ($)
Sales forecasting	Month, promotions, weather	Sales volume
Temperature prediction	Date, location, humidity	Temperature (°F)
Stock price prediction	Volume, market index, news	Stock price ($)
Delivery time estimation	Distance, traffic, weather	Delivery time (minutes)
Insurance premium pricing	Age, health history, coverage	Premium ($/month)
Energy consumption	Time, weather, building size	Energy (kWh)

Regression vs. Classification: The Key Difference

Aspect	Regression	Classification
Output type	Continuous number	Discrete category
Examples	$450,000, 72.5°F, 3.7 hours	Spam/Not Spam, Cat/Dog, Yes/No
Question answered	"How much?" or "How many?"	"Which category?" or "Is it X?"
Evaluation	MAE, RMSE, R-squared	Accuracy, Precision, Recall, F1

Microsoft Azure AI Fundamentals

2.2 Regression Models

Key Takeaways

Regression Models

What Is Regression?

How to Identify a Regression Problem

Linear Regression

Simple Linear Regression (One Feature)

Multiple Linear Regression (Multiple Features)

Regression Evaluation Metrics

Mean Absolute Error (MAE)

Root Mean Squared Error (RMSE)

R-squared (Coefficient of Determination)

Common Regression Use Cases

Regression vs. Classification: The Key Difference

Microsoft Azure AI Fundamentals

1Introduction

2Domain 1: Describe AI Workloads and Considerations (15-20%)

3Domain 2: Fundamental Principles of Machine Learning on Azure (20-25%)

4Domain 3: Computer Vision Workloads on Azure (15-20%)

5Domain 4: Natural Language Processing Workloads on Azure (15-20%)

6Domain 5: Generative AI Workloads on Azure (15-20%)

7Exam Review and Full-Length Practice Questions

2.2 Regression Models

Key Takeaways

Regression Models

What Is Regression?

How to Identify a Regression Problem

Linear Regression

Simple Linear Regression (One Feature)

Multiple Linear Regression (Multiple Features)

Regression Evaluation Metrics

Mean Absolute Error (MAE)

Root Mean Squared Error (RMSE)

R-squared (Coefficient of Determination)

Common Regression Use Cases

Regression vs. Classification: The Key Difference