Which of the following columns would cause the WORST compression and largest model size increase?

A Transaction GUID column with 1 million unique values. A GUID column with 1 million unique values has very high cardinality — every value is different, making dictionary encoding nearly useless. Boolean (2 values), Region (5 values), and Year (5 values) all have very low cardinality and compress extremely well, regardless of row count.

Why should you disable the Auto Date/Time feature in Power BI?

It creates a hidden date table for EVERY date column, bloating the model. Auto Date/Time creates a hidden, full-featured date table for every date column in the model. If you have 10 date columns, that is 10 hidden date tables with year, quarter, month, and day hierarchies — significant memory overhead. Disable it and create one explicit Date dimension table instead.

A model has a detailed transactions table with 50 million rows at per-second granularity. Reports only show daily summaries. What is the best optimization approach?

Aggregate the data to daily level in Power Query before loading. Aggregating 50 million per-second rows to daily level in Power Query (using Group By) could reduce the table to thousands of rows — a massive reduction in model size. This should be done before loading into the model, as it reduces memory consumption and improves query performance dramatically.

Model Size Reduction and Cardinality Management

Quick Answer: Model size is driven by column count, row count, and cardinality (unique values per column). Remove unused columns, filter unnecessary rows, use integers instead of text for keys, reduce precision of decimals, and disable Auto Date/Time. High-cardinality text columns (GUIDs, free-text descriptions) are the biggest offenders.

Understanding Model Size

Power BI's VertiPaq engine compresses data using dictionary encoding and other techniques. The compression ratio depends on:

Factors Affecting Size

Factor	Impact
Number of columns	Each column has overhead regardless of row count
Number of rows	More rows = more data to store
Cardinality	More unique values = worse compression
Data type	Text > Decimal > Integer > Boolean (most to least storage)
Value distribution	Skewed distributions compress better than uniform

Understanding Cardinality

Cardinality = the number of distinct values in a column.

Cardinality Level	Example	Compression
Very low (2-10)	Gender, Status, Boolean	Excellent
Low (10-1,000)	Category, Region, Department	Very Good
Medium (1,000-100,000)	City, Product Name	Good
High (100,000+)	Customer Email, Transaction ID	Poor
Very high (millions)	GUID, Free Text, Timestamps	Very Poor

Column Optimization

Remove Unused Columns

Every column in the model consumes memory. Remove columns that are not used in:

Reports (visuals, slicers, filters)
Measures (DAX references)
Relationships (key columns)
Row-level security (filter expressions)

In Power Query: Right-click column → Remove
Or: Select needed columns → Right-click → Remove Other Columns

Best Practice: "Remove Other Columns" is more future-proof than "Remove Columns" — if the source adds new columns, they are automatically excluded.

Reduce Column Precision

Use Date instead of Date/Time when time is not needed
Round decimal values to fewer decimal places
Truncate text columns to maximum needed length

Replace High-Cardinality Text with Integers

Instead of using text-based keys for relationships:

Before (Text Key)	After (Integer Key)
"PROD-ABC-12345"	1
"PROD-DEF-67890"	2
"PROD-GHI-11111"	3

Create a surrogate integer key in Power Query:

Add Column → Index Column → From 1

Row Optimization

Filter Historical Data

If reports only need the last 3 years of data:

Table.SelectRows(Source, each [Date] >= #date(2024, 1, 1))

Aggregate to Needed Granularity

If reports never drill below the daily level but data is per-second:

Group By Date, Product → Sum of Amount, Count of Transactions

This can reduce millions of rows to thousands.

Use Aggregation Tables

For large models, create pre-aggregated summary tables:

Load both detail and summary tables
Configure aggregations in the model
Power BI automatically routes high-level queries to the summary table
Drill-through queries use the detail table

Disable Auto Date/Time

File → Options → Data Load → Time Intelligence
→ Uncheck "Auto Date/Time for new files"

Auto Date/Time creates a hidden date table for every date column in the model. With 10 date columns, that is 10 hidden tables with full date hierarchies — significant overhead.

Instead: Create one explicit Date dimension table and use it for all date-based analysis.

On the Exam

The PL-300 frequently tests:

Identifying high-cardinality columns as model size problems
Choosing strategies to reduce model size
Understanding the impact of data types on compression
Knowing when to aggregate data before loading
Disabling Auto Date/Time and using an explicit date table

Microsoft Power BI Data Analyst

6.2 Model Size Reduction and Cardinality Management

Key Takeaways

Model Size Reduction and Cardinality Management

Understanding Model Size

Factors Affecting Size

Understanding Cardinality

Column Optimization

Remove Unused Columns

Reduce Column Precision

Replace High-Cardinality Text with Integers

Row Optimization

Filter Historical Data

Aggregate to Needed Granularity

Use Aggregation Tables

Disable Auto Date/Time

On the Exam

Microsoft Power BI Data Analyst

1Introduction

2Chapter 1: Get or Connect to Data

3Chapter 2: Profile and Clean the Data

4Chapter 3: Transform and Load the Data

5Chapter 4: Design and Implement a Data Model

6Chapter 5: Create Model Calculations with DAX

7Chapter 6: Optimize Model Performance

8Chapter 7: Create Reports

9Chapter 8: Enhance Reports for Usability and Storytelling

10Chapter 9: Identify Patterns and Trends

11Chapter 10: Create and Manage Workspaces and Assets

12Chapter 11: Secure and Govern Power BI Items

6.2 Model Size Reduction and Cardinality Management

Key Takeaways

Model Size Reduction and Cardinality Management

Understanding Model Size

Factors Affecting Size

Understanding Cardinality

Column Optimization

Remove Unused Columns

Reduce Column Precision

Replace High-Cardinality Text with Integers

Row Optimization

Filter Historical Data

Aggregate to Needed Granularity

Use Aggregation Tables

Disable Auto Date/Time

On the Exam