4.5 Optimization & Refresh
Key Takeaways
- A clean star schema, low-cardinality columns, and removing unused columns are the highest-impact ways to shrink and speed up an Import or Direct Lake model.
- Aggregation tables pre-summarize large facts so visuals hit a small cached table and only fall through to the detailed table when needed.
- Incremental refresh partitions Import tables by a date range so only recent partitions refresh, drastically reducing refresh time and memory.
- Performance Analyzer in Power BI Desktop and DAX Studio identify slow visuals and slow DAX queries for targeted tuning.
- Direct Lake performance depends on framing, column count, and capacity guardrails; reducing model width and reframing appropriately keeps it in the fast path.
Optimization Is a Tested Skill, Not an Afterthought
The semantic-model domain explicitly covers DAX optimization, model optimization, and incremental refresh. Expect scenarios that describe a slow report or a model that exceeds memory and ask for the most effective remedy.
Model Size and Speed Fundamentals
The VertiPaq engine compresses columns; column cardinality (number of distinct values) is the main driver of model size and query cost. High-cardinality columns — long unique strings, free-text, high-precision datetimes — compress poorly.
Highest-impact levers:
- Use a star schema; avoid wide, snowflaked, or fact-embedded designs.
- Remove unused columns and unused tables entirely.
- Reduce cardinality: split datetime into date + time, round numbers that do not need precision.
- Replace calculated columns with measures where possible.
- Disable auto date/time and use a single marked date table.
Aggregations
An aggregation table stores a pre-summarized version of a large fact (for example daily totals by product). When a visual only needs the summary grain, the engine answers from the small aggregation table; when it needs detail, it transparently falls through to the detailed fact. Aggregations are especially valuable with DirectQuery/Direct Lake over very large facts because most dashboard queries never touch the billion-row detail table.
Incremental Refresh
Incremental refresh partitions an Import table by a date range using RangeStart/RangeEnd parameters. After the initial load, scheduled refreshes only reprocess recent partitions (for example the last few days), while historical partitions stay cached. Benefits: far shorter refresh windows, lower memory and capacity use, and optional real-time freshness on the latest partition (hybrid tables). It applies to Import-mode tables; Direct Lake stays current through framing instead. A wrong-answer trap is applying incremental refresh to a Direct Lake table.
| Symptom | Likely fix |
|---|---|
| Daily refresh takes hours, mostly reloading old data | Incremental refresh |
| Huge fact, dashboards only need summaries | Aggregation table |
| Model exceeds capacity memory | Reduce cardinality / remove columns / large format |
| One visual is slow, others fine | Profile that visual's DAX in Performance Analyzer / DAX Studio |
Diagnosing Slow Reports
Use Performance Analyzer in Power BI Desktop to record how long each visual takes and split DAX query time from visual rendering. For deep DAX tuning, DAX Studio captures the storage-engine vs formula-engine breakdown and server timings. The exam favors answers that measure first (profile the slow visual) over blanket changes like 'switch the whole model to DirectQuery.' For Direct Lake, keep models narrow and ensure reframing keeps the fast path; excessive fallback to DirectQuery is a performance red flag.
An Import semantic model has a 400-million-row sales fact. Scheduled refresh now runs over four hours each night even though only the last two days of data change. What is the most appropriate optimization?
Most dashboard visuals over a billion-row Direct Lake fact only show monthly totals by category, yet they are slow because every query scans the full detail. What is the best-targeted fix?