5.2 Last-Week Review Map
Key Takeaways
- The three DP-700 domains are each weighted 30-35 percent, so no area can be skipped in the final week.
- Implement and manage: OneLake is one logical data lake per tenant, workspace roles (Admin/Member/Contributor/Viewer) plus item and OneLake security govern access, and deployment pipelines copy metadata not data.
- Ingest and transform: pipelines orchestrate, Dataflows Gen2 do low-code Power Query shaping, notebooks run Spark code, and the medallion bronze/silver/gold pattern stages the data.
- Monitor and optimize: the Monitoring hub shows item runs, Delta OPTIMIZE plus V-Order improve read speed, and capacity throttling delays then rejects work as smoothing windows are exceeded.
- Use short mixed sets in the last week so you practice switching domains without a label.
5.2 Last-Week Review Map
The final week is for consolidation, not new material. DP-700 has three domains, each weighted 30-35 percent, so points are spread evenly and you cannot afford a blind spot. Use this map to lock in the single highest-yield fact per area, then prove it with short mixed sets.
Domain 1 - Implement and manage an analytics solution (30-35%)
The spine here is the Fabric storage and governance model. OneLake is a single, logical, tenant-wide data lake built on Delta Parquet; every workspace and item stores data in it, and shortcuts virtualize external data (ADLS Gen2, S3, GCS) without copying. Access is layered: workspace roles set the coarse permission, and OneLake data access roles and item-level sharing refine it.
| Workspace role | What it can do |
|---|---|
| Admin | Full control, including workspace settings and access |
| Member | Create/edit items, share, publish the app |
| Contributor | Create/edit items, but cannot publish the app |
| Viewer | Read-only consumption; no item data access unless granted via OneLake security |
For application lifecycle management (ALM), remember two rules that get tested constantly: Git integration supports Azure DevOps Git and GitHub for branch-based source control, and deployment pipelines copy item definitions and metadata, not the underlying data, so a freshly deployed semantic model must be refreshed and data-source credentials stay stage-specific.
Domain 2 - Ingest and transform data (30-35%)
The most-tested distinction is which tool does which job. Pipelines orchestrate multi-step workflows (copy activity, scheduling, retries, event triggers). Dataflows Gen2 do low-code, Power Query (M) shaping for analysts. Notebooks run Spark code (PySpark, Spark SQL) for custom, distributed transformation. Eventstream handles low-code real-time ingestion and routing into a lakehouse or eventhouse, while KQL queries telemetry in an eventhouse.
Layer those tools onto the medallion architecture: bronze is raw landed data, silver is cleaned and conformed, gold is the curated, denormalized star schema ready for Power BI. Loading patterns matter: use full load for small dimensions, incremental load when a reliable change timestamp exists, and mirroring for continuous low-latency replication of a source such as Azure SQL into OneLake without custom ETL. Streaming jobs must use checkpoints, each with its own location, to resume safely after restart.
Domain 3 - Monitor and optimize an analytics solution (30-35%)
The Monitoring hub is the single place to see run status across pipelines, notebooks, dataflows, and other items - it is the answer to almost any 'where do I check the status' question. For storage performance, Delta OPTIMIZE compacts many small files into fewer large ones, and V-Order applies a read-optimized sort that speeds Power BI and SQL reads; VACUUM removes stale files past the retention window. For capacity, learn the throttling ladder driven by smoothing windows.
| Smoothing window | What happens when exceeded |
|---|---|
| 10-minute interactive | Interactive requests get a ~20-second delay |
| 60-minute interactive | New interactive requests are rejected; background still runs |
| 24-hour background | All new requests, interactive and background, are rejected |
Last-week schedule
- Day 7-5: drill your weakest of the three domains
- Day 4-3: mixed timed sets across all three
- Day 2: error-log rules plus the OPTIMIZE/V-Order and throttling tables
- Day 1: logistics, light review, sleep
High-yield cross-domain facts worth memorizing cold
A handful of facts span the three domains and appear repeatedly. Drill these until they are reflex:
- Shortcuts vs mirroring vs copy: a shortcut points at data in place (internal OneLake or external ADLS Gen2, S3, GCS) with no copy; mirroring continuously replicates a supported source (such as Azure SQL or Cosmos DB) into OneLake; a copy activity physically moves data. Pick the one the requirement implies - 'no ETL, low latency, stays in sync' is mirroring; 'query external Delta without staging' is a shortcut.
- SQL analytics endpoint is read-only: a lakehouse exposes a read-only SQL analytics endpoint, while a warehouse supports full read-write T-SQL. If a stem needs inserts or updates via T-SQL, it needs a warehouse, not the lakehouse endpoint.
- Security layers: workspace roles are coarse; OneLake data access roles, row-level security (RLS), column-level security (CLS), and dynamic data masking (DDM) refine it. A Viewer who can list a lakehouse but cannot read its files needs a OneLake data access role; an analyst seeing masked values needs the UNMASK permission.
- Endorsement: Promoted is self-service by item owners; Certified requires an authorized reviewer.
Tool-to-language quick map
DP-700 expects fluency in SQL, PySpark, and KQL, and the language often tells you which tool the question is about.
| Language | Where it lives | Typical use |
|---|---|---|
| T-SQL | Warehouse, SQL analytics endpoint | Set-based transforms, RLS/CLS, DDM |
| PySpark / Spark SQL | Notebooks, Spark job definitions | Distributed code transforms, streaming |
| KQL | Eventhouse / KQL database | Querying real-time telemetry |
| Power Query (M) | Dataflow Gen2 | Low-code analyst shaping |
If you can name the tool, language, and the single decisive fact for each domain without notes, the last week has done its job. Resist adding brand-new resources now - scattered review the night before a scaled exam lowers confidence more than it raises knowledge.
A semantic model is promoted to an empty production stage with a deployment pipeline. The model appears but has no data. What is the correct explanation?
You need a low-code transformation that an analyst can build using Power Query before loading curated tables. Which Fabric item is the best fit?
A capacity has exceeded its 60-minute interactive smoothing window. What behavior should you expect?