3.1 Ingest and transform data Overview

Key Takeaways

Ingest and transform data accounts for 30-35% of the DP-700 blueprint.
The domain should be studied as job tasks, not a list of definitions.
Questions often ask which action, control, data element, or workflow step is most appropriate.
Use domain weight and practice misses to decide how much review time this area needs.

Last updated: May 2026

3.1 Ingest and transform data Overview

Ingest and transform data is a DP-700 blueprint domain focused on Batch and streaming ingestion, loading patterns, shortcuts, mirroring, transformations with Power Query, PySpark, SQL, and KQL, and dimensional-model preparation..

Official baseline

Use the current official materials before relying on secondary summaries. Primary source: Microsoft Certified: Fabric Data Engineer Associate. Also compare the official content outline, candidate guide, and scheduling resources when policies affect eligibility, fees, timing, or retakes.

Study notes

Ingest and transform data is weighted at 30-35%. The official description is: Batch and streaming ingestion, loading patterns, shortcuts, mirroring, transformations with Power Query, PySpark, SQL, and KQL, and dimensional-model preparation..

For test prep, convert the domain into actions. Ask: what document, data element, system control, report, code, policy, or communication step would a competent professional choose?

High-yield cue	How to use it
Dp700 Loading Patterns	Practice recognizing when the stem is testing dp700 loading patterns and what action follows.
Dp700 Streaming Processing	Practice recognizing when the stem is testing dp700 streaming processing and what action follows.
Dp700 Batch Ingestion	Practice recognizing when the stem is testing dp700 batch ingestion and what action follows.
Dp700 Batch Transformation	Practice recognizing when the stem is testing dp700 batch transformation and what action follows.
Dp700 Streaming Storage	Practice recognizing when the stem is testing dp700 streaming storage and what action follows.
Dp700 Orchestration	Practice recognizing when the stem is testing dp700 orchestration and what action follows.

Do not study this domain only by rereading notes. Build small scenarios and ask what the role should do next. The exam is more likely to test a practical decision than a pure definition.

Exam-ready mental model

For this section, reduce the material to a repeatable model: cue, authority, action, evidence, and risk. The cue tells you why the question is being asked. The authority is the rule, policy, standard, configuration behavior, official guideline, or operational constraint. The action is what the professional should do next. The evidence is the data point, document, log, calculation, or system state that supports the answer. The risk is what goes wrong if you choose the shortcut.

When reviewing, force yourself to state that model out loud for missed questions. If you can only remember a definition but cannot connect it to an action, the material is not yet exam-ready. If you can name the action but not the authority, you may choose an answer that sounds operationally convenient but violates the official process. If you can name the rule but not the evidence, you may overapply it to the wrong scenario.

How this appears on the exam

The exam usually tests applied judgment. Read the stem for the role, the setting, the governing rule, and the immediate task. Then choose the answer that is most accurate, policy-aligned, and complete for that task. If an answer sounds familiar but ignores the specific cue in the stem, treat it as a distractor. If two answers seem possible, prefer the one that is more specific to the stated task and leaves the cleanest audit trail.

Error-log rule

After each missed question in this area, write one sentence that starts with: I missed this because. Good categories are misread cue, did not know rule, wrong sequence, calculation error, overgeneralized policy, or chose the faster but less defensible action. Add a second sentence that starts with: Next time I will look for. That second sentence turns the miss into a concrete cue you can recognize later.

Test Your Knowledge

You want Azure SQL data to stay continuously available in OneLake with low-latency replication and without building a custom ETL process. Which Fabric feature should you use?

Mirroring

Item endorsement

Query insights

Domain assignment

Test Your Knowledge

You need to query external Delta data in Fabric without copying or staging it into a new storage location. Which feature is the best fit?

Deployment pipeline

OneLake shortcut

Dynamic data masking

Semantic link

Up Next

3.2 Core Workflows and Decision Points

Continue learning

DP-700 Study Guide

1Chapter 1: DP-700 Orientation and Exam Strategy

2Chapter 2: Implement and manage an analytics solution

3Chapter 3: Ingest and transform data

4Chapter 4: Monitor and optimize an analytics solution

5Chapter 5: Final Review and Test Day

3.1 Ingest and transform data Overview

Key Takeaways

3.1 Ingest and transform data Overview

Official baseline

Study notes

Exam-ready mental model

How this appears on the exam

Error-log rule