Career upgrade: Learn practical AI skills for better jobs and higher pay.
Level up
Cheat sheet

Azure DP-700 Cheat Sheet

Quick Facts

Exam
DP-700
Credential
Fabric Data Engineer
Time
100 min
Pass
700/1000
Level
Intermediate
Product
Microsoft Fabric
Role
Data Engineer
Blueprint
April 20 2026

Domain Balance

Manage, Ingest, Monitor split evenly

Manage: secureIngest: loadMonitor: tune

Pipeline vs Notebook

Pipeline

  • Orchestrates activities
  • Copy data
  • Schedules runs

Notebook

  • Transforms data
  • Spark session
  • Code execution

Coordinate vs compute

Security Picker

  1. Workspace administrationAdmin(Control plane)
  2. Create contentContributor(Workspace role)
  3. Read workspace onlyViewer(No write)
  4. Secure lake dataOneLake roles(Data plane)
  5. Filter rowsRLS(User context)
  6. Mask valuesDynamic masking(Sensitive data)

Workspace Settings

Spark pool
Workspace compute default
Starter pools
Faster session startup
High concurrency
Shared notebook session
Resource profile
Read/write tuning preset
Item compute
Per-item Spark sizing
Domain
Business grouping
OneLake cache
Shortcut read acceleration
Dataflow scale
Gen2 compute setting

Security Plane

Workspace acts; OneLake reads

Workspace: actionsItem: sharingOneLake: dataRLS: rows

Workspace vs OneLake Roles

Workspace role

  • Control plane
  • Item visibility
  • Content actions

OneLake role

  • Data plane
  • Folder/table access
  • Viewer data access

Actions vs data

Lifecycle

Git integration
Branch-based versioning
Azure DevOps
Supported Git provider
GitHub
Supported Git provider
Database project
Warehouse schema source
Deployment pipeline
Dev-test-prod promotion
Deployment rules
Stage-specific bindings
Credentials
Not copied
Data refresh
Required after deploy

Security + Governance

Admin
Full workspace control
Member
Manage content/users
Contributor
Create/edit items
Viewer
Read workspace content
Item sharing
Item-level access
OneLake roles
Data-plane access
DefaultReader
ReadAll data access
Sensitivity label
Information protection
Endorsement
Trusted content signal
Audit logs
Activity trail

Access Controls

RLS
Row filter
CLS
Column filter
OLS
Object hiding
Folder security
Path-level access
File security
Object-level access
Dynamic masking
Sensitive value masking
Entra ID
Identity provider
Service principal
App identity

Orchestration

Pipeline
Activity orchestration
Dataflow Gen2
Low-code transformation
Notebook
Code transformation
Schedule
Time-based trigger
Event trigger
Event-based run
Parameter
Runtime input
Expression
Dynamic value
Notebook activity
Pipeline-called code

Tool Stack

Flow, Pipe, Note, SQL, KQL

Flow: low-codePipe: orchestrateNote: SparkSQL/KQL: query

Lakehouse vs Warehouse

Lakehouse

  • Delta files
  • Spark friendly
  • Flexible schema

Warehouse

  • Relational SQL
  • T-SQL serving
  • Structured modeling

Files vs SQL

Transformation Picker

  1. Low-code shapingDataflow Gen2(Power Query)
  2. Custom Spark logicNotebook(PySpark)
  3. Warehouse SQLT-SQL(Relational)
  4. Eventhouse queryKQL(Real-time)
  5. Move then transformPipeline(Activities)
  6. Reusable code jobSpark job(Batch)

Stores + Patterns

Lakehouse
Delta lake analytics
Warehouse
Relational SQL analytics
Eventhouse
Real-time KQL store
KQL database
Time-series analytics
Semantic model
BI consumption layer
Full load
Reload everything
Incremental load
Load changes only
Streaming load
Continuous event ingestion

Streaming Path

Events flow, land, query, window

Eventstream: flowEventhouse: landKQL: queryWindow: time

Dataflow vs Notebook

Dataflow Gen2

  • Low-code
  • Power Query
  • Many connectors

Notebook

  • Code-first
  • PySpark/Scala
  • Custom logic

Low-code vs code

Storage Picker

  1. Delta analyticsLakehouse(Files/tables)
  2. SQL servingWarehouse(Relational)
  3. Telemetry analyticsEventhouse(KQL)
  4. External dataShortcut(No copy)
  5. Operational replicaMirroring(Copied changes)
  6. Report layerSemantic model(BI)

Batch Tools

Copy activity
Move source data
Dataflow Gen2
Power Query shaping
Notebook
PySpark/custom code
T-SQL
Warehouse transformations
KQL
Eventhouse transformations
Shortcut
Virtual data reference
Mirroring
Near-real-time replica
Fast copy
Dataflow ingestion boost

Shortcut vs Mirroring

Shortcut

  • Virtual pointer
  • No copy
  • Target permissions

Mirroring

  • Replicated data
  • Source changes
  • Fabric copy

Reference vs replicate

Transforms + Modeling

Denormalize
Flatten for analytics
Fact table
Measures/events
Dimension table
Descriptive context
Star schema
Facts plus dimensions
Aggregate
Summarize groups
Surrogate key
Warehouse identifier
Late dimension
Fact arrives first
Inferred member
Temporary dimension row

Eventstream vs Eventhouse

Eventstream

  • Ingest events
  • Route/process
  • Stream pipeline

Eventhouse

  • Store events
  • KQL query
  • Real-time analytics

Flow vs store

Data Quality

Duplicate rows
Deduplicate by key
Missing data
Impute or reject
Late events
Handle watermark lag
Schema drift
Source shape changed
Checkpoint
Streaming resume state
Watermark
Late-data boundary
CDC
Change data capture
Idempotent load
Safe rerun

Full vs Incremental

Full load

  • Reload all
  • Simple logic
  • Longer window

Incremental

  • Changed rows
  • Needs watermark
  • Shorter window

All vs changes

Streaming Tools

Eventstream
Event routing/processing
Eventhouse
Real-time analytics store
Native table
KQL-owned storage
Shortcut table
Referenced OneLake data
Query acceleration
Shortcut query boost
Spark streaming
Micro-batch processing
Window function
Time-bucket computation
KQL
Streaming query language

Tune Order

Prune, compact, cache, scale

Prune scansCompact filesCache hotScale compute

OPTIMIZE vs V-Order

OPTIMIZE

  • Compacts files
  • Delta maintenance
  • Write cleanup

V-Order

  • Read layout
  • Parquet optimization
  • Power BI speed

Compact vs layout

Troubleshooting Picker

  1. Pipeline failedRun history(Activities)
  2. Dataflow failedRefresh details(Mashup)
  3. Notebook failedSpark logs(Driver)
  4. Shortcut failsCredentials(Target access)
  5. Model staleRefresh history(Semantic)
  6. Capacity saturatedMetrics app(Utilization)

Monitor + Alerts

Monitoring hub
Fabric run status
Pipeline run
Activity execution
Dataflow refresh
Gen2 execution
Notebook run
Spark execution
Semantic refresh
Model data update
Alerts
Condition notifications
Capacity metrics
Resource utilization
Audit events
User activity

Error Triage

Pipeline error
Activity failure
Dataflow error
Mashup/load issue
Notebook error
Spark/code failure
Eventhouse error
KQL store issue
Eventstream error
Event route issue
T-SQL error
SQL statement issue
Shortcut error
Target/credential issue
Refresh error
Semantic update failure

Performance Tuning

OPTIMIZE
Compact Delta files
V-Order
Parquet read optimization
Partitioning
Prune scanned data
Z-order
Cluster related values
Statistics
Query planning metadata
Indexes
Warehouse access paths
Materialization
Precomputed results
Query acceleration
KQL shortcut speed

Query + Spark

Predicate pushdown
Filter near source
Projection pruning
Read needed columns
Broadcast join
Small table join
Shuffle
Network data exchange
Skew
Uneven partition load
Caching
Reuse hot data
Concurrency
Parallel user pressure
Explain plan
Query execution shape

Common Traps

Workspace vs data access

Viewer sees item OneLake role reads data

Deployment copies metadata

Definitions promote Data stays behind

Shortcut vs copy

Shortcut references target Mirroring replicates changes

Pipeline vs transform

Pipeline orchestrates Notebook transforms

Low-code vs custom

Dataflow uses Power Query Notebook uses code

Streaming resume

Checkpoint preserves state Watermark bounds lateness

Telemetry store

Eventstream routes events Eventhouse stores events

Optimization scope

OPTIMIZE compacts files V-Order improves reads

Last Minute

  1. 1.Weights: three equal 30-35 bands
  2. 2.Know SQL, PySpark, KQL
  3. 3.Workspace roles govern actions
  4. 4.OneLake roles govern data
  5. 5.Deployment pipelines skip data
  6. 6.Credentials remain stage-specific
  7. 7.Shortcuts reference; mirroring copies
  8. 8.Dataflow low-code; notebooks code
  9. 9.Pipeline orchestrates; notebook computes
  10. 10.Eventstream flows; eventhouse stores
  11. 11.Incremental loads need watermarks
  12. 12.OPTIMIZE compacts Delta files
Same family resources

Explore More Microsoft Azure Certifications

Continue into nearby exams from the same family. Each card keeps practice questions, study guides, flashcards, videos, and articles in one place.