2.1 Data Roles and Workloads

Key Takeaways

Database administrators (DBAs) own availability, backups, security, and performance of a database system, not the design of analytical models.
Data engineers build and operate data pipelines, ingestion, transformation, and storage layers that feed analysts and scientists.
Data analysts produce reports and visualizations from cleaned data, typically using Power BI, SQL, and DAX.
Data scientists apply statistics and machine learning to build predictive and prescriptive models, often working in Azure Machine Learning or notebooks.
OLTP workloads process many small read/write transactions in near real time; OLAP workloads scan large historical datasets to support analysis and reporting.

Last updated: June 2026

Microsoft Learn frames the DP-900 exam around three ideas: the data, the people who work with it, and the workload that uses it. Before you can match an Azure service to a scenario, you have to recognize who is doing what.

The Four Core Data Roles

Microsoft groups data professionals into four reference roles. Each role has its own typical tasks, tools, and Microsoft certifications.

Role	Primary responsibility	Typical Azure tools	Microsoft cert path
Database Administrator (DBA)	Keep databases available, secure, performant, and recoverable	Azure SQL, SQL Server on Azure VMs, SQL Managed Instance	DP-300
Data Engineer	Build pipelines that ingest, transform, and store data at scale	Azure Data Factory, Synapse Analytics, Microsoft Fabric, Databricks	DP-700
Data Analyst	Model and visualize cleaned data so the business can decide	Power BI, Power BI Service, DAX, SQL	PL-300
Data Scientist	Train and operationalize statistical and machine-learning models	Azure Machine Learning, notebooks, Python/R	DP-100

Database Administrator

The DBA is responsible for the runtime health of a database. Day-to-day work includes installing and patching database engines, creating logins and role assignments, configuring backups and point-in-time restore, tuning indexes, watching for blocking and deadlocks, and meeting SLAs. On Azure, a DBA spends much less time on hardware or OS patching because services such as Azure SQL Database and Azure SQL Managed Instance are platform as a service (PaaS) — Microsoft owns the underlying infrastructure.

Data Engineer

The data engineer builds the plumbing. They design ingestion from operational sources, write transformation logic (T-SQL, PySpark, dataflows), and land data in formats that analysts and scientists can query. In Azure, this work usually happens in Azure Data Factory for orchestration, Azure Synapse Analytics or Microsoft Fabric for storage and compute, and Azure Databricks for large-scale Spark transformations.

Data Analyst

The data analyst turns curated data into answers. They author semantic models, write DAX measures, design dashboards, and tell stories with visuals. In the Microsoft stack, the analyst's home is Power BI Desktop for authoring and the Power BI Service (or Microsoft Fabric) for sharing.

Data Scientist

The data scientist applies statistics and machine learning to predict and prescribe. They explore data in notebooks, engineer features, train models, evaluate accuracy, and deploy models as endpoints. On Azure this lives in Azure Machine Learning and increasingly in Fabric Data Science.

Workload Types: OLTP vs OLAP

A workload is the kind of work a data system is doing. DP-900 cares about two main categories.

Characteristic	OLTP (Transactional)	OLAP (Analytical)
Purpose	Run the business in real time	Understand the business over time
Transactions	Many small reads and writes	Few large reads, mostly aggregations
Latency target	Milliseconds	Seconds to minutes
Schema	Normalized	Denormalized, often star schema
Typical user	Application	Analyst, data scientist, executive
Azure example	Azure SQL Database, Cosmos DB	Synapse dedicated SQL pool, Fabric warehouse

A point-of-sale system writing one sales row per scan is OLTP. The end-of-quarter report that scans 500 million sales rows to compute revenue by region is OLAP.

Batch vs Streaming Ingestion

Independently of OLTP vs OLAP, the way data arrives at a system is either batch or streaming.

Batch processing moves data in large groups on a schedule (every hour, every night). It is latency-tolerant but throughput-friendly. The classic Azure example is an ADF pipeline that copies the previous day's sales files into the warehouse at 2 AM.
Stream processing treats each event as it arrives, often within seconds. Stock ticks, IoT telemetry, and clickstreams are streaming. Azure offers Azure Stream Analytics, Event Hubs, and Microsoft Fabric Real-Time Intelligence for these patterns.

Many real systems combine both: streaming for the live dashboard, batch for the historical warehouse load. Recognizing which side of the line a scenario sits on is a recurring DP-900 question pattern.

How the Roles Hand Off to Each Other

The four roles form a pipeline, and DP-900 often tests the boundary between them. Raw operational data is owned by the DBA, who guarantees it is available and correct. The data engineer picks that data up, moves it into the analytics estate, and shapes it into trusted tables. The data analyst consumes those trusted tables to build reports, and the data scientist consumes them to train models.

A common trap question describes a task that sits on a boundary — "who designs the nightly pipeline?" (engineer, not DBA) or "who builds the Power BI semantic model?" (analyst, not engineer). Match the verb: administer and protect (DBA), ingest and transform (engineer), model and visualize (analyst), predict and prescribe (scientist).

The Five Analytics Maturity Levels

Microsoft Learn frames analytical value along a maturity ladder, and you should be able to order it. Each level answers a different question and is increasingly valuable and complex.

Level	Question it answers	Example
Descriptive	What happened?	Last quarter's total sales
Diagnostic	Why did it happen?	Sales fell because a region underperformed
Predictive	What will happen?	Forecast next quarter's demand
Prescriptive	What should we do?	Recommend reorder quantities
Cognitive	Learn and adapt automatically	A system that retrains itself on new data

Descriptive and diagnostic analytics are typically the analyst's territory; predictive, prescriptive, and cognitive analytics are the data scientist's. A question that asks "which type of analytics recommends an action?" is testing prescriptive; "forecasts a future value?" is predictive.

Where the Workloads Run

Mapping the role-and-workload framing onto Azure: OLTP runs on the Azure SQL family and Cosmos DB; OLAP runs on Synapse, Microsoft Fabric, and Databricks; batch ingestion is orchestrated by Data Factory; streaming flows through Event Hubs, IoT Hub, and Stream Analytics; and the analyst's reporting layer is Power BI. Recognizing both who does the work and which service hosts it is the combined skill the core-concepts domain rewards, because nearly every later exam question is a variation on "this person, doing this workload, should use which Azure service?"

Test Your Knowledge

A retail company needs someone to design ingestion pipelines that move POS data from 1,200 stores into a central analytics platform every night, transforming the data along the way. Which data role best fits this responsibility?

Database administrator

Data analyst

Data engineer

Data scientist

Test Your Knowledge

Which pair of characteristics most clearly identifies an OLTP workload rather than an OLAP workload?

Aggregations over years of history and a denormalized star schema

Many small, low-latency read/write transactions against a normalized schema

Once-daily batch loads into a fact table with bronze, silver, and gold zones

Long-running queries that scan billions of rows for executive dashboards

Up Next

2.2 Relational vs Non-Relational Data

Continue learning

Microsoft Azure Data Fundamentals

Azure DP-900

2.1 Data Roles and Workloads

Key Takeaways

The Four Core Data Roles

Database Administrator

Data Engineer

Data Analyst

Data Scientist

Workload Types: OLTP vs OLAP

Batch vs Streaming Ingestion

How the Roles Hand Off to Each Other

The Five Analytics Maturity Levels

Where the Workloads Run

Microsoft Azure Data Fundamentals

1Chapter 1: Introduction & Exam Overview

2Chapter 2: Core Data Concepts

3Chapter 3: Relational Data on Azure

4Chapter 4: Non-Relational Data on Azure

5Chapter 5: Analytics Workloads on Azure

Azure DP-900

2.1 Data Roles and Workloads

Key Takeaways

The Four Core Data Roles

Database Administrator

Data Engineer

Data Analyst

Data Scientist

Workload Types: OLTP vs OLAP

Batch vs Streaming Ingestion

How the Roles Hand Off to Each Other

The Five Analytics Maturity Levels

Where the Workloads Run