3.2 Discover Data: OneLake Catalog & Real-Time Hub

Key Takeaways

The OneLake catalog is the tenant-wide discovery surface for governed analytical items (lakehouses, warehouses, semantic models, KQL databases) with search, lineage, and endorsement filtering.
The Real-Time hub is the discovery and management surface specifically for streaming and event data: eventstreams, KQL data, Azure event sources, and Fabric events.
Endorsement has two main levels — Promoted (any contributor can apply) and Certified (only authorized reviewers) — and endorsed items surface higher and are visually flagged in discovery.
Discovering and reusing an existing certified semantic model or lakehouse prevents duplicate, ungoverned data sprawl and is the exam-preferred answer over rebuilding.
Sensitivity labels and endorsement are governance signals, not access control; they help users find trusted data but do not by themselves grant or deny permissions.

Last updated: June 2026

Discovery Before You Build

A recurring DP-600 theme: the correct engineering choice is often to find and reuse a governed asset rather than re-ingest the same data. The exam rewards answers that say "check the catalog / Real-Time hub first." Reusing a certified semantic model or an existing lakehouse table avoids duplicate storage, conflicting logic, and ungoverned shadow datasets. When a question offers "build a new lakehouse" against "search the catalog for an existing certified item," the discovery option almost always wins unless the data genuinely does not exist yet.

OneLake Catalog

The OneLake catalog is the tenant-wide place to discover and govern analytical data items you have access to:

Discover tab — search and browse lakehouses, warehouses, semantic models, KQL databases, and other items across workspaces you can see.
Govern experience — surfaces governance posture: sensitivity labels, endorsement, refresh status, and item insights to help data owners improve trust.
Lineage — shows upstream sources and downstream dependents so you understand impact before changing or reusing an item.
Filters — narrow by item type, workspace, endorsement, and sensitivity label so you can quickly isolate, say, only Certified semantic models.

The catalog respects permissions: it surfaces items you are entitled to see, and endorsement raises an item's ranking but never overrides access control.

Real-Time Hub

The Real-Time hub is the dedicated discovery and management surface for streaming and event data — the equivalent of the catalog for data in motion:

Browse and connect to eventstreams, KQL databases/eventhouse data, Fabric events (workspace and job events), and Azure event sources (Event Hubs, IoT Hub, blob storage events).
Preview streams, create eventstreams, and route events to destinations such as an eventhouse or lakehouse.
It is where you go when the scenario mentions streaming, telemetry, IoT, sensors, or events and you need to find or wire up a real-time source.

Catalog vs. Real-Time Hub

Need	Use
Find an existing lakehouse, warehouse, or semantic model to reuse	OneLake catalog
Inspect lineage and endorsement before changing a model	OneLake catalog
Discover available event streams or telemetry sources	Real-Time hub
Connect to Azure Event Hubs / IoT Hub and route events	Real-Time hub
Find a certified dataset for self-service reporting	OneLake catalog (filter by Certified)
Monitor and subscribe to Fabric job/workspace events	Real-Time hub

Trap: a streaming or telemetry scenario that offers "search the OneLake catalog" as the discovery answer is steering you wrong — data in motion is the Real-Time hub's job, and governed analytical tables are the catalog's.

Lineage and Impact Analysis

Before you reuse or modify a discovered item, DP-600 expects you to check lineage. Lineage shows the chain from sources through dataflows, lakehouses, warehouses, and semantic models out to reports. Two practical uses on the exam:

Impact analysis — before changing a shared semantic model, lineage shows which downstream reports and apps would break, so you can warn owners.
Root-cause tracing — when a report shows stale numbers, lineage walks upstream to the dataflow or lakehouse that did not refresh.

This is why "reuse the existing certified model" is safe: lineage proves what feeds it and what depends on it, so you are not building blind on duplicated, ungoverned data.

Endorsement: Surfacing Trusted Data

Endorsement is how organizations signal which items are trustworthy. It directly affects discovery ranking and the badges users see, and DP-600 tests the boundary between endorsement and security.

Level	Who can apply	Meaning	Discovery effect
None	n/a	Unendorsed / personal	Lower visibility
Promoted	Any item contributor/owner	"This is ready to use"	Promoted badge, surfaced higher
Certified	Only users authorized by the tenant/domain admin	"Org-reviewed, authoritative"	Certified badge, surfaced highest
Master data (where enabled)	Authorized reviewers	Canonical reference data	Highlighted as canonical

Key exam distinctions you must keep straight:

Endorsement is a trust signal, not access control. A certified semantic model still requires permissions to actually consume; certification does not grant anyone access to the data.
Promoted is self-service; Certified is gated. Anyone who can edit an item can promote it; only reviewers authorized in the tenant or domain settings can certify it. That restriction is exactly what makes Certified meaningful — it implies a review happened.
Endorsement is not deployment and not a sensitivity label. Endorsement surfaces trust in discovery; deployment pipelines move content between Dev/Test/Prod stages; sensitivity labels classify content and can drive encryption/protection. The exam loves to blur these three in distractors.

Three Concepts the Exam Conflates

Concept	Purpose	Grants access?
Endorsement	Signals trust, ranks in discovery	No
Sensitivity label	Classifies data, can enforce protection	No (it can restrict via DLP)
Workspace/item roles	Actual permissions to view/edit	Yes

When a scenario asks how to make a governed dataset easy for analysts to find and trust, the answer is to endorse it — Certify it if a formal review is required — so it ranks and badges appropriately in the OneLake catalog, not to copy it into every workspace. When the scenario asks how to let analysts actually open it, the answer is permissions, not endorsement.

Who Configures Certification

Certification is gated on purpose. A tenant administrator (or a delegated domain administrator in a data-mesh setup) decides which security groups are allowed to certify items, and only those authorized users see the Certified option. This is a frequent exam detail: if a question asks why an item owner cannot certify their own dataset, the reason is that they are not on the authorized certifier list — promotion is open to any contributor, but certification is restricted by admin policy.

Master-data endorsement, where enabled, is similarly restricted to authorized reviewers and flags an item as the single canonical source for a reference entity such as a corporate product or chart-of-accounts list.

Test Your Knowledge

A data analyst is about to build a new lakehouse and pipeline to calculate company-wide revenue. Before starting, what is the BEST first step according to Fabric data-governance practice?

Immediately create the lakehouse so work is not delayed

Search the OneLake catalog for an existing certified semantic model or lakehouse that already provides governed revenue data

Open the Real-Time hub to find a streaming revenue source

Request workspace Admin so the new lakehouse can be certified later

Test Your Knowledge

A reviewer applies the Certified endorsement to a semantic model so analysts trust it. An analyst still cannot open the model. What is the correct explanation?

Certification automatically grants read access, so the analyst must wait for propagation

Endorsement is a trust and discovery signal only; the analyst still needs item or workspace permissions to consume the model

Certified models can only be opened by the reviewer who certified them

The model must also be Promoted before Certified takes effect

Up Next

3.3 Choose the Right Store: Lakehouse vs Warehouse vs Eventhouse

Continue learning

Exam DP-600: Implementing Analytics Solutions Using Microsoft Fabric

Azure DP-600

3.2 Discover Data: OneLake Catalog & Real-Time Hub

Key Takeaways

Discovery Before You Build

OneLake Catalog

Real-Time Hub

Catalog vs. Real-Time Hub

Lineage and Impact Analysis

Endorsement: Surfacing Trusted Data

Three Concepts the Exam Conflates

Who Configures Certification

Exam DP-600: Implementing Analytics Solutions Using Microsoft Fabric

1DP-600 Exam Overview & Fabric Foundations

2Maintain a Data Analytics Solution (25-30%)

3Prepare Data (45-50%)

4Implement & Manage Semantic Models (25-30%)

5Exam Strategy & Final Preparation

Azure DP-600

3.2 Discover Data: OneLake Catalog & Real-Time Hub

Key Takeaways

Discovery Before You Build

OneLake Catalog

Real-Time Hub

Catalog vs. Real-Time Hub

Lineage and Impact Analysis

Endorsement: Surfacing Trusted Data

Three Concepts the Exam Conflates

Who Configures Certification