1.3 Microsoft Fabric Platform Fundamentals
Key Takeaways
- OneLake is the single, tenant-wide logical data lake in Open Mirroring Delta/Parquet format — one copy of data, no silos, addressed like OneDrive for data.
- Workspaces are collaboration containers that hold Fabric items and are the unit of capacity assignment, security roles, and Git/deployment lifecycle.
- Capacities are the purchased compute units (Fabric SKUs, e.g., F2, F64) that power all workloads in assigned workspaces.
- Fabric is a SaaS platform: lakehouse, warehouse, eventhouse, semantic model, notebook, pipeline, and report are all first-class items in one product.
- Shortcuts virtualize external or cross-workspace data into OneLake without copying it, enabling one logical lake across sources.
Fabric Is One SaaS Platform
Quick Answer: Microsoft Fabric is a software-as-a-service (SaaS) analytics platform. Every analytics artifact — lakehouse, warehouse, eventhouse, semantic model, notebook, data pipeline, and report — is a typed item that lives in a workspace, stores data in OneLake, and runs on a purchased capacity. There are no separate servers to provision.
Understanding this architecture is foundational because nearly every DP-600 question assumes you know where data physically lives and which item a task belongs in.
OneLake: One Copy of Data
OneLake is the single, tenant-wide logical data lake automatically provisioned with Fabric. Key properties tested on the exam:
- One per tenant. Like OneDrive for files, every organization gets exactly one OneLake. It removes data silos.
- Open format. Tabular data is stored in Delta/Parquet, so it is readable by Spark, T-SQL, KQL, and Direct Lake semantic models without conversion.
- Hierarchical addressing. Data is addressed by
tenant > workspace > item > folder/table. - Shortcuts. A shortcut is a virtual reference that surfaces external storage (Amazon S3, Azure Data Lake, another workspace) inside OneLake without copying the data. This is how Fabric keeps a single logical lake across many physical sources.
Workspaces: The Collaboration and Governance Unit
A workspace is the container that holds Fabric items and is the boundary for collaboration, security, capacity, and lifecycle:
- Items (lakehouse, warehouse, semantic model, report, etc.) are created inside a workspace.
- Workspace roles (Admin, Member, Contributor, Viewer) grant baseline access to its contents.
- A workspace is assigned to one capacity, which supplies its compute.
- Git integration and deployment pipelines operate at the workspace level — making the workspace the natural dev/test/prod boundary.
Capacities: The Compute Engine
A capacity is the pool of purchased compute that powers all workloads in its assigned workspaces. Capacities are sold as Fabric SKUs (for example F2 through large F-series like F64). All Fabric engines — Spark, SQL, KQL, semantic model query — draw from the same capacity, so capacity sizing affects every workload, not just one engine.
Item Types You Must Recognize
| Item | Purpose | Primary Query Language |
|---|---|---|
| Lakehouse | Open Delta/Parquet file + table analytics, Spark-friendly | SQL (read), Spark |
| Warehouse | Relational T-SQL data warehousing, full read/write SQL | T-SQL |
| Eventhouse / KQL DB | Real-time and time-series telemetry analytics | KQL |
| Semantic model | BI consumption layer for Power BI reports | DAX |
| Data pipeline / Dataflow Gen2 | Ingestion and orchestration of data movement | Low-code / config |
| Notebook | Code-first Spark transformations | PySpark / Spark SQL |
An analytics engineer must make a large dataset stored in Amazon S3 available to a Fabric lakehouse for analysis without duplicating the data into OneLake storage. Which Fabric capability should they use?