1.3 Delta Lake Fundamentals
Key Takeaways
- A Delta table is Parquet data files plus a _delta_log directory of JSON commits that provide ACID transactions.
- Each transaction adds an ordered JSON commit recording add/remove file actions; every 10 commits a Parquet checkpoint speeds log reads.
- Time travel lets you query prior versions by VERSION AS OF or TIMESTAMP AS OF, useful for audits and rollbacks.
- Schema enforcement rejects writes that do not match the table schema; mergeSchema / overwriteSchema permit intentional evolution.
- OPTIMIZE compacts small files (with ZORDER for data skipping), VACUUM removes unreferenced files, and MERGE performs upserts.
Anatomy of a Delta Table
A Delta table is just two things in object storage: the actual data as Parquet files, and a _delta_log/ directory — the transaction log — that records every change. The log is what upgrades plain files into a reliable, ACID-compliant table.
Each write creates an ordered commit file, 00000000000000000000.json, ...0001.json, and so on. A commit does not rewrite data; it records actions: add (a new data file is part of the table), remove (a file is logically deleted), plus metadata and protocol entries. The current table state is the replay of all commits in order. Because readers determine which files belong to a version from the log — not from listing the directory — Delta gives:
- Atomicity: a commit is all-or-nothing; partial writes never appear.
- Consistency & Isolation: readers see a consistent snapshot; concurrent writers use optimistic concurrency and a commit conflict aborts the loser.
- Durability: committed data persists in cloud storage.
To avoid replaying thousands of JSONs, Delta writes a checkpoint (a Parquet summary of state) every 10 commits, so readers load the latest checkpoint plus any newer JSON commits.
Time Travel
Because old files stay referenced in the log, you can query prior versions:
SELECT * FROM sales VERSION AS OF 12;
SELECT * FROM sales TIMESTAMP AS OF '2026-06-01';
Use DESCRIBE HISTORY sales to see the version, timestamp, and operation for each commit. Time travel powers audits, reproducible reports, and rollbacks (RESTORE TABLE sales TO VERSION AS OF 12). Note: once VACUUM removes the underlying files, time travel to those older versions fails with file-not-found.
Schema Enforcement and Evolution
Delta tables have a defined schema, and schema enforcement (schema-on-write) rejects any write whose columns or types do not match — protecting downstream consumers from corruption. When a change is intentional, schema evolution lets it through:
| Goal | Mechanism |
|---|---|
| Add new columns from an append | .option("mergeSchema", "true") or spark.databricks.delta.schema.autoMerge.enabled |
| Replace the schema on overwrite | .option("overwriteSchema", "true") |
| Evolve schema during a MERGE | WITH SCHEMA EVOLUTION clause |
Without these options, a mismatched write fails fast rather than silently dropping or coercing data — a frequent exam point.
MERGE: Upserts in One Statement
MERGE combines insert, update, and delete against a target using a source and a match condition — the standard pattern for change-data-capture and Silver-layer upserts:
MERGE INTO customers t
USING updates s ON t.id = s.id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *;
This updates millions of rows as simply as a database UPDATE, all within one atomic commit.
Table Maintenance: OPTIMIZE, ZORDER, VACUUM
Streaming and frequent small writes create the small-file problem, hurting read performance. Maintenance commands fix layout and reclaim space:
OPTIMIZE table— compacts many small Parquet files into fewer right-sized files (no data change).OPTIMIZE table ZORDER BY (col)— co-locates rows sharing column values so queries filtering oncolskip far more files (data skipping). Best for high-cardinality filter columns.- Liquid clustering (
CLUSTER BY) — the modern, stateful alternative to partitioning and ZORDER; it incrementally maintains layout as data evolves without recomputing the whole table. VACUUM table— permanently deletes data files no longer referenced by the log, older than the retention threshold (default 7 days). VACUUM frees storage but breaks time travel to versions whose files it removes.
A common trap: running VACUUM ... RETAIN 0 HOURS deletes files immediately and can corrupt readers/time travel; Databricks guards this with a safety check you must explicitly disable.
Partitioning, Clustering, and Data Skipping
Delta accelerates reads by skipping files it can prove are irrelevant. For each data file, Delta records min/max statistics per column in the log; a query with a WHERE filter consults these stats and reads only files whose ranges overlap the predicate. Three layout strategies improve this:
| Strategy | What it does | When to use |
|---|---|---|
| Partitioning | Splits data into directories by a low-cardinality column (e.g., date) | Large tables with a clear, low-cardinality filter |
| Z-ordering | Co-locates related values across multiple columns within files | High-cardinality filter columns; run via OPTIMIZE |
| Liquid clustering | Stateful, incremental clustering on chosen keys | Modern default; avoids over-partitioning and reclustering cost |
Over-partitioning on a high-cardinality column is a classic anti-pattern: it produces thousands of tiny files and hurts performance. Liquid clustering is now preferred because it adapts as data grows without the rigid directory layout of partitioning or the full-table recompute that Z-ordering can require.
Managed vs External Tables and Constraints
Delta tables come in two storage flavors under Unity Catalog. A managed table has both its metadata and its data files managed by Databricks in the catalog's storage location; dropping it deletes the data. An external table stores data at a path you specify (an external location); dropping it removes only the metadata, leaving the files in place. Delta also supports CHECK constraints and NOT NULL to reject invalid rows at write time, and generated columns computed from other columns.
Why Delta Beats Plain Parquet
Plain Parquet on a lake has no atomic commits, so a failed multi-file write leaves the table half-updated and readers can see partial results. Delta's transaction log makes every write atomic and isolated, adds time travel and schema enforcement, and enables efficient upserts, deletes, and updates that raw Parquet cannot express. That reliability is precisely why every managed table on Databricks defaults to Delta.
What does the _delta_log directory of a Delta table contain, and why is it essential?
A query needs the state of the orders table as it existed yesterday for an audit. Which feature provides this?
An append job tries to write a DataFrame that has an extra column not in the target Delta table, and no schema options are set. What happens?
Which maintenance command compacts many small Parquet files into fewer, larger files to improve read performance without changing the data?