1.5 Databricks SQL and the Photon Engine

Key Takeaways

  • Databricks SQL provides a warehouse-style experience: the SQL editor, queries, dashboards, alerts, and BI-tool connectivity over lakehouse tables.
  • SQL warehouses are the compute for Databricks SQL and come in Classic, Pro, and Serverless tiers with differing startup speed and features.
  • Serverless SQL warehouses start in seconds and use Intelligent Workload Management (IWM) for AI-driven autoscaling.
  • Photon is Databricks' native C++ vectorized engine that processes data in columnar batches to accelerate SQL and DataFrame workloads.
  • Photon is always on for serverless and SQL warehouses, and is enabled by default but toggleable on classic clusters.
Last updated: June 2026

Databricks SQL

Databricks SQL (DBSQL) brings the experience of a data warehouse to the lakehouse. It gives analysts and engineers a dedicated environment to run SQL against governed Delta tables without touching Spark notebooks. Its components include:

  • SQL editor — write and run queries with autocomplete and result visualizations.
  • Queries — saved, parameterized SQL that can be scheduled to refresh.
  • Dashboards — collections of visualizations built from queries, refreshed on a schedule.
  • Alerts — trigger notifications when a query result crosses a threshold.
  • BI connectivity — native connectors and JDBC/ODBC let Power BI, Tableau, and Looker query lakehouse tables directly.

All of this runs on Unity Catalog-governed data, so the same permissions and lineage apply whether data is accessed from SQL, a notebook, or a BI tool.

SQL Warehouses

A SQL warehouse is the compute that powers Databricks SQL — analogous to a cluster but tuned for SQL/BI concurrency. There are three tiers:

Warehouse typeStartupNotes
ClassicSlower (minutes)Runs in your cloud account; basic DBSQL features
ProModerateAdds advanced features; still in your account
ServerlessSeconds (2-6s)Compute managed by Databricks; instant scaling

Serverless SQL warehouses start in seconds and use Intelligent Workload Management (IWM) to autoscale across queries and concurrency automatically. They support all three DBSQL performance features — Photon, Predictive I/O, and IWM — and remove almost all infrastructure management. For exam purposes: serverless = fastest startup and least management; classic = slowest, runs in your own cloud subscription.

The Photon Engine

Photon is Databricks' native vectorized query engine, written in C++, that accelerates analytical workloads. Instead of executing through the JVM-based Spark SQL engine row by row, Photon processes data in columnar batches using vectorized, SIMD-friendly operations — dramatically speeding up scans, joins, aggregations, and writes.

Key facts to remember:

  • Photon replaces the JVM execution engine for supported operations with a native C++ runtime; it is fully compatible with the Spark and DataFrame APIs, so no code changes are needed.
  • It accelerates SQL workloads, DataFrame API calls, ETL pipelines, and stateless streaming.
  • It is a drop-in accelerator: where an operation is not yet supported by Photon, execution falls back to standard Spark transparently.
  • Photon can reduce DBU cost per query even though its compute carries a higher DBU rate, because jobs finish faster.

Where Photon Runs

ComputePhoton status
Serverless computeAlways enabled
SQL warehousesAlways enabled
Serverless Lakeflow Declarative PipelinesAlways enabled
Classic all-purpose / jobs computeEnabled by default, toggleable

Because SQL warehouses always run Photon, Databricks SQL queries benefit from its acceleration automatically. On classic clusters, Photon is on by default but can be turned off — a useful exam distinction. Together, Databricks SQL + Photon give the lakehouse warehouse-class query performance directly on open Delta tables, eliminating the need to move data into a separate analytics warehouse.

Performance Features Behind Databricks SQL

Serverless SQL warehouses bundle three engine features that work together, and the exam expects you to recognize each by name:

  • Photon — the native C++ vectorized engine that executes the query plan in columnar batches.
  • Predictive I/O — uses machine learning to read only the data needed for a query, accelerating selective reads and point lookups (especially with deletion vectors) so the engine scans far fewer bytes.
  • Intelligent Workload Management (IWM) — AI-driven autoscaling that predicts query demand and adds or removes warehouse clusters to meet concurrency without manual tuning.

These combine so a dashboard hitting a serverless warehouse gets fast startup, automatic scaling under load, and Photon-accelerated execution with no configuration.

How DBSQL Fits the Lakehouse

Databricks SQL is the warehouse persona of the lakehouse. Analysts who only know SQL get a familiar editor, scheduled queries, dashboards, and alerts, while connecting BI tools through JDBC/ODBC, all reading the same governed Delta tables that data engineers populate with notebooks and Lakeflow pipelines. Because Unity Catalog enforces permissions and lineage uniformly, a column an engineer masks is masked for the BI tool too — there is no separate warehouse copy with its own, drifting security model.

Exam-Critical Distinctions

  • SQL warehouse vs cluster: a SQL warehouse powers Databricks SQL and BI concurrency; a cluster runs notebooks and jobs. Both are compute, but you would not run a notebook on a SQL warehouse.
  • Serverless vs classic: serverless starts in seconds and is managed by Databricks; classic runs in your own cloud account and starts in minutes.
  • Photon is automatic on SQL warehouses, so DBSQL queries are accelerated whether or not the user knows Photon exists. Knowing where Photon is always-on versus toggleable is a frequently tested point. Remember too that Photon requires no code changes — it transparently accelerates the same SQL and DataFrame operations you already write and silently falls back to standard Spark for any operation it does not yet support, so enabling it is purely a performance and cost decision rather than a rewrite.
Test Your Knowledge

What is Photon?

A
B
C
D
Test Your Knowledge

On which compute is Photon always enabled (not toggleable)?

A
B
C
D
Test Your Knowledge

A team needs a SQL warehouse that starts in seconds and autoscales automatically with minimal infrastructure management. Which tier fits best?

A
B
C
D