1.6 Databricks Runtime and Compute Configuration

Key Takeaways

  • The Databricks Runtime (DBR) bundles Apache Spark, Delta Lake, and optimized libraries; the LTS variant gives extended support.
  • All-purpose compute is long-lived for interactive development; job compute is created per run by the scheduler and terminated after.
  • Serverless compute is managed by Databricks, starts in seconds, and always runs Photon and autoscaling.
  • Autoscaling adds and removes workers between min and max bounds based on load; auto-termination shuts idle clusters to save cost.
  • Cluster policies constrain configuration (instance types, autoscaling, tags) to control cost and governance; DBU is the billing unit.
Last updated: June 2026

The Databricks Runtime

The Databricks Runtime (DBR) is the set of software installed on cluster machines. Each version bundles a specific Apache Spark build, Delta Lake, Java/Scala/Python, and a library of performance and connectivity optimizations — so you select a runtime version rather than assembling Spark yourself.

Key runtime variants:

RuntimeUse
Standard DBRGeneral data engineering with Spark + Delta
DBR LTSLong-Term Support — extended patches/stability for production
DBR MLAdds popular ML libraries (scikit-learn, XGBoost, etc.)
Photon-enabled DBRRuns the Photon native engine for acceleration

Choose an LTS version for production pipelines that need stability over a long window; pick the latest standard release to access newer features.

Compute Types

Databricks compute falls into three categories the exam tests directly:

  • All-purpose compute — long-lived, interactive clusters for developing in notebooks, ad-hoc analysis, and collaboration. You create, restart, and terminate them manually; multiple users can share one.
  • Job compute (job clusters) — created automatically by the scheduler when a Lakeflow Job task runs and terminated when the job finishes. It is cheaper for production because it exists only for the run and is isolated per job.
  • Serverless compute — fully managed by Databricks. It starts in seconds from cached environments, autoscales automatically, and always runs Photon. There is nothing to size or terminate; Databricks handles the infrastructure.

A core exam rule: use all-purpose for interactive development and job compute for scheduled production work — running production jobs on always-on all-purpose clusters wastes money.

Configuring Clusters: Autoscaling, Termination, Policies

Classic clusters expose configuration that balances performance and cost:

  • Autoscaling — set a minimum and maximum number of worker nodes; Databricks adds workers when a job is resource-bound and removes them when load drops, so you pay for capacity only when needed.
  • Auto-termination — a cluster shuts down after a defined idle period (e.g., 30 minutes), preventing forgotten clusters from running up cost.
  • Cluster (compute) policies — admin-defined rule sets that constrain what users can configure: allowed instance types, max workers, mandatory autoscaling, required tags, and runtime version. Policies enforce cost controls and governance and simplify cluster creation for non-experts.

Driver and Workers

A cluster has one driver node (runs the Spark driver, coordinates tasks, holds notebook state) and one or more worker nodes (execute tasks in parallel). A single-node cluster runs driver and executor on one machine — fine for small or single-threaded work, not for large distributed jobs.

DBU: The Billing Unit

A Databricks Unit (DBU) is a normalized unit of processing capacity consumed per hour. Your bill is DBUs × tier rate (which varies by product — Jobs, All-Purpose, DBSQL — and cloud) plus the underlying cloud VM cost. Larger or more workers consume more DBUs per hour; Photon-enabled compute has a higher DBU rate but often finishes faster, lowering total DBUs for a workload.

Cost-Control Checklist

  • Run production on job compute, not all-purpose.
  • Enable autoscaling and auto-termination.
  • Apply cluster policies to cap size and enforce tags.
  • Consider serverless to remove idle and startup waste.

These levers are exactly what the associate exam expects you to apply when asked to make a workload reliable and cost-efficient.

Access Modes and Cluster Sizing

When you create classic compute you also choose an access mode, which determines isolation and Unity Catalog support. Single-user (dedicated) mode is assigned to one user and supports all languages including Scala; shared mode lets multiple users share a cluster with process isolation and full Unity Catalog governance. The legacy no-isolation shared mode lacks Unity Catalog enforcement and is being retired — new accounts created after December 18, 2025 do not get it. For governed multi-user work, shared access mode is the right answer.

Sizing involves two independent choices: the node type (CPU, memory, and whether the instance is memory- or compute-optimized) and the number of workers. Memory-heavy aggregations and joins benefit from memory-optimized nodes, while a small exploratory workload may run fine on a single-node cluster. Autoscaling then flexes worker count within bounds, so you size for the typical case and let scaling absorb spikes.

Matching Compute to the Workload

The associate exam repeatedly poses 'which compute should you use' scenarios. Use this decision guide:

ScenarioRecommended compute
Interactive notebook development, shared explorationAll-purpose cluster
Scheduled production pipelineJob compute (per-run, auto-terminated)
Fast startup, zero infra management, bursty workServerless compute
SQL analytics, dashboards, BI toolsSQL warehouse (serverless for speed)
Long-running, stability-critical productionLTS Databricks Runtime

The through-line is cost-and-reliability fit: never leave production on an always-on all-purpose cluster, always enable auto-termination on interactive clusters, and reach for serverless when you want to eliminate startup latency and sizing effort. Pairing the right runtime (LTS for stability) with the right compute type, governed by cluster policies and billed transparently in DBUs, is the practical competency this domain certifies.

When a scenario stresses both cost and reliability, the strongest answer almost always combines several levers at once — job compute for the run, autoscaling within sensible bounds, auto-termination for any interactive clusters, an LTS runtime for production stability, and a cluster policy to keep every team within those guardrails.

Test Your Knowledge

What does the Databricks Runtime (DBR) provide?

A
B
C
D
Test Your Knowledge

A scheduled production pipeline currently runs on an always-on all-purpose cluster. What is the recommended, more cost-effective compute?

A
B
C
D
Test Your Knowledge

An administrator wants to restrict which instance types and maximum worker counts users may select when creating clusters, and to require cost-tracking tags. What should they use?

A
B
C
D
Test Your Knowledge

With cluster autoscaling enabled between 2 and 8 workers, what happens during a resource-intensive stage of a job?

A
B
C
D