1.6 Databricks Runtime and Compute Configuration

Key Takeaways

  • Databricks Runtime (DBR) includes Apache Spark, Delta Lake, and pre-installed libraries optimized for the Databricks platform.
  • Runtime versions follow a Long-Term Support (LTS) model — LTS versions receive security patches for an extended period.
  • Cluster autoscaling automatically adds or removes worker nodes based on workload demand, optimizing cost and performance.
  • Spot instances (preemptible VMs) offer significant cost savings (60-90%) but may be interrupted by the cloud provider.
  • Init scripts run custom code during cluster startup, useful for installing additional libraries or configuring environment settings.
Last updated: March 2026

Databricks Runtime and Compute Configuration

Quick Answer: Databricks Runtime (DBR) bundles Spark, Delta Lake, and optimized libraries. Use LTS versions for production stability. Configure autoscaling for dynamic workloads, spot instances for cost savings, and init scripts for custom setup.

Databricks Runtime Versions

Runtime TypeIncludesBest For
Standard RuntimeSpark, Delta Lake, Python, SQL, Scala, RGeneral data engineering
ML RuntimeStandard + ML libraries (TensorFlow, PyTorch, scikit-learn)Machine learning workloads
Photon RuntimeStandard + Photon C++ engineSQL-heavy and ETL workloads
GPU RuntimeML + GPU drivers and librariesDeep learning, GPU compute

LTS (Long-Term Support)

  • LTS versions (e.g., 15.4 LTS) receive security updates and bug fixes for 2+ years
  • Recommended for production workloads where stability is critical
  • Non-LTS versions have shorter support periods

Cluster Autoscaling

Cluster configuration:
  Min workers: 2
  Max workers: 10
  Autoscaling: enabled

How Autoscaling Works

  1. Cluster starts with the minimum number of workers
  2. As tasks queue up, workers are added up to the maximum
  3. When tasks complete and resources are idle, workers are removed
  4. Optimizes cost by matching resources to actual demand

Autoscaling Considerations

  • Scale-up time: Adding nodes takes 1-5 minutes (cloud VM provisioning)
  • Scale-down delay: Workers are removed after a configurable idle period
  • Minimum workers: Set to handle base workload without scaling
  • Maximum workers: Set to cap costs during peak processing

Spot Instances

Instance TypeCostInterruption RiskBest For
On-demandFull priceNoneProduction, time-sensitive jobs
Spot60-90% discountMay be interrupted by cloud providerFault-tolerant batch processing

Best Practice: Mixed Instance Pool

  • Driver node: Always on-demand (interruption would fail the entire job)
  • Worker nodes: Use spot instances with on-demand fallback
  • Databricks automatically handles spot interruptions and reassigns tasks

Init Scripts

Init scripts run during cluster startup:

#!/bin/bash
# Example: install a custom Python library
pip install custom-library==1.2.3

# Example: configure environment variables
export MY_API_KEY="secret-value"
  • Stored in DBFS, Unity Catalog volumes, or workspace files
  • Global init scripts: Run on every cluster in the workspace
  • Cluster-scoped init scripts: Run only on a specific cluster
  • Use for: custom library installation, driver configuration, environment setup

On the Exam: Know that LTS runtimes are recommended for production, autoscaling optimizes cost by matching workers to demand, and spot instances offer significant savings but risk interruption. The driver node should always be on-demand.

Test Your Knowledge

Why should the driver node of a Databricks cluster always use on-demand instances rather than spot instances?

A
B
C
D
Test Your Knowledge

Which Databricks Runtime type should be used for a production ETL pipeline that requires long-term stability and support?

A
B
C
D