4.2 Databricks Asset Bundles (Declarative Automation Bundles)

Key Takeaways

  • Databricks Asset Bundles (DAB), now called Declarative Automation Bundles, package project source code, configuration, and resource definitions for CI/CD deployment.
  • A bundle is defined by a databricks.yml file at the project root that specifies resources (jobs, pipelines), targets (environments), and configuration.
  • The Databricks CLI commands (bundle validate, bundle deploy, bundle run) manage the lifecycle of bundles.
  • Targets define environment-specific configurations (dev, staging, prod) within the same bundle definition.
  • Bundles replace manual UI-based deployment with code-based, version-controlled infrastructure as code.
Last updated: March 2026

Databricks Asset Bundles (Declarative Automation Bundles)

Quick Answer: Databricks Asset Bundles (now Declarative Automation Bundles) are infrastructure-as-code definitions for Databricks projects. A databricks.yml file defines jobs, pipelines, and targets. The CLI validates, deploys, and runs bundles across dev, staging, and production environments.

What Are Bundles?

Bundles bring software engineering best practices to Databricks projects:

PracticeHow Bundles Enable It
Source controlBundle YAML and source code stored in Git
Code reviewPull request review for infrastructure changes
CI/CDAutomated deployment via CLI in CI/CD pipelines
Environment managementTargets define dev/staging/prod configurations
ReproducibilitySame bundle definition deploys consistently
TestingValidate bundle before deployment

Bundle Structure

my-project/
├── databricks.yml           # Main bundle configuration
├── resources/
│   ├── jobs.yml             # Job definitions
│   └── pipelines.yml        # Pipeline definitions
├── src/
│   ��── notebooks/
│   │   ├── ingest.py
│   │   ├── transform.sql
│   │   └── aggregate.py
│   └── libraries/
│       └── helpers.py
└── tests/
    └── test_transform.py

databricks.yml Configuration

bundle:
  name: sales-pipeline

# Include additional configuration files
include:
  - resources/*.yml

# Target environments
targets:
  dev:
    mode: development
    default: true
    workspace:
      host: https://dev-workspace.cloud.databricks.com

  staging:
    workspace:
      host: https://staging-workspace.cloud.databricks.com

  prod:
    mode: production
    workspace:
      host: https://prod-workspace.cloud.databricks.com
    run_as:
      service_principal_name: prod-service-principal

Resource Configuration (resources/jobs.yml)

resources:
  jobs:
    daily_etl:
      name: "Daily ETL Pipeline"
      schedule:
        quartz_cron_expression: "0 0 2 * * ?"
        timezone_id: "UTC"
      tasks:
        - task_key: ingest
          notebook_task:
            notebook_path: src/notebooks/ingest.py
          new_cluster:
            spark_version: "15.4.x-scala2.12"
            node_type_id: "i3.xlarge"
            num_workers: 2

        - task_key: transform
          depends_on:
            - task_key: ingest
          notebook_task:
            notebook_path: src/notebooks/transform.sql

        - task_key: aggregate
          depends_on:
            - task_key: transform
          notebook_task:
            notebook_path: src/notebooks/aggregate.py

CLI Commands

CommandPurpose
databricks bundle initCreate a new bundle from a template
databricks bundle validateCheck bundle syntax and references
databricks bundle deployDeploy bundle resources to a workspace
databricks bundle runRun a specific job or pipeline from the bundle
databricks bundle destroyRemove all resources created by the bundle

Workflow

# 1. Initialize a new project
databricks bundle init --template default-python

# 2. Validate the configuration
databricks bundle validate -t dev

# 3. Deploy to development
databricks bundle deploy -t dev

# 4. Run the job
databricks bundle run daily_etl -t dev

# 5. Deploy to production (after testing)
databricks bundle deploy -t prod

Targets (Environment Management)

Targets allow the same bundle to be deployed to different environments:

Target PropertyPurposeExample
modedevelopment or productionmode: production
workspace.hostTarget workspace URLhttps://prod.cloud.databricks.com
run_asService principal for productionrun_as: {service_principal_name: sp-prod}
defaultDefault target for CLI commandsdefault: true

Development vs. Production Mode

FeatureDevelopment ModeProduction Mode
Resource namingPrefixed with [dev username]Exact name as defined
PermissionsCurrent user onlyConfigured via run_as
LockingNo deployment lockDeployment lock to prevent conflicts
Cluster policyCan use any clusterShould use job clusters

CI/CD Integration

# Example: GitHub Actions CI/CD pipeline
name: Deploy Databricks Bundle
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: databricks/setup-cli@main
      - run: databricks bundle validate -t prod
      - run: databricks bundle deploy -t prod

On the Exam: Know the basic bundle structure (databricks.yml, resources, targets), the three main CLI commands (validate, deploy, run), and the difference between development and production modes. You do not need to memorize YAML syntax.

Test Your Knowledge

Which file must exist at the root of every Databricks Asset Bundle project?

A
B
C
D
Test Your Knowledge

What does the "databricks bundle validate" command do?

A
B
C
D
Test Your Knowledge

In a Databricks Asset Bundle, what is the purpose of "targets"?

A
B
C
D