2.8 Databricks Notebooks and Code Execution
Key Takeaways
- Magic commands switch a cell's language: %python, %sql, %scala, %r, plus %md for markdown, %sh for shell, and %fs for file-system commands.
- %run executes another notebook inline, sharing variables and functions, unlike dbutils.notebook.run which runs in a separate scope and returns a string.
- dbutils provides utilities: dbutils.fs (file system), dbutils.widgets (parameters), dbutils.secrets (credentials), dbutils.notebook (orchestration).
- Widgets (dbutils.widgets.text/dropdown) parameterize notebooks and can be passed values from jobs.
- Notebooks attach to a cluster; the default language is set per notebook but any cell can override it with a magic command.
Magic Commands
Every Databricks notebook has a default language (Python, SQL, Scala, or R) chosen when it is created, but any individual cell can override it with a magic command as the first line:
| Magic | Effect |
|---|---|
%python / %sql / %scala / %r | Run the cell in that language |
%md | Render the cell as Markdown |
%sh | Run a shell command on the driver |
%fs | Run a Databricks file-system command (e.g. %fs ls /) |
%run | Execute another notebook inline |
%pip | Install a Python library on the cluster |
So in a Python notebook you can drop into SQL for one cell with %sql SELECT ..., then return to Python in the next. Magic commands must be the first token in the cell — you cannot mix two languages in one cell. The %md magic is how documentation and headings are added between code cells.
%run vs dbutils.notebook.run
There are two ways to invoke another notebook, and the difference is tested:
| Mechanism | Scope | Returns |
|---|---|---|
%run ./helpers | Inline — variables/functions from the called notebook become available in the caller | Nothing (side effects in current scope) |
dbutils.notebook.run('nb', 60, args) | Separate execution context | A single string value via dbutils.notebook.exit() |
Use %run to factor shared setup, functions, or configuration into a helper notebook and pull them into the current session — it behaves like an import that shares the namespace. Use dbutils.notebook.run for orchestration where the child runs independently (with its own scope), accepts parameters, has a timeout, and returns a result string. Because %run shares scope, a function defined in the helper is callable immediately after the %run cell.
dbutils and Widgets
dbutils is the Databricks utilities object that exposes several namespaces:
dbutils.fs— file-system operations (ls,cp,mkdirs,rm,mount).dbutils.widgets— create input controls to parameterize a notebook.dbutils.secrets— fetch credentials from a secret scope without printing them.dbutils.notebook—run()andexit()for orchestration.
Widgets turn a notebook into a parameterized, reusable asset:
dbutils.widgets.text('run_date', '2026-06-12')
dbutils.widgets.dropdown('env', 'dev', ['dev','prod'])
d = dbutils.widgets.get('run_date')
Widget types include text, dropdown, combobox, and multiselect. When a notebook runs as a job task, the job can supply widget values as parameters, so the same notebook serves multiple environments or dates. dbutils.widgets.get('name') reads the current value at execution time.
Execution Model, dbutils.fs, and DBFS
A notebook runs against an attached cluster (all-purpose or job compute) or a SQL warehouse. Cells execute sequentially in a single shared session, so variables, imports, and a spark SparkSession persist from cell to cell — state survives until you detach or restart the cluster. The spark and dbutils objects are pre-defined; you do not create them.
dbutils.fs wraps the Databricks File System (DBFS) and cloud storage with familiar operations, and the %fs magic is a shorthand for the same calls:
| Command | Action |
|---|---|
dbutils.fs.ls(path) | List a directory |
dbutils.fs.cp(src, dst, recurse=True) | Copy files |
dbutils.fs.rm(path, recurse=True) | Delete files |
dbutils.fs.mkdirs(path) | Create directories |
dbutils.fs.head(path) | Preview a file's start |
Paths default to the DBFS root unless prefixed (dbfs:/, abfss://, s3://); in Unity Catalog environments, volumes (/Volumes/cat/schema/vol/...) are the governed way to reach files instead of legacy mounts. To see a function's help, run dbutils.fs.help() or dbutils.widgets.help(). Putting these together — magic commands to mix languages, %run/dbutils.notebook.run to modularize, widgets to parameterize, and dbutils.fs/secrets for I/O and credentials — is exactly the toolkit the certification expects you to wield to build clean, reusable data-engineering notebooks.
From Notebooks to Jobs and the temp-view Bridge
Notebooks become production pipelines when scheduled as Databricks Jobs (workflows). A job is a DAG of tasks, each typically running one notebook on job compute; tasks can depend on one another, pass parameters via widgets, and retry on failure. The same notebook that you developed interactively runs unchanged as a task, with widget values supplied by the job — this is the standard path from prototype to scheduled production.
A practical pattern when mixing languages is the temporary view bridge: compute a DataFrame in Python, register it with df.createOrReplaceTempView("stage"), then query it from a %sql cell — and conversely, capture a SQL result back into Python with spark.table("stage") or spark.sql("..."). Because all cells share one SparkSession, a temp view created in any cell is visible to every other cell regardless of language.
| Magic / call | Typical use |
|---|---|
%pip install pkg | Install a library for the session |
%sh | Driver-local shell commands |
dbutils.notebook.exit(value) | Return a string to an orchestrating notebook |
dbutils.jobs.taskValues | Pass values between job tasks |
This interoperability — Python and SQL freely exchanging data through views, modular notebooks wired into job DAGs, and parameters flowing in via widgets — is the backbone of how data engineers compose maintainable Databricks workloads. The key takeaway for the exam: one shared SparkSession per notebook means state and temp views persist across cells and languages, magic commands switch language per cell, and the unchanged notebook promotes cleanly from interactive development to a scheduled, parameterized job task.
You are in a Python notebook and want to run a single cell as SQL. What do you put as the first line of that cell?
Which statement about %run versus dbutils.notebook.run is correct?
Which dbutils namespace is used to create a text input that a job can pass a value into at runtime?