Career upgrade: Learn practical AI skills for better jobs and higher pay.
Level up
All Practice Exams

100+ Free Airflow Fundamentals Practice Questions

Pass your Astronomer Certification for Apache Airflow Fundamentals exam on the first try — instant access, no signup required.

✓ No registration✓ No credit card✓ No hidden fees✓ Start practicing immediately
~70-80% Pass Rate
100+ Questions
100% Free
1 / 100
Question 1
Score: 0/0

What does the 'provide_context' parameter on a PythonOperator do in Airflow 1.x, and why is it unnecessary in Airflow 2.x?

A
B
C
D
to track
Same family resources

Explore More Astronomer (Apache Airflow) Certifications

Continue into nearby exams from the same family. Each card keeps practice questions, study guides, flashcards, videos, and articles in one place.

2026 Statistics

Key Facts: Airflow Fundamentals Exam

75

Exam Questions

Astronomer

60 min

Exam Duration

Astronomer

75%

Passing Score

Astronomer

Free

Exam Fee

Astronomer

2 years

Validity

Astronomer

75 questions in 60 minutes, passing score 75%, free to take. Covers Airflow architecture (webserver, scheduler, executor, metadata DB), DAGs and tasks, operators (PythonOperator, BashOperator, sensors), XCom, Variables, Connections, and monitoring. Certification valid for 2 years. Available online.

Sample Airflow Fundamentals Practice Questions

Try these sample questions to test your Airflow Fundamentals exam readiness. Each question includes a detailed explanation. Start the interactive quiz above for the full 100+ question experience with AI tutoring.

1Which Airflow component is responsible for reading DAG files, determining when tasks are ready to run, and sending them to the executor?
A.Scheduler
B.Webserver
C.Executor
D.Metadata database
Explanation: The Airflow scheduler continuously reads DAG files from the dag_folder, evaluates task dependencies and schedule intervals, and sends ready tasks to the executor. The webserver provides the UI. The executor manages task dispatch to workers. The metadata database stores state.
2What does the Airflow metadata database store?
A.DAG source code files
B.Task instance states, DAG run history, variables, and connections
C.Executor worker logs
D.Python operator return values only
Explanation: The metadata database (typically PostgreSQL or MySQL) stores DAG run records, task instance states, XCom values, Variables, Connections, user info, and other Airflow operational data. DAG source code lives in the dag_folder on the filesystem.
3Which executor is appropriate for a single-machine Airflow deployment with moderate workloads and does NOT require a Celery or Kubernetes infrastructure?
A.SequentialExecutor
B.LocalExecutor
C.CeleryExecutor
D.KubernetesExecutor
Explanation: LocalExecutor runs tasks as subprocesses on the same machine as the scheduler, supporting parallel task execution without external infrastructure. SequentialExecutor runs one task at a time (not suitable for production). CeleryExecutor requires a message broker. KubernetesExecutor requires a Kubernetes cluster.
4In the following DAG snippet, what is the task execution order? ```python extract >> transform >> load ```
A.load → transform → extract
B.extract → transform → load
C.All three run in parallel
D.transform → extract → load
Explanation: The >> operator sets downstream dependencies. extract >> transform >> load means extract runs first, then transform (after extract succeeds), then load (after transform succeeds). This creates a linear pipeline.
5A DAG has start_date = datetime(2026, 1, 1) and schedule_interval='@daily'. It is deployed for the first time on 2026-01-05 with catchup=True. How many DAG runs will Airflow immediately trigger?
A.1 (only the most recent missed run)
B.4 (January 1, 2, 3, and 4)
C.5 (January 1 through 5)
D.0 (Airflow waits for the next scheduled run)
Explanation: With catchup=True, Airflow backfills all missed intervals from start_date to the current date. For @daily from Jan 1 to deployment on Jan 5, the missed intervals are Jan 1–2, Jan 2–3, Jan 3–4, and Jan 4–5, producing 4 DAG runs. Jan 5–6 has not elapsed yet.
6What is the difference between the execution_date and the actual run time of an Airflow task?
A.They are always the same; execution_date is when the task runs
B.execution_date is the logical start of the data interval (schedule period), not the actual run time
C.execution_date is the end of the data interval
D.execution_date is set by the executor when the task starts
Explanation: execution_date (also called logical_date in Airflow 2.2+) represents the start of the scheduling interval — a logical time marker for the data period being processed. The actual wall-clock run time is later, after the interval elapses. For a @daily DAG, the Jan 1 execution_date run actually starts on Jan 2 when the interval completes.
7A PythonOperator task needs to pass a file path string to a downstream task. What is the recommended Airflow mechanism?
A.Write the value to a shared filesystem and hardcode the path
B.Use XCom to push the value and have the downstream task pull it
C.Store it in an Airflow Variable and delete it after the downstream task reads it
D.Pass it as a constructor argument to the downstream operator class
Explanation: XCom (cross-communication) is Airflow's built-in mechanism for passing small data between tasks. The upstream task pushes a value (via return or xcom_push), and the downstream task pulls it (via xcom_pull). This keeps task communication traceable and integrated with the metadata database.
8How does a PythonOperator task push a value to XCom using the recommended modern approach?
A.Call context['ti'].xcom_push(key='result', value=my_value) explicitly
B.Return the value from the Python callable — Airflow automatically pushes return values to XCom
C.Write the value to an Airflow Variable with the task ID as the key
D.Call XCom.set(key='result', value=my_value) as a class method
Explanation: When a PythonOperator callable returns a value, Airflow automatically stores it in XCom under the key 'return_value'. This is the cleanest approach and works seamlessly with the TaskFlow API's @task decorator. Explicit xcom_push is also valid but more verbose.
9What is the correct way to retrieve an XCom value pushed by a task with task_id='extract' in a downstream PythonOperator?
A.value = context['xcom']['extract']['return_value']
B.value = ti.xcom_pull(task_ids='extract')
C.value = Variable.get('extract')
D.value = XCom.get(task_id='extract')
Explanation: xcom_pull(task_ids='extract') retrieves the XCom value pushed by the 'extract' task (defaults to 'return_value' key). The TaskInstance (ti) is passed via op_kwargs={'ti': '{{ ti }}'} or via **context when provide_context=True. In the TaskFlow API, XCom pull happens automatically via function arguments.
10An Airflow Variable stores a database connection string that should differ between development and production environments. What is the recommended approach?
A.Hardcode the connection string in the DAG Python file
B.Use an Airflow Variable with the same key, setting different values per environment
C.Create separate DAG files for each environment
D.Store the value in XCom at DAG startup
Explanation: Airflow Variables are environment-specific key-value pairs stored in the metadata database. The DAG code references the key (e.g., Variable.get('db_connection')), and the value is set differently in each environment (dev/prod). This keeps DAG code environment-agnostic. Alternatively, Connections are preferable for actual credentials.

About the Airflow Fundamentals Exam

The Astronomer Certification for Apache Airflow Fundamentals validates knowledge of core Apache Airflow concepts including architecture components, DAG definition, task dependencies, operators, sensors, XCom, Variables, Connections, and the Airflow UI. It is the entry-level Astronomer certification for data engineers and data professionals working with workflow orchestration.

Questions

75 scored questions

Time Limit

60 minutes

Passing Score

75% (56/75)

Exam Fee

Free (Astronomer)

Airflow Fundamentals Exam Content Outline

20%

Core Airflow Architecture

Webserver, scheduler, executor types (LocalExecutor, CeleryExecutor, KubernetesExecutor), metadata database, worker, and Airflow component interaction

25%

DAGs and Tasks

DAG definition with Python, task dependencies (>>/<< operators, set_upstream/set_downstream), execution dates, logical dates, schedule intervals, catchup, backfill

25%

Operators and Sensors

PythonOperator, BashOperator, EmailOperator, DummyOperator/EmptyOperator, sensors (FileSensor, HttpSensor), operator parameters (retries, retry_delay, depends_on_past)

20%

XCom, Variables, and Connections

XCom push (xcom_push / return value) and pull, Variables for runtime configuration, Connections and Hooks for external systems

10%

Task Lifecycle and Monitoring

Task states (queued, running, success, failed, skipped, up_for_retry), SLA miss callbacks, log access, Airflow UI Graph and Tree views, triggering DAGs

How to Pass the Airflow Fundamentals Exam

What You Need to Know

  • Passing score: 75% (56/75)
  • Exam length: 75 questions
  • Time limit: 60 minutes
  • Exam fee: Free

Keys to Passing

  • Complete 500+ practice questions
  • Score 80%+ consistently before scheduling
  • Focus on highest-weighted sections
  • Use our AI tutor for tough concepts

Airflow Fundamentals Study Tips from Top Performers

1Understand the role of each Airflow component: webserver (UI), scheduler (orchestration), executor (task dispatch), metadata DB (state storage)
2Practice writing DAGs with Python — know how to set task dependencies using >> and << operators
3Know the difference between execution_date (logical date) and the actual date a task runs
4Understand catchup=False vs catchup=True and when to use each
5Learn XCom: push via return value or xcom_push, pull via xcom_pull with task_ids parameter
6Know when to use Variables vs environment variables vs Airflow Connections
7Understand task states and what causes each: success, failed, up_for_retry, skipped

Frequently Asked Questions

What is the Airflow scheduler responsible for?

The Airflow scheduler reads DAG files, determines which tasks are ready to run based on dependencies and schedule, and sends tasks to the executor. It continuously monitors DAG runs and task states to keep pipelines moving forward.

What is XCom in Apache Airflow?

XCom (cross-communication) is the mechanism for passing small amounts of data between tasks in the same DAG run. Tasks can push values via xcom_push() or by returning a value, and pull them via xcom_pull(). XCom data is stored in the metadata database and should only be used for small data like identifiers or status flags.

What is the difference between schedule_interval and catchup?

schedule_interval defines how often a DAG runs (e.g., @daily, @hourly, cron expression). catchup=True (default) causes Airflow to backfill all missed runs since the start_date. Setting catchup=False runs only the most recent missed interval, preventing large backlogs of historical runs.

What are Airflow Connections used for?

Connections store credentials and endpoint information for external systems (databases, cloud services, APIs). They are managed through the Airflow UI or environment variables and accessed in tasks via Hooks, which provide convenient interfaces to interact with connected services.