5.5 Lakehouse Federation

Key Takeaways

  • Lakehouse Federation enables querying external databases (PostgreSQL, MySQL, Snowflake, Redshift, BigQuery, SQL Server) directly from Databricks without data movement.
  • Connections define the authentication credentials for accessing external databases, and foreign catalogs expose external database objects in the Unity Catalog namespace.
  • Federated queries run through Unity Catalog governance, providing consistent access control and audit logging for external data.
  • Query optimization pushes predicates and projections down to the external database to minimize data transfer.
  • Lakehouse Federation requires Pro or Serverless SQL warehouses — it does not work with classic SQL warehouses or all-purpose clusters.
Last updated: March 2026

Lakehouse Federation

Quick Answer: Lakehouse Federation lets you query external databases (PostgreSQL, MySQL, Snowflake, Redshift, BigQuery) directly from Databricks SQL without moving data. Connections define credentials; foreign catalogs expose external tables in Unity Catalog for governed access.

What Is Lakehouse Federation?

Lakehouse Federation enables federated queries — querying data in external databases as if it were a Unity Catalog table, without ETL or data movement.

Supported External Sources

SourceConnection Type
PostgreSQLpostgresql
MySQLmysql
Microsoft SQL Serversqlserver
Azure Synapsesqldw
Amazon Redshiftredshift
Snowflakesnowflake
Google BigQuerybigquery
Other Databricks workspacesdatabricks

Setup Process

Step 1: Create a Connection

CREATE CONNECTION my_postgres_conn
TYPE postgresql
OPTIONS (
    host 'my-postgres-host.example.com',
    port '5432',
    user 'readonly_user',
    password secret('my-scope', 'pg-password')
);

Step 2: Create a Foreign Catalog

-- Map an external database to a Unity Catalog catalog
CREATE FOREIGN CATALOG postgres_data
USING CONNECTION my_postgres_conn
OPTIONS (database 'production_db');

Step 3: Query External Data

-- Query external tables using standard three-level namespace
SELECT * FROM postgres_data.public.customers
WHERE country = 'US';

-- Join external data with Lakehouse data
SELECT
    c.customer_name,
    o.total_amount
FROM postgres_data.public.customers c
JOIN prod.sales.orders o ON c.customer_id = o.customer_id;

Governance

Federated queries go through Unity Catalog governance:

FeatureSupport
Access controlGRANT/REVOKE on foreign catalogs, schemas, and tables
Audit loggingAll federated queries logged in system.access.audit
Data lineageLineage tracked for federated queries
Consistent namespaceExternal tables appear in the three-level namespace
-- Grant access to federated data
GRANT USE CATALOG ON CATALOG postgres_data TO `data-analysts`;
GRANT USE SCHEMA ON SCHEMA postgres_data.public TO `data-analysts`;
GRANT SELECT ON TABLE postgres_data.public.customers TO `data-analysts`;

Query Optimization

Lakehouse Federation optimizes federated queries by:

  1. Predicate pushdown: WHERE clauses are sent to the external database
  2. Projection pushdown: Only requested columns are fetched
  3. Aggregation pushdown: Simple aggregations (COUNT, SUM, AVG) pushed to the source
  4. Limit pushdown: LIMIT clauses applied at the source
-- This query pushes the WHERE and SELECT to PostgreSQL
-- Only matching rows and selected columns are transferred
SELECT customer_name, email
FROM postgres_data.public.customers
WHERE country = 'US' AND active = true;

Compute Requirements

Compute TypeFederation Support
Serverless SQL warehouseSupported
Pro SQL warehouseSupported
Classic SQL warehouseNOT supported
All-purpose clusterNOT supported
Job clusterNOT supported

On the Exam: Know that Lakehouse Federation requires Pro or Serverless SQL warehouses, that external data is accessed without copying, and that Unity Catalog governance (permissions, lineage, auditing) applies to federated queries just like local tables.

Test Your Knowledge

Which Databricks compute type supports Lakehouse Federation for querying external databases?

A
B
C
D
Test Your Knowledge

What does "predicate pushdown" mean in the context of Lakehouse Federation?

A
B
C
D
Test Your Knowledge

After creating a foreign catalog in Lakehouse Federation, how do you control who can access the external data?

A
B
C
D