4.3 Snowpipe, External Tables, and Storage Integrations

Key Takeaways

Snowpipe auto-ingest triggers loads from cloud event notifications (S3 SQS/SNS, Azure Event Grid, GCP Pub/Sub).
The Snowpipe REST API (insertFiles/insertReport/loadHistoryScan) drives loads programmatically with file-level dedup.
External tables expose files in a stage as a read-only table over a VARIANT VALUE column plus a metadata pseudo-column.
A storage integration stores an IAM identity so external stages need no inline keys.
External tables and Snowpipe both refresh via serverless compute and can use directory tables for metadata.

Last updated: June 2026

Snowpipe Continuous Ingestion

Snowpipe loads files within minutes of their arrival in a stage, using serverless compute instead of a user warehouse. A pipe is a first-class object that wraps a COPY INTO statement: CREATE PIPE my_pipe AUTO_INGEST = TRUE AS COPY INTO t FROM @ext_stage. There are two ways to trigger a pipe:

Auto-ingest — the cloud provider sends an event notification when a new file lands. On AWS S3 this is an SQS (or SNS) notification; on Azure it is Event Grid; on GCP it is Pub/Sub. The pipe's notification_channel (from SHOW PIPES) is the ARN you wire the bucket event to.
REST API — a client calls insertFiles with a list of filenames; insertReport and loadHistoryScan return load status. This is how the Kafka connector and custom apps drive Snowpipe.

Snowpipe performs file-level deduplication: it tracks loaded file names for ~14 days and will not reload a file with the same path unless FORCE semantics apply. Because Snowpipe is event-driven and micro-batch, it suits frequent small batches; for sub-second row-level streaming, Snowpipe Streaming writes rows directly via the Snowflake Ingest SDK without staging files.

Trait	Bulk COPY INTO	Snowpipe
Compute	Your warehouse	Serverless (Snowflake-managed)
Trigger	Manual / scheduled	Event notification or REST API
Default ON_ERROR	ABORT_STATEMENT	SKIP_FILE
Best for	Large scheduled batches	Continuous micro-batches
Billing	Warehouse credits	Per-GB serverless credits

External Tables and Storage Integrations

External tables

An external table lets you query files in place in an external stage without loading them into Snowflake. The data exposes through a built-in VALUE column (a VARIANT of each row/record) plus a METADATA$FILENAME pseudo-column. You define virtual columns as expressions over VALUE (for example VALUE:c1::number) and can partition the table by parts of the file path to prune scans. External tables are read-only: no inserts, updates, indexes, or clustering keys, because the data lives outside Snowflake.

They are commonly paired with a materialized view to accelerate repeated queries and with directory tables for file metadata. Auto-refresh of external-table partitions uses the same cloud-notification mechanism as Snowpipe and runs on serverless compute.

Storage integrations

A storage integration is an account-level object that stores a generated cloud IAM identity plus optional allowed/blocked storage locations. It lets external stages and pipes reach S3/Azure/GCS without embedding access keys in SQL. The setup pattern is: CREATE STORAGE INTEGRATION (Snowflake generates an IAM user/principal), grant that principal access in the cloud console, then reference the integration in CREATE STAGE ... STORAGE_INTEGRATION = my_int. Benefits over inline credentials:

No secrets in code — credentials never appear in stage definitions or query history.
Centralized control — a security admin manages cloud trust once; many stages reuse it.
Scoped access — STORAGE_ALLOWED_LOCATIONS restricts which buckets/paths are reachable.

Creating or altering a storage integration requires the ACCOUNTADMIN role or a role with the global CREATE INTEGRATION privilege.

Managing and Monitoring Pipes

Pipes have a lifecycle you can control with SQL. ALTER PIPE my_pipe SET PIPE_EXECUTION_PAUSED = TRUE pauses ingestion (useful during maintenance), and SHOW PIPES lists each pipe's pattern, notification channel, and pending-file count. To inspect Snowpipe activity, query the PIPE_USAGE_HISTORY table function for credits and bytes ingested over time, and COPY_HISTORY for per-file outcomes. If files were already in the bucket before a pipe was created, a one-time ALTER PIPE my_pipe REFRESH queues those existing files for load — auto-ingest only reacts to new notifications, so historical backfill needs this manual refresh.

Snowpipe vs. bulk vs. streaming

Choosing among ingestion paths is a frequent scenario. The rule of thumb:

Bulk COPY INTO for large, scheduled batch jobs where you control the warehouse and want maximum throughput.
Snowpipe (auto-ingest or REST) for files arriving continuously and unpredictably; latency is minutes and compute is serverless.
Snowpipe Streaming for row-by-row, sub-second latency with no intermediate files, driven by the Snowflake Ingest SDK or the Kafka connector in streaming mode.

Cost awareness

Because Snowpipe and external-table refresh run on serverless compute, they do not require a running warehouse, but they are not free — they draw serverless credits metered per GB (Snowpipe) or per refresh. Loading many tiny files inflates Snowpipe cost because overhead is amortized across less data, which is exactly why the 100-250 MB compressed sizing guidance applies to continuous loading too. Right-sizing files before they hit the stage is the single biggest lever on ingestion cost.

Privileges and ownership

To create a pipe a role needs the CREATE PIPE privilege on the schema plus USAGE on the database, schema, and stage, and INSERT/SELECT on the target table. Snowpipe runs under the pipe owner's privileges, so if that role loses access to the stage or table, ingestion silently stops — a frequent troubleshooting scenario. For external tables, the role needs CREATE EXTERNAL TABLE plus USAGE on the stage and, when a storage integration is involved, USAGE on the integration.

Recognizing which privilege is missing is often the difference between two plausible answers: if files exist but nothing loads, suspect the pipe owner's stage/table grants; if a CREATE STAGE fails against S3, suspect the storage-integration USAGE grant or the cloud-side trust policy. Tie each connectivity object back to its required grants and the scenario questions resolve cleanly.

Test Your Knowledge

On AWS, which mechanism does a Snowpipe pipe configured with AUTO_INGEST = TRUE use to learn that a new file has arrived in an S3 bucket?

A scheduled task that polls the bucket every minute

An S3 event notification delivered to the pipe's SQS queue

A direct database trigger on the target table

The JDBC driver pushing the file list to Snowflake

Test Your Knowledge

Which statement about Snowflake external tables is correct?

They are read-only and expose file rows through a VARIANT VALUE column

They physically load file data into Snowflake micro-partitions

They require a clustering key to be queried efficiently

They support INSERT and UPDATE like standard tables

Test Your Knowledge

What is the primary security advantage of using a storage integration for an external stage?

It compresses files automatically before loading

It doubles load throughput by parallelizing across regions

It avoids embedding cloud access keys in stage definitions by delegating to a Snowflake-managed IAM identity

It converts all files to Parquet during transit

Test Your Knowledge

Files were already sitting in an S3 bucket before a new auto-ingest pipe was created. What is required to load those pre-existing files?

Nothing; auto-ingest backfills all historical files automatically

Drop and recreate the target table

Switch the pipe to a running warehouse

Run a one-time ALTER PIPE ... REFRESH to queue the existing files

Up Next

4.4 Bulk Unloading and Semi-Structured Data

Continue learning

SnowPro Core Study Guide

SnowPro Core Certification

4.3 Snowpipe, External Tables, and Storage Integrations

Key Takeaways

Snowpipe Continuous Ingestion

External Tables and Storage Integrations

External tables

Storage integrations

Managing and Monitoring Pipes

Snowpipe vs. bulk vs. streaming

Cost awareness

Privileges and ownership

SnowPro Core Study Guide

1Chapter 1: SnowPro Core Orientation and Exam Strategy

2Chapter 2: Snowflake AI Data Cloud Features and Architecture

3Chapter 3: Account Management and Data Governance

4Chapter 4: Data Loading, Unloading, and Connectivity

5Chapter 5: Performance Optimization, Querying, and Transformation

6Chapter 6: Data Collaboration

7Chapter 7: Final Review and Test Day

SnowPro Core Certification

4.3 Snowpipe, External Tables, and Storage Integrations

Key Takeaways

Snowpipe Continuous Ingestion

External Tables and Storage Integrations

External tables

Storage integrations

Managing and Monitoring Pipes

Snowpipe vs. bulk vs. streaming

Cost awareness

Privileges and ownership