4.1 Data Loading, Unloading, and Connectivity Overview

Key Takeaways

This domain is 18% of the SnowPro Core COF-C03 exam and centers on COPY INTO, stages, Snowpipe, and unloading.
Bulk loading uses COPY INTO <table> against a stage; continuous loading uses Snowpipe pipes with serverless compute.
Snowflake supports six file formats for loading: CSV, JSON, Avro, ORC, Parquet, and XML.
Stages come in four flavors: user (@~), table (@%table), named internal, and external (S3/Azure/GCS).
COPY INTO maintains 64-day load metadata per table to prevent reloading the same files.

Last updated: June 2026

What This Domain Covers

Data Loading, Unloading, and Connectivity is worth 18% of the SnowPro Core (COF-C03) exam, making it one of the two heaviest domains alongside Account & Security. Every question here ties back to one of four pillars: bulk loading with COPY INTO <table>, continuous loading with Snowpipe, bulk unloading with COPY INTO <location>, and the connectors/drivers that applications use to reach Snowflake. You should be able to pick the right tool for an ingestion pattern, name the supported file formats, and explain how stages broker files between cloud storage and Snowflake tables.

Snowflake separates storage (where files sit), stages (the pointer/landing zone), and compute (the virtual warehouse that runs the load). A bulk COPY INTO consumes a user-specified warehouse; Snowpipe and external-table refresh use serverless compute that Snowflake manages and meters separately. Knowing which path uses your warehouse versus Snowflake-managed compute is a recurring exam distinction.

Supported file formats

Snowflake loads six formats. Memorize this list and which are structured versus semi-structured.

Format	Category	Notes
CSV (delimited)	Structured	Default `TYPE = CSV`; field/record delimiters, skip header
JSON	Semi-structured	Loads into a `VARIANT` column
Avro	Semi-structured	Binary, schema-embedded → VARIANT
ORC	Semi-structured	Columnar binary → VARIANT
Parquet	Semi-structured	Columnar binary → VARIANT; common for data lakes
XML	Semi-structured	Preview-era support; loads to VARIANT

Semi-structured formats land in a single VARIANT column unless you transform during load. Binary columnar formats (Parquet, ORC, Avro) are billed by Snowpipe on their observed size regardless of compression, while text formats (CSV, JSON, XML) are billed on uncompressed size.

Stages: The Four Types

A stage is a named location that references files for loading or unloading. There are two internal categories (Snowflake-managed storage) and one external category (your own cloud bucket).

User stage @~ — one per user, automatically provisioned, cannot be altered or dropped, not shareable. Good for files a single user loads into many tables.
Table stage @%mytable — one per table, automatically provisioned, references files destined for that one table. Cannot be altered/dropped independently; no transformation on load to a different table.
Named internal stage @my_stage — explicitly created with CREATE STAGE, the most flexible internal option; supports file formats, directory tables, and grants to roles.
External stage @my_ext_stage — points at Amazon S3, Azure Blob/ADLS Gen2, or Google Cloud Storage using a credential or, preferably, a storage integration.

You upload local files to an internal stage with the PUT command (via SnowSQL or a driver — PUT is not available in the Snowsight worksheet UI) and download from a stage with GET. LIST @stage shows staged files and metadata (name, size, MD5, last-modified); REMOVE deletes them.

Load metadata and idempotency

Snowflake records load metadata for each target table for 64 days. COPY INTO uses it to skip files already loaded successfully, making reloads idempotent. After 64 days, that history expires and a file could be reloaded. Use FORCE = TRUE to deliberately reload, or LOAD_UNCERTAIN_FILES for files whose status is ambiguous. This metadata is per-table, which is why the same file can be loaded into two different tables independently.

File Formats Objects and Directory Tables

Rather than repeating parsing options inside every COPY INTO, you can create a reusable file format object: CREATE FILE FORMAT my_csv TYPE = CSV FIELD_DELIMITER = '|' SKIP_HEADER = 1 NULL_IF = ('NULL',''). The format can then be referenced by name in COPY INTO, in stage definitions, or in external tables. CSV options you should recognize include FIELD_DELIMITER, RECORD_DELIMITER, SKIP_HEADER, FIELD_OPTIONALLY_ENCLOSED_BY, ESCAPE, DATE_FORMAT, ENCODING, and ERROR_ON_COLUMN_COUNT_MISMATCH.

For semi-structured formats, STRIP_OUTER_ARRAY (JSON) collapses a top-level array so each element becomes a row, and BINARY_FORMAT controls binary encoding.

A directory table is an implicit, queryable layer of file metadata over a stage (DIRECTORY = (ENABLE = TRUE)). It returns RELATIVE_PATH, SIZE, LAST_MODIFIED, MD5, and a scoped file URL, and underpins features like unstructured-file access and Streamlit data apps. Refresh it manually with ALTER STAGE ... REFRESH or automatically via cloud notifications.

Compression on load

Snowflake auto-detects common compression on staged files. By default COMPRESSION = AUTO, so a .gz, .bz2, .zstd, .deflate, or .snappy file is decompressed transparently during load. You only set COMPRESSION explicitly when auto-detection would guess wrong or when the file is uncompressed but named misleadingly. Because Snowflake stores all table data in its own compressed columnar micro-partitions, the staged file's compression affects only transfer and parse cost, not how data is ultimately stored.

Why this domain rewards a workflow mindset

Rather than memorizing options in isolation, picture the end-to-end path for a load: a file is produced upstream, lands in cloud storage or is pushed with PUT, a stage points at it, a file format describes how to parse it, a warehouse (or serverless Snowpipe) runs the COPY, load metadata records the result, and copy options decide what happens to errors and to the staged file afterward. Most exam stems describe one point on that path and ask for the next correct action. If you can locate the stem on this pipeline, the distractors that belong to a different stage of the workflow become easy to eliminate.

Build the pipeline picture first, then map each term — stage, file format, COPY, Snowpipe, external table, connector — onto it.

Test Your Knowledge

A developer needs to upload local CSV files from their laptop into a Snowflake internal stage. Which command should they use, and where can it run?

COPY INTO, run only from the Snowsight worksheet UI

PUT, run from SnowSQL or a driver (not from the Snowsight worksheet)

GET, run from any web browser session

INSERT, run from the Snowflake REST API

Test Your Knowledge

How long does Snowflake retain load metadata that prevents COPY INTO from reloading previously loaded files?

7 days

24 hours

64 days

Indefinitely, until the table is dropped

Test Your Knowledge

Which stage type is automatically provisioned, unique to a single user, and cannot be altered or dropped?

User stage (@~)

Named internal stage

External stage

Table stage (@%table)

Test Your Knowledge

Which of the following file formats can Snowflake load into a table?

Only CSV and JSON

Only structured formats such as CSV

CSV and Parquet only

CSV, JSON, Avro, ORC, Parquet, and XML

Up Next

4.2 Core Workflows and Decision Points

Continue learning

SnowPro Core Study Guide

SnowPro Core Certification

4.1 Data Loading, Unloading, and Connectivity Overview

Key Takeaways

What This Domain Covers

Supported file formats

Stages: The Four Types

Load metadata and idempotency

File Formats Objects and Directory Tables

Compression on load

Why this domain rewards a workflow mindset

SnowPro Core Study Guide

1Chapter 1: SnowPro Core Orientation and Exam Strategy

2Chapter 2: Snowflake AI Data Cloud Features and Architecture

3Chapter 3: Account Management and Data Governance

4Chapter 4: Data Loading, Unloading, and Connectivity

5Chapter 5: Performance Optimization, Querying, and Transformation

6Chapter 6: Data Collaboration

7Chapter 7: Final Review and Test Day

SnowPro Core Certification

4.1 Data Loading, Unloading, and Connectivity Overview

Key Takeaways

What This Domain Covers

Supported file formats

Stages: The Four Types

Load metadata and idempotency

File Formats Objects and Directory Tables

Compression on load

Why this domain rewards a workflow mindset