4.2 Azure Blob Storage

Key Takeaways

Azure Blob storage holds unstructured object data in a flat namespace of containers, each containing blobs addressed by name.
There are three blob types: block blobs (files and media, the default), append blobs (logging), and page blobs (random-access VM disks up to 8 TB).
Access tiers trade storage cost against access cost and latency: Hot (frequent), Cool (>=30 days), Cold (>=90 days), and Archive (offline, >=180 days, rehydration required).
Lifecycle management policies automatically move blobs between tiers or delete them based on age or last-access time.
Azure Data Lake Storage Gen2 is Blob storage with the hierarchical namespace enabled, adding real directories and POSIX-style ACLs for analytics.

Last updated: June 2026

Azure Blob Storage

Quick Answer: Azure Blob storage is Azure's massively scalable object store for unstructured data — images, video, backups, logs, and the raw files of a data lake. Data lives in containers (think top-level folders) as blobs (individual objects). You pick a blob type (block, append, or page) and an access tier (Hot, Cool, Cold, or Archive) to balance cost against retrieval speed.

"Blob" stands for Binary Large Object. Blob storage is schema-less — Azure does not look inside the object, which is what makes it the right home for anything without a fixed structure.

Containers and the Flat Namespace

A blob storage account organizes data into containers. By default the namespace is flat: a blob named 2026/01/sales.csv is a single object whose name happens to contain slashes — there is no real 2026 folder. Tools display this as a virtual folder tree, but the storage layer sees one name. (Enabling the hierarchical namespace, below, changes this.)

The Three Blob Types

Blob type	Optimized for	Example
Block blob	Files and streaming media uploaded in blocks	Documents, images, video, data-lake files
Append blob	Append-only writes	Log files, audit trails, telemetry appends
Page blob	Frequent random reads/writes up to 8 TB	Backing disks for Azure VMs

Block blob is the default and the one DP-900 cares about most. Append blobs only allow adding to the end. Page blobs back unmanaged VM disks.

Access Tiers

The access tier sets how cheaply data is stored versus how much it costs (and how long it takes) to read it back. This is the single most-tested blob topic.

Tier	Storage cost	Access cost	Min. retention	Availability	Use case
Hot	Highest	Lowest	none	99.9%	Frequently accessed, active data
Cool	Lower	Higher	30 days	99%	Infrequently accessed, short-term backup
Cold	Lower still	Higher still	90 days	99%	Rarely accessed but still online
Archive	Lowest	Highest	180 days	Offline	Long-term retention, compliance

Key exam traps:

Hot, Cool, and Cold are online — data is readable immediately. Archive is offline: you must rehydrate a blob (move it to Hot or Cool) before reading it, which can take hours.
Each cooler tier has an early-deletion penalty if you remove or move the data before its minimum-retention period (30 / 90 / 180 days).
Hot, Cool, and Cold are set at the blob level (or account default). Archive is set per blob.

The decision rule: choose the warmest tier whose access pattern you actually have, because cooler tiers punish frequent reads and early deletion.

Lifecycle Management

Manually re-tiering blobs does not scale. A lifecycle management policy is a JSON rule set on the account that automatically acts on blobs based on age or last-access time — for example, move to Cool after 30 days, to Archive after 90 days, delete after 7 years. This is how organizations keep storage bills down without writing code.

Azure Data Lake Storage Gen2

Azure Data Lake Storage Gen2 (ADLS Gen2) is not a separate service — it is Blob storage with the hierarchical namespace (HNS) feature switched on. HNS adds:

Real directories that can be created, renamed, and deleted atomically (a flat blob store has to copy every object to rename a "folder").
POSIX-style ACLs for fine-grained, folder-level permissions.
A driver (abfs://) optimized for analytics engines such as Spark, Synapse, and Databricks.

Because ADLS Gen2 sits on the same platform as Blob storage, it inherits the same tiers, redundancy, and security. For analytics workloads — the data lake at the heart of the medallion architecture — HNS is enabled; for pure object storage (media, backups), it is left off.

Common Blob Scenarios

Static website hosting. A blob account can serve a static site directly from a $web container.
Backup and archive. Database and VM backups land in Cool or Archive tiers.
Data lake landing zone. Raw ingestion files arrive as block blobs in ADLS Gen2, ready for the bronze layer.
Media distribution. Images and video stream from Hot-tier block blobs, often behind a CDN.

Moving Data Into Blob Storage

DP-900 expects awareness of the common tools for getting data in and out. AzCopy is a command-line utility optimized for high-throughput bulk copies. Azure Storage Explorer is a free graphical client for browsing and transferring blobs. Azure Data Factory copy activities move data on a schedule. The REST API and SDKs (.NET, Python, Java, JavaScript) embed transfers in applications, and abfs:// is the analytics driver used by Spark and Synapse against ADLS Gen2. Recognizing AzCopy and Storage Explorer by name is a frequent exam point.

Tier Changes and Rehydration Mechanics

Changing a blob's tier is metadata-only for the online tiers (Hot, Cool, Cold) and takes effect quickly. Moving into Archive is also quick, but moving out of Archive (rehydration) is the slow path: you choose a standard priority (up to ~15 hours) or high priority (faster, costs more) and the blob is unreadable until rehydration completes. A common trap presents a need to read archived data "immediately" — the correct response is that Archive cannot serve immediate reads, so if low-latency access is required the data should not be in Archive in the first place.

Immutability, Soft Delete, and Versioning

Blob storage offers data-protection features the exam may touch on:

Soft delete retains deleted blobs (and containers) for a configurable window so accidental deletes can be recovered.
Versioning automatically keeps prior versions of a blob on each overwrite.
Immutable storage (WORM) applies time-based or legal-hold policies so blobs cannot be modified or deleted until the policy expires — used for regulatory compliance such as financial records.

Putting Tiers and Lifecycle Together

A mature blob strategy combines tiers with a lifecycle policy: data lands in Hot while it is actively used, a policy moves it to Cool after 30 days of no access, then to Archive after 90 days, and finally deletes it after the legal retention period. The same policy can act on last-access time rather than creation time, so genuinely active data stays Hot while truly dormant data drifts cheaply downward — minimizing cost without manual intervention, which is the operational point the exam rewards.

Test Your Knowledge

A compliance team must retain audit files for seven years. The files will almost never be read, retrieval latency of several hours is acceptable, and storage cost must be minimized. Which Azure Blob access tier fits best?

Hot tier

Cool tier

Cold tier

Archive tier

Test Your Knowledge

What is the relationship between Azure Blob storage and Azure Data Lake Storage Gen2?

ADLS Gen2 is a completely separate service that does not use blob storage

ADLS Gen2 is Blob storage with the hierarchical namespace enabled, adding real directories and POSIX ACLs

ADLS Gen2 is a relational database optimized for analytics

ADLS Gen2 replaces Blob storage and cannot store unstructured files

Up Next

4.3 Azure Files, Queue, and Table Storage

Continue learning

Microsoft Azure Data Fundamentals

Azure DP-900

4.2 Azure Blob Storage

Key Takeaways

Azure Blob Storage

Containers and the Flat Namespace

The Three Blob Types

Access Tiers

Lifecycle Management

Azure Data Lake Storage Gen2

Common Blob Scenarios

Moving Data Into Blob Storage

Tier Changes and Rehydration Mechanics

Immutability, Soft Delete, and Versioning

Putting Tiers and Lifecycle Together

Microsoft Azure Data Fundamentals

1Chapter 1: Introduction & Exam Overview

2Chapter 2: Core Data Concepts

3Chapter 3: Relational Data on Azure

4Chapter 4: Non-Relational Data on Azure

5Chapter 5: Analytics Workloads on Azure

Azure DP-900

4.2 Azure Blob Storage

Key Takeaways

Azure Blob Storage

Containers and the Flat Namespace

The Three Blob Types

Access Tiers

Lifecycle Management

Azure Data Lake Storage Gen2

Common Blob Scenarios

Moving Data Into Blob Storage

Tier Changes and Rehydration Mechanics

Immutability, Soft Delete, and Versioning

Putting Tiers and Lifecycle Together