4.4 Azure Cosmos DB: Use Cases and Architecture

Key Takeaways

Azure Cosmos DB is a fully managed, multi-model, globally distributed NoSQL database offering single-digit-millisecond latency and 99.999% availability with multi-region writes.
Turnkey global distribution lets you add or remove read/write regions with a click; data is replicated automatically and clients are routed to the nearest region.
Throughput is provisioned (or autoscaled) in Request Units per second (RU/s); a Request Unit is a normalized currency for the CPU, memory, and I/O a database operation costs.
Data is sharded across logical partitions by a partition key; choosing a high-cardinality, evenly accessed partition key is critical to avoid hot partitions.
Cosmos DB is ideal for global, write-heavy, low-latency applications such as IoT telemetry, retail catalogs, gaming, and personalization, but it is not a replacement for relational OLTP needing complex joins.

Last updated: June 2026

Azure Cosmos DB: Use Cases and Architecture

Quick Answer: Azure Cosmos DB is a fully managed, multi-model, globally distributed NoSQL database. It guarantees single-digit-millisecond read/write latency, up to 99.999% availability with multi-region writes, elastic scale, and five tunable consistency levels. You provision performance in Request Units per second (RU/s) and scale by partitioning data on a chosen key.

Where Azure Table storage is a simple, cheap key-value store, Cosmos DB is the premium, planet-scale NoSQL platform. On DP-900 you must be able to identify use cases for Cosmos DB and describe its APIs (the next section).

What Makes Cosmos DB Different

Global distribution, turnkey. Add or remove Azure regions to a Cosmos DB account with a single setting. Data is replicated to every chosen region automatically, and the SDK routes each client to the nearest region for low latency.
Multi-region writes. Optionally make every region a write region (active-active), so users on different continents all write locally. This is what underpins the latency SLA for globally distributed apps.
Guaranteed SLAs. Microsoft offers financially backed SLAs on latency (single-digit ms at the 99th percentile), throughput, consistency, and availability — rare among databases.
Schema-agnostic and auto-indexed. Items have no enforced schema, and by default every property is indexed, so queries are fast without manual index design.
Elastic scale. Storage and throughput scale independently and virtually without limit.

Request Units (RU/s)

Cosmos DB abstracts CPU, memory, and IOPS into a single normalized currency: the Request Unit (RU). Every operation — a read, a write, a query — costs a measurable number of RUs. A point read of a 1 KB item costs roughly 1 RU; writes and queries cost more.

You provision throughput in RU/s, and there are three modes:

Mode	How it works	Best for
Provisioned (manual)	You set a fixed RU/s; billed for that capacity	Steady, predictable workloads
Autoscale	You set a max RU/s; Cosmos scales between 10% and 100% of it automatically	Variable or spiky workloads
Serverless	Pay per RU consumed, no minimum	Dev/test, intermittent, low-traffic apps

Exceeding your provisioned RU/s causes requests to be throttled (HTTP 429), which the SDK retries. Sizing RU/s correctly is a core operational task.

Partitioning

Cosmos DB scales horizontally by spreading data across many physical partitions. You choose a partition key (for example /deviceId, /userId, or /category), and Cosmos DB hashes it to assign each item to a logical partition; many logical partitions map onto each physical partition.

The partition key choice is the most important design decision:

A good partition key has high cardinality (many distinct values) and spreads both storage and request volume evenly.
A bad key concentrates traffic on a few values, creating a hot partition that throttles while the rest of the database sits idle.
A logical partition has a 20 GB storage ceiling, so the key must also avoid unbounded growth on any single value.

When Cosmos DB Is the Right Answer

Use case	Why Cosmos DB fits
IoT and telemetry	Massive write throughput, time-series friendly, elastic scale
Retail product catalog	Flexible schema per product category, global low-latency reads
Gaming	Single-digit-ms latency for leaderboards and player state worldwide
Personalization / user profiles	Globally distributed reads close to each user
Web and mobile backends	Active-active writes, auto-indexing, elastic scale during launches

When It Is Not the Right Answer

Workloads needing complex multi-table joins, foreign keys, and full ACID across many tables → use the Azure SQL family.
Heavy ad-hoc analytical scans of historical data → use a warehouse or lakehouse (Synapse / Fabric). For analytics on Cosmos data without ETL, enable the analytical store and query it via Synapse Link / Fabric mirroring.
Simple, low-cost key-value needs without global reach → plain Table storage may be cheaper.

The one-line exam tell: a scenario that stresses global distribution, millisecond latency at scale, flexible schema, or massive write throughput points to Cosmos DB.

Resource Hierarchy

Cosmos DB organizes data in a clear hierarchy you should recognize: an account (the globally distributed top-level resource, tied to one API) contains databases, which contain containers (called collections, tables, or graphs depending on the API), which hold items (documents, rows, nodes/edges). Throughput (RU/s) can be provisioned at the database level (shared across its containers) or at the container level (dedicated). Dedicated container throughput gives predictable performance; shared database throughput is cheaper for many small containers.

The Analytical Store and Synapse Link

Cosmos DB is an operational (OLTP-style) store, and running heavy analytical scans against it would consume RUs and slow the application. The analytical store solves this: when enabled on a container, Cosmos DB automatically keeps a column-oriented copy of the data, isolated from the transactional workload, with no RU cost to the operational side.

Azure Synapse Link (and Microsoft Fabric mirroring) then queries that analytical store directly — HTAP without ETL. The classic exam scenario "run analytics on Cosmos DB data without affecting transactional performance" is answered by the analytical store via Synapse Link or Fabric mirroring.

Backups and Security

Cosmos DB takes automatic backups and supports both periodic and continuous backup (point-in-time restore). Security layers include Microsoft Entra ID RBAC, primary/secondary keys, resource tokens for fine-grained access, IP firewalls, private endpoints, and always-on encryption at rest and in transit. These mirror the storage-account security model and reinforce that Entra ID is the preferred, secret-free access method across Azure data services.

RU Sizing Intuition

A rough mental model the exam rewards: a 1 KB point read costs ~1 RU; a 1 KB write costs roughly 5 RU; queries cost more depending on how many items they scan and whether indexes are used. If an app does 100 reads and 20 writes per second on 1 KB items, a back-of-envelope estimate is about 100 + (20 x 5) = 200 RU/s, before query overhead. You do not compute exact RU charges on DP-900, but you should understand that writes cost more than reads, that unindexed or large queries cost the most, and that exceeding provisioned RU/s causes throttling (HTTP 429) which autoscale or higher provisioning relieves.

Test Your Knowledge

A gaming company is launching worldwide and needs player profile and leaderboard data that can be read AND written with single-digit-millisecond latency from any continent, with each region able to accept writes locally. Which Azure service best fits?

Azure SQL Database General Purpose

Azure Table storage with LRS

Azure Cosmos DB with multi-region writes

Azure Synapse dedicated SQL pool

Test Your Knowledge

A team designing a Cosmos DB container for IoT telemetry from 5 million devices wants to avoid throttling and hot partitions. Which partition key choice is best?

A constant value such as "telemetry" for every item

The device ID (/deviceId), which has high cardinality and spreads requests evenly

The current date (/date), so all of today's writes land together

The region name (/region), with only four possible values

Up Next

4.5 Azure Cosmos DB APIs

Continue learning

Microsoft Azure Data Fundamentals

Azure DP-900

4.4 Azure Cosmos DB: Use Cases and Architecture

Key Takeaways

Azure Cosmos DB: Use Cases and Architecture

What Makes Cosmos DB Different

Request Units (RU/s)

Partitioning

When Cosmos DB Is the Right Answer

When It Is Not the Right Answer

Resource Hierarchy

The Analytical Store and Synapse Link

Backups and Security

RU Sizing Intuition

Microsoft Azure Data Fundamentals

1Chapter 1: Introduction & Exam Overview

2Chapter 2: Core Data Concepts

3Chapter 3: Relational Data on Azure

4Chapter 4: Non-Relational Data on Azure

5Chapter 5: Analytics Workloads on Azure

Azure DP-900

4.4 Azure Cosmos DB: Use Cases and Architecture

Key Takeaways

Azure Cosmos DB: Use Cases and Architecture

What Makes Cosmos DB Different

Request Units (RU/s)

Partitioning

When Cosmos DB Is the Right Answer

When It Is Not the Right Answer

Resource Hierarchy

The Analytical Store and Synapse Link

Backups and Security

RU Sizing Intuition