4.5 Azure Cosmos DB APIs
Key Takeaways
- Cosmos DB for NoSQL is the native, document-based API with the richest feature set and is Microsoft's recommended default for new projects.
- Cosmos DB for MongoDB is wire-protocol compatible with MongoDB, so existing MongoDB apps, drivers, and tools work with minimal change.
- Cosmos DB for Apache Cassandra exposes a wide-column store compatible with the Cassandra Query Language (CQL) and Cassandra drivers.
- Cosmos DB for Apache Gremlin is a graph database storing vertices and edges, queried with the Gremlin traversal language for relationship-heavy data.
- Cosmos DB for Table is the premium, globally distributed evolution of Azure Table storage, and Cosmos DB for PostgreSQL (Citus) is the relational distributed-PostgreSQL member of the family.
Azure Cosmos DB APIs
Quick Answer: Cosmos DB is multi-model: one global service exposes several APIs, each presenting a different data model and query surface. The five you must know are NoSQL (native document), MongoDB (document, MongoDB-compatible), Apache Cassandra (wide-column), Apache Gremlin (graph), and Table (key-value). A separate relational member, Cosmos DB for PostgreSQL, rounds out the family.
The exam's favorite Cosmos DB question is: given a scenario, which API do you choose? The answer almost always comes down to what data model the application already expects or what existing database you are migrating from.
The APIs at a Glance
| API | Data model | Query language | Choose it when... |
|---|---|---|---|
| Cosmos DB for NoSQL | Document (JSON) | SQL-like query syntax | You are starting a new project and want the richest, most up-to-date features |
| Cosmos DB for MongoDB | Document (BSON) | MongoDB query language | You are migrating an existing MongoDB app or want MongoDB drivers/tools |
| Cosmos DB for Apache Cassandra | Wide-column | Cassandra Query Language (CQL) | You are migrating a Cassandra workload or need a column-family model |
| Cosmos DB for Apache Gremlin | Graph (vertices + edges) | Gremlin traversal language | Your dominant queries traverse relationships (social, fraud, recommendations) |
| Cosmos DB for Table | Key-value | OData / Table SDK | You want a premium upgrade path from Azure Table storage |
Cosmos DB for NoSQL (the native API)
This is Cosmos DB's first-party API. Data is stored as JSON documents, and you query it with a familiar SQL-like dialect (SELECT * FROM c WHERE c.category = 'books'). It receives new features first, supports stored procedures, triggers, and user-defined functions in JavaScript, and is Microsoft's recommended default for greenfield development. If a question says "new application, no legacy database," NoSQL is the answer.
Cosmos DB for MongoDB
This API implements the MongoDB wire protocol. Existing applications written against MongoDB — using the official MongoDB drivers, the mongo shell, Compass, and so on — connect to Cosmos DB by changing only the connection string. It is the migration path for organizations that already run MongoDB but want a fully managed, globally distributed backend without rewriting code.
Cosmos DB for Apache Cassandra
This API provides a wide-column (column-family) store compatible with CQL and Cassandra drivers. Teams running Apache Cassandra on their own clusters migrate here to shed operational burden while keeping their CQL schemas and tooling. The model suits time-series, IoT, and very wide, sparse rows.
Cosmos DB for Apache Gremlin
This API turns Cosmos DB into a graph database. Data is modeled as vertices (entities, such as people or products) and edges (relationships, such as "friend-of" or "purchased"), and you query it with the Gremlin traversal language from the Apache TinkerPop project. Choose Gremlin when the relationships between entities are the main thing you query — social networks, fraud rings, recommendation engines, and knowledge graphs.
Cosmos DB for Table
This API is the premium evolution of Azure Table storage. It speaks the same Table protocol, so existing Table-storage apps migrate with a connection-string change, but it adds turnkey global distribution, guaranteed low latency, automatic indexing of every property, and dedicated throughput. Choose it when a Table-storage workload outgrows the standard service's performance or needs global reach.
Cosmos DB for PostgreSQL
For completeness: the Cosmos DB family also includes Cosmos DB for PostgreSQL (built on the Citus extension). Unlike the five APIs above, this one is relational — covered in Chapter 3. Watch for an exam distractor that lists it as a NoSQL API; it is distributed PostgreSQL, not a document/graph/column store.
A Decision Shortcut
- New app, no legacy DB → NoSQL.
- "We already use MongoDB / Cassandra" → the matching MongoDB or Cassandra API.
- "We query relationships / a graph" → Gremlin.
- "Upgrade our Azure Table storage" → Table.
- "Distributed relational PostgreSQL" → Cosmos DB for PostgreSQL (relational, not NoSQL).
How an API Choice Shapes Everything Downstream
The API you select at account creation is fixed for the life of the account and determines the data model, the query language, the client drivers, and even some portal tooling. You cannot later flip a NoSQL account into a Gremlin account; you would create a new account and migrate. This permanence is why the exam stresses matching the API to the workload up front: the decision is not easily reversed.
Mapping NoSQL Families to Cosmos DB APIs
Chapter 2 introduced the four NoSQL families. Cosmos DB's APIs map onto them directly, which is a clean way to remember the lineup:
| NoSQL family (concept) | Cosmos DB API |
|---|---|
| Document | NoSQL (native) and MongoDB |
| Column-family / wide-column | Apache Cassandra |
| Graph | Apache Gremlin |
| Key-value | Table |
This is also why a single "multi-model" service can cover so many scenarios: each API exposes the same globally distributed engine through the lens of a different data model.
Compatibility Is About Migration, Not Magic
The MongoDB and Cassandra APIs implement the respective wire protocols and query languages, so existing apps and tools connect with little change. This is compatibility for migration, not a promise that every server-side feature of the original product is reproduced identically — some advanced or version-specific features may differ. For DP-900 the message is simpler: choose the compatible API to move an existing MongoDB or Cassandra workload onto a managed, globally distributed backend with minimal code change.
Quick Identification Drill
- "JSON documents, SQL-like queries, new app" → NoSQL.
- "We use the
mongoshell / MongoDB drivers today" → MongoDB. - "CQL keyspaces and tables / Cassandra cluster today" → Cassandra.
- "Vertices, edges, traversals, friend-of-friend" → Gremlin.
- "PartitionKey/RowKey, upgrade from Azure Table storage" → Table.
- "Distributed relational PostgreSQL with Citus" → Cosmos DB for PostgreSQL (relational; not one of the five NoSQL APIs).
Drilling these one-line tells turns the most common Cosmos DB exam question into a quick pattern match rather than a memory test.
An organization runs a large self-managed Apache Cassandra cluster and wants to move to a fully managed, globally distributed service while keeping its existing CQL queries and Cassandra drivers. Which Azure Cosmos DB API should it use?
A startup is building a brand-new application with no existing database and wants the richest feature set and the SQL-like document query experience that receives new Cosmos DB capabilities first. Which API is Microsoft's recommended choice?