2.3 Transactional Workloads (OLTP) and ACID
Key Takeaways
- OLTP workloads are characterized by many small, fast, frequent transactions with low latency requirements.
- ACID stands for Atomicity, Consistency, Isolation, and Durability — the four guarantees a relational engine makes about transactions.
- Two-phase commit (2PC) coordinates a single transaction across multiple resource managers using a prepare phase and a commit phase.
- SQL Server / Azure SQL support five isolation levels: Read Uncommitted, Read Committed (default), Repeatable Read, Serializable, and Snapshot.
- OLTP schemas are normalized (typically to 3NF) to eliminate redundancy and protect data integrity during high-volume writes.
Online Transactional Processing (OLTP) is the workload that runs the business in real time: order entry, account transfers, ticket reservations, inventory updates. The system has to be correct under heavy concurrent load, not just fast.
OLTP Characteristics
Microsoft's reference profile for OLTP looks like this:
- Many small transactions. Each unit of work touches a few rows.
- High concurrency. Hundreds or thousands of users may be writing at once.
- Low latency. Each transaction completes in milliseconds.
- Normalized schema. Data is split across many tables to avoid duplication.
- Read/write mix. Heavy mix of inserts, updates, and deletes alongside reads.
- ACID required. A failed transfer must not leave money missing.
Azure services that target OLTP include Azure SQL Database, Azure SQL Managed Instance, SQL Server on Azure VMs, Azure Database for PostgreSQL, Azure Database for MySQL, and Azure Cosmos DB (which supports ACID at the partition level).
ACID Properties
A transaction is a logical unit of work — for example, "debit account A by $100 and credit account B by $100." A relational engine guarantees four properties for transactions, collectively called ACID.
| Property | Guarantee | What goes wrong without it |
|---|---|---|
| Atomicity | Either all statements in the transaction commit, or none do. | The debit succeeds but the credit fails — money disappears. |
| Consistency | The database moves from one valid state to another, respecting all constraints. | A foreign-key reference is left pointing at a non-existent row. |
| Isolation | Concurrent transactions do not see each other's intermediate state. | Two users withdraw from the same account and both see the full balance. |
| Durability | Once committed, the change survives crashes and power loss. | The server reboots and the committed sale is lost. |
These guarantees are enforced through the transaction log (for atomicity and durability), constraints (for consistency), and locking or row-versioning (for isolation).
Two-Phase Commit (2PC)
When a single transaction has to update more than one resource — say, an Azure SQL table and a queue, or two databases — the engine uses two-phase commit (2PC) to coordinate.
- Prepare phase. The transaction coordinator asks every participant "can you commit?" Each participant writes its changes to its log, locks the rows, and answers yes or no.
- Commit phase. If every participant said yes, the coordinator tells them all to commit. If any said no, the coordinator tells them all to roll back.
The upside is atomic distributed transactions. The downside is latency and the blocking problem: if the coordinator fails between the two phases, participants stay locked until it recovers. Most modern cloud-native designs avoid 2PC in favor of eventually consistent patterns.
Isolation Levels
Isolation is the most nuanced of the four ACID properties. SQL Server, Azure SQL Database, and Azure SQL Managed Instance support five isolation levels, from least to most strict.
| Level | Dirty read | Non-repeatable read | Phantom read | Mechanism |
|---|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible | No shared locks for reads |
| Read Committed (default) | Prevented | Possible | Possible | Shared locks released after read, or row versioning if RCSI is on |
| Repeatable Read | Prevented | Prevented | Possible | Shared locks held until end of transaction |
| Serializable | Prevented | Prevented | Prevented | Range locks held until end of transaction |
| Snapshot | Prevented | Prevented | Prevented | Row versioning — reads see a consistent snapshot from transaction start |
The Three Read Anomalies
- Dirty read — reading a row another transaction has modified but not yet committed.
- Non-repeatable read — reading the same row twice in one transaction and getting different values.
- Phantom read — running the same range query twice and getting different sets of rows.
Choosing a Level
- Read Committed is the default in SQL Server and is usually right for OLTP.
- Serializable gives the strongest guarantees but reduces concurrency.
- Snapshot isolation uses row versioning in
tempdbto give reads a consistent point-in-time view without taking shared locks, which is excellent for read-heavy reporting against an OLTP store.
Why OLTP Schemas Are Normalized
In OLTP, you want each fact stored exactly once so an update touches a single row. Storing a customer's address in the Orders table would mean every change to that address requires updating every order — opening the door to update anomalies and to a lot of unnecessary I/O. Normalizing to 3NF is the standard target for OLTP. Analytical schemas (covered in section 2.4) deliberately go the other way.
Concurrency Control Mechanisms
Underneath the isolation levels are two ways the engine prevents transactions from corrupting each other's view of the data. Pessimistic concurrency uses locks: a transaction acquires shared (read) or exclusive (write) locks and others wait. This is the classic SQL Server default.
Optimistic concurrency uses row versioning: instead of blocking readers, the engine keeps prior versions of rows in tempdb (or in the database for accelerated database recovery) so a reader sees a consistent snapshot while writers proceed. Read Committed Snapshot Isolation (RCSI) and Snapshot isolation are the row-versioning options, and they are why a reporting query can run against a busy OLTP database without blocking writers — a frequent exam scenario.
Locking Hazards
- Blocking happens when one transaction holds a lock another needs; the second simply waits. Brief blocking is normal; sustained blocking hurts throughput.
- Deadlock happens when two transactions each hold a lock the other needs, forming a cycle. SQL Server detects deadlocks and kills one transaction (the deadlock victim, usually the one cheapest to roll back), returning error 1205. The application must retry. Knowing that the engine automatically resolves deadlocks by choosing a victim is a testable detail.
ACID vs BASE
Relational OLTP engines favor ACID (strong, immediate consistency). Many distributed NoSQL systems favor BASE — Basically Available, Soft state, Eventually consistent — trading immediate consistency for availability and scale. Cosmos DB blurs the line: it provides ACID guarantees within a logical partition and lets you tune consistency globally (the five levels in Chapter 4). The exam contrast is: relational SQL = ACID by default; globally distributed NoSQL = often eventual/BASE, but Cosmos DB makes consistency a configurable choice.
A Worked Isolation Example
Consider two users hitting one account row. User A starts a transfer (reads balance 500, intends to withdraw 100). Before A commits, User B reads the same row. Under Read Uncommitted, B could read A's uncommitted new balance of 400 (a dirty read) — and if A rolls back, B acted on a value that never existed. Under Read Committed (the default), B waits or reads the last committed value of 500. Under Serializable, the engine also prevents any phantom rows from appearing in a range B is scanning. Walking up the levels trades concurrency for correctness, which is exactly the trade-off DP-900 wants you to articulate.
A bank transfer debits account A by $200 and credits account B by $200. The debit succeeds but the credit fails because account B is closed, and the database rolls both statements back. Which ACID property is this behavior demonstrating?
A reporting query against an OLTP database must never see uncommitted changes, but it also must not block writers and is allowed to see a consistent point-in-time view of the data. Which isolation level best matches this requirement on Azure SQL Database?