What Is the CAP Theorem?
The CAP theorem, proposed by computer scientist Eric Brewer in 2000 and later proved by Seth Gilbert and Nancy Lynch in 2002, states that a distributed data store can provide at most two out of three of the following guarantees simultaneously:
- Consistency (C): Every read receives the most recent write or an error. All nodes see the same data at the same time.
- Availability (A): Every request receives a non-error response, without the guarantee that it contains the most recent write. The system is always operational.
- Partition Tolerance (P): The system continues to operate despite network partitions (communication failures between nodes). Messages between nodes can be dropped or delayed.
Why Not All Three?
Imagine you have two database nodes, Node A and Node B, that replicate data between each other. A network partition occurs, and the nodes can no longer communicate. Now a client writes data to Node A. When another client reads from Node B, you have a choice:
- Return the old data (Choose A over C): Node B responds with stale data. The system is available but inconsistent.
- Return an error (Choose C over A): Node B refuses to serve the read until it can confirm it has the latest data from Node A. The system is consistent but not available.
You cannot have both: you cannot return the most recent data from Node B if Node B has not received it from Node A due to the partition. This is the fundamental trade-off.
Why Partition Tolerance Is Non-Negotiable
In any distributed system deployed across a network, network partitions will happen. Cables get cut, routers fail, cloud availability zones become unreachable. Since partition tolerance is a reality rather than a choice, the practical choice is between CP (consistency + partition tolerance) and AP (availability + partition tolerance).
A system that does not tolerate partitions — a CA system — would essentially be a non-distributed system running on a single node with no replication. While this is technically consistent and available, it defeats the purpose of distributed systems, which is to survive failures.
CP Systems: Consistency + Partition Tolerance
A CP system prioritizes consistency over availability. During a network partition, it will refuse to serve requests rather than risk returning stale data. When the partition heals, all nodes will have consistent data.
Examples of CP Systems
- MongoDB (default configuration): In a replica set, writes go to the primary node. If the primary is unreachable due to a partition, the system elects a new primary. During the election, writes are rejected (sacrificing availability for consistency).
- HBase: Built on HDFS with strong consistency guarantees. Uses ZooKeeper for coordination. Becomes unavailable for a region if its region server fails until failover completes.
- Redis Cluster: With the default settings, Redis Cluster prefers consistency. If a master fails and its replicas cannot be promoted, the cluster rejects writes to those slots.
- etcd / ZooKeeper: These coordination services use consensus protocols (Raft and ZAB respectively) that guarantee strong consistency. They sacrifice availability during partitions — a minority partition cannot serve reads or writes.
// CP behavior illustrated in pseudocode
class CPDatabase {
private nodes: DatabaseNode[];
private quorumSize: number;
async write(key: string, value: string): Promise<boolean> {
// Must get acknowledgment from a majority of nodes (quorum)
let acks = 0;
const promises = this.nodes.map(async (node) => {
try {
await node.write(key, value);
acks++;
} catch (e) {
// Node unreachable (partition)
}
});
await Promise.allSettled(promises);
if (acks >= this.quorumSize) {
return true; // Write committed - consistency guaranteed
}
// Cannot reach quorum - reject the write (sacrifice availability)
throw new Error('Write failed: cannot reach quorum');
}
async read(key: string): Promise<string> {
// Must read from a majority to ensure we get the latest value
const responses: string[] = [];
for (const node of this.nodes) {
try {
const value = await node.read(key);
responses.push(value);
} catch (e) {
// Node unreachable
}
}
if (responses.length >= this.quorumSize) {
// Return the value with the highest version/timestamp
return this.getLatestValue(responses);
}
throw new Error('Read failed: cannot reach quorum');
}
}
AP Systems: Availability + Partition Tolerance
An AP system prioritizes availability over consistency. During a network partition, all nodes continue to accept reads and writes. When the partition heals, the system reconciles conflicting writes using techniques like last-write-wins, vector clocks, or CRDTs.
Examples of AP Systems
- Cassandra: Designed for availability. Every node can accept reads and writes. Uses tunable consistency levels — you can configure how many replicas must respond before a read or write is considered successful, allowing you to slide between AP and CP behavior per query.
- DynamoDB: Amazon's managed NoSQL database favors availability. In its default configuration, reads may return stale data (eventually consistent reads), but you can opt into strongly consistent reads at the cost of higher latency.
- CouchDB: Uses a multi-master replication model where all nodes can accept writes. Conflicts are detected and stored, and the application must resolve them.
- DNS: The Domain Name System is a classic AP system. DNS servers can serve stale records during partitions, prioritizing availability over consistency. Updates propagate eventually.
CP vs AP Systems Comparison
| Aspect | CP Systems | AP Systems |
|---|---|---|
| During partition | May reject requests | Always responds |
| Data freshness | Always current | May be stale |
| Write conflicts | Prevented | Must be resolved |
| Latency | Higher (waits for quorum) | Lower (local response) |
| Use case | Financial systems, inventory | Social media, shopping carts |
| Examples | MongoDB, HBase, etcd | Cassandra, DynamoDB, CouchDB |
PACELC Theorem
The CAP theorem only describes behavior during a network partition. But what about when the network is working normally? The PACELC theorem extends CAP to address this:
If there is a Partition (P), choose between Availability (A) and Consistency (C). Else (E), when the system is operating normally, choose between Latency (L) and Consistency (C).
This is important because even without partitions, there is a trade-off between consistency and latency. A system that requires a write to be replicated to all nodes before acknowledging it (strong consistency) will have higher latency than one that acknowledges immediately and replicates in the background (eventual consistency).
PACELC Classifications
| System | During Partition (PAC) | Normal Operation (ELC) |
|---|---|---|
| Cassandra | PA (Available) | EL (Low latency) |
| DynamoDB | PA (Available) | EL (Low latency) |
| MongoDB | PC (Consistent) | EC (Consistent) |
| PostgreSQL | PC (Consistent) | EC (Consistent) |
| Cosmos DB | PA/PC (Tunable) | EL/EC (Tunable) |
How to Choose for Your System
The right choice depends entirely on your application's requirements. Here is a decision framework:
Decision Framework
- Choose CP when correctness is critical: Financial transactions (bank transfers), inventory management (overselling is costly), leader election, distributed locks, and any system where stale data could cause harm.
- Choose AP when availability is paramount: Social media feeds (showing a slightly stale post is fine), shopping carts (merge conflicts later), analytics and logging (some data loss is acceptable), and any system where downtime directly impacts revenue.
- Use tunable consistency when possible: Many modern databases (Cassandra, DynamoDB, Cosmos DB) let you choose consistency on a per-query basis. Use strong consistency for critical operations and eventual consistency for everything else.
Common Misconception
The CAP theorem does not mean you must sacrifice consistency or availability all the time. You sacrifice one only during network partitions, which are (hopefully) rare events. During normal operation, you can have both strong consistency and high availability. The PACELC theorem makes this distinction explicit: during normal operation, the trade-off is between latency and consistency, not availability and consistency.