Technical·November 2024

CAP Theorem and Real Systems

CAP theorem: a distributed system can guarantee at most two of — Consistency, Availability, Partition tolerance.

Since network partitions happen in the real world and you can't opt out, the real tradeoff is: when a partition occurs, do you want the system to return potentially stale data (AP) or refuse to respond until it's sure it has fresh data (CP)?

CP systems (Zookeeper, HBase): consistent when things go wrong, but some nodes may become unavailable. Better for coordination, locks, leader election.

AP systems (Cassandra, CouchDB): available when things go wrong, but different nodes may temporarily see different data. Better for user-facing reads where eventual consistency is fine.

The thing CAP doesn't capture: latency. A CP system that returns in 100ms vs. 10 seconds is a different product. PACELC is a more complete model: even without a partition, you're trading off latency against consistency.

In practice: most systems are configurable — Cassandra lets you tune consistency level per query. The interesting decisions are not "pick one" but "pick the right tradeoff for each operation."

PreviousLIME and Why Interpretability Matters NextOn Writing Code That Reads Like Prose