In response to You Do It Too: Forfeiting Partition Tolerance in Distributed Systems, I’d like to remind folks of a few things around CAP.

Partition intolerance does not mean that partitions cannot happen, it means partitions are not supported.

Specifically, partition-intolerant systems must sacrifice invariants when partitions occur. Which invariants? By Gilbert & Lynch, either the system allows nonlinearizable histories, or some requests to non-failing nodes cannot complete. Related proofs tell us that systems which preserve availability during partitions also cannot provide sequential consistency, serializability, repeatable read, cursor stability, or snapshot isolation.

CP, AP describe the behavior if a partition occurs.

CP and AP are upper bounds: systems can provide C or A during a partition, but might provide neither.

This obviously leaves room for an overlap between the two categories.

This is not obvious at all. CP describes systems which are both C and P; AP describes systems which are both A and P. The existence of an overlap between CP and AP implies there exists some system which is C, A, and P. The entire point of the CAP theorem is that such systems cannot exist. The existence of such a system, even theoretically, would disprove the theorem. You’ve got a paper to publish.

Many CA systems are not CP.

Every CA system is not CP.

Many CP systems are not CA.

Every CP system is not CA. Or the theorem’s wrong, and we’ve got several proofs to overturn!

Systems that belong to these two categories are only systems that stop working during the partition, but are consistent once the partition is fixed (trivially a webserver connected to a database). I personally prefer to call these systems ‘CA’…

Gilbert & Lynch’s proof is very clear about what constitutes availability: “For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response.” Systems which stop working during a partition trivially fail the theorem’s definition of availability by failing to make progress. You can’t call them CA.

Without data agreed upon, there is no real way out from this debate.

Luckily, we do have data: network partitions happen in LANs all the time. Claiming small clusters will save you is a hand-waving argument at best: there are plenty of cases of clusters as small as two nodes, connected by redundant physical switches, encountering network partitions.

“node failures, processes crashes and network partitions are partitions so you have to be partition tolerant”. This is not only false but also dangerous: it hides the fact that each of these faults could be tackled independently with a specific priority.

I’d like to reiterate that “network partitions” don’t only happen in the network. In a formal model of a distributed system, like that used in the Gilbert & Lynch proof, we refer to everything that transmits messages between processes as “the network”, and make idealizing assumptions about the processes themselves: e.g. they are always singlethreaded, they execute in bounded time, etc. Real software is fuzzier: our processes are usually not realtime, which means the network effectively extends within the node. Garbage collection, in particular, is a notorious cause of “network” partitions, because it delays messages.

More informally, I suspect that attempting to tackle network partitions as an independent type of fault is why so many databases fail their Jepsen tests. It’s easier to choose an algorithm which is safe in the general case of asynchronous networks, than to try and impose synchronous delivery via imperfect failure detectors and special-cases.


I want that last paragraph on a T-shirt.

Post a Comment

Comments are moderated. Links have nofollow. Seriously, spammers, give it a rest.

Please avoid writing anything here unless you're a computer. This is also a trap:

Supports Github-flavored Markdown, including [links](, *emphasis*, _underline_, `code`, and > blockquotes. Use ```clj on its own line to start an (e.g.) Clojure code block, and ``` to end the block.