Note: the original version of this article is available here.
Why was CockroachDB designed to be compatible with PostgreSQL?
Perhaps surprisingly, the answer to this question is currently (as of early May 2018) not documented publicly, neither on the CockroachDB web site, nor in the CockroachDB docs, nor in CockroachDB-related articles in third party sources.
I was lucky to sit next to Ben Darnell, CTO of Cockroach Labs, when Lakshmi Kannan, Product Manager, asked us the very same question. What follows is an extended rewording of our resulting discussion.
Disclaimer: this is a personal recollection without notes and after a week had passed. Any inaccuracy in the ideas, arguments, timelines, statements, facts or opinions recollected here is entirely mine.
CockroachDB could very well have been created with its own SQL dialect and/or network protocol. This is the way RethinkDB did it. Although Google's Spanner's SQL dialect is inspired by MySQL, it is not fully compatible with it either. Compatibility with anything is a choice, it is not inherently necessary when building a SQL RDBMS. Avoiding compatibility can even be a way to commercial success, by enabling vendor lock-in.
The primary reason to choose compatibility is to reduce the amount of work needed to attract users.
When designing a custom network protocol, the database provider must also provide client drivers. Considering the multitude of platform and programming languages in use today, a custom protocol yields a hard choice: either focus on a few platform/language combinations and only provide drivers for that, or divert a lot of engineering resources on client drivers. The former choice yields a smaller potential user base. The latter choice takes resources away from the implementation of a better RDBMS.
By adopting an established network protocol, especially one for which drivers exist on many platforms and for many programming languages, a new RDBMS makes itself immediately available to a larger ecosystem of existing client code.
This is why CockroachDB chose to be compatible with an established RDBMS.
There is one more thing to be said here about multiple layers of compatibility. I will come back to this in a moment.
There are plenty of well-known and widely used RDBMSs out there. After deciding a new project should be compatible with anything, how does one choose what to be compatible with?
Here, two first-order constraints came at play for CockroachDB.
The first is that CockroachDB was open source from the get-go. It was thus essential to choose compatibility with other RDBMSs that have open source drivers and applications. More specifically, it was legally necessary to choose compatibility with other RDBMSs whose open source drivers are legally allowed to connect to another product than what they were designed for. An open source driver for MSSQL, for example, is no good, because the MSSQL network protocol is likely patent-encumbered and no "MSSQL compatible" database can be built without onerous agreements with Microsoft (if at all). Substitute "MSSQL" by "DB2" or "Oracle" and the argument remains.
The second constraint is that CockroachDB aimed to appeal to open source developers. What do open source developers like? Open source client drivers are one thing, but the ecosystem around the technology matters much more. There must exist welcoming communities, public discussion forums, a free market of third party software, free and open learning resources, etc. In contrast, many proprietary RDBMSs restrict discussion around their tech to semi-private forums. They often have complex third party publishing restrictions. Training to learn and use the technology is usually restricted and expensive.
So for CockroachDB, proprietary RDBMS were out of the compatibility story from the get-go.
What remains? CockroachDB looked at:
Looking for long-established and widely-used open source RDBMS that are commonly known to serve as compatibility anchor for NewSQL databases, the search narrows down to just two: MySQL and PostgreSQL.
Truth be told, as Ben Darnell recollected, MySQL was an appealing choice initially.
Google's Spanner, developed with similar ideas as CockroachDB and with a similar audience in mind, chose for some amount of MySQL compatibility. Other recent NewSQL RDBMSs, like MemSQL, have opted for MySQL compatibility too. Although it was not yet known when CockroachDB started, TiDB, the closest thing to a competitor, chose for MySQL compatibility squarely. Ben Darnell, like the rest of the early team at Cockroach Labs, even had more personal experience with MySQL than with PostgreSQL.
So what went on?
Initially, CockroachDB was swinging on both sides and paddled with the idea of compatibility with MySQL, especially with Spanner in sight. What tipped the balance in PostgreSQL's favor was a combination of multiple factors.
There was initially a clear impression that PostgreSQL's documentation of its network protocol was clearer, more detailed and overall more supportive of a third party implementation than MySQL's documentation of its own protocol.
Meanwhile, the PostgreSQL License is compatible with CockroachDB's own Apache License, which enables reuse of (some of) PostgreSQL's own source code in CockroachDB unchanged. In contrast, MySQL (and its successor MariaDB) is released under the GNU GPL, which prevents direct reuse of MySQL code in CockroachDB.
Also, it slowly but surely became clear that MySQL had grown organically over time in a somewhat less principled manner than PostgreSQL. Throughout the documentation and source code, MySQL seemed to have more exceptions and special cases to care about. Proper SQL transactions came much later to MySQL than to PostgreSQL, and there remains significant cruft in the MySQL documentation and the MySQL ecosystem as a fallout of weak transaction isolation. As I understood the argument, the CockroachDB team did not like the perspective of sharing the same culture and ecosystem as one that usually only sees transaction isolation as a complicated, opt-in, "advanced" feature.
These factors alone caused CockroachDB to slowly but surely focus on PostgreSQL compatibility only. But as time went on, some other aspects came up and confirmed this was a good choice. More on this below.
CockroachDB currently claims compatibility with PostgreSQL's network protocol, and partial compatibility with its SQL dialect. What does this mean?
Protocols for communication between things on a network are typically layered.
On the Internet, for example, the IP layer is at a lower level, TCP higher than IP (IP can function independently from TCP, and TCP is built upon it), and HTTP is higher than TCP (TCP can function independently from HTTP, and HTTP is built upon it).
For RDBMs the communication is organized like this:
SQL RDBMs do not typically explain their communication protocols like this in their docs, but application developers see this structure clearly in their code: when a driver can connect successfully to a database (the network protocol) there is still work to do, because they can still get errors if the app sends invalid SQL punctuation (the SQL syntax). Even when the punctuation is fine, there can still be errors when the app uses SQL functions not currently supported on the server (the SQL semantics). These practical stumbles during the development process of apps are the visible artifacts of a layered communication protocol.
Today, CockroachDB already supports PostgreSQL's network protocol, called "pgwire" in the CockroachDB source code, very well. It also supports most of PostgreSQL's SQL syntax, by virtue of being able to reuse PostgreSQL's syntax parser virtually as-is (with extensions). The compatibility story is however currently more murky at the level of the dialect's semantics, simply because there is much more work to achieve adequate compatibility at that level.
Initially, the CockroachDB team's idea was that CockroachDB would be compatible with PostgreSQL's network protocol, so as to enable reuse of client drivers, but provide its own, potentially PostgreSQL-incompatible SQL syntax and/or dialect semantics. In particular, it was envisioned that CockroachDB would provide sufficiently many specific SQL extensions that a custom SQL dialect would be necessary (or at least warranted).
And so early (non-released) versions of CockroachDB had a hybrid MySQL/Spanner/PostgreSQL dialect, available to clients over the PostgreSQL network protocol.
The assumption underlying this strategy is that users wanted to reuse PostgreSQL drivers (which talk the network protocol) but would accept using CockroachDB-specific SQL queries with their drivers in exchange for CockroachDB-specific benefits.
As the team learned the hard way on the ramp-up to CockroachDB 1.0, most developers in the current open source ecosystems that CockroachDB wants to enter do not write their own SQL queries any more—as opposed to, e.g., ten or twenty years ago.
They will either copy-paste examples from third party documentation, for example Stack Overflow, or use abstract database frameworks, in particular ORM frameworks which generate SQL queries automatically from higher-level specifications. By and large, developers do not care much for SQL any more and want to reuse their frameworks. This was understood, but perhaps under-estimated, in the early phases of the CockroachDB project.
By early 2016, it became clear CockroachDB did not have just client drivers to make compatible to stimulate adoption; it also had to make the higher level frameworks compatible. This in turn meant that the initial idea to restrict compatibility to the network protocol was insufficient, and extended the compatibility mandate throughout CockroachDB's entire SQL layer. That, or invest engineering work to extend existing PostgreSQL frameworks to make them work with CockroachDB.
This is now well-understood. Both directions are being heavily invested in.
The priority is now to increase compatibility with PostgreSQL's
semantics out-of-the-box, for example by providing more and more of
PostgreSQL's built-in functions and operators, and exposing more
compatible data via the
introspection tables. This work is ongoing and future versions of
CockroachDB aim to become increasingly more compatible in this way.
Meanwhile, there exist a few features where CockroachDB will likely never be fully compatible with PostgreSQL due to a fundamental difference in database architecture. For example, PostgreSQL's FOR locking clauses cannot be implemented efficiently in CockroachDB at all, because the concurrency control primitives in CockroachDB are so different. When existing client frameworks for PostgreSQL require such features, Cockroach Labs will instead invest into providing custom CockroachDB-specific versions of the framework, either by building them in-house or supporting third parties to do the work. As CockroachDB becomes more popular, the developers of the frameworks might even choose to implement CockroachDB-specific alternatives themselves.
There are two aspects here.
Does the CockroachDB team want to compete with PostgreSQL?
At this point, no. There is currently no ambition to replace PostgreSQL where it excels and is most often deployed: vertically scalable, robust and reliable SQL on a single server, or with active/passive replication in a single datacenter. Instead, CockroachDB is targeting larger deployments than those where PostgreSQL is typically deployed.
How likely is it that CockroachDB will happen to displace PostgreSQL?
It's rather unlikely. The computing industry is still expanding and there is new business to be made both for small and large scale deployments. There is also a general principle that the longer a piece of software has been around, the longer it will remain around (see: COBOL, Emacs). PostgreSQL has been around sufficiently long that most newly minted software engineers this year will have retired before PostgreSQL stops expanding its user base.
As it turns out, making CockroachDB compatible with PostgreSQL is actually a lot of work. Would have it been any different if CockroachDB had chosen another route instead?
The first alternative, to provide a custom network protocol and SQL dialect, forcing the creation of a new application ecosystem "from scratch", cannot be seriously considered. CockroachDB's money-making business ultimately aims to sell products and services to "enterprise" users, who are notoriously and factually averse to change and young technology. Without compatibility with an established software ecosystem, there is a huge bootstrapping problem which the team had no experience in solving. The failure of the company behind RethinkDB was a chilling reminder that this is a critical risk.
Meanwhile, even without considering the money-making aspects, adopting compatibility has a huge benefit on the development process: it removes open questions from the design discussions. Open questions in the client/server interface of a networked software are a huge time sink and usually becomes source of costly disagreements between sub-teams or long-winded design iterations. By adopting compatibility with something, anything really, the CockroachDB team was able to utilize a definite source of answers for a large number of design decisions. This has accelerated the development by focusing efforts on more important matters, for example resilience and operational simplicity.
Another alternative route was to provide compatibility with MySQL instead. The MySQL protocol and dialect, albeit less well documented and less principled as discussed above, is arguably simpler to implement—especially for the semantics aspects beyond the network protocol. MySQL has fewer SQL data types, fewer built-in functions, fewer introspection facilities, fewer configuration options. All this may translate to less work for CockroachDB, as it does, for example, for TiDB.
At this time, the CockroachDB team is still pretty comfortable with the choice of PostgreSQL over MySQL. The arguments discussed above still hold, especially the code license and the programmer culture. In addition, over time, several other aspects in favor of PostgreSQL became clear.
For one, the acquisition of MySQL AB, the company behind MySQL, by Sun Microsystems (now Oracle) in 2008 has fragmented the MySQL ecosystem. MariaDB, while initially fully compatible, now provides its own feature set. There is no unified steering governance body behind MySQL's future any more. PostgreSQL can be considered more future-proof in that regard. Also, Oracle now owns the bulk of MySQL's intellectual property, and a company building a product too closely related to MySQL and appealing to a similar enterprise audience might land a bit too close to Oracle's anti-competition radar for comfort.
Separately, PostgreSQL's reference documentation is extremely clean, well-written, well-presented and developer-friendly, moreso than MySQL's. It is an effective complement to CockroachDB's own documentation, and users have reported that they are happy to use both together. PostgreSQL's documentation also has an excellent reputation, and this reputation leaks culturally towards CockroachDB, providing a slight reputation boost.
Finally, as years go by the specific compatibility choice ends up mattering less and less. By the time a project becomes well-known and enough users have adopted it in their tech stack, (social) network effects and the quality of provided support resources largely overshadow the initial ramp-up benefits of (tech) network or dialect compatibility. Hopefully, CockroachDB will find itself in this position soon enough.
So yes, in retrospect, probably still the right choice.
Copyright © 2018 Raphael ‘kena’ Poss. Permission is granted to distribute, reuse and modify this document according to the terms of the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/.