A few months ago, I was invited to present CockroachDB to a tech consulting office in Amsterdam. The audience was welcoming and receptive. They understood, appreciated, and lauded the “flagship” features of CockroachDB: distribution, scalability, high availability, operating simplicity.
Yet a question came up which I had not heard before: all these are features that solve known problems; now, what are the goodies?
The goodies, the asker clarified, are those features which:
- the user did not expect,
- are not present in other products, and
- are small-ish in nature so that a casual user can easily show them off to a peer.
Goodies enable users to brag about their product choice after the choice is made, without too much attention for the rational trade-offs that motivated the choice.
I paused, and recollected. What are CockroachDB’s goodies?
Obviously, the main CockroachDB documentation is unlikely to highlight features directly in this way: the documentation aims to treat all features as novel and useful, making no assumptions about what a particular reader may like more over another. Arguably, the doc site is also a marketing tool aiming to convince users who do not use CockroachDB yet, so it is bound to focus primarily on CockroachDB’s core features.
Finding “goodies” requires looking at the thing as if all its core features were already considered familiar and uninteresting, and contemplate what sticks out beyond that in an agreeable way.
Searching for a fancy feature suitable to impart a “wow” reaction in demonstration booths, I quickly thought about the Node map: a graphical visualisation of the geographical distribution of CockroachDB nodes in the world.
Arguably, this feature is very enterprise-y (and incidentally limited to deployments with an “Enterprise license”), and perhaps of limited use when the database operates properly.
We can instead look at the layer underneath, another goodie of a more technical nature: the configuration of replication zones which enable a user to configure which parts of which SQL tables is replicated on which (sub-)sets of cluster nodes.
The zone config language is a DSL (domain-specific language) which supports a constraint algebra against arbitrary attributes of the underlying data stores. It supports both positive (mandate) and negative (avoid) conjunctions (mandate/avoid compatibility with all properties) and disjunctions (mandate/avoid compatibility with either/or properties). Its constraint solver results in automatic migration events which move the data where it is constrained. It also interacts peacefully and constructively with the automatic load balancing that happens independently to increase performance: data is migrated within its constrained zone to bring it closer to where it is needed.
I described this as solid and serious feature that is both practically essential and appealing to an audience of erudite hackers. My audience agreed.
For having contributed to some parts of the code base, I am aware of several more goodies which I indirectly or directly contributed to.
For example, CockroachDB integrates a fancy tracing infrastructure
which can extract detailed debugging details. The collection of traces
can be enabled using a variety of mechanisms depending on the
troubleshooting scenario. For example, one can request a detailed
all the processing done by CockroachDB on behalf of a single query,
but throughout all the abstraction layers inside CockroachDB including
across all the nodes in the cluster that participated in the query’s
execution. Many other tracing endpoints beyond
SHOW TRACE are also
available, including via the web browser. It’s also possible to trace
all executions through particular files or functions in CockroachDB’s
Given the commonly known arduousness of debugging large distributed systems, developers will likely find some appeal in this powerful tool. It has certainly improved the life of CockroachDB’s contributors already.
Speaking of which, a fancy advantage of exposing tracing data within SQL is that one can then further use SQL queries to filter, transform and reduce particular details of traces. In fact, CockroachDB generalizes this principle: any internal data produced by CockroachDB that can be structured as a table should be available for further processing by SQL queries.
Here, I am not considering that CockroachDB, like other SQL databases, exposes the SQL logical schema via SQL tables (e.g. information_schema) which can be queried for introspection.
Instead, beyond that, any configuration or administration SQL
statements can also be used as a “virtual table” to query
For example, there exists a
SHOW JOBS statement that lists the
current background tasks in the cluster (e.g. asynchronous online schema
changes, such as adding an index on a very large table); given that this
produces tabular data, one can refine the output with e.g.
SELECT finished - created FROM [SHOW JOBS] to determine the
execution time of completed jobs. This enables users to design their own
views on the current status of their cluster, without the need to
request an extension in CockroachDB’s SQL syntax.
There exists also a command-line SQL
cockroach sql), analogous to the psql
fact, it’s so compatible with it that
psql can connect to a
CockroachDB cluster, and
cockroach sql can connect to a PostgreSQL database.
Despite its smaller set of features compared to
cockroach sql contains its own goodies. For example, both
cockroach sql can present the user with guidance about the
syntax and usage of a SQL statement using
can also present this help if the user presses
?? then the tab key
while they are currently entering a query. This enables the use of
contextual help without erasing the current entry, which is particularly
convenient while experimenting. To ease experimentation further,
cockroach sql also supports
\hf (not known to
psql) which is
able to pull the documentation of individual SQL built-in functions,
On a related note, the
cockroach executable program contains many
other functions besides the main server function (
etc) and the SQL shell (
sql). Some of them are gems of their own.
cockroach demo is a fantastic entry point for beginners, and for
teachers constructing a SQL tutorial: in one fell swoop, it starts a
RAM-only CockroachDB server and an interactive SQL shell, with no
additional configuration needed. Type this command in, then you can
start typing SQL immediately and work with CockroachDB. Lovers of
sqlite tend to like this a lot. (I do too. It’s gorgeously helpful
to try out new code during development.)
cockroach gen man will generate CockroachDB’s unix manual
pages automatically, ready
to read or install. Cockroach Labs distributes a single
binary, to simplify the download process, but you can still install its
documentation in the Right Way, like for all your other beloved unix programs.
cockroach gen autocompletes generates auto-completion data for
either Bash or Zsh. Avid users of the CockroachDB command line will
surely appreciate this convenience, which is designed to accelerate
operations and maintenance.
There is even an Easter egg hidden in
cockroach gen somewhere, but I
am not telling. Will you be able to spot the CockroachDB logo?
There is much I could write about CockroachDB’s unique technical features. Yet, at this point, I would like to shift this exposé and underline that CockroachDB’s own documentation site is itself quite a unique achievement. To a casual observer, “it’s just a documentation site for a technical project”. But for amateurs of documentation resources, there is much to love.
For example, the documentation presents content both from the angle of usage scenarios (e.g. “how to do this or that”) and as a reference manual (i.e. “what is everything I need to know about an aspect of the product”). There is content both for absolute beginners (e.g. “Getting started” guides) and technical audiences (e.g. an in-depth presentation of CockroachDB’s architecture). Cross-references are exhaustive and relevant, so that it is particularly easy to idly surf from one area to another, much like one can educate themselves by casually browsing Wikipedia. Each documentation page has a link in the top right where the reader can become an editor and propose improvements (even propose direct changes to the text). For a project as young as CockroachDB, the maturity of its documentation is remarkable. (Disclaimer: I have personally contributed to parts of it. I am very proud.)
To conclude, I would say it is rather easy to find ways to like (or even love) CockroachDB beyond the moment you decide that it is suitable for your purpose. Plenty of goodies indeed.
So what do you think? Did I miss something? Is any part unclear? Leave your comments below.