Despite it’s self declared alpha status, gocql is a functional and stable CQL driver for Cassandra. Its well factored codebase indicates that significant thought has gone into design of gocql. It has been kept lean and clean with a focus on maintainability as the feature set grows. The following is a list of the main features:
- Marshaling of all of CQL types to Go types
- Automatic query statement preparation
- Support for logged, unlogged, counter and batch operations
- Awareness of the Cassandra cluster
- Query Result paging
- Frame compression
- CAS operations (aka lightweight transactions)
Working At Low Levels
gocql provides a low level focus on the mechanics of the CQL protocol. It covers all of the primitives required to interact correctly and reliably with a Cassandra cluster.
As the gocql applications I was writing began to grow in complexity, I found myself trying to reduce repetitive CQL code. In particular, I found it difficult to factor the binding of query results in a satisfactory way.
In addition, as I began to tweak and change the schema in Cassandra, the query parameter placeholders in my CQL began to get out of sync. These binding parameter mismatches only became apparent at run time, when the gocql driver would return an error. This problem got bigger as the application grew.
I needed some kind of higher level API that would decouple my application logic from the low level access provided by the gocql driver.
After my initial attempts to factor code in appropriate support libraries, it occurred to me that I really needed a mechanism to:
- Tell me at compile time when my Go code was out of sync with the current schema in Cassandra
- Reduce the amount of boilerplate code required to process query results
- Keep the power of gocql, in case I ran into any scenarios where I needed direct access to specific low level features
- Not do anything too clever or fancy or make any assumptions about the layout of the schema in Cassandra
- The way in which CQL statements are generated has to be clear and deterministic - there should be no hidden magic that leads to unexpected CQL being issued at runtime
This set of requirements led me to believe that I ultimately needed to build a code generation tool that could introspect a schema in Cassandra and produce Go source code to to take care of the low level boilerplate binding code. In addition, I wanted the generated code to to allow for meaningful compile time type checks of every column that was be accessed.
What I ended up building was a toolchain made up of two parts:
- A command line tool that gets pointed at a given keyspace on a specific instance of Cassandra. The output of the
cqlctool is Go source code to be included in an application source tree.
- A runtime API that follows a fluent builder pattern, so that hand written Go code using that uses the generated Go code actually looks and feels like a CQL statement.
We’re going to look at some basic examples of the cqlc fluent API using the following CQL column family definition:
CREATE TABLE events ( sensor bigint, timestamp timeuuid, temperature float, pressure int, PRIMARY KEY (sensor, timestamp) );
After pointing the
cqlc command line tool at an instance of Cassandra with this schema loaded into it, static definitions of the
events table will be generated in Go. You can then use this generated code in conjunction with the runtime API.
To get all events from the sensor with id 100, you can use the following
ctx.Select(EVENTS.PRESSURE). From(EVENTS). Where(EVENTS.SENSOR.Eq(100)). Fetch(session)
UPDATE can be used interchangeably. cqlc exposes an API called
Upsert(). Whether or a CQL
UPDATE or an
INSERT will be generated depends on the presence of a
Where() component. For example, the following fluent API call will generate an
ctx.Upsert(EVENTS). SetInt64(EVENTS.SENSOR, 100). SetTimeUUID(EVENTS.TIMESTAMP, uuid.TimeUUID()). SetFloat32(EVENTS.TEMPERATURE, 19.8). SetInt32(EVENTS.PRESSURE, 357). Exec(session)
If we modify the above statement to contain a
Where() component, the API will generate an
ctx.Upsert(EVENTS). SetFloat32(EVENTS.TEMPERATURE, 19.8). SetInt32(EVENTS.PRESSURE, 357). Where( EVENTS.SENSOR.Eq(100), EVENTS.TIMESTAMP.Eq(uuid.TimeUUID())) Exec(session)
Both variants have the same effect on the database - they are functionally equivalent.
You can either delete a single column or an entire row with the fluent API. If you specify any specific columns in a
component, only that column value will be removed:
Delete(EVENTS.PRESSURE). From(EVENTS). Where(EVENTS.SENSOR.Eq(100)). Exec(session)
If you want to delete the entire row, just modify the above statement and remove all arguments to the
Delete().From(EVENTS). Where(EVENTS.SENSOR.Eq(100)). Exec(session)
The cqlc documentation contains examples of the fluent API that include:
- Counter column families
- Statement batching
- Compile time checks for query predicates
In addition, you can look at a complete running example.
Separation Of Concerns
cqlc does not replace gocql, instead it builds on top of it and adds features that are not within the remit of a low level driver. cqlc does not abstract anything away from gocql and the gocql session reference is available to the application at all times. Because of this, an application can use as little or as much of cqlc as it likes.
The following table highlights the different focuses for gocql and cqlc:
|Result set binding||Manual||Automatic|
|Schema sync checks||No||Yes|
|Column type checks||No||Yes|
|Binding parameter checks||No||Yes|
|Partition column awareness1||No||Yes|
|Cluster column awareness2||No||Yes|
Standing On The Shoulders Of Giants
The concept of cqlc is not unique, it merely borrows a bunch of ideas from other projects. I wanted to pay tribute to those that went before me and basically showed me the way.
The main architectural inspiration for cqlc came from jooq. jooq consists of a tool that produces Java code by introspecting the JDBC metadata of a SQL database. It also has a runtime API that allows you to construct SQL in a fluent and type safe way. Contrary to a number of Java database products, it doesn’t hide any of the SQL away from the application. I wanted to emulate these properties in the solution I was building.
gocql provides the batteries for the runtime API - without gocql you can’t use cqlc at all. cqlc leverages the strengths of gocql.
megajson is a Go library for JSON that generates fast decoders and encoders for marshaling JSON to Go structs. This provided me with the necessary scaffolding to generate Go code from a given data model. megajson also provided a way to run tests against generated code.
cqlcdifferentiates between partitioned columns, clustered columns and regular columns. It will only generate equality predicates for partitioned columns. More information is available in the documentation section on predicates. ↩