Tag Archives: technical

Riak Quick Start with Docker

April 21, 2014

In the Riak documentation, one of the first sections contains a quick start. The goal of the quick start is to get Riak on your workstation and then establish a five-node cluster (in under five minutes).

The quick start itself is based on the source build of Riak. The source contains a Makefile with a target labeled devrel. The devrel (or development release) automates the creation of 5 separate copies of Riak. After the devrel process is complete, you can start each copy of Riak and join each instance into a cluster.

In a world with Linux Containers (LXC) and Docker, is there a way we can leverage these technologies to make the Riak quick start process more streamlined for developers? At the same time, can the isolation and portability of containers ease the transition of clusters between environments for operators?

Below is a first pass at it.


Docker and LXC are key prerequisites for a container-based quick start. Luckily, Docker’s website has installation instructions for almost every flavor of Linux, Windows, and Mac OS. Furthermore, those instructions also include the installation of LXC.

Note: Before executing any of the commands below, ensure that your DOCKER_HOST environmental variable is set correctly. This is the host that is running Docker’s server component:

$ export DOCKER_HOST="tcp://"

Building a Riak Image

Since we’re working with Docker’s API instead of LXC directly, the process of building a Riak container begins with a Dockerfile. A Dockerfile contains the steps required to build a Docker image. From that image, we can spawn container instances.

To get the Dockerfile, simply clone the docker-riak repository. From there, use the build Makefile target to build the container:

After the image building process is complete, you should see something like this from the output of docker images:

(See the entire Dockerfile contents here.)

Bring Up a Cluster

After the Docker image for Riak is created, the next step is to create a container out of it. But, since Riak is a distributed database, we don’t want to spin up just one Riak container – we want to spin up at least 5.

We also want the Riak containers to communicate with each other. Within Docker, this can be accomplished by linking containers. Linking connects one container to others by populating the target container’s environment with variables containing IP and port information of exposed endpoints associated with the source container.

Instead of establishing all of the links between containers manually, you can automated each step in another Makefile target labeled start-cluster. The start-cluster target is aware of three environmental variables that can alter its behavior:

  • DOCKER_RIAK_CLUSTER_SIZE – The number of nodes in your Riak cluster (default: 5)
  • DOCKER_RIAK_AUTOMATIC_CLUSTERING – A flag to automatically cluster Riak (default: false)
  • DOCKER_RIAK_DEBUG – A flag to set -x on the cluster management scripts (default: false)

To start a 5 node cluster, you can invoke the start-cluster target like this:

Now, not only do we have 5 Docker containers running Riak, but those containers are also joined into a cluster!

Testing a Cluster

From outside the container, we can interact with Riak’s HTTP or Protocol Buffers interfaces. For testing, we’re going to use the HTTP interface.

The HTTP interface has an endpoint called /stats that emits Riak statistics. The test-cluster Makefile target hits a random container’s /stats endpoint and pretty-prints its output to the console.

The most interesting attributes for testing cluster membership are ring_members:

And ring_ownership:

Together, these attributes let us know that this particular Riak node knows about all of the other Riak instances.

GETs and PUTs
We can also test GETs and PUTs to show that the nodes can accept reads and writes. At the same time, we can issue a PUT to one container and a GET from another to demonstrate that the nodes can communicate with each other.

Using docker ps, we can get a list of the running Riak containers. Note that the PORTS column includes multiple pairings for each container. We’re interested in the pairing associated with 8098 because that’s where Riak’s HTTP API endpoint is listening:

For riak04 (or container ID fba2c4d85aac), port 8098 is mapped to 49160. Let’s issue a PUT to riak04 with an arbitrary key and some data:

Now, let’s read the same value from riak02 (or container ID 9cac9ef525a5). In this case, port 8098 is mapped to 49156:

Looks like we have an operational Riak cluster!


Docker is an increasingly popular tool used to package applications (and their dependencies) into a virtual container that can run on any Linux server. Containers are almost as lightweight as a process, but with the isolation and portability of a virtual machine.

Running Riak within a Linux Container has advantages over a devrel in that it includes the Erlang distribution we test with and avoids the need to alter port bindings to prevent clashes. At the same time, it’s attractive to operators because a developer’s Riak cluster can be moved to another environment by simply saving and loading containers.

If you’re already playing around with Docker (or want to), give it a shot with Riak.

Hector Castro

Webmachine 1.10.0: never breaks eye contact

May 3, 2013

We recently tagged version 1.10.0 of Webmachine and, in addition to a slew of bug fixes, it includes some notable new features. Those features are the subject of today’s post; but first a bit of background on the driving force for these additions.

The development of Riak CS is great for dogfooding and bringing home some of the pain points in application development using Riak. The same is also true for Webmachine.

Webmachine has not received a great deal of attention recently because it had what Riak needed and, for the most part, Webmachine has just worked. With Riak CS we needed things from Webmachine that either were not possible or did not work in a way that suited our needs. Besides there was more pressing work to be done making Riak more awesome. With Riak CS that was not always the case. So we have been adding new features we needed and we believe these features will be of use and interest to the larger Webmachine community. Dogfooding FTW again!

We have now also created a 1.11.0 tag that includes an updated tag of mochiweb so that Webmachine can be built and used with Erlang R16.

New features for 1.10.0

Run multiple dispatch groups within a single application
Users can now specify multiple groups of dispatch rules that listen on different IP addresses and ports within the same Erlang application. Read about how to configure this here.

Event-based logging system
The server modules that previously handled Webmachine logging have been replaced with an event-based system. Log event handlers can be added and removed dynamically and custom log modules can be easily added and run in concert with any existing log handlers. More details about the new logging system are here.

Ablity to specify a URL rewrite module
This feature is very similar to the mod_rewrite module for Apache httpd. A rewrite module specifies a set of rules for rewriting the URL and the rewritten URL is what is processed by the dispatch rules of Webmachine. Docs are here. The module used by Riak CS to rewrite S3 API requests can be found here.

Stream large response objects without buffering the entire object in memory
Streaming content has long been possible with Webmachine, but it was not suitable for use with large objects when not using multipart/mixed because Webmachine buffered all of the content in memory to determine the size in order to properly set the Content-Length header. This was important for Riak CS because it needed to stream back very large objects and the S3 API does not use multipart responses for this operation. Now streaming large content where the size can be determined in advance can be accomplished without having to pay the price of buffering everything in memory. More info on using this feature is here.

Ability to override the resource_module value in the request metadata
The impetus for this feature is more esoteric than the other features so an example is probably the best description. Take the case where the Webmachine resource modules duplicate a lot of code in implementing the required callbacks to service requests. One way to address this is to move much of that common code to a single module and use that common module as the resource in all dispatch rules. The ModOpts for each dispatch rule are used to specify a smaller set of callbacks for resource specialization so that logging data reflects the specialized resource module and not the common module. We will provide further details about the motivations this in a subsequent blog post focused on Riak CS. Documentation on how to configure this option can be found here.

Kelly Mclaughlin