At the most recent San Francisco Riak meetup, we had the pleasure to invite Uber engineer & past OSCON speaker Jeff Wolski (GitHub) to discuss his more recent work.
You may know Uber as the popular on-demand car service, but they’re so much more than that. Uber is innovating at the intersection of lifestyle and logistics at a rapid pace. To do so, they architect some of the most fascinating distributed architectures I know of.
A newer part of this ever-evolving distributed system is project Ringpop (also on GitHub). As Jeff puts it:
“Ringpop is an open-source Node.js library developed at Uber that brings application-layer sharding to many of their dispatching platform services.”
This additional abstraction layer, maintained through a consistent hashing ring familiar to any Riak enthusiast, provides a means by which Jeff can add additional dispatching services without service interruption.
To leverage this scalability while providing stability, Ringpop keeps in mind that no distributed network is always reliable. Jeff dedicates a portion of his talk to exploring these complexities and how SWIM gossip protocol is implemented to handle bad actors:
“Service instances that behave erratically, slow or otherwise, wreak havoc on the rest of the cluster by causing frequent and persistent changes to the state of the ring and ultimately, inconsistent hash ring lookups. In this talk you’ll hear about Ringpop, its implementation, and how we’ve had to employ a flap damping technique to suss out these bad actors to achieve higher levels of reliability for our services.”
The presentation, in the context of Ringpop, also shows how Uber relies on Riak for high availability. Riak acts as a persistent data store for a portion of new dispatching services as well as some functional extensions on top of Ringpop. For example, as objects are generated that require persistence, such as a new driver coming online and their associated mailbox of potential trips, these IDs are stored as keys within Riak.
These layers are used for further services, which rely on data stored in Riak, including:
- Stateful HTTP long-poll services
- Client/server sync services
- Rate limiters
- Geospatial services
It’s insanely interesting to see how Uber continues to scale as one of the most respected software companies today. To do so, Jeff notes a list of research that informs the design of Ringpop and will continue to be important to their development. What is often forgotten in our productivity-obsessed dev culture, is the importance of practice. In one Q&A, Jeff responds to a comment on the concern of creating a consistent hash on top of a consistent hash of Riak, he says “if it’s wrong I’ll go delete the repo right now.” His willingness to improve, even if it means deleting month’s worth of code, is refreshing to me.
There is a great deal of learning to be done if you’re looking to design a similar resilient set of services.
If you enjoy the video, you’ll love the documents mentioned. The Dynamo paper is close to our hearts at Basho since Riak is also based upon its goal. For further reading, check out BGP route flap damping, SWIM protocol, and Uber’s code on tchannel.
Keep sharing, learning, building and re-building,
August 21, 2013
Riak is being used by companies of all shapes and sizes. Since it is open source, we don’t know exactly how many deployments of Riak there are, but our best guess puts it in the thousands. We love hearing about all the unique ways companies are using Riak. Sometimes, our awesome users and partners even write up blogs that showcase how they’re using Riak and why they decided to give it a try. Below are a few good ones that we came across:
Our partner, SoftLayer, has written about Riak a number of times. Recently, Marc Jones wrote about some popular use cases for Riak and Harold Hannon wrote a performance analysis about running Riak on bare metal.
You can learn more about how Basho partners with SoftLayer here.
Flyclops is currently in the process of launching with Riak, and they have chosen to blog about their experiences along the way. First, they wrote about their database evaluation process and why they chose Riak. Then, earlier this week, they wrote about building Riak on OmniOS.
Superfeedr created a Google Reader API replacement and chose to power it with Riak.
For more examples of how companies are using Riak, check out our Users Page.
July 15, 2013
Today, we are sending out our quarterly Riak Community Survey. This survey is to help us better understand how you’re using Riak. By understanding how Riak is being used, we can make more educated decisions about how to improve Riak in the future. We will also anonymize this data and share it with the community to provide a more holistic view of how Riak is being used.
To participate in this survey, simply click here to get started. All survey participants will receive Basho swag and a discount code for RICON West tickets. One lucky participant will be selected to receive a free RICON West ticket.
Thanks for participating in our survey and be sure to grab a RICON West ticket. Early bird prices end August 29th.