November 10, 2014
Many data needs are better served by data stores that are optimized for maximum availability and scalability — rather than optimized for consistency. For certain use cases, there are elements to the data that require strong consistency. With Riak 2.0, in addition to eventual consistency, there is now a way to enforce strong consistency when needed.
NOTE: Riak’s strong consistency feature is currently an open-source-only feature and is not yet commercially supported.
Behavioral Changes with Strong Consistency
Strongly consistent operations in Riak function much like eventually consistent operations at the application level. The core difference lies in the types of errors Riak will report to the client.
Each request to update an object (except for the initial creation) must include a context value reflecting the last time the application read it. This is the same behavior that Riak clients have always followed with version vectors and strong consistency also mandates its use. Similarly, reading data from a strongly consistent Riak bucket functions exactly like eventually consistent reads.
If that value is not provided for an update operation to an existing object, Riak will reject it. This is because the database assumes that you have not seen the current value and may not know what you’re doing.
Similarly, if that context value is out of date, Riak will also reject update operations. The client must re-read the latest value and supply an update based on that new value, with the new context.
If Riak cannot contact a majority of the servers responsible for the key, the request will fail. Ordinarily, Riak is happy to accept all operations in the interest of high availability and never dropping a write – even in the extreme case of only one server surviving data center outages.
Strong consistency also eliminates object siblings, as it is effectively impossible for the cluster to disagree on the value of an object.
When considering consistency models in an application, it is easy for the logic to quickly become daunting. This is especially true when designing a workflow that leverages both eventually and strongly consistent models. It is, therefore, easiest to begin with a simple use case.
Consider the workflow involved in storing and updating username and password data. In the case of a password update, it is necessary that — at any given time — there be exactly ONE result for a user’s password. Relatedly, it is important to ensure that an update of this value is fully atomic or user experience is substantially degraded. It would be possible to leverage Riak for all the eventually consistent elements of the application and leverage strong consistency for the username and password.
To see how eventual and strong consistency can be combined to solve business problems, let’s take a not-so-hypothetical example from the energy industry.
Imagine you’re collecting massive amounts of geological data for analysis. Each batch of data must be processed by a single instance of your application. Since this processing can take hours, days, or even weeks to complete, it’s expensive if two applications handle the same batch.
Let’s walk through the sequence of events.
- Batch of data arrives for processing.
- The batch is stored in a large object store (like, Riak CS) under a batch ID.
- The batch ID is added to a pending job list in Riak and stored as a set (one of the new Riak Data Types).
This is a classic example of eventual consistency and an illustration of the value of the new Riak Data Types introduced with Riak 2.0. Storing a new batch ID in your database should never fail, even if servers are offline. If multiple applications are adding batch IDs to the pending list at the same time, it’s perfectly reasonable for those lists to temporarily diverge, as long as they can be trivially merged later.
Let’s continue to see where strong consistency comes into play.
- A compute node becomes available to process the data.
- The compute node retrieves the pending job list and picks a batch ID.
- The compute node attempts to create a lock for that batch ID.
This is where strong consistency is required. This lock object should be created in a bucket that is managed by the new strong consistency subsystem in Riak 2.0. If someone else also grabs that batch ID and tries to create another lock object, Riak’s strong consistency logic will reject this second attempt. This compute node will just start over by grabbing a new ID.
To detect crashed jobs, the lock object should be created with basic job data, such as which compute node owns the processing job and what time it was started.
- The compute node asks Riak to add the batch ID to a different set, a running job list.
- The compute node asks Riak to remove the batch ID from the pending list.
- The job runs.
- When completed, the compute node asks Riak to add the batch ID to a completed job list.
- Riak is asked to remove the batch ID from the running list.
- The compute node deletes the lock object (or updates it to reflect the completion of the processing job).
Tradeoffs When Using Strong Consistency
- Blind updates will be rejected, so the client must read the existing value before supplying a new one (except in the case of entirely new keys).
- Write requests may be slightly slower due to coordination overhead.
- If a majority of the servers responsible for a piece of data are unavailable, write requests will fail. Read operations may fail depending on the freshness of the data that is still accessible.
- Secondary indexes (2i) are not yet supported.
- Multi-datacenter replication in Riak Enterprise is not yet supported.
- Using Strong Consistency (for developers)
- Managing Strong Consistency (for operators)
- Strong Consistency (theory & concepts)
Strong Consistency is now available with Riak 2.0. Download Riak 2.0 on our Docs Page.
December 11, 2013
In the world of distributed systems, there are still a lot of unsolved problems and improvements to be made. This means that there is a lot of interesting research being done at top institutions around the world – with some of the brightest minds looking to improve distributed systems. At RICON West, Basho’s developer conference, we brought three PhD students and candidates to speak, whose work on distributed systems has been vital to both Basho and the future of the industry.
Peter Bailis is a PhD student at UC Berkeley. His talk, “Bad As I Wanna Be: Coordination and Consistency in Distributed Databases,” goes into how to reason about the trade-offs between coordination, consistency, latency, and availability, with a focus on practical takeaways from recent research both at UC Berkeley and beyond. He also talks about reconciling “consistency” in NoSQL and ACID databases and explains why, even though you probably didn’t “beat the CAP Theorem,” you (and tomorrow’s database designs) may be on to something. His full talk is below.
Lindsey Kuper is a PhD candidate at Indiana University, who studies the foundations of deterministic parallel programming. At RICON, she spoke on “LVars: Lattice-Based Data Structures for Deterministic Parallelism,” which introduces LVars (data structures that enable deterministic parallel programming). LVars generalize the single-assignment variables often found in deterministic parallel languages to allow multiple assignments that are monotonically increasing with respect to a user-specified lattice of states. LVars maintain determinism by allowing only monotonic writes and “threshold” reads to and from shared data. Her talk looks at examples of programming in an LVar-based parallel language that is provably deterministic, and explores the connection between LVars and CRDTs. The complete talk is below.
Finally, we had Diego Ongaro, a PhD student at Stanford University, talk about “The Raft Consensus Algorithm.” His talk discusses Raft, a consensus algorithm designed for understandability and developed by Diego and Professor John Ousterhout. Raft is equivalent to Paxos in fault-tolerance and performance, but it’s designed to be as easy to understand as possible, while cleanly addressing all major pieces needed for practical systems. The hope is that Raft will make consensus available to a wider audience, and that this wider audience will be able to develop a wider variety of higher quality consensus-based systems than are available today. You can learn more about Raft below.
To watch all of the sessions from RICON West 2013, visit the Basho Technologies Youtube Channel.
Santa Clara, CA – June 24, 2013 – Today, at the Apache CloudStack Collaboration Conference, John Burwell of Basho Technologies will be presenting twice during the Track 1 sessions. This Collaboration Conference brings together users, integrators, and contributors to share their knowledge and work together for future CloudStack releases. This is the second Collaboration Conference and is taking place in Santa Clara from June 23-25th.
The first of Burwell’s sessions is titled “Who the Heck are You? Integrating SSO into CloudStack,” and starts at 11:30am. This talk will provide a brief introduction to the single sign-on (SSO) authentication model and associated best practices. It will then discuss the benefits of integrating CloudStack with one or more SSO infrastructures. His second session, entitled “How to Run From a Zombie: CloudStack Distributed Process Management,” starts at 2:30pm. This talk will explore CloudStack’s distributed process management requirements and the challenges they present in the context of CAP theorem. These challenges will then be addressed through a distributed process model that emphasizes efficiency, fault tolerance, and operational transparency.
John Burwell is a Consulting Engineer at Basho Technologies, the makers of the open source Riak distributed key value database and open source Riak CS object store. He is also a committer to the Apache CloudStack project focused on storage architecture and security integration. His first CloudStack contribution, S3-backed Secondary Storage, will be included in the upcoming 4.1.0 release.
January 10, 2013
This is the first in a series of blog posts that discusses a high-level overview of the benefits and tradeoffs of Riak versus traditional relational databases. If this is relevant to your projects or applications, register for our “From Relational to Riak” webcast on January 24.
One of the biggest differences between Riak and relational systems is the focus on availability and how the underlying architecture deals with failure modes.
Most relational databases leverage a master/slave architecture to replicate data. This approach usually means the master coordinates all write operations, working with the slave nodes to update data. If the master node fails, the database will reject write operations until the failure is resolved – often involving failover or leader election – to maintain correctness. This can result in a window of write unavailability.
Conversely, Riak uses a masterless system with no single point of failure, meaning any node can serve read or write requests. If a node experiences an outage, other nodes can continue to accept read and write requests. Additionally, if a node fails or becomes unavailable to the rest of the cluster due to a network partition, a neighboring node will take over responsibilities for the unavailable node. Once this node becomes available again, the neighboring node will pass over any updates through a process called “hinted handoff.” This is another way that Riak maintains availability and resilience even despite serious failure.
Because Riak’s system allows for reads and writes, even when multiple nodes are unavailable, and uses an eventually consistent design to maintain availability, in rare cases different replicas may contain different versions of an object. This can occur if multiple clients update the same piece of data at the exact same time or if nodes are down or laggy. These conflicts happen a statistically small portion of the time, but are important to know about. Riak has a number of mechanisms for detecting and resolving these conflicts when they occur. For more on how Riak achieves availability and the tradeoffs involved, see our documentation on the subject.
For many use cases today, high availability and fault tolerance are critical to the user experience and the company’s revenue. Unavailability has a negative impact on your revenue, damages user trust and leads to a poor user experience. For use cases such as online retail, shopping carts, advertising, social and mobile platforms or anything with critical data needs, high availability is key and Riak may be the right choice.