December 11, 2013
In the world of distributed systems, there are still a lot of unsolved problems and improvements to be made. This means that there is a lot of interesting research being done at top institutions around the world – with some of the brightest minds looking to improve distributed systems. At RICON West, Basho’s developer conference, we brought three PhD students and candidates to speak, whose work on distributed systems has been vital to both Basho and the future of the industry.
Peter Bailis is a PhD student at UC Berkeley. His talk, “Bad As I Wanna Be: Coordination and Consistency in Distributed Databases,” goes into how to reason about the trade-offs between coordination, consistency, latency, and availability, with a focus on practical takeaways from recent research both at UC Berkeley and beyond. He also talks about reconciling “consistency” in NoSQL and ACID databases and explains why, even though you probably didn’t “beat the CAP Theorem,” you (and tomorrow’s database designs) may be on to something. His full talk is below.
Lindsey Kuper is a PhD candidate at Indiana University, who studies the foundations of deterministic parallel programming. At RICON, she spoke on “LVars: Lattice-Based Data Structures for Deterministic Parallelism,” which introduces LVars (data structures that enable deterministic parallel programming). LVars generalize the single-assignment variables often found in deterministic parallel languages to allow multiple assignments that are monotonically increasing with respect to a user-specified lattice of states. LVars maintain determinism by allowing only monotonic writes and “threshold” reads to and from shared data. Her talk looks at examples of programming in an LVar-based parallel language that is provably deterministic, and explores the connection between LVars and CRDTs. The complete talk is below.
Finally, we had Diego Ongaro, a PhD student at Stanford University, talk about “The Raft Consensus Algorithm.” His talk discusses Raft, a consensus algorithm designed for understandability and developed by Diego and Professor John Ousterhout. Raft is equivalent to Paxos in fault-tolerance and performance, but it’s designed to be as easy to understand as possible, while cleanly addressing all major pieces needed for practical systems. The hope is that Raft will make consensus available to a wider audience, and that this wider audience will be able to develop a wider variety of higher quality consensus-based systems than are available today. You can learn more about Raft below.
To watch all of the sessions from RICON West 2013, visit the Basho Technologies Youtube Channel.
Santa Clara, CA – June 24, 2013 – Today, at the Apache CloudStack Collaboration Conference, John Burwell of Basho Technologies will be presenting twice during the Track 1 sessions. This Collaboration Conference brings together users, integrators, and contributors to share their knowledge and work together for future CloudStack releases. This is the second Collaboration Conference and is taking place in Santa Clara from June 23-25th.
The first of Burwell’s sessions is titled “Who the Heck are You? Integrating SSO into CloudStack,” and starts at 11:30am. This talk will provide a brief introduction to the single sign-on (SSO) authentication model and associated best practices. It will then discuss the benefits of integrating CloudStack with one or more SSO infrastructures. His second session, entitled “How to Run From a Zombie: CloudStack Distributed Process Management,” starts at 2:30pm. This talk will explore CloudStack’s distributed process management requirements and the challenges they present in the context of CAP theorem. These challenges will then be addressed through a distributed process model that emphasizes efficiency, fault tolerance, and operational transparency.
John Burwell is a Consulting Engineer at Basho Technologies, the makers of the open source Riak distributed key value database and open source Riak CS object store. He is also a committer to the Apache CloudStack project focused on storage architecture and security integration. His first CloudStack contribution, S3-backed Secondary Storage, will be included in the upcoming 4.1.0 release.
January 10, 2013
This is the first in a series of blog posts that discusses a high-level overview of the benefits and tradeoffs of Riak versus traditional relational databases. If this is relevant to your projects or applications, register for our “From Relational to Riak” webcast on January 24.
One of the biggest differences between Riak and relational systems is the focus on availability and how the underlying architecture deals with failure modes.
Most relational databases leverage a master/slave architecture to replicate data. This approach usually means the master coordinates all write operations, working with the slave nodes to update data. If the master node fails, the database will reject write operations until the failure is resolved – often involving failover or leader election – to maintain correctness. This can result in a window of write unavailability.
Conversely, Riak uses a masterless system with no single point of failure, meaning any node can serve read or write requests. If a node experiences an outage, other nodes can continue to accept read and write requests. Additionally, if a node fails or becomes unavailable to the rest of the cluster due to a network partition, a neighboring node will take over responsibilities for the unavailable node. Once this node becomes available again, the neighboring node will pass over any updates through a process called “hinted handoff.” This is another way that Riak maintains availability and resilience even despite serious failure.
Because Riak’s system allows for reads and writes, even when multiple nodes are unavailable, and uses an eventually consistent design to maintain availability, in rare cases different replicas may contain different versions of an object. This can occur if multiple clients update the same piece of data at the exact same time or if nodes are down or laggy. These conflicts happen a statistically small portion of the time, but are important to know about. Riak has a number of mechanisms for detecting and resolving these conflicts when they occur. For more on how Riak achieves availability and the tradeoffs involved, see our documentation on the subject.
For many use cases today, high availability and fault tolerance are critical to the user experience and the company’s revenue. Unavailability has a negative impact on your revenue, damages user trust and leads to a poor user experience. For use cases such as online retail, shopping carts, advertising, social and mobile platforms or anything with critical data needs, high availability is key and Riak may be the right choice.