To accompany our recent Riak 1.4 announcement, we hosted a live “What’s New in Riak 1.4” webcast. While we had many attendees that asked some great questions, we realize not everyone was able to tune in. That’s why we’re providing the complete recording below.
The webcast is about 30 minutes. It provides a quick background of the basics of Riak and discusses what’s new in the Riak 1.4 release, including the addition of eventually consistent counters, improvements in secondary indexing, and Riak Control updates. The webcast also covers Riak Enterprise and the enhancements released with 1.4. Finally, it looks at how other companies are using Riak and what’s in store for the future Riak 2.0 release later this year.
For those of you who are up on your RICON history, you’ll remember that last year, Basho Hackers Russell Brown and Sean Cribbs gave a talk called “Data Structures in Riak” (video can be viewed here). Russell and Sean detailed the approach that Basho was taking to add eventually consistent data structures to Riak. The highlight of the presentation was a demonstration of incrementing and decrementing a counter using a sample app built with riak_dt. A simple counter was incremented. During this, nodes were killed, network partitions were introduced, and despite all that, counts converged once the cluster healed.
It was one of the more memorable moments of the entire conference.
We believe developers can build robust applications utilizing a simple key/value API. GETs, PUTs, and DELETEs can work wonders when utilized correctly. But this doesn’t let you build everything on Riak, and we’ve seen a fair amount of applications that outsource things – like data type operations – to other storage or caching systems. Especially when porting apps from Redis to Riak, we often hear that counters are one feature that Riak lacks. Basho is firmly in the “right-technology-for-the-right-job” camp but we’re aggressively adding functionality that doesn’t break Riak’s design goals of availability and fault-tolerance.
As of the Riak 1.4 release, counters are officially supported. Specifically, a data type known as a PN-Counter, which can be both incremented and decremented. This is the first of a suite of data types we’re planning to add (more on this in a moment) that give developers the ability to build more complex functionality on top of data stored as keys and values.
Using counters, you can increment and decrement a count associated with a named object in a given bucket. This sounds easy, but in a system like Riak where writes aren’t serialized and all updates are asynchronous, determining the last actor in a series of updates to an object is non-trivial. Riak’s counters should be used (in their current state) to count things that can tolerate eventual consistency. With that in mind, here are a few apps and types of functionality that could be implemented with Riak’s Counters:
Facebook Like Button
Youtube Views and Likes
Hacker News Votes
Twitter Followers and Favorites
The thing to remember here is that these counts can tolerate slight, brief imprecision. When your follower count fluctuates between 1000 and 1010 before finally settling on 1009, Twitter continues to work as expected. Riak 2.0 will feature work that enables you to enforce consistency around designated buckets which will solve this problem (with the necessary tradeoffs). Until then, use counters in Riak for things that can tolerate eventual consistency.
Even with this caveat, counters are a huge addition to Riak and we’re excited to see the new suite of applications and functionality they make possible.
Usage & Getting Started
To make use of counters we’ve introduced new endpoints and request types for the HTTP and Protocol Buffers APIs, respectively.
The complete documentation for the HTTP interface is here. Here are the basics using CURL:
Usage documentation for this is still in the works, but here’s the relevant message (as seen in riak_pb):
We’re working on implementing these in all of Basho supported client libraries. Keep an eye on these for details and timelines around availability. We currently support counters in the following libraries across the following protocols:
Python – HTTP and PB
Java – HTTP and PB
Erlang – PB
In addition to the docs and code, Basho Hacker Sam Elliot has started a Riak CRDT cookbook. The first section walks you through using counters in a few different ways, and even shows you how to simulate failure events. Take it for a spin and send Sam some feedback.
Future Data Types
In addition to counters, we have big plans for more data types in Riak. Sets and maps are on the short list, and the goal is to have these ready for Riak 2.0. Russell posted an extensive RFC on the Riak GitHub repo for those interested. Comments, critiques, and contributions are all encouraged.
With the release of Riak 1.4, we have made some important additions and changes to the Client API features, with a goal of strengthening the real-time, streaming, and timeout behaviors for clients. To take a deeper look at all of the changes in Riak 1.4, check out the release notes.
Protocol Buffers & Multiple Interface Binding
In previous versions of Riak, the interface binding for protocol buffers was set to a default 127.0.0.1 with a port of 8087 and the endpoint was limited to a single IP address and port. With Riak 1.4, the list of endpoints can be configured. This feature dramatically extends the options around setting up firewall security and other options at the network level, which will provide security more choice in which port ranges to close off or IP ranges to shift.
Clients can also bind to these new ranges. This gives clients the ability to use protocol buffers on more web-friendly port ranges, even utilizing protocol buffers in parallel with HTTP on port 80 if necessary. With this update, Riak now has closer feature parity between HTTP and Protocol Buffers.
Milliseconds can now be assigned to a timeout value for clients. Client-specified timeouts can be used for object manipulation around fetch, store and delete, listing buckets, or keys. This addition will be useful for asynchronous requests and pivotal for use with synchronous requests. For more on the client-specified timeouts, take a look at the relevant GitHub issue.
To explore response times and where timeout conditions can occur, check out the Basho Bench docs. There are examples for testing various scenarios and identifying bottlenecks that may need custom timeouts or performance improvements.
Bucket Properties for Protocol Buffers
Bucket properties can now be reset to default values and all built-in properties can be configured via the protocol buffers API. This helps client usage of protocol buffers in a dramatic way and, again, steps closer to feature parity with HTTP. For more information on setting and using these bucket properties, check out the bucket properties documentation.
List-Buckets Streaming: Real-time
Listing keys or buckets via a streaming request will now send bucket names to the client as received. This prevents the need to wait for all nodes to respond to the request. This update helps with response time and timeouts from the client point of view. It also allows for the use of streaming features with Node.js, C#, Java, and other languages and frameworks that support real-time streaming data feeds.
To get started, download Riak 1.4 on our docs site. Also, be sure and grab a ticket to RICON West before they sell out.
Last week, we announced the availability of Riak 1.4, which added a number of new features and updates. To provide more details about what was introduced with the latest release, we hosted a “What’s New in Riak 1.4” webcast.
For reference, we have posted the slides from this webcast below and the complete recording will be posted in the coming weeks. These slides provide background on Riak, introduce the new features and updates available with 1.4, and discuss some use cases and user stories. They also cover Riak Enterprise and what enhancements 1.4 made to Riak Enterprise’s replication.
Riak Control is a web-based administrative console for inspecting and manipulating Riak clusters.
Although Riak Control is maintained as a separate application, the necessary code for Control ships with Riak 1.1 and above and requires no additional installation steps. For details on setting up Riak Control, check out our docs.
Those are things you may already know about Control. Now let’s look at the changes in 1.4.
Cluster Management with Staging
The riak-admin command-line tool has offered staged clustering since Riak 1.2. Riak 1.4 brings that functionality to Control.
The new Cluster Management interface allows you to stage cluster node additions and removals. Once the changes have been reviewed, they can be committed to the cluster. After being committed, Control displays partition transfers and memory utilization changes as they occur.
Staged changes to the cluster:
Changes committed; transfers active:
Cluster stabilizes after changes:
Standalone Node Management Interface
Because the Cluster Management interface now operates on staged changes, actions that cannot be staged have been moved to the Node Management interface. Here, changes to individual nodes, such as stopping or marking them as down, can be applied.
With the introduction of Riak 1.4, Basho offers developers new ways to leverage secondary indexes (often referred to as 2i). This is a short review of what they are and what has been added.
Secondary indexes in Riak
Values in Riak are treated as opaque, although optional add-ons such as Riak Search and Yokozuna can index the contents.
With two of the supported storage backends (LevelDB and Memory), developers can add their own indexes for querying. These can be numeric or string values, matched as exact values or ranges, and can have as much or as little to do with the stored value as the developer wishes.
Primary key lookups will always be the fastest way to retrieve values from Riak, but 2i is a useful way to label and retrieve data.
What has changed with Riak 1.4?
Previously, results from 2i queries were presented as a comprehensive list of unordered keys. Depending on the size of the result set, this could be awkward (or impossible) for a client application to handle.
With 1.4, the following features have been added:
Pagination and streaming are available on request.
Results are now sorted: first by index value, then by keys.
If requested as part of a range query, the matched index value will be returned alongside each key.
Here is an example of a range query via HTTP. Pagination is specified via max_results=5 and the return of matched index values via return_terms=true.
In this case we are querying a small Twitter firehose data set; each tweet was added to Riak with nested hashtag values as indexes. The query is designed to match hashtags in the range ri (inclusive) to ru (exclusive).
Today, Basho Technologies announced the public availability of Riak 1.4.
The release includes new features and updates in addition to a substantive set of addressed issues. These updates include improvements to Secondary Indexes, simplified cluster management through Riak Control, reduced object storage overhead, and progress reporting for Hinted Handoff. Riak 1.4 also sets the stage for Basho’s upcoming major release, Riak 2.0, planned for Fall 2013.
In addition to these features and capabilities, Riak 1.4 includes eventually consistent, distributed counter functionality. Riak’s first distributed data type provides conflict resolution after a network partition and continues to advance Basho’s position of leadership within the distributed systems space.
This release encompasses both Riak and Riak Enterprise, which includes the multi-datacenter replication capability used by an increasing number of enterprise customers to address their critical data needs.
We are excited to announce the launch of Riak 1.4. With this release, we have added in more functionality and addressed some common requests that we hear from customers. In addition, there are a few features available in technical preview that you can begin testing and will be fully rolled out in the 2.0 launch later this year.
Replication in Riak 1.4 supports SSL, NAT, and full sync scheduling
Availability of cascading real-time writes gives operators the choice as to whether or not all writes are replicated to all datacenters
Optional use of Active Anti-Entropy during replication in Riak 1.4 to significantly decrease data transfer times is available in Technical Preview
These updates improve the performance of Riak and provide greater functionality and management for both clusters and multiple datacenters. You can download Riak 1.4 at docs.basho.com/riak/latest/downloads.
For a full list of what’s in Riak 1.4, check out our code at Github.com/basho or review the release notes. To learn even more, join our live webcast, “What’s New in Riak 1.4” on July 12th and look for a series of more detailed blog posts over the coming weeks.
We will also be launching Riak CS 1.4 shortly. Keep an eye on our blog for more information.