August 1, 2013
To accompany our recent Riak 1.4 announcement, we hosted a live “What’s New in Riak 1.4” webcast. While we had many attendees that asked some great questions, we realize not everyone was able to tune in. That’s why we’re providing the complete recording below.
The webcast is about 30 minutes. It provides a quick background of the basics of Riak and discusses what’s new in the Riak 1.4 release, including the addition of eventually consistent counters, improvements in secondary indexing, and Riak Control updates. The webcast also covers Riak Enterprise and the enhancements released with 1.4. Finally, it looks at how other companies are using Riak and what’s in store for the future Riak 2.0 release later this year.
July 29, 2013
For those of you who are up on your RICON history, you’ll remember that last year, Basho Hackers Russell Brown and Sean Cribbs gave a talk called “Data Structures in Riak” (video can be viewed here). Russell and Sean detailed the approach that Basho was taking to add eventually consistent data structures to Riak. The highlight of the presentation was a demonstration of incrementing and decrementing a counter using a sample app built with riak_dt. A simple counter was incremented. During this, nodes were killed, network partitions were introduced, and despite all that, counts converged once the cluster healed.
It was one of the more memorable moments of the entire conference.
We believe developers can build robust applications utilizing a simple key/value API. GETs, PUTs, and DELETEs can work wonders when utilized correctly. But this doesn’t let you build everything on Riak, and we’ve seen a fair amount of applications that outsource things – like data type operations – to other storage or caching systems. Especially when porting apps from Redis to Riak, we often hear that counters are one feature that Riak lacks. Basho is firmly in the “right-technology-for-the-right-job” camp but we’re aggressively adding functionality that doesn’t break Riak’s design goals of availability and fault-tolerance.
As of the Riak 1.4 release, counters are officially supported. Specifically, a data type known as a PN-Counter, which can be both incremented and decremented. This is the first of a suite of data types we’re planning to add (more on this in a moment) that give developers the ability to build more complex functionality on top of data stored as keys and values.
Using counters, you can increment and decrement a count associated with a named object in a given bucket. This sounds easy, but in a system like Riak where writes aren’t serialized and all updates are asynchronous, determining the last actor in a series of updates to an object is non-trivial. Riak’s counters should be used (in their current state) to count things that can tolerate eventual consistency. With that in mind, here are a few apps and types of functionality that could be implemented with Riak’s Counters:
- Facebook Like Button
- Youtube Views and Likes
- Hacker News Votes
- Twitter Followers and Favorites
The thing to remember here is that these counts can tolerate slight, brief imprecision. When your follower count fluctuates between 1000 and 1010 before finally settling on 1009, Twitter continues to work as expected. Riak 2.0 will feature work that enables you to enforce consistency around designated buckets which will solve this problem (with the necessary tradeoffs). Until then, use counters in Riak for things that can tolerate eventual consistency.
Even with this caveat, counters are a huge addition to Riak and we’re excited to see the new suite of applications and functionality they make possible.
Usage & Getting Started
To make use of counters we’ve introduced new endpoints and request types for the HTTP and Protocol Buffers APIs, respectively.
The complete documentation for the HTTP interface is here. Here are the basics using CURL:
Usage documentation for this is still in the works, but here’s the relevant message (as seen in riak_pb):
We’re working on implementing these in all of Basho supported client libraries. Keep an eye on these for details and timelines around availability. We currently support counters in the following libraries across the following protocols:
- Python – HTTP and PB
- Java – HTTP and PB
- Erlang – PB
In addition to the docs and code, Basho Hacker Sam Elliot has started a Riak CRDT cookbook. The first section walks you through using counters in a few different ways, and even shows you how to simulate failure events. Take it for a spin and send Sam some feedback.
Future Data Types
In addition to counters, we have big plans for more data types in Riak. Sets and maps are on the short list, and the goal is to have these ready for Riak 2.0. Russell posted an extensive RFC on the Riak GitHub repo for those interested. Comments, critiques, and contributions are all encouraged.
Related Work and Additional Reading
- A simple way to store a PN-Counter in a riak_object
- Release Notes on Counters
- CRDT paper from Shapiro et al. at INRIA
Enjoy and see you at RICON West in October.
July 22, 2013
With the release of Riak 1.4, we have made some important additions and changes to the Client API features, with a goal of strengthening the real-time, streaming, and timeout behaviors for clients. To take a deeper look at all of the changes in Riak 1.4, check out the release notes.
Protocol Buffers & Multiple Interface Binding
In previous versions of Riak, the interface binding for protocol buffers was set to a default 127.0.0.1 with a port of 8087 and the endpoint was limited to a single IP address and port. With Riak 1.4, the list of endpoints can be configured. This feature dramatically extends the options around setting up firewall security and other options at the network level, which will provide security more choice in which port ranges to close off or IP ranges to shift.
Clients can also bind to these new ranges. This gives clients the ability to use protocol buffers on more web-friendly port ranges, even utilizing protocol buffers in parallel with HTTP on port 80 if necessary. With this update, Riak now has closer feature parity between HTTP and Protocol Buffers.
Milliseconds can now be assigned to a timeout value for clients. Client-specified timeouts can be used for object manipulation around fetch, store and delete, listing buckets, or keys. This addition will be useful for asynchronous requests and pivotal for use with synchronous requests. For more on the client-specified timeouts, take a look at the relevant GitHub issue.
To explore response times and where timeout conditions can occur, check out the Basho Bench docs. There are examples for testing various scenarios and identifying bottlenecks that may need custom timeouts or performance improvements.
Bucket Properties for Protocol Buffers
Bucket properties can now be reset to default values and all built-in properties can be configured via the protocol buffers API. This helps client usage of protocol buffers in a dramatic way and, again, steps closer to feature parity with HTTP. For more information on setting and using these bucket properties, check out the bucket properties documentation.
List-Buckets Streaming: Real-time
Listing keys or buckets via a streaming request will now send bucket names to the client as received. This prevents the need to wait for all nodes to respond to the request. This update helps with response time and timeouts from the client point of view. It also allows for the use of streaming features with Node.js, C#, Java, and other languages and frameworks that support real-time streaming data feeds.
July 17, 2013
Last week, we announced the availability of Riak 1.4, which added a number of new features and updates. To provide more details about what was introduced with the latest release, we hosted a “What’s New in Riak 1.4” webcast.
For reference, we have posted the slides from this webcast below and the complete recording will be posted in the coming weeks. These slides provide background on Riak, introduce the new features and updates available with 1.4, and discuss some use cases and user stories. They also cover Riak Enterprise and what enhancements 1.4 made to Riak Enterprise’s replication.
July 16, 2013
Riak Control is a web-based administrative console for inspecting and manipulating Riak clusters.
Although Riak Control is maintained as a separate application, the necessary code for Control ships with Riak 1.1 and above and requires no additional installation steps. For details on setting up Riak Control, check out our docs.
Those are things you may already know about Control. Now let’s look at the changes in 1.4.
Cluster Management with Staging
riak-admin command-line tool has offered staged clustering since Riak 1.2. Riak 1.4 brings that functionality to Control.
The new Cluster Management interface allows you to stage cluster node additions and removals. Once the changes have been reviewed, they can be committed to the cluster. After being committed, Control displays partition transfers and memory utilization changes as they occur.
Staged changes to the cluster:
Changes committed; transfers active:
Cluster stabilizes after changes:
Standalone Node Management Interface
Because the Cluster Management interface now operates on staged changes, actions that cannot be staged have been moved to the Node Management interface. Here, changes to individual nodes, such as stopping or marking them as down, can be applied.
Contributing to Riak Control
Riak Control’s modular design prevents users from having to understand every detail of its existing functionality to contribute. If you’re interested in contributing, we have outlined the process of setting up a development environment, as well as some basic rules for contribution.
Check out our Riak 1.4 announcement to learn what else is included in this release.
July 11, 2013
To learn more about what’s new in Riak 1.4, sign up for our webcast on July 12th at 11am PT/2pm ET
With the introduction of Riak 1.4, Basho offers developers new ways to leverage secondary indexes (often referred to as 2i). This is a short review of what they are and what has been added.
Secondary indexes in Riak
Values in Riak are treated as opaque, although optional add-ons such as Riak Search and Yokozuna can index the contents.
With two of the supported storage backends (LevelDB and Memory), developers can add their own indexes for querying. These can be numeric or string values, matched as exact values or ranges, and can have as much or as little to do with the stored value as the developer wishes.
Primary key lookups will always be the fastest way to retrieve values from Riak, but 2i is a useful way to label and retrieve data.
What has changed with Riak 1.4?
Previously, results from 2i queries were presented as a comprehensive list of unordered keys. Depending on the size of the result set, this could be awkward (or impossible) for a client application to handle.
With 1.4, the following features have been added:
- Pagination and streaming are available on request.
- Results are now sorted: first by index value, then by keys.
- If requested as part of a range query, the matched index value will be returned alongside each key.
Here is an example of a range query via HTTP. Pagination is specified via
max_results=5 and the return of matched index values via
In this case we are querying a small Twitter firehose data set; each tweet was added to Riak with nested hashtag values as indexes. The query is designed to match hashtags in the range
ri (inclusive) to
continuation value is necessary to retrieve the next page of results, and as expected the results are sorted by index value and key.
Where to find more information?
Basho’s docs site has been updated for 1.4:
July 10, 2013
Today, Basho Technologies announced the public availability of Riak 1.4.
The release includes new features and updates in addition to a substantive set of addressed issues. These updates include improvements to Secondary Indexes, simplified cluster management through Riak Control, reduced object storage overhead, and progress reporting for Hinted Handoff. Riak 1.4 also sets the stage for Basho’s upcoming major release, Riak 2.0, planned for Fall 2013.
In addition to these features and capabilities, Riak 1.4 includes eventually consistent, distributed counter functionality. Riak’s first distributed data type provides conflict resolution after a network partition and continues to advance Basho’s position of leadership within the distributed systems space.
This release encompasses both Riak and Riak Enterprise, which includes the multi-datacenter replication capability used by an increasing number of enterprise customers to address their critical data needs.
A full list of the new features and updates available in the 1.4 release can be found on the Basho blog post, Basho Announces Availability of Riak 1.4.
July 10, 2013
We are excited to announce the launch of Riak 1.4. With this release, we have added in more functionality and addressed some common requests that we hear from customers. In addition, there are a few features available in technical preview that you can begin testing and will be fully rolled out in the 2.0 launch later this year.
The new features and updates in Riak 1.4 include:
- Secondary Indexing Improvements: Query results are now sorted and paginated, offering developers much richer semantics
- Introducing Counters in Riak: Counters, Riak’s first distributed data type, provide automatic conflict resolution after a network partition
- Simplified Cluster Management With Riak Control: New capabilities in Riak’s GUI-based administration tool improve the cluster management page for preparing and applying changes to the cluster
- Reduced Object Storage Overhead: Values and associated metadata are stored and transmitted using a more compact format, reducing disk and network overhead
- Handoff Progress Reporting: Makes operating the cluster, identifying and troubleshooting issues, and monitoring the cluster simpler
- Improved Backpressure: Riak responds with an overload message if a vnode has too many messages in queue
This 1.4 launch also adds quite a few performance enhancements to Riak Enterprise’s multi-datacenter replication that include:
- Replication in Riak 1.4 supports SSL, NAT, and full sync scheduling
- Availability of cascading real-time writes gives operators the choice as to whether or not all writes are replicated to all datacenters
- Optional use of Active Anti-Entropy during replication in Riak 1.4 to significantly decrease data transfer times is available in Technical Preview
These updates improve the performance of Riak and provide greater functionality and management for both clusters and multiple datacenters. You can download Riak 1.4 at docs.basho.com/riak/latest/downloads.
For a full list of what’s in Riak 1.4, check out our code at Github.com/basho or review the release notes. To learn even more, join our live webcast, “What’s New in Riak 1.4” on July 12th and look for a series of more detailed blog posts over the coming weeks.
We will also be launching Riak CS 1.4 shortly. Keep an eye on our blog for more information.