Tag Archives: HTTP

Counters in Riak 1.4

July 29, 2013

For those of you who are up on your RICON history, you’ll remember that last year, Basho Hackers Russell Brown and Sean Cribbs gave a talk called “Data Structures in Riak” (video can be viewed here). Russell and Sean detailed the approach that Basho was taking to add eventually consistent data structures to Riak. The highlight of the presentation was a demonstration of incrementing and decrementing a counter using a sample app built with riak_dt. A simple counter was incremented. During this, nodes were killed, network partitions were introduced, and despite all that, counts converged once the cluster healed.

It was one of the more memorable moments of the entire conference.

We believe developers can build robust applications utilizing a simple key/value API. GETs, PUTs, and DELETEs can work wonders when utilized correctly. But this doesn’t let you build everything on Riak, and we’ve seen a fair amount of applications that outsource things – like data type operations – to other storage or caching systems. Especially when porting apps from Redis to Riak, we often hear that counters are one feature that Riak lacks. Basho is firmly in the “right-technology-for-the-right-job” camp but we’re aggressively adding functionality that doesn’t break Riak’s design goals of availability and fault-tolerance.

As of the Riak 1.4 release, counters are officially supported. Specifically, a data type known as a PN-Counter, which can be both incremented and decremented. This is the first of a suite of data types we’re planning to add (more on this in a moment) that give developers the ability to build more complex functionality on top of data stored as keys and values.

Use Cases

Using counters, you can increment and decrement a count associated with a named object in a given bucket. This sounds easy, but in a system like Riak where writes aren’t serialized and all updates are asynchronous, determining the last actor in a series of updates to an object is non-trivial. Riak’s counters should be used (in their current state) to count things that can tolerate eventual consistency. With that in mind, here are a few apps and types of functionality that could be implemented with Riak’s Counters:

  • Facebook Like Button
  • Youtube Views and Likes
  • Hacker News Votes
  • Twitter Followers and Favorites

The thing to remember here is that these counts can tolerate slight, brief imprecision. When your follower count fluctuates between 1000 and 1010 before finally settling on 1009, Twitter continues to work as expected. Riak 2.0 will feature work that enables you to enforce consistency around designated buckets which will solve this problem (with the necessary tradeoffs). Until then, use counters in Riak for things that can tolerate eventual consistency.

Even with this caveat, counters are a huge addition to Riak and we’re excited to see the new suite of applications and functionality they make possible.

Usage & Getting Started

To make use of counters we’ve introduced new endpoints and request types for the HTTP and Protocol Buffers APIs, respectively.

HTTP

The complete documentation for the HTTP interface is here. Here are the basics using CURL:

That’s it.

PBC

Usage documentation for this is still in the works, but here’s the relevant message (as seen in riak_pb):

We’re working on implementing these in all of Basho supported client libraries. Keep an eye on these for details and timelines around availability. We currently support counters in the following libraries across the following protocols:

  • Python – HTTP and PB
  • Java – HTTP and PB
  • Erlang – PB

In addition to the docs and code, Basho Hacker Sam Elliot has started a Riak CRDT cookbook. The first section walks you through using counters in a few different ways, and even shows you how to simulate failure events. Take it for a spin and send Sam some feedback.

Future Data Types

In addition to counters, we have big plans for more data types in Riak. Sets and maps are on the short list, and the goal is to have these ready for Riak 2.0. Russell posted an extensive RFC on the Riak GitHub repo for those interested. Comments, critiques, and contributions are all encouraged.

Related Work and Additional Reading

Enjoy and see you at RICON West in October.

Mark and The Basho Team

How Copious Uses Riak for Social Logins on its eCommerce Marketplace

January 28, 2013

Copious is a social commerce marketplace that makes it easy for people to buy and sell the things they love. Copious uses social data from Facebook and Twitter to customize the shopping and selling experience for each individual user around their interests, taste, and style. This connects buyers and sellers to each other in a unique way that also presents exciting technical challenges.

Copious Homepage

In the summer of 2012, Copious decided to switch to Riak from MongoDB for their social login functionality. They currently store all registered accounts in Riak as well as the tokens that make it possible for users to authenticate with Copious via their Facebook or Twitter accounts. Copious now stores hundreds of thousands of keys in Riak.

Copious found that Riak’s key/value data model was a natural fit for their authentication scheme and that it was easy to set up and run on Amazon Web Services. They run an HTTP service in front of Riak that presents data in the appropriate form and handles events between the front-end application and the backend data storage.

“Operating in a cloud environment, our infrastructure must be resilient to failure. Sometimes machines or even entire availability zones just disappear, and Riak’s fault-tolerant design means we remain available despite these failures. Riak is one component we never worry about, and requires much less operational work than other datastores,” said Robert Zuber, co-founder of Copious.

Copious is one of the many companies now using a polyglot approach to their platform – using Riak alongside MySQL, Redis, SOLR, and other technologies depending on the problem they need to solve. They chose to use Riak for their social login functionality because of its operational simplicity, which allows them to easily scale up without sharding and provides the high availability required for a smooth user experience. Based on their success with Riak, Zuber has said that they are looking to move more data to Riak in the future.

You can sign up now on Copious to earn $10 off your first $20+ purchase, plus the added bonus of knowing that Riak is in the background helping the site do its magic. If you’re in San Francisco, make sure to attend our Riak meetup on Wednesday, February 13, where Copious will be presenting how they use Riak alongside other datastores.

To learn more about how Riak can benefit your eCommerce or retail platform, join our webcast, “Riak on Retail” on February 8th!

Basho

Ripple 0.9 Release

April 4, 2011

I’m proud to announce the Ripple family of gems (riak-client, ripple, riak-sessions) version 0.9.0 were released yesterday. This is a huge leap forward from the 0.8 series, last release of which was in December. I’m going to highlight some of the best new features in this blog post.

HTTP+SSL Support

Adam Hunter did an awesome job implementing support for HTTPS, including navigating the various idiosyncracies of HTTP client libraries. If you’ve got HTTPS turned on in Riak, or a reverse-proxy in front that provides SSL, it’s easy to set up.

“`ruby
# Turn on SSL
client = Riak::Client.new(:ssl => true)

# Alternatively, provide the protocol client = Riak::Client.new(:protocol => “https”)

# Want to be a good SSL citizen? Use a client certificate.
# This can be used to authenticate clients automatically on the server-side.
client.ssl = {:pem_file => “/path/to/pemfile”}

# Use the CA chain for server verification
client.ssl = { :ca_path => “/path/to/ca_cert/dir” }

# All three of the above options will invoke “peer” verification.
# Use “none” verification only if you’re lazy. This is the default
# if you don’t specify a client certificate or CA.
client.ssl = { :verify_mode => “none” }
“`

Adam also added HTTP Basic authentication for those who use it on their reverse-proxy servers. It can be set with the :basic_auth option/accessor as a string of “user:password”.

Protocol Buffers

Riak has had a Protocol Buffers-based client API for a long time, but the state of Protocol Buffers support in Ruby has been very bad until recently. Thanks to Blake Mizerany’s “Beefcake” library, it was really simple to add support in a cross-platform way. While not insanely faster, the decreased overhead for many operations can make a big difference in the long run. Check out these benchmarks (run on MRI 1.9.2, comparing against the Excon backend):

                               user     system      total        real
http  ping                 0.020000   0.010000   0.030000 (  0.084994)
pbc   ping                 0.000000   0.000000   0.000000 (  0.007313)
http  buckets              0.010000   0.000000   0.010000 (  0.894827)
pbc   buckets              0.000000   0.000000   0.000000 (  0.864926)
http  get_bucket           0.480000   0.020000   0.500000 (  1.075365)
pbc   get_bucket           0.170000   0.030000   0.200000 (  0.271493)
http  set_bucket           0.060000   0.000000   0.060000 (  0.660926)
pbc   set_bucket           0.030000   0.000000   0.030000 (  0.579500)
http  store_new            0.710000   0.040000   0.750000 (  2.443635)
pbc   store_new            0.630000   0.030000   0.660000 (  1.382278)
http  store_key            0.730000   0.040000   0.770000 (  2.779741)
pbc   store_key            0.580000   0.020000   0.600000 (  1.539332)
http  fetch_key            0.690000   0.030000   0.720000 (  2.014679)
pbc   fetch_key            0.410000   0.030000   0.440000 (  0.948865)
http  keys                 0.300000   0.090000   0.390000 ( 78.455719)
pbc   keys                 0.530000   0.020000   0.550000 (  0.828484)
http  key_stream           0.200000   0.010000   0.210000 (  0.689116)
pbc   key_stream           0.530000   0.010000   0.540000 (  0.833347)

Adding Protocol Buffers required a breaking change in the Riak::Client class, namely that the port setting/accessor was split into http_port and pb_port. If you still have this setting in a configuration file or your code, you will receive a deprecation warning.

“`ruby
# Use Protocol Buffers! (default port is 8087)
client = Riak::Client.new(:protocol => “pbc”, :pb_port => 8087)

# Use HTTP and Protocol Buffers in parallel, too! (Luwak and Search require HTTP)
client.store_file(“bigpic.jpg”, “image/jpeg”, File.open(“images/bigpic.jpg”, ‘rb’))
“`

**Warning**: Because some operations (namely get_bucket_props and set_bucket_props) are not semantically equivalent on both interfaces, you might run into some unexpected problems. I have been assured that these differences will be fixed soon.

MapReduce Improvements

Streaming MapReduce is now supported on both protocols. This lets you handle results as they are produced by MapReduce rather than waiting for all the results to be accumulated. Unlike the traditional mode, you will be passed a phase number in addition to the data from the phase. Like key-streaming, just give a block to the run method.

“`ruby
# Make a MapReduce job like usual.
Riak::MapReduce.new(client).
add(“people”,”sean”).
link(:tag => “friend”).
map(“Riak.mapValuesJson”, :keep => true).
run do |phase, data| # Streaming!
puts data.inspect
end
“`

MapReduce key-filters were available in beta releases of 0.9. This is a new feature in Riak 0.14 that lets you reduce the number of keys fed to your MapReduce query by some criteria of the key names. Here’s an example:

“`ruby
# Blockish builder syntax for key-filters
Riak::MapReduce.new(client).
filter(“posts”) do
tokenize “-“, 1 # Split the key on dashes, take the first token
string_to_int # Convert the token to an integer
eq 2011 # Only pass the ones from 2011
end

# Same as above, without builder syntax
Riak::MapReduce.new(client).
add(“posts”, [["tokenize", "-", 1],
["string_to_int"],
["eq", 2011]])
“`

Ripple::Document Improvements

Ripple::Document models got a lot of small improvements, including:

  • Callback ordering was fixed.
  • Documents can be serialized to JSON e.g. for API responses.
  • Client errors bubble up when saving a Document.
  • Several association proxy bugs were fixed.
  • The datetime serialization format defaults to ISO8601 but is also configurable.
  • Mass-attribute-assignment protection was added, including protecting :key by default.
  • Embedded documents can be compared for equality, which amounts to attribute equality when under the same parent document.
  • Documents can now have observer classes which can also be generated by the ripple:observer generator.

Testing Improvements

In order to make sure that the client layer is sufficiently independent of transport semantics and that the lower layers comply with the “unified” backend API, there is a new suite of integration tests for riak-client that covers operations that are supported by both transport mechanisms. This should make it much easier to implement new client backends in the future.

The Riak::TestServer was made faster and more reliable by a few changes to the Erlang bits that power it.

Onward to 1.0

Recently the committers and some members of the community joined me in discussing some key features that need to be in Ripple before it reaches “1.0” status. Some of them will be really incredible, and I’m anxious to get started on them:

  • Enhanced Document querying (scopes, indexing, lazy loading, etc)
  • User-defined sibling-resolution policies, with automatic retries
  • Enhanced Riak Search features
  • Platform-specific Protocol Buffers drivers (MRI C, JRuby, Rubinius C++)
  • Server templates for creating development clusters (extracted from Riak::TestServer)

To that end, I’ve created a 0.9-stable branch which will only receive bugfixes going forward. All new development for 1.0 will be done on master. We’re likely to break some legacy APIs, so we will try to add deprecation notices to the 0.9 series where possible.

Enjoy this latest release of Ripple!

Sean and contributors

Riak 0.10 is full of great stuff

April 23, 2010

give the people what they want

We’ve received a lot of feedback in the past few months about the ways that Riak already serves people well, and the ways that they wish it could do more for them. Our latest release is an example of our response to that feedback.

Protocol Buffers

Riak has always been accessible via a clean and easy-to-use HTTP interface. We made that choice because HTTP is unquestionably the most well-understood and well-deployed protocol for data transfer. This has paid off well by making it simple for people to use many languages to interact with Riak, to get good caching behavior, and so on. However, that interface is not optimized for maximum throughput. Each request needs to parse several unknown-length headers, for example, which imposes a bit of load when you’re pointing a firehose of data into your cluster.

For those who would rather give up some of the niceties of HTTP to get a bit more speed, we have added a new client-facing interface to Riak. That interface uses the “protocol buffers” encoding scheme originally created by Google. We are beginning to roll out some client libraries with support for that new interface, starting with Python and Erlang but soon to encompass several other languages. You can expect them to trickle out over the next couple of weeks. Initial tests show a nice multiple of increased throughput on some workloads when switching to the new interface. We are likely to release some benchmarks to demonstrate this sometime soon. Give it a spin and let us know what you think.

Commit Hooks

A number of users (and a few key potential customers) have asked us how to either verify some aspects of their data (schemas, etc) on the way in to Riak, or else how to take some action (on a secondary object or otherwise) as a result of having stored it. Basically, people seem to want stored procedures.

Okay, you can have them.

Much like with our map/reduce functionality, your own functions can be expressed in either Erlang or JavaScript. As with any database’s stored procedures you should make sure to make them as simple as possible or else you might place an undue load on the cluster when trying to perform a lot of writes.

Faster Key Listings

Listing of all of the keys in a Riak bucket is fundamentally a bit more of a pain than any of the per document operations as it has to deal with and coordinate responses from many nodes. However, it doesn’t need to be all that bad.

The behavior of list_keys in Riak 0.10 is much faster than in previous releases, due both to more efficient tracking of vnode coverage and also to a much faster bloom filter. The vnode coverage aspect also makes it much more tolerant of node outages than before.

If you do use bucket key listings in your application, you should always do so in streaming mode (“keys=stream” query param if via HTTP) as doing otherwise necessitates building the entire list in memory before sending it to the client.

Cleanliness and Modularity

A lot of other work went into this release as well. The internal tracking of dependencies is much cleaner, for those of you building from source (instead of just grabbing a pre-built release). We have also broken apart the core Erlang application into two pieces. There will be more written on the reasons and benefits of that later, but for now the impact is that you probably need to make some minor changes to your configuration files when you upgrade.

All in all, we’re excited about this release and hope that you enjoy using it.

- Justin