August 8, 2010

Thank you to those who attended our Rails-oriented webinar yesterday. Like before, we’re recapping the questions below for everyone’s sake (in no particular order).

Q: When you have multiple application servers and Riak nodes, how do you handle “replication lag”?

Most web applications have some element of eventual consistency (or potential inconsistency) in them by their nature. Object and view caches sacrifice immediate consistency for gains in throughput and latency, and hopefully provide a better user experience. With Riak, you can achieve acceptable data freshness by “reading your writes”. That is, use the same read quorum as your write quorum and make sure that the R+W is greater than N. For example, using R=W=DW=2 when N=3 will give a strong assurance of consistency.

Q: I find myself doing def key; id; end. Is there any easier way to tell Ripple the key?

Currently there is not. However, I’ve found myself using this pattern frequently when I want a meaningful key that is also an attribute. There’s an issue on the tracker just for this feature. In the meantime, you could use two method aliases:

class User
include Ripple::Document
property :email, String, :presence => true

# This forces all attribute methods to be defined
alias_method :key, :email
alias_method :key=, :email=

As long as your property is a string, this should work just fine.

Q: Any tips on how to handle pagination over MapReduce queries?

The challenge with pagination in Riak is that reduce phases are not guaranteed to run only once, but instead are run in parallel as results from the previous phase come in asynchronously, and then followed by a final reduce. So in a sense, you have to treat all invocations of your reduce function as a “re-reduce”. We have plans to allow reduce phases to specify that they should be run only once, but for right now you can get around this limitation.

Reduce phases are always run on the coordinating node, so if you put a reduce phase before the one where you want to perform pagination, you are pretty much guaranteed that the whole result set is going to be available in a single application of the final reduce. A typical combination would be a “sorting” phase followed by a “pagination” phase. Riak.reduceSort and Riak.reduceSlice are two built-in functions that could help accomplish this task.

Sean and Grant