January 27, 2014
On the official Basho docs, we compare Riak to multiple other databases. We are currently working on updating these comparisons, but, in the meantime, we wanted to provide a more up-to-date comparison for one of the more common questions we’re asked: How does Riak compare to Cassandra?
Cassandra looks the most like Riak out of any other widely-deployed data storage technology in existence. Cassandra and Riak have architectural roots in Amazon’s Dynamo, the system Amazon engineered to handle their highly available shopping cart service. Both Riak and Cassandra are masterless, highly available stores that persist replicas and handle failure scenarios through concepts such as hinted handoff and read-repair. However, there are certain key differences between the two that should be considered when evaluating them.
Amazon’s Dynamo utilized a Key/Value data model. Early in Cassandra’s development, a decision was made to diverge from keys and values toward a wide-row data model (similar to Google’s BigTable). This means Cassandra is a Key/Key/Value store, which includes the concept of Column Families that contain columns. With this model, Cassandra is able to handle high write volumes, typically by appending new data to a row. This also allows Cassandra to perform very efficient range queries, with the tradeoff being a more rigid data model since rows are “fixed” and non-sequential read operations often require several disk seeks.
On the other hand, Riak is a straight Key/Value store. We believe this offers the most flexibility for data storage. Riak’s schemaless design has zero restrictions on data type, so an object can be a JSON document at one moment and a JPEG at the next.
Like Cassandra, Riak also excels at high write volumes. Range queries can be a little more costly, though still achievable through Secondary Indexes. In addition, there are a number of data modeling tips and tricks for Riak that make it easy to expose access to data in ways that sometimes aren’t as obvious at first glance. Below are a few examples:
In Riak, multi-datacenter replication is achieved by connecting independent clusters, each of which own their own hash ring. Operators have the ability to manage each cluster and select all or part of the data to replicate across a WAN. Multi-datacenter replication in Riak features two primary modes of operation: full sync and real-time. Data transmitted between clusters can be encrypted via OpenSSL out-of-the-box. Riak also allows for per-bucket replication for more granular control.
Cassandra achieves replication across WANs by splitting the hash ring across two or more clusters, which requires operators to manually define a NetworkTopologyStrategy, Replication Factor, a Replication Placement Strategy, and a Consistency Level for both local and cross data center requests.
Conflict Resolution and Object Versioning
Cassandra uses wall clock timestamps to establish ordering. The resolution strategy in this case is Last Write Wins (LWW), which means that data may be overwritten when there is write contention. The odds of data loss are magnified by (inevitable) server clock drift. More details on this can be found in the blog, “Clocks are Hard, or, Welcome to the Wonderful World of Distributed Systems.”
Riak uses a data structure called vector clocks to track the causal ordering of updates. This per-object ancestry allows Riak to identify and isolate conflicts without using system clocks.
In the event of a concurrent update to a single key, or a network partition that leaves application servers writing to Riak on both sides of the split, Riak can be configured to keep all writes and expose them to the next reader of that key. In this case, choosing the right value happens at the application level, allowing developers to either apply business logic or some common function (e.g. merge union of values) to resolve the conflict. From there, that value can be written back to Riak for its key. This ensures that Riak never loses writes.
Riak Data Types, first introduced in Riak 1.4 and expanded in the upcoming Riak 2.0, are designed to converge automatically. This means Riak will transparently manage the conflict resolution logic for concurrent writes to objects.
In the event of server failures and network problems, Riak is designed to always accept read and write requests, even if the servers that are ordinarily responsible for that data are unavailable.
Cassandra will allow writes to (optionally) be stored on alternative servers, but will not allow that data to be retrieved. Only after the cluster is repaired and those writes are handed off to an appropriate replica server (with the potential data loss that timestamp-based conflict resolution implies, as discussed earlier) will the data that was written be available to readers.
Imagine a user working with a shopping cart when the application is unable to connect to the primary replicas. The user can re-add missing items to the cart but will never actually see the items show up in the cart (unless the application performs its own caching, which introduces more layers of complexity and points of failure).
When handling missing or divergent/stale data, Riak and Cassandra have many similarities. Both employ a passive mechanism where read operations trigger the repair of inconsistent replicas (known as read-repair). Both also use Active Anti-Entropy, which builds a modified Merkle tree to track changes or new inserts on a per hash-ring-partition basis. Since the hash rings contain overlapping keys, the trees are compared and any divergent or missing data is automatically repaired in the background. This can be incredibly effective at combating problems such as bitrot, since Active Anti-Entropy does not need to wait for a read operation.
The key difference in implementation is that Cassandra uses short-lived, in-memory hash trees that are built per Column Family and generated as snapshots during major data compactions. Riak’s trees are on-disk and persistent. Persistent trees are safer and more conducive to ensuring data integrity across much larger datasets (e.g. 1 billion keys could easily cost 8-16GB of RAM in Cassandra versus 8-16GB of disk in Riak).
Both Cassandra and Riak are eventually consistent, scalable databases that have strengths for specific use cases. Each has hundreds of thousands of hours of engineering invested and the commercial backing and support offered by their respective companies, Datastax and Basho. At Basho, we have labored to make Riak very robust and easy to operate at both large and small scale. For more information on how Riak is being used, visit our Riak Users page. For a look at what’s to come, download the Technical Preview of Riak 2.0.
December 30, 2013
2013 was a huge year for Basho Technologies and before we dive into 2014, we thought we’d take a moment to reflect on how far we’ve come.
2013 was the year of the Riak User. We love hearing about all the amazing ways companies across various industries are using Riak. This year, we were able to share dozens of exciting case studies. These include:
- Synacor’s TV Everywhere platform
- Enstratius (acquired by Dell)
- Best Buy
- Alert Logic
- Viggle (through OmniTI)
- Turner Broadcasting
- Hosted Graphite
- Gilt Groupe
- Praekelt Foundation
- National Health Service
- City Maps
- The Weather Company
For even more Riak Users, check out the Users Page.
We released Riak 1.3, Riak 1.4, and the Technical Preview of Riak 2.0 this year. These releases added such features as Active Anti-Entropy, revamped Riak Control, queryability improvements, Riak Data Types, and much more. Be on the lookout for the general release of Riak 2.0 early next year.
This year, we expanded RICON, Basho’s distributed systems conference, to both RICON East and RICON West. These were both sold out conferences that featured speakers from bitly, Comcast, Google, Netflix, Salesforce, The Weather Company, Turner Broadcasting, Twitter, and many more.
We drastically increased the number of Basho partners in 2013. For a full list of partners, check out the Partnerships Page. Some key ones to note include Tokyo Electron Device, SoftLayer, and Seagate.
Our amazing community team hosted over 200 meetups around the world this year. On top of that, they also attended dozens of industry events to spread the word about Basho. Keep an eye on the Events Page to see where we’ll be in 2014.
2013 was a busy year but, with some exciting announcements coming, we look forward to an even busier 2014. Happy New Year!
December 18, 2013
Downtime, planned or unplanned, is no longer an option. It can have a dramatic impact on revenue and lead to negative customer experiences and attrition. Luckily, distributed NoSQL databases (such as Basho Riak) are designed to provide high availability, even during network partition or server failure. This means there will never be an excuse for downtime again.
To help demonstrate the cost of downtime and how Riak can help, we have put together an infographic, “Down With Downtime.” Zoom in by clicking the image below.
October 28, 2013
The technology community is extremely agile and fast-paced. It can turn on a dime to solve business problems as they arise. However, with this agility comes budding terminology that can often provide false categorizations. This can lead to confusion, especially when companies evaluate new technologies based on a surface understanding of these terms. The world of data is full of these terms, including the notorious “NoSQL” and “big data.”
As described in a previous post, NoSQL is a misleading term. This term represents a response to changing business priorities that require more flexible, resilient architectures (as opposed to the traditional, rigid systems that often happen to use SQL). However, within the NoSQL space, there are dozens of players that can be as different from one another as they are from any of the various SQL-speaking systems.
Big data is another term that, while fairly self-explanatory, has been overused to the point of dilution. One reason why NoSQL databases have become necessary is because of their ability to easily scale to keep up with data growth. Simply storing a lot of data isn’t the solution though. Some data is more critical than others (and should be accessible no matter what) and some data needs to be analyzed to provide business insights. When digging into a business, big data is too vague a term to describe both of these use cases.
As these terms (to highlight a few) are used, it can lead to industry confusion. One area of confusion that we have experienced relates to Basho’s own distributed database, Riak, and the distributed processing system, Hadoop.
While these two systems are actually complementary, we are often asked “How is Riak different from Hadoop?”
To help explain this, it’s important to start with a basic understanding of both systems. Riak is a distributed database that is built for high availability, fault tolerance, and scalability. It is best used to store large amounts of critical data that applications and users need to constantly be able to access. Riak is built by Basho Technologies and can be used as an alternative to or in conjunction with relational databases (such as MySQL) or to other “NoSQL” databases (such as MongoDB or Cassandra).
Hadoop is a framework that allows for the distributed parallel processing of large data sets across clusters of computers. It was originally based on the “MapReduce” system, which was invented by Google. Hadoop consists of two core parts: the underlying Hadoop Distributed File System (HDFS), which ensures stored data is always available to be analyzed, and MapReduce, which allows for scalable computation by dividing and running queries over multiple machines. Hadoop provides an inexpensive, scalable solution for bulk data processing and is mostly used as part of an overarching analytics strategy, not for primary “hot” data storage.
One easy way to distinguish between the two is to look at some of the common use cases.
Riak Use Cases
Riak can be used by any application that needs to always have access to large amounts of critical data. Riak uses a key/value data model and is data-type agnostic, so operators can store any type of content in Riak. Due to the key/value model, certain industry use cases fit easily into Riak. These include:
- Gaming – storing player data, session data, etc
- Retail – underpinning shopping carts, product inventories, etc
- Mobile – social authentication, text and multimedia storage, global data locality, etc
- Advertising – serving ad content, session storage, mobile experiences, etc
- Healthcare – prescription or patient records, patient IDs, health data that must always be available across a network of providers, etc
For a full list of use cases, check out our Users Page.
Hadoop Use Cases
Hadoop is designed for situations where you need to store unmodeled data and run computationally intensive analytics over that data. The original use cases of both MapReduce and Hadoop were to produce indexes for distributed search engines at Google and Yahoo respectively. Any industry that needs to do large scale analytics to better improve their business can use Hadoop. Some common examples include finance (build models to do accurate portfolio evaluations and risk analysis) and eCommerce (analyze shopping behavior to deliver product recommendations or better search results).
Riak and Hadoop are based on many of the same tenets, making their usage complementary for some companies. Many companies that utilize Riak today have created scripts, or processes, to pull data from Riak and push into other solutions (like Hadoop) for the purpose of historical archiving or future analysis. Recognizing this trend, Basho is exploring the creation of additional tools to simplify this process.
If you are interested in our thinking on these data export capabilities, please contact us.
Every tool has its value. Hadoop excels at being used by a relatively small subset of the business to answer big questions. Riak excels at being used by a very large number of users and powering critical data for businesses.
October 22, 2013
Today, Seagate has announced the availability of their Kinetic Open Storage platform, which simplifies data management, improves performance and scalability, all while lowering expenses. This fundamentally new architecture reduces costs by allowing applications to communicate directly with the storage system, eliminating the acquisition, deployment, and support costs of hyperscale storage infrastructures.
Basho has partnered with Seagate to help them develop this platform to provide interoperability and testing with Riak. Now, with the release of this platform, we want to make it easier for developers to test the Kinetic Open Storage platform with Riak. We have just released an alpha version of our eKinetic driver, which enables an Erlang-based high-performance socket connection to the drive. We have also released software to improve Riak backend compatibility by mapping a Riak backend to the drive library. Both are available for download https://github.com/basho-labs/riak_kinetic.
Not only does deploying Riak on this platform drastically simplify the management of data through a straightforward socket-based network interface, this simplification also increases I/O efficiency by removing bottlenecks and optimizing cluster management, data replication, migration, and active multi-datacenter performance. Additionally, it is expected that users will realize up to a 50% decrease in the Total Cost of Operations through simplified operations alone. Users can also maximize storage density through reduced power and cooling costs and build out cloud datacenters for even more savings.
Seagate Principal Technologist, James Hughes, will be speaking about the Kinetic Open Storage platform and Riak at RICON, Basho’s distributed systems conference. His talk, “Device Based Innovation to Enable Scale-Out Storage” will take place on October 29th at 12pm in Track Two. Seagate is also a sponsor of RICON.
October 2, 2013
ooVoo has over 85 million users worldwide, with nearly 2.5 million users generating an average of 300 million minutes of video every day. These users also generate about 1,000 chat messages per second. With all of this activity, ooVoo adds nearly 40GB of data per day and now maintains tens of terabytes of data.
In 2012, ooVoo selected Riak to deploy new communication features and ensure no-single point of failure in their always-available architecture. Today, Riak is used to support cloud based chat history, rich interactive chat, group communication features, and an infinite retention policy.
In the webinar, ooVoo Senior Director and System Architect, Alex Fok, discusses their business requirements, architecture decisions, and business results from their deployment of Riak. The webinar is available here and can also be viewed below.
October 2, 2013
What Is Riak CS?
In May of this year, we posted the top 5 questions we heard from customers and our community about Riak CS; today we’ll take a deeper dive into the technical details, specifically the differences between Riak CS and Riak itself.
Riak CS as Compared to Riak
Both Riak CS and Riak are, at their core, places to store objects. Both are open source and both are designed to be used in a cluster of servers for availability and scalability.
The fundamental distinction between the two is simple: Riak CS can be used for storing very large objects, into the terabyte size range, while Riak is optimized for fast storage and retrieval of small objects (typically no more than a few megabytes).
There are subtle differences; however, that can be obscured by the similarities between the two.
Why Would I Use Riak CS?
Riak CS is used for a variety of reasons. Some examples:
- Private object storage services, for example for companies that want to store sensitive data behind their own firewalls.
- Large binary object storage as part of a voice or video service.
- An integrated component in an OpenStack cloud solution, storing and serving VM images on demand.
Tier 3, Yahoo! Japan, Datapipe, and Turner Broadcasting are just a few of the big names using Riak CS today.
What Does Riak CS Do That Riak Doesn’t?
Riak CS carves large objects into small chunks of data to be distributed throughout a Riak cluster and, when used with Riak CS Enterprise, synchronized with remote data centers.
Riak CS adds compatibility with Amazon’s S3 and OpenStack’s Swift APIs. These offer very different semantics than Riak, and the advanced search capabilities in Riak such as Secondary Indexes and full text search are not available using S3 or Swift clients.
We strongly advise against it, but it is possible to work with Riak’s standard APIs “under the hood” when deploying a Riak CS solution.
Work is actively underway to add a security model to Riak in the upcoming 2.0 release.
Buckets or Buckets?
Users of Riak CS store their objects in virtual containers (called buckets in Amazon S3 parlance, containers in OpenStack).
Riak also relies heavily on buckets for data storage and configuration but, despite the names, these buckets are not the same.
As an example of how this can cause confusion: the replication factor in Riak (the number of times a piece of data is stored in a cluster) is configurable per-bucket. Because Riak’s buckets do not underly the user buckets in Riak CS, this feature cannot be used to create tiered services.
Riak is designed to maximize availability; the price paid for that is delayed consistency when the network is split and clients are writing to both sides of the cluster.
Creating user accounts in Riak CS; however, led to the need for a mechanism to maintain strong consistency. If two people attempt to create user accounts with the same username on either side of a network partition, both cannot be allowed to succeed, or else a conflict will occur that is very difficult to automatically recover from.
Furthermore, user buckets in S3 (and OpenStack APIs as implemented in Riak CS) reside in a global rather than a user-specific namespace, so bucket creation must also be handled carefully.
Riak CS introduced a service named Stanchion that is designed to handle these specific requests to avoid conflicts. Stanchion is a single process running on a single Riak server (thus introducing a single point of failure for user account and bucket creation requests).
While it is possible to deploy Stanchion using common system tools to make a daemon process run in a highly available manner, Basho recommends doing so carefully and testing it thoroughly. Since the only impact of failure is to prevent user and bucket creation, it may be preferable to monitor and alert on failure. If two copies of Stanchion are running due to a network partition, its strong consistency guarantees will be lost.
With strong consistency options targeted for Riak 2.0, expect to see some changes.
Basho offers multi-datacenter replication with its Enterprise software licenses, and Riak CS Enterprise takes full advantage of that feature. Data can be written to one or more clusters in multiple data centers and be synchronized automatically between them.
There are two types of synchronization: real-time, which occurs as objects are written, and full sync, which happens on a periodic basis to compare the full contents of each cluster for any changes to be merged.
One key difference is that Riak CS maintains manifest files to track the chunks it creates, and it is these manifests that are distributed between clusters during real-time sync. The individual chunks are not synchronized until a full sync replication occurs, or until someone requests the file from a remote cluster. The manifest is made active for someone to retrieve the chunks after the original upload to the source cluster is complete.
A common mistake while installing Riak CS is to configure it using information specific to Riak rather than Riak CS. As an example, per the Riak CS installation instructions the relevant backend data store must be configured to
riak_cs_kv_multi_backend, which is forked from Riak’s
riak_kv_multi_backend. Using the latter will cause problems.
Riak (CS) Control
Exposure to Internet
Exposing any database directly to the Internet is risky. Riak, currently lacking any concept of authentication, absolutely must not be accessible to untrusted networks.
Riak CS; however, is designed with Internet access in mind. It is still advisable to place a load balancer or proxy in front of a Riak CS cluster, for example to ease cluster maintenance/upgrades and to provide a central location to log and block potentially hostile access.
Riak CS servers will still have open Riak ports that must be protected from the Internet as you would any Riak servers.
Where to Next for Riak CS?
2013 has been a big year for Riak CS: it was released as open source in the spring, with OpenStack support added this summer. Still, there is much to do.
As mentioned above, improving or replacing Stanchion is a high priority.
We will continue to expand the API coverage for Riak CS. The next major targets are the copy object operations that Amazon S3 and OpenStack Swift offer.
Compression and more granular replication controls are also under consideration for future releases.
By building Riak CS atop the most robust open source distributed database in the world, we’ve created a very operationally friendly, powerful storage solution that can evolve to meet present and future needs. Feel free to give it a try if you aren’t already using it.
If you’re interested in hearing from the engineers who’ve made this software possible (and seeing just how far a highly available data storage solution can take you), join us October 29-30th for RICON West. RICON West is where Basho brings together industry and academia to discuss the rapidly expanding world of distributed systems, including Riak and Riak CS.
September 30, 2013
While the biggest event of October is Basho’s distributed systems conference, RICON West, we will still be traveling the world to attend many other events this month. Here’s a look at where you can find us during the weeks leading up to RICON.
Monktoberfest: Basho’s Director of Marketing, Tyler Hannan, will be speaking at Monktoberfest on “Medieval Art, Collective Intelligence, and Language Abuse – The Ethos of Distributed Systems.” Monktoberfest will take place in Portland, ME from Oct. 3-4.
Erlang Factory Lite: Basho will have speakers at both the Chicago event (Oct. 4th) and the Berlin event (Oct. 16th). Check out talks from Chris Meiklejohn and Steve Vinoski to learn more about Riak, Erlang, and distributed systems.
CloudConnect Chicago: Basho is a sponsor and exhibitor of CloudConnect Chicago, taking place Oct. 21-23. Basho engineer, John Burwell, will also be speaking about building private clouds with Apache CloudStack and Riak CS.
O’Reilly Strata: Basho will be exhibiting and speaking at the upcoming O’Reilly Strata conference in New York from Oct. 28-30. Stop by our booth and find out why we will all be using distributed systems in the future.
September 17, 2013
The Praekelt Foundation is a non-profit that builds open source, scalable mobile technologies and solutions to improve the health and well-being of people living in poverty. Their Vumi solution was created as a response to the rapid spread of mobile phones across Africa. Vumi allows for large scale mobile messaging using SMS and USSD, so no internet connectivity is required. Vumi uses Riak as a super reliable backend to store all the messages that are being processed and all responses. This data is all archived to allow for further analysis to see trends of areas and which campaigns are the most successful.
The Vumi Network reaches hundreds of thousands of end users across many countries. It works with non-governmental organizations to set up campaigns and services for emerging markets. These include education (Wikipedia uses Vumi to allow end-users to search and retrieve information from Wikipedia over SMS/USSD), health (partnering with Johnson & Johnson, the MAMA campaign (Mobile Alliance for Maternal Action) allows pregnant women to receive health information over SMS based pregnancy stage and HIV diagnosis), peaceful messaging (Sisi Ni Amani uses Vumi to prevent election violence in Kenya through grassroots engagement and tracking of early conflict warning signs), as well as many other utilities.
They had been using Postgres for years but, when it came to storing messages, they knew Postgres was only an interim solution. Since they needed a non-relational system, they started evaluating the key players in the NoSQL space. With MongoDB, they found the durability defaults needed for a performance boost were not adequate for their zero downtime needs; CouchDB did not give them the performance they needed; and Cassandra was too operationally intensive for their small team and Riak offered better features. When they began testing Riak, Praekelt Foundation Chief Engineer, Simon de Haan, was able to get a three-node cluster up and running on his laptop in 20 minutes. This operational simplicity, the reliability of the system, the ability to seamlessly scale to entire populations, and the range of query options made Riak a clear choice to power Vumi.
“It blew my mind how easy it was to set up Riak and was a huge selling point for our small operations team,” said de Haan. “We also needed a reliable system with solid up-time guarantees. Riak has never gone down on us and continues to survive individual restarts. The whole thing just works.”
Since launching Riak two years ago, they are running five nodes and push 1,000 messages each second. All messages are stored as JSON in Riak, which makes it easy for them utilize Secondary Indexing and MapReduce when querying this data. With the introduction of pagination with Secondary Indexes and Eventually Consistent Counters in Riak 1.4, they have also been able to move a lot of data from Redis over to Riak to take advantage of these new features. Additionally, The Praekelt Foundation will expand their querying capabilities later this year when Riak Search gets a makeover in the Riak 2.0 release.
The Praekelt Foundation is currently evaluating Riak and Riak CS for some of their other technologies and Basho is proud to be a part of such a great cause. For more information on the Praekelt Foundation and Vumi, visit their site at www.praekeltfoundation.org/
September 12, 2013
Superfeedr provides a real-time API to any application that wants to produce (publishers) or consume (subscribers) data feeds without wasting resources or maintaining an expensive and changing infrastructure. It fetches and parses RSS or Atom feeds on behalf of its users and new entries are then pushed to subscribing applications using a webhook mechanism (PubSubHubbub) or XMPP. The Google Reader replacement is an example of a popular API built by Superfeedr that has backed up much of Google Reader.
Riak is used by Superfeedr to store the content from all feeds so users can retrieve past content (including the Google Reader API replacement), even if the feeds themselves may not include these entries anymore. This Riak datastore is referred to as “the cave.”
When Superfeedr first built “the cave” datastore, they opted for a cluster of large Redis instances (five servers with 8GB of memory each) due to its inherent speed. However, they realized that a more durable system was needed and the need to manually shard feeds across the cluster made it difficult to scale beyond storing a couple entries per feed. The scaling problem turned into an even larger issue because the average size of a stored entry was 2KB. Now, they had nearly 1,000 items per feed and 50 million feeds, translating to over 93TB of data and quickly growing.
They chose to move “the cave” to Riak due to its focus on availability (as delivering stale data was more important than delivering no data) and ease-of-scale. According to Superfeedr Founder, Julien Genestoux, “Riak solves the scalability problem elegantly. Through consistent hashing, our data is automatically distributed across the cluster and we can easily add nodes as needed.” While Riak does have a lower read performance than Redis, this proved to not be a problem as they found it easy to put caches in front of Riak if they needed to serve content faster.
Though Superfeedr found it easy to set up their Riak cluster, the default behavior for handling conflicts had to be adjusted for their use case. By working with Basho and the Riak community, they were able to find the right settings and optimize their conflict resolution algorithm. For more information on Riak’s configurable behaviors, check out our four-part blog series.
Superfeedr went into production in two phases: they started storing production data in the beginning of 2013 and began serving that data about two months later. During this period, Superfeedr was able to design their cluster infrastructure and thoroughly performance test it with actual production data.
Two types of objects are stored in Superfeedr’s Riak datastore: feeds and entries. Feeds are stored as a collection of internal feed ids, which correspond to the entries and include some meta-information, such as the title. Entries correspond to feed entries and are indexed by feedID-entryID, allowing them to store multiple entries for each feed. This indexing scheme allows entries to be retrieved, even if they lose track of the feed element, through a MapReduce job.
At write time, Superfeedr writes both the feed element and the entry element. When they query for a feed, they issue a MapReduce job to read both the feed element and the desired number of entry items. They also use a pagination mechanism to limit the resources consumed for each request, with an arbitrary limit of 50 entries.
Today, Superfeedr has served over 23 billion entries, with nearly one million more being published every hour. Their six-node Riak cluster (built on 16GB Linode slices) has allowed them to horizontally scale their cluster as their content and user base grows. “Riak is the right tool for us due to its scalability and always on availability,” said Genestoux. “We have refined it to fit our needs and can rest-assured that no data will ever be lost in our Riak ‘cave.’”
If you’re looking for a Google Reader replacement or interested in learning more about Superfeedr, check out their site: superfeedr.com/. For other examples of Riak in production, visit: basho.com/riak-users/