June 11, 2012
Q: What platforms are supported by Riak?
This question comes up quite frequently and there has always been two answers to that question. The first answer really answers the question of “what platforms can Riak run on?” The second one answers the question of “what platforms are packaged for and tested by Basho?”
*BSD and the Answer to the First Question
Riak has been built and run on FreeBSD by a handful of users for quite some time from what we can tell. Those dedicated users had to jump through hoops involving modifying many of Riak’s
Makefile‘s as well as
#ifdef‘s scattered throughout our storage backends. From a great-user-experience point of view, this is not ideal.
Early last year work was started by both the community and Basho to improve our codebase to support *BSD properly. Unfortunately with our limited resources at the time, finishing the testing on *BSD got set aside in favor of new Riak features. This left a single missing piece from easy use of Riak on *BSD, LevelDB.
In December Basho engineer Andrew Thompson submitted a patch to Google to add DragonFlyBSD/NetBSD/OpenBSD support to LevelDB so we can use those fixes in eLevelDB. In March 2012, the patch was merged into LevelDB. That patch, along with the eventual merge of the “BSD Support” pull-request, finally added BSD to the answer of the first question, “what platforms can Riak run on?”
While great, this still leaves Riak *BSD users alone when it comes to the second question “what platforms are packaged for and tested by Basho?”
*BSD and the Answer to the Second Question
Whenever asked what platforms are officially supported by Basho the answer has always been to look at what packages were available for Riak. After the release of Riak 1.1, we decided we had the capacity to support another major platform with our next major release. Due to interest from the community and customers, FreeBSD was chosen. Now “Free” doesn’t satisfy all * in *BSD, but it is a good place to start!
Luckily the work to actually package FreeBSD was made easier by all the work mentioned above to have it build cleanly. As of this commit, packages for FreeBSD have been building with every commit on
riak/master. So, stay tuned for the next major Riak release, when we’ll ship FreeBSD 9 packages as well as update the installation documentation. Now I’m on the hook for writing documentation I suppose!
$ gmake package RELEASE=1
June 7, 2012
We have been in close contact with the Server and Cloud Team at Microsoft over the past months. The addition of Linux and new language libraries, such as Java and Node.js, are indicative of a major new commitment for Microsoft’s Cloud.
Most importantly, we are excited that Riak will be available on Microsoft’s Cloud running on Linux in a month’s time.
June 4, 2012
I’m thrilled to announce that the v0.3 Riak Community Release Notes are now official. (For some history on the Community Release Notes, go here.) This installment covers what happened in the community from (approximately) May 4 through June 1. Some of the many stand-out accomplishments from this release:
- The team at Kiip gave a great talk on moving from MongoDB to Riak.
- Long-time Riak and Riak Search users and contributors Cliboard officially launched.
- The first Riak London Meetup was held.
- Mathias Meyer shipped a huge update to the Riak Handbook.
- Matt Ranney and the team at Voxer released their Node.js client for Riak.
We did a lot more. Take a few minutes to read up. Also, we’re already rolling with the 0.4 Release Notes (which will cover June 2 up through July 1). You’re encouraged to contribute to past, present, and future release notes, so don’t hold back.
Enjoy, and thanks for being a part of Riak.
Basho Technologies today announced the immediate availability of the second edition of Riak Handbook.
CAMBRIDGE, MA – June 1, 2012 – Basho Technologies today announced the immediate availability of the second edition of Riak Handbook. The significantly updated Riak Handbook includes more than 43 pages of new content covering many of the latest feature enhancements to Riak, Basho’s industry-leading, open-source, distributed database. Riak Handbook is authored by former Basho developer and advocate, Mathias Meyer.
Riak Handbook is a comprehensive, hands-on guide to Riak. The initial release of Riak Handbook focused on the driving forces behind Riak, including Amazon Dynamo, eventual consistency and CAP Theorem. Through a collection of examples and code, Mathias’ Riak Handbook explores the mechanics of Riak, such as storing and retrieving data, indexing, searching and querying data, and sheds a light on Riak in production. The updated handbook expands on previously covered key concepts and introduces new capabilities, including the following:
- An overview of Riak Control, a new Web-based operations management tool
- An entirely new section on deploying Erlang code in a Riak cluster
- Additional details on secondary indexes
- Insight into load balancing Riak nodes
- An introduction to network node planning
- An introduction to Riak CS, includes Amazon S3 API compatibility
The updated Riak Handbook includes an entirely new section dedicated to popular use cases and is full of examples and code from real-time usage scenarios.
Mathias Meyer is an experienced software developer, consultant and coach from Berlin, Germany. He has worked with database technology leaders such as Sybase and Oracle. He entered into the world of NoSQL in 2008 and joined Basho Technologies in 2010.
About Basho Technologies
Basho Technologies is the leader in highly-available, distributed database technologies used to power scalable, data-intensive Web, mobile, and e-commerce applications and large cloud computing platforms. Basho customers, including fast-growing Web businesses and large Fortune 500 enterprises, use Riak to implement content delivery platforms and global session stores, to aggregate large amounts of data for logging, search, and analytics, to manage, store and stream unstructured data, and to build scalable cloud computing platforms.
Riak is available open source for download at http://wiki.basho.com/Riak.html. Riak EnterpriseDS is available with advanced replication, services and 24/7 support. Riak CS enables mutli-tenant object storage with advanced reporting and an Amazon S3 compatible API. For more information visit www.basho.com or follow us on Twitter at www.twitter.com/basho.
May 24, 2012
Kiip has been using Riak in production for about three months, and Armon and Mitchell shared a lot of great information on their deployment: why they determined their initial, MongoDB-based system wouldn’t scale; their process for selecting Riak over various other database options; how, specifically, they are using Riak in production; what Riak isn’t suited for; some of the production issues they hit along the way; and much more.
The talk runs just over 30 minutes, and it’s well worth your time. Armon and Mitchell are smart, honest, articulate speakers who care deeply about the quality and long-term viability of the systems they put into production at Kiip. Also, if you like slides, they can be found here. Lastly, if you want to work with this team, Kiip is hiring.
May 22, 2012
A handful of the Basho team are descending on Europe, attending and speaking at various conferences and meetups, and we couldn’t be more excited to meet and mingle with the growing European Riak community.
GOTO Copenhagen began yesterday and runs through May 23. The GOTO conference series are international events put on for, and by, software developers. This year’s theme is “Real Stories from Real People” and attendees can expect to learn about solving real life problems form real life experiences from a number of leading experts and authors.
Put these talks in your calendar:
Basho will also have a booth on the exhibition floor. Be sure to stop by in between talks to chat with Ian or Tom about Riak, distributed systems or your favorite fancy cocktail.
GOTO is on a roll this year, with two European conferences scheduled back to back. GOTO Amsterdam is hosted at Beurs van Berlage. For a two day conferences, the list of speakers is quite impressive so kudos to the GOTO Program Advisory Board for putting this one together.
In addition to a booth on the exhibition floor, Andy Gross will deliver the following talk that cannot be missed:
Be sure to corner [Andy](https://twitter.com/#!/argv0) or [Tom](https://twitter.com/#!/tsantero) in between sessions and ask them hard questions about Riak.
### NoSQL Matters
Basho’s [Tom Santero](https://twitter.com/#!/tsantero) will be attending [NoSQL Matters](http://www.nosql-matters.org/) set to take place in Cologne, Germany. This is a brand new conference, and we have very high expectations for success considering the caliber of [speakers](http://www.nosql-matters.org/speakers/) on deck.
If you’re in attendance, be sure not to miss these talks from members of the Riak community:
* [Designing for Concurrency with Riak](http://www.nosql-matters.org/agenda/) – Mathias Meyer
* [Theoretical Aspects of Distributed Systems, Playfully Illustrated](http://www.nosql-matters.org/agenda/) – Pavlo Baron
### Erlang User Conference
**May 28 – June 1**
Stockholm plays host to this year’s [Erlang User Conference](http://www.erlang-factory.com/conference/ErlangUserConference2012). The events put on by Erlang Solutions are usually exceptional, and Basho will be well represented this year.
The conference itself last for two days, Monday and Tuesday, followed by a day of tutorials on Wednesday and then wrapped up with two days of workshops on Thursday and Friday.
We’ll be delivering the following talks:
* [Sweden’s Next Top NoSQL Data Model](http://www.erlang-factory.com/conference/ErlangUserConference2012/speakers/IanPlosker) – Ian Plosker
* [Innovation: What Every Developer Absolutely Needs To Know](http://www.erlang-factory.com/conference/ErlangUserConference2012/speakers/SteveVinoski) – Steve Vinoski
Basho’s VP of Engineering, [Dizzy Smith](https://twitter.com/#!/dizzyd), will host a [tutorial](http://www.erlang-factory.com/conference/ErlangUserConference2012/speakers/DizzySmith) demonstrating [Rebar](https://github.com/basho/rebar), an open-source build-system for Erlang/OTP applications.
[Ian Plosker](https://twitter.com/#!/dstroyallmodels) will be running a two day class on
[Building distributed clusters with Riak](http://www.erlang-factory.com/conference/ErlangUserConference2012/university/RiakTraining). Everyone who attends will walk away with a very clear understanding of just why Riak is the best distributed database you will ever run in production.
### London Riak Meetup
Basho is pleased to announce that Ian Plosker will be hosting the [Inaugural London Riak Meetup](http://www.meetup.com/riak-london/events/62061262/).
This first meetup in London will feature a talk by Basho’s VP of Engineering, [Dizzy Smith](https://twitter.com/#!/dizzyd) and is hosted in Google’s new [co-working space](http://www.campuslondon.com/).
If you’re in or around London on May 30, missing this is not optional.
*Don’t forget to [RSVP](http://www.meetup.com/riak-london/events/62061262/).*
[EuRuKo](http://www.euruko2012.org/) is an annual Ruby conference hosted in Amsterdam. All attendees can expect a killer [venue](http://www.euruko2012.org/#venue), awesome [lineup of speakers](http://www.euruko2012.org/#speakers) and a fancy Boat Party sponsored by GitHub.
[Sean Cribbs](https://twitter.com/#!/seancribbs) will give a talk titled *A Case of Accidental Concurrency* – if you haven’t been lucky enough to hear Sean speak in person before, than you’re in for a real treat.
### Berlin Buzzwords
Last, but certainly not least, we’ll be at [Berlin Buzzwords](http://berlinbuzzwords.de/) for two days of [brilliant technologists](http://berlinbuzzwords.de/speakers), [hackathons](http://berlinbuzzwords.de/wiki/hackathons) and training.
The theme of this conference is “search”, “store” and “scale”, our natural habitat so to speak.
You’ll get to hear form the Basho team:
* [Germany’s Next Top Data Model](http://berlinbuzzwords.de/sessions/germanys-next-top-data-model) – Ian Plosker
* [Eventually Consistent Data Structures](http://berlinbuzzwords.de/sessions/eventually-consistent-data-structures) – Sean Cribbs
As well as the Riak community:
* [From Hand to Mouth](http://berlinbuzzwords.de/sessions/hand-mouth) – Pavlo Baron
May 14, 2012
I’m excited to announce that Matt Ranney and the team at Voxer just open-sourced their production Node.js client for Riak, node_riak. Voxer has been using this code in production for months now, and it has been battle-tested with Riak clusters doing billions of requests daily and that are storing on the order of 100s of terabytes of raw data.
In addition to more code, Matt and his team will be working to add documentation and other resources to the repo as time allows. Interested parties are encouraged to fork, use, and contribute.
Enjoy, and thanks again to Matt and his team for releasing the library. Here’s the link to the repo once more: node_riak
Also, if you want to work with Riak, Node.js, and other technologies that power Voxer, check out their job listings. You won’t be disappointed.
May 9, 2012
Today I am excited to introduce a new piece of infrastructure to the Riak Community on which we’ve been working: Riak Community Release Notes.
Much like codebases grow and evolve, so does a community and its accomplishments. Why not present and chronicle the community in the same way you would a piece of code? The Riak Community Release Notes are an attempt to do just that.
Each month, we’ll tag and release a new “version” of the Riak Community. The most recent (and first official) release is v0.2. Each release will represent the evolution of the Community as demonstrated by our collective work and activity. For example:
- There were more than 20 blog posts about Riak in April
- Some of our users raised funding and welcomed new children to their family
- We released some (but not enough) new documentation.
My hope is that this will grow into a collaborative effort to track the trajectory of Riak and our user community. It looks somewhat like the Riak Recap, but I think it’ll extend and surpass it in a lot of ways. Most importantly, it’s an experiment, and I’m looking forward to how it evolves. Pull requests, feedback, and criticisms are welcomed.
Thanks for being a part of Riak.
April 26, 2012
At Basho we love Yammer. Besides making a product that we rely on internally, they are long-time Riak fans and advocates, and have built a large Riak cluster to power notifications for their entire user base. But not every use case is a fit for Riak. Running multiple databases in production is not uncommon, and skilled engineering teams like Yammer’s will always select the best tool for the job.
To that end, Ryan Kennedy, Yammer’s Director of Core Services, presented at BashoChats 003 about some of the impressive work that he and his colleagues are doing with Berkeley DB. He goes in depth on how they came to select BDB, what they added on top of Berkeley to ensure it could scale and satisfy their availability requirements, and what their data set and request profile look like in production. There’s a lot of worthwhile and valuable information in here. (Ryan’s slides are here if you’re interested in the PDF.
Enjoy, and if you’re interested in speaking at a future BashoChats meetup, email me – email@example.com. Also, if you want to work with companies like Yammer, Twitter, Square, Simple, LinkedIn, and Basho building distributed systems, you should be at the next meetup. Keep an eye on the Meetup page for details.
April 26, 2012
Here at Basho we want to make sure that your Riak implementations are set up from the beginning to succeed. While you can use the Riak Fast Track to quickly set up a 3-node dev/test environment, we recommend that all production deployments use a minimum of 5 nodes, ensuring you benefit from the architectural principles that underpin Riak’s availability, fault-tolerance and scaling properties.
TL;DR: Deployments of five nodes or greater will provide a foundation for the best performance and growth as the cluster expands. Since Riak scales linearly with the addition of more nodes, users find improved performance, reliability, and throughput with larger clusters. Smaller deployments can compromise the fault-tolerance of the system: with a “sane” replication requirement for availability (we default to three copies), node failures in smaller clusters mean that replication requirements may not be met. This can result in degraded performance and risk of data loss. Additionally, clusters smaller than five nodes mean that with a sane replication requirement of 3, a high percentage (75-100% of the nodes) will need to respond to each request, putting undue load on the cluster that may degrade performance.
Let’s take a closer look in the scenario of a three- and four-node cluster.
Performance and Fault Tolerance Concerns in a 3-Node Cluster
To ensure that the cluster is always available to respond to read and write requests, Basho recommends a “sane default” for data replication: three copies of the data on three different nodes. The default configuration of Riak requires four nodes at minimum to insure no single node holds more than one copy of any particular piece of data. (In future versions of Riak we’ll be able to guarantee that each replica is living on a separate physical node. At this point it’s almost at 100%, but we won’t tell you it’s guaranteed until it is.) While it is possible to change the settings to ensure that the three replicas are on distinct nodes in a three node cluster, you still run into issues of replica placement during a node failure or network partition.
In the event of node failure or a network partition in a three-node cluster, the default requirement for replication remains three but there are only two nodes available to service requests. This will result in degraded performance and carries a risk of data loss.
Performance and Fault Tolerance Concerns in a 4-Node Cluster
With a requirement of three replicas, any one request for a particular piece of data from a 4-node cluster will require a response from 75 – 100% of the nodes in the cluster, which may result in degraded performance. In the event of node failure or a network partition in a 4-node cluster, you are back to the issues we outline above.
What if I want to change the replication default?
If using a different data replication number is right for your implementation, just make sure to use a cluster of N +2 nodes where N is the number of replicas for the reasons outlined above.
Going With 5 Nodes
As you add nodes to a Riak cluster that starts with 5 nodes, the percentage of the cluster required to service each request goes down. Riak scales linearly and predictably from this point on. When a node is taken out of service or fails, the number of nodes remaining is large enough to protect you from data loss.
So do your development and testing with smaller clusters, but when it comes to production, start with five nodes.