May 22, 2012
A handful of the Basho team are descending on Europe, attending and speaking at various conferences and meetups, and we couldn’t be more excited to meet and mingle with the growing European Riak community.
GOTO Copenhagen began yesterday and runs through May 23. The GOTO conference series are international events put on for, and by, software developers. This year’s theme is “Real Stories from Real People” and attendees can expect to learn about solving real life problems form real life experiences from a number of leading experts and authors.
Put these talks in your calendar:
Basho will also have a booth on the exhibition floor. Be sure to stop by in between talks to chat with Ian or Tom about Riak, distributed systems or your favorite fancy cocktail.
GOTO is on a roll this year, with two European conferences scheduled back to back. GOTO Amsterdam is hosted at Beurs van Berlage. For a two day conferences, the list of speakers is quite impressive so kudos to the GOTO Program Advisory Board for putting this one together.
In addition to a booth on the exhibition floor, Andy Gross will deliver the following talk that cannot be missed:
Be sure to corner [Andy](https://twitter.com/#!/argv0) or [Tom](https://twitter.com/#!/tsantero) in between sessions and ask them hard questions about Riak.
### NoSQL Matters
Basho’s [Tom Santero](https://twitter.com/#!/tsantero) will be attending [NoSQL Matters](http://www.nosql-matters.org/) set to take place in Cologne, Germany. This is a brand new conference, and we have very high expectations for success considering the caliber of [speakers](http://www.nosql-matters.org/speakers/) on deck.
If you’re in attendance, be sure not to miss these talks from members of the Riak community:
* [Designing for Concurrency with Riak](http://www.nosql-matters.org/agenda/) – Mathias Meyer
* [Theoretical Aspects of Distributed Systems, Playfully Illustrated](http://www.nosql-matters.org/agenda/) – Pavlo Baron
### Erlang User Conference
**May 28 – June 1**
Stockholm plays host to this year’s [Erlang User Conference](http://www.erlang-factory.com/conference/ErlangUserConference2012). The events put on by Erlang Solutions are usually exceptional, and Basho will be well represented this year.
The conference itself last for two days, Monday and Tuesday, followed by a day of tutorials on Wednesday and then wrapped up with two days of workshops on Thursday and Friday.
We’ll be delivering the following talks:
* [Sweden’s Next Top NoSQL Data Model](http://www.erlang-factory.com/conference/ErlangUserConference2012/speakers/IanPlosker) – Ian Plosker
* [Innovation: What Every Developer Absolutely Needs To Know](http://www.erlang-factory.com/conference/ErlangUserConference2012/speakers/SteveVinoski) – Steve Vinoski
Basho’s VP of Engineering, [Dizzy Smith](https://twitter.com/#!/dizzyd), will host a [tutorial](http://www.erlang-factory.com/conference/ErlangUserConference2012/speakers/DizzySmith) demonstrating [Rebar](https://github.com/basho/rebar), an open-source build-system for Erlang/OTP applications.
[Ian Plosker](https://twitter.com/#!/dstroyallmodels) will be running a two day class on
[Building distributed clusters with Riak](http://www.erlang-factory.com/conference/ErlangUserConference2012/university/RiakTraining). Everyone who attends will walk away with a very clear understanding of just why Riak is the best distributed database you will ever run in production.
### London Riak Meetup
Basho is pleased to announce that Ian Plosker will be hosting the [Inaugural London Riak Meetup](http://www.meetup.com/riak-london/events/62061262/).
This first meetup in London will feature a talk by Basho’s VP of Engineering, [Dizzy Smith](https://twitter.com/#!/dizzyd) and is hosted in Google’s new [co-working space](http://www.campuslondon.com/).
If you’re in or around London on May 30, missing this is not optional.
*Don’t forget to [RSVP](http://www.meetup.com/riak-london/events/62061262/).*
[EuRuKo](http://www.euruko2012.org/) is an annual Ruby conference hosted in Amsterdam. All attendees can expect a killer [venue](http://www.euruko2012.org/#venue), awesome [lineup of speakers](http://www.euruko2012.org/#speakers) and a fancy Boat Party sponsored by GitHub.
[Sean Cribbs](https://twitter.com/#!/seancribbs) will give a talk titled *A Case of Accidental Concurrency* – if you haven’t been lucky enough to hear Sean speak in person before, than you’re in for a real treat.
### Berlin Buzzwords
Last, but certainly not least, we’ll be at [Berlin Buzzwords](http://berlinbuzzwords.de/) for two days of [brilliant technologists](http://berlinbuzzwords.de/speakers), [hackathons](http://berlinbuzzwords.de/wiki/hackathons) and training.
The theme of this conference is “search”, “store” and “scale”, our natural habitat so to speak.
You’ll get to hear form the Basho team:
* [Germany’s Next Top Data Model](http://berlinbuzzwords.de/sessions/germanys-next-top-data-model) – Ian Plosker
* [Eventually Consistent Data Structures](http://berlinbuzzwords.de/sessions/eventually-consistent-data-structures) – Sean Cribbs
As well as the Riak community:
* [From Hand to Mouth](http://berlinbuzzwords.de/sessions/hand-mouth) – Pavlo Baron
May 14, 2012
I’m excited to announce that Matt Ranney and the team at Voxer just open-sourced their production Node.js client for Riak, node_riak. Voxer has been using this code in production for months now, and it has been battle-tested with Riak clusters doing billions of requests daily and that are storing on the order of 100s of terabytes of raw data.
In addition to more code, Matt and his team will be working to add documentation and other resources to the repo as time allows. Interested parties are encouraged to fork, use, and contribute.
Enjoy, and thanks again to Matt and his team for releasing the library. Here’s the link to the repo once more: node_riak
Also, if you want to work with Riak, Node.js, and other technologies that power Voxer, check out their job listings. You won’t be disappointed.
May 9, 2012
Today I am excited to introduce a new piece of infrastructure to the Riak Community on which we’ve been working: Riak Community Release Notes.
Much like codebases grow and evolve, so does a community and its accomplishments. Why not present and chronicle the community in the same way you would a piece of code? The Riak Community Release Notes are an attempt to do just that.
Each month, we’ll tag and release a new “version” of the Riak Community. The most recent (and first official) release is v0.2. Each release will represent the evolution of the Community as demonstrated by our collective work and activity. For example:
- There were more than 20 blog posts about Riak in April
- Some of our users raised funding and welcomed new children to their family
- We released some (but not enough) new documentation.
My hope is that this will grow into a collaborative effort to track the trajectory of Riak and our user community. It looks somewhat like the Riak Recap, but I think it’ll extend and surpass it in a lot of ways. Most importantly, it’s an experiment, and I’m looking forward to how it evolves. Pull requests, feedback, and criticisms are welcomed.
Thanks for being a part of Riak.
April 26, 2012
At Basho we love Yammer. Besides making a product that we rely on internally, they are long-time Riak fans and advocates, and have built a large Riak cluster to power notifications for their entire user base. But not every use case is a fit for Riak. Running multiple databases in production is not uncommon, and skilled engineering teams like Yammer’s will always select the best tool for the job.
To that end, Ryan Kennedy, Yammer’s Director of Core Services, presented at BashoChats 003 about some of the impressive work that he and his colleagues are doing with Berkeley DB. He goes in depth on how they came to select BDB, what they added on top of Berkeley to ensure it could scale and satisfy their availability requirements, and what their data set and request profile look like in production. There’s a lot of worthwhile and valuable information in here. (Ryan’s slides are here if you’re interested in the PDF.
Enjoy, and if you’re interested in speaking at a future BashoChats meetup, email me – firstname.lastname@example.org. Also, if you want to work with companies like Yammer, Twitter, Square, Simple, LinkedIn, and Basho building distributed systems, you should be at the next meetup. Keep an eye on the Meetup page for details.
April 26, 2012
Here at Basho we want to make sure that your Riak implementations are set up from the beginning to succeed. While you can use the Riak Fast Track to quickly set up a 3-node dev/test environment, we recommend that all production deployments use a minimum of 5 nodes, ensuring you benefit from the architectural principles that underpin Riak’s availability, fault-tolerance and scaling properties.
TL;DR: Deployments of five nodes or greater will provide a foundation for the best performance and growth as the cluster expands. Since Riak scales linearly with the addition of more nodes, users find improved performance, reliability, and throughput with larger clusters. Smaller deployments can compromise the fault-tolerance of the system: with a “sane” replication requirement for availability (we default to three copies), node failures in smaller clusters mean that replication requirements may not be met. This can result in degraded performance and risk of data loss. Additionally, clusters smaller than five nodes mean that with a sane replication requirement of 3, a high percentage (75-100% of the nodes) will need to respond to each request, putting undue load on the cluster that may degrade performance.
Let’s take a closer look in the scenario of a three- and four-node cluster.
Performance and Fault Tolerance Concerns in a 3-Node Cluster
To ensure that the cluster is always available to respond to read and write requests, Basho recommends a “sane default” for data replication: three copies of the data on three different nodes. The default configuration of Riak requires four nodes at minimum to insure no single node holds more than one copy of any particular piece of data. (In future versions of Riak we’ll be able to guarantee that each replica is living on a separate physical node. At this point it’s almost at 100%, but we won’t tell you it’s guaranteed until it is.) While it is possible to change the settings to ensure that the three replicas are on distinct nodes in a three node cluster, you still run into issues of replica placement during a node failure or network partition.
In the event of node failure or a network partition in a three-node cluster, the default requirement for replication remains three but there are only two nodes available to service requests. This will result in degraded performance and carries a risk of data loss.
Performance and Fault Tolerance Concerns in a 4-Node Cluster
With a requirement of three replicas, any one request for a particular piece of data from a 4-node cluster will require a response from 75 – 100% of the nodes in the cluster, which may result in degraded performance. In the event of node failure or a network partition in a 4-node cluster, you are back to the issues we outline above.
What if I want to change the replication default?
If using a different data replication number is right for your implementation, just make sure to use a cluster of N +2 nodes where N is the number of replicas for the reasons outlined above.
Going With 5 Nodes
As you add nodes to a Riak cluster that starts with 5 nodes, the percentage of the cluster required to service each request goes down. Riak scales linearly and predictably from this point on. When a node is taken out of service or fails, the number of nodes remaining is large enough to protect you from data loss.
So do your development and testing with smaller clusters, but when it comes to production, start with five nodes.
April 18, 2012
Basho’s awesome designer John Newman spent some time putting together beautiful badges that will make great additions to your sites if you’re running Riak and want to tell the world.
We’ve also added them to the Riak Wiki. Now you can run the best open source database in the game and tell people about it.
Enjoy and thanks for being a part of Riak.
March 26, 2012
This is a big week for Basho.
The first three days of Erlang Factory are primarily workshops, and Daniel Reverri will be teaching a 3 day class on Building Distributed Clusters with Riak. All attendees will walk away with a clear understanding of exactly why Riak is the best distributed database you will ever run in production.>
The actual conference spans Thursday – Friday, and the talk lineup for this year’s event is exceptional. The Basho team will be well-represented. Put these talks on your calendar if you’re attending:
- Test-First Construction of Distributed Systems – Joseph Blomstedt
- Building Healthy Distributed Systems – Mark Phillips
- Building Cloud Storage Services with Riak – Andy Gross
Several members of the Riak Community are also on the schedule:
- Erlang for .NET Developers – OJ Reeves
- Rewriting GitHub Pages with Riak Core, Riak KV, and Webmachine – Jesse Newland
Basho Bash West
We’re really excited about all the success surrounding Riak in 2011 and we’re continuously building on that momentum as we move deeper into 2012. The number of Riak users and community members are growing exponentially so we decided to throw a party to celebrate. We’re calling it Basho Bash West 2012, and it’s co-sponsored by our friends at Joyent, Yammer and Voxer.
Come join us on Thursday, March 29th, at 6:30PM. We are renting out Roe, and you won’t be allowed to pay for anything. You’ll also be leaving with some limited edition Riak swag that will make you the envy of all your friends. Various members of the Basho team will be in attendance, along with hundreds of developers, executives, and technology enthusiasts from the Bay Area. Miss this at your peril.
You must RSVP to attend.
February 22, 2012
Riak Control is Basho’s new OSS, REST-driven, user-interface for Riak. The code has been available for a few months now, but it’s officially supported in Riak 1.1, so we wanted share some details on what it’s about and why you should be excited about it.
Lowering the Barrier for Entry
Once a Riak cluster is up and running we want it to be as hands-free to administer as possible. Things should “just work,” like plumbing. But we’ll be the first to admit that a new user’s initial welcoming with Riak isn’t always as pleasant as it should be.
Some steps are unavoidable: downloading, installing, and/or building from source, etc. But once the initial work is done, the experience should be as inviting as possible. Riak is a very powerful database with numerous options and commands. Riak Control allows you to easily manage/inspect your cluster while ignoring many of these until needed.
Empowering Riak Administrators
When we first sat down to decide what the more important features for any Riak interface should be, one theme stood out above all the others: cluster management. We wanted to give developers and administrators the ability to quickly build a cluster, inspect nodes, and diagnose the health of their cluster. And we wanted it to happen fast.
Riak is about large datasets and clusters replicating that dataset for maximum availability and persistence. We’re working hard to help companies that write many, many GBs per day to clusters containing 50+ nodes. Riak Control is a tool that brings issues and risks front-and-center. And it gives customers the ability to take action in real-time.
The Two-Minute Tour
Riak Control is currently broken up into nested levels of detail. Each page in Riak Control is designed to give you just as much information as you need, nothing more. As you navigate the UI, you’ll gradually be taken deeper into the rabbit hole.
The Snapshot is what you’ll see when you first fire up Riak Control. It should give you a warm-fuzzy feeling when everything is A-okay: an unmistakable, beautiful green check mark.
For times when things aren’t perfect, you will be presented with a list of concern areas. Each will have links to other pages of Riak Control where you can take a closer look at the problem.
The cluster page is where you can get a quick look at all of the nodes in your cluster and manage membership.
With a glance you can see which nodes are partitioned from the rest of the cluster or offline, which are leaving or joining the cluster, view partition ownership, monitor memory, and more. And with a click you can add nodes to the cluster, take nodes offline for maintenance, and leave the cluster.
One level deeper than the cluster view is the ring page. This is where you can see the health of each partition. Most of the time, your ring will be too large to really manage from the ring view. But with the filters you can immediately find which partitions are owned by which nodes, partitions whose primary nodes are unreachable, current handoffs, and more.
Riak Control is not standing still. Riak 1.1 includes Riak Control in its early stages so we can begin to gather feedback. We want to know what it does right and what it does wrong. Your feedback and ideas are encouraged. Additionally, we have a list of features and functionality slated for future releases. None of these are set in stone, but here is a list of what we have planned…
While Riak Control is – at its heart – a simple REST API, we’re working to modularize it in a way that allows you to write your own modules/plugins. We want to see Riak Control become a collection of pieces that all snap and work together, empowering you to manage your cluster in the way that best fits your needs.
Currently Riak Control uses a pull model to gather information about the cluster. While this isn’t a performance issue, we very much want to make it a push-system. As things happen to the cluster, the cluster should notify Riak Control of the changes, which in-turn will notify the user.
Clicking on a node name from anywhere should take you to a page giving details specifically about that node, similar to the data you would get from a
riak-admin status command.
Bucket & Object Inspection
While low-level object manipulation isn’t designed to be a primary feature of Riak Control, it is a very handy tool to have, and extremely valuable when initially setting up Riak for the first time. More importantly, Riak buckets will be available to create and inspect.
Riak Control will feature a powerful interface for creating MapReduce queries. You will be able to debug, save, load, and execute previously saved queries with ease.
Customer Support Tools
In addition to the general tools provided for manipulation of the cluster and data, we also are planning for improve monitoring tools.
- View the log files of individual nodes
- See graphs of load, memory latency, disk usage, etc.
- Coalesce and bundle data for support tickets
- File support tickets
Any Comments, Questions, or Feature Requests?
Anything you’d like to share or ask? Join the Riak-Users Mailing List and tell us what you think. The other option is to fork the code and make your opinions known with a pull request or by filing an issue. You can also find some formal documentation on the Riak Wiki.
Thanks for being a part of Riak.
February 07, 2012
Basho’s usage of Travis is mainly via Riak’s client libraries and we’ve been thrilled with it to date. Over the past few months, Sean Cribbs, Basho’s lead on Riak’s Ruby Client, has been using Travis. Reid Draper, another member of the Basho Team, has also started using it for builds of Sumo, a Riak Clojure Client that he and a few others are spearheading. There is also a handful of other Riak-related projects that are using Travis regularly.
Based on Sean and Reid’s experience with Travis and their opinion of the project’s usefulness and viability, it was an easy decision for us to donate money to support its ongoing development. Basho is also dedicated to the usage and proliferation of quality open source projects, and Travis is a great example of this. So we are proud to be a Silver Sponsor of Travis CI and are looking forward to watching the tool and its community grow.
January 31, 2012
The videos from last month’s San Francisco Riak Meetup are online and ready for consumption. The first features Julio Capote giving a short overview of the work he and Posterous are doing with Riak as a post cache. The second presentation was from Mark Phillips and it was all about Riak Control, the new Riak Admin Tool that will be fully supported in the forthcoming Riak 1.1 release.
Enjoy, and thanks for being a part of Riak.
Riak In Production At Posterous
This talk runs about 11 minutes. In it, Julio details the importance of the post cache at Posterous, what their initial solution to the problem was, and how they went about selecting Riak over MongoDB, MySQL, and Redis.
Preview of Riak Control
This talk runs just under 30 minutes. Mark starts with a history of the Riak Admin UI, details Basho’s motivations for writing and open-sourcing Riak Control, and then gives a live demo of the tool and talks about future enhancements.