January 30, 2013
Many teams run Riak in public cloud environments, either as a part of their infrastructure or as the foundation of it. Increasingly, we see enterprises and startups use a hybrid implementation that leverages both private infrastructure and public cloud services. This hybrid model is often used to address burst capacity issues, tenancy/location concerns, and simple proof-of-concept implementations prior to hardware acquisition.
Over the past few years, we have seen substantive adoption of Riak on Amazon Web Services. To that end, we are pleased that Basho has been approved as an Amazon Web Services Technology Partner. We look forward to highlighting interesting use cases, publishing detailed case studies of usage, and continuing to improve the usability and deployment speed of Riak on the AWS platform.
This post provides a high-level overview of your deployment options for using Riak on Amazon.
How Many Nodes?
Before we discuss the mechanics of implementation, it is important to consider the size of your deployment. One of the most frequent questions Basho is asked is, “How many nodes should I start with?”
If you have played with the Riak Fast Track you are familiar with deploying three nodes on a single machine. However, for production deployments, we recommend that your cluster be setup with a minimum of five nodes. For more details on how this minimum ensures the performance and availability of your implementation, please read the post entitled: Why Your Riak Cluster Should Have At Least Five Nodes.
So, you have a minimum of five nodes and you’ve decided that leveraging a cloud provider is appropriate for your current business needs. Now, how do you get started?
Amazon Machine Image
At its simplest, an Amazon Machine Image (AMI) is a pre-built machine image and configuration of Riak for Amazon EC2 users.
Obtaining and configuring the image is a relatively straightforward process. However, since Riak needs the nodes in the cluster to communicate with each other, there is some manual setup involved.
First, provision the Riak AMI onto the server of your choice via the AWS marketplace.
Once the virtual machine is created, manually configure the EC2 security group to allow the Riak nodes to speak to each other. The details of this step can be found on our docs portal under Installing on AWS Marketplace. However, this is generally as simple as opening a few inbound ports and defining a “Custom TCP rule.”
At this point, the machines can be clustered together. When the individual virtual machines are provisioned and the security group is configured, simply SSH into each machine and use internal riak-admin tools to join the nodes to the cluster.
But what if you want to automate some of the configuration of your cluster? Or, what if you want the ability to setup a VPC-based stack that includes:
- a front-end load balancer,
- a cluster of application servers,
- a Riak powered demo application,
- a back-end load balancer,
- and a cluster of Riak servers.
In that case, the Basho team has made available scripts that leverage AWS CloudFormation to build out your cluster in a scripted fashion.
Since this is a much different process than the previous method, it is well worth watching the introductory video (embedded below). In addition, the scripts in the cloudformation-riak repo can be thought of as “known good” templates. We accept Pull Requests and happy forking!
As always, there is a manual option.
If you need to control the system configuration or are most comfortable with software that you have built and deployed yourself, there is always the option to install from package or source.
For a full list of supported operating systems, check out the Installing and Upgrading page of the doc portal. In addition, we have recently launched a new download page that includes the source for the OSS version of Riak.
And easier to deploy than ever before. If you have feedback on present deployment alternatives, or recommendations on ways to make Riak support for cloud infrastructure easier, please drop us a note in the mailing list.
January 29, 2013
This is the first in a series of blog posts covering the benefits Riak offers to developers and operators of retail and eCommerce platforms. To learn more, join our “Retail on Riak” webcast on Friday, February 8th.
As retailers grow and have to store more and more data, traditional relational databases aren’t always the best option. Retailers want to scale easily, without the operational burden of manual sharding. Meanwhile, business requirements demand their data is always available for reads and writes. Riak is a highly available, low latency distributed database that is ideal for retailers who need to serve product data quickly and maintain “always on” shopping experiences. Riak is based on architectural principles from Amazon. Riak is designed for high availability and scale so retailers can always serve customers, even under failure conditions, and rapidly grow to meet peak loads.
Retailers of all sizes have chosen Riak to power parts of their business, including:
- Best Buy: Best Buy is North America’s top specialty retailer of consumer electronics, personal computers, entertainment software, and appliances. Riak has been an integral part in the transformation push to re-platform Best Buy’s eCommerce platform. For more info, check out Best Buy’s talk from our 2012 developer conference, RICON.
- ideeli: ideeli is one of the fastest growing retailers with over 5 million members and more than 1,000 brand partners. They use Riak to serve HTML documents and user-specific products. ideeli chose Riak to power their event-based shopping experience due to Riak’s ability to serve users information at low latency and provide ease of use and scale to ideeli’s operations team. Check out the complete case study for more details.
- Copious: Copious is a social commerce marketplace that makes it easy for people to buy and sell the things they love. They currently store all registered accounts in Riak as well as the tokens that make it possible for users to authenticate with Copious via their Facebook or Twitter accounts. They chose to use Riak for their social login functionality because of its operational simplicity, which allows them to easily scale up without sharding and provides the high availability required for a smooth user experience. For more details, check out the complete Copious story on our blog.
For more information about the benefits of Riak for retailers and the retailers already using it, register for our “Retail on Riak” webcast on February 8th!
January 22, 2013
Traditionally, most retailers have used relational databases to manage their platforms and eCommerce sites. However, with the rapid growth of data and business requirements for high availability and scale, more retailers are looking at non-relational solutions like Riak.
Riak is a masterless, distributed database that provides retailers with high read and write availability, fault-tolerance and the ability to grow with low operational cost. Architectural, operational and development benefits for retailers include:
- “Always On” Shopping Experience: Based on architectural principles from Amazon, Riak is designed to favor data availability, even in the event of hardware failure or network partition. For retailers, failure to accept additions to a shopping cart, or serve product information quickly, has a direct and negative impact on revenue. Riak is architected to ensure the system can always accept writes and serve reads at low-latency.
- Resilient Infrastructure: At scale, hardware malfunction, network partition, and other failure modes are inevitable. Riak provides a number of mechanisms to ensure that retail infrastructure is resilient to failure. Data is replicated automatically within the cluster so nodes can go down but the system still responds to requests. This ensures read and write availability, even in serious failure conditions.
- Low-Latency Data Storage: Many retailers now operate online and mobile experiences with an API or data services platform. In order to provide a fast and available experience to end users, Riak is designed to serve predictable, low-latency requests as part of a service-oriented infrastructure and is accessible via HTTP API, protocol buffers, or Riak’s many client libraries.
- Scale to Peak Loads with Low Operational Cost: During major holidays and other periods of peak load, retailers may have to significantly increase their database capacity quickly. When new nodes are added, Riak automatically distributes data evenly to naturally prevent hot spots in the database, and yields a near-linear increase in performance and throughput when capacity is added.
- Global Data Locality and Redundancy: Riak Enterprise’s multi-site replication allows replication of data to multiple data centers, providing both a global data footprint and the ability to survive datacenter failure.
Top retailers using Riak include Best Buy and ideeli. Best Buy selected Riak as an integral part in the transformation push to re-platform its eCommerce platform. For more information about how Best Buy is using Riak, check out this video.
ideeli uses Riak to serve HTML documents and user-specific products. ideeli chose Riak to provide its highly available, event-based shopping experience – Riak gives them the ability to serve user information at low latency and provides ease of use and scale to ideeli’s operations team. For more information on ideeli’s use of Riak check out the complete case study.
Common use cases for Riak in the retail/eCommerce space include shopping carts (due to Riak’s “always-on” capabilities), product catalogs (Riak is well suited for the storage of rapidly growing content that needs to be served at low-latency), API platforms (Riak’s flexible, schemaless design allows for rapid application development), and mobile applications (Riak is ideal for powering mobile experiences across platforms due to its low-latency, always-available small object storage capabilities).
To help retailers evaluate and adopt Riak, we’ve published a technical overview: “Retail on Riak: A Technical Introduction.” We discuss more in-depth information on modeling applications for common use cases, switching from a relational architecture, querying, multi-site replication and more.
January 21, 2013
Mad Mimi is an email marketing service that allows users to create, send, and track email campaigns without using templates. With over 100,000 clients, Mad Mimi is storing a large amount of data that needs to be accessed quickly and easily.
In 2011, Mad Mimi realized that their data was growing beyond the capacity of their MySQL database. Rather than resharding the data, which would require an extensive operational effort, Mad Mimi decided to try Riak based on its ability to scale quickly and easily without manual sharding.
Mad Mimi now uses Riak to track email statistics, leveraging the secondary indexing feature to make retrieving data easier. Secondary indexing allows users to attach additional key/value data to Riak objects and query them by exact match or range value. Mad Mimi is currently running an 8 node cluster storing between three and five billion keys, adding between 10-20 million keys each day.
Since launching with Riak, their cluster has never gone down and it is still as fast as ever. Based on this success, they hope to move all their email tracking statistics to Riak and eliminate MySQL entirely.
For more details on Mad Mimi’s experience with Riak, check out the case study, “Email Marketing Success with Mad Mimi and Riak.”
For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.
November 30, 2012
In September, we announced a partnership with Citrix CloudPlatform to provide integrated storage and compute capabilities. Basho and Citrix are working together on a combined platform that provides highly available cloud storage with multi-data center capabilities as part of the CloudPlatform solution for private, hybrid and public clouds.
With this month’s release of multi-datacenter features in Riak CS, we’re able to provide CloudPlatform users with highly available, multi-site cloud storage. We’re also working on authentication support for Citrix CloudPlatform in Riak CS for even more seamless integration.
“The developers of Riak have done a great job helping to extend Apache CloudStack, enabling users to use an S3-compatible object store for secondary storage,” said Citrix VP of Product Development and Apache CloudStack committer, Kevin Kluge. “We are also looking forward to having the option to use storage replication across zones as part of their Citrix CloudPlatform compatible Riak CS product.”
This weekend we’re at the CloudStack Collaboration Conference to talk to users about Riak CS and CloudStack, and we’ve got tons of free trials for Riak CS to pass out. If you’re at the Collaboration Conference, be sure to attend the technical overview presented by Basho Chief Architect Andy Gross, taking place at 10:45 AM PT in RM607. And make sure to track us down on Twitter if you’d like to talk more.
Several Basho team members will be presenting on distributed systems topics at QCon San Francisco.
SAN FRANCISCO, CA – November 7, 2012 – Attending QCon International Software Development Conference this week in San Francisco? We’d love to meet up and talk to you about Riak! You can catch us in the exhibitor’s hall all week, or at the welcome party taking place after the talks Wednesday, November 7 at Thirsty Bear. Additionally, several Basho team members will be presenting on distributed systems topics. Check out the talk synopsis below and hope to see you there.
Thursday, November 8
Riak and Dynamo, 5 Years Later
Andy Gross, Basho Chief Architect
October 2012 marks the five year anniversary of Amazon’s seminal Dynamo paper, which inspired most of the NoSQL databases that appeared shortly after its publication, including Riak. In this session, Andy will reflect on five years of involvement with Riak and distributed databases and discuss what went right, what went wrong, and what the next five years may hold for Riak as we outgrow our Dynamo roots.
Fear No More: Embrace Eventual Consistency
Sean Cribbs, Basho Software Engineer
A number of years ago, Eric Brewer, father of the CAP theorem, coined an architectural style of loosely-coupled distributed systems “BASE”, meaning, “Basically Available, Soft-state, and Eventually-consistent”. Clearly he meant this as a counterpoint to the “ACID” properties of traditional database systems. BASE systems choose to remain available to operations, sacrificing strict synchronization. While developers are very comfortable with the convenience of ACID, eventual consistency can be frightening, unfamiliar territory.
This talk will dive into the design of eventually consistent systems, touching on theory and practice. We’ll see why EC doesn’t mean “inconsistent” but is actually a different kind of consistency, with different tradeoffs. These new skills should help developers know when to embrace eventually-consistent solutions instead of fearing them.
Friday, November 9
Dynamo: Theme and Variations
Shanley Kane, Basho Director of Product Management
The Dynamo paper, released by Amazon five years ago, laid out a set of technical “themes” for highly available, fault-tolerant distributed systems. Since then, numerous NoSQL products have been built on its core principles. These “variations,” along with recent advances in research, represent both a fascinating study in technical evolution and the forefront of the non-relational world. In this talk, we’ll cover the foundations of Dynamo – consistent hashing, vector clocks, hinted handoff, gossip protocol – advances in each area, and how querying and application development has changed as a result of them.
November 14, 2012
The video from last week’s BashoChats Meetup is ready for consumption. Rusty Sears was kind enough to join us for an overview of Stasis. Stasis is “a flexible transactional storage library that is geared toward high-performance applications and system developers.” Rusty worked on it when he was at UC Berkeley and is now doing related work as part of Microsoft’s Cloud and Information Services Lab.
The talk runs just over 30 minutes, and is well worth your time. You’ll soon realize why Eric Brewer mentioned Stasis in his RICON2012 keynote as the type of framework that will be important for the next generation of distributed systems.
Enjoy, and make sure to sign up for BashoChats. When we announce January’s speaker, you’ll be glad you did…
November 4, 2012
For those of you not familiar with Riak Core, it’s more-or-less the distributed systems infrastructure that makes up, well, the core of how Riak distributes data and scales. For some introductory reading (that’s not pure code), there’s an old but still valuable blog post on the Basho Blog that’s well worth your time.
Why a separate list? Because Core is a powerful library that can be (and is being) used to build applications distinct of the other OTP apps (kv, search, pipe, etc.) that make up Riak. I know of at least 10 companies that have Riak Core apps in production, and I’m sure there are many more just waiting to share their use cases with the
world (hint hint…). Plenty of Riak issues are Core-related, and these should still be handled on the Riak Mailing List. However, as Core gets more use, there are questions, comments, and concerns that will be specific to Core, so a separate forum for these makes sense. There will be some overlap, too, and Basho will take responsibility for cross-posting when necessary.
We’ve long been convinced of the power of Core, but it has received less tooling (docs, tutorials, etc.) due to lack of engineering time. This is a great first step to helping put more community power and focus behind Core.
October 16, 2012
We are pleased to announce that Basho package repositories for Riak downloads are now available! Hopefully this makes installing Riak even easier. Here’s the summary of what’s currently available:
RHEL 5 and clones
RHEL 6 and clones
then install riak
Debian / Ubuntu:
get the signing key:
add the repository to your system:
then install Riak
This information will soon be added to the Riak Docs, but we couldn’t wait to share the good news.
Enjoy and thanks for being a part of Riak.
October 26, 2012
A few weeks back the Basho Team put on RICON2012. This was our first developer conference, and by nearly all-accounts, we put on a good show. Here are a few comments from those who were at RICON:
— Sean Schade (@seanschade) October 13, 2012
— Evan Meagher (@evanm) October 11, 2012
— Dana Contreras (@DanaDanger) October 11, 2012
For more, you can browse the @basho favorites for the numerous tweets we managed to tag during RICON. A few blog posts   also popped up with positive reviews (with at least one more on the way).
We’ve received more than a few inquiries asking about how we went about planning and executing RICON, so we wanted to publish something on it before too long. This post will cover (in very brief detail) the components of RICON we chose to focus on. We didn’t necessarily do anything new, and as you’ll notice a lot of the ideas were borrowed or modified.
Basho is an open source company, and our flagship project, Riak, has been out for more than three years now. During that time, we’ve built up a strong community, and today 1000s of companies and organizations are using Riak in production. We discussed doing a pure Riak conference (and the Riak community is big enough to warrant such an event), but Basho’s ambitions are, quite frankly, a bit bigger than Riak, and we believe in the future of distributed systems. We also know that making distributed systems something every developer embraces and understands isn’t doable alone, so building a community around it is essential to its success.
So, sometime around the beginning of July, the decision was made put on a conference. We announced it later that month, and got to work. We had just over three months.
Own The Venue
One of our first endeavors was searching for a venue. We knew we wanted to be in San Francisco, and wanted something intimate. Led by the efforts of Amber, we narrowed down the options to a handful and eventually settled on the W Hotel in SOMA. Their third floor has various meeting rooms, and capacity was about 300 people. We then learned that the second floor, which was primarily a bar, could be ours for both days, too, if we so desired. This would be a perfect spot for hacking and relaxing. What really sold us (aside from the amazing staff at the W) was the fact that we could own the venue for a few days. Our attendees wouldn’t be running through mazes to find tracks; or going off-site for lunch. We would be able to carve out space for RICON that would be largely untouched.
The immediate downside is that the W isn’t cheap, meaning even if we cut a deal with them for rooms, the rate would still be pretty steep. It ended up coming to just over $300/night with the RICON code. Certainly not a steal. We mitigated this with the following: ticket prices were kept low – the early bird was just $250 and the full price was just $100 more; also, San Francisco has a lot of hotels, so if you looked around there was cheaper lodging to be had.
Speaker Selection, Variety, and Composition
Keeping with the “distributed systems conference for developers” theme, we set out to find speakers that could cover the current and future state of the space. This wasn’t an easy task, but we assembled an impressive line up of developers, engineers, executives, and academicians. One thing that should have been immediately apparent was that the focus wasn’t Riak. Of the three keynotes, two were dedicated to larger trends in distributed systems; we had talks about Postgres and Chef; “Scaling Cassandra” was one of the lightning talks.
We were also dedicated to showcasing female speakers, though we could have done better. Women were part of just under 20% of all the talks the conference total which (anecdotally) is much higher than what most of us were used to seeing at developer events. Admittedly, we wanted this ratio be higher, and at future RICONs we’ll push that much closer to 50/50.
On the top of our priority list was bullet-proof WiFi. There would be no complaints from RICON attendees about connectivity or bandwidth. We worked with the team at Unwired to get a 100MBit dedicated ethernet drop (which included running a line from the roof of the W down to their machine room – a tidy 31 floors). Meraki then came aboard as official WiFi sponsors and provided us with enough hardware to blanket bolth floors of the conference – more than 20,000 square feet – with reliable, fast internet. They wrote about it on their blog shortly after RICON concluded. Also, you can expect a full length post from Sean Carey, Seth Thomas and Ryan Carey on all the work they did to keep you connected (because it was downright awesome).
Thanks to Nimby, Artur, and the rest of the crew (and servers) at Fastly, we were able to stream the entire conference live. We did this both days, all day, and streamed out 1080p video at about 60 frames/second. By the end of the conference the live stream only dropped a total of 2,086 out of approx 2,808,000 frames (99.9%); a testament to the quality of the streaming infrastructure available from Fastly. This was something that came together within the week preceding RICON, and we were very fortunate for it as it enabled us to increase the impact of RICON by orders of magnitude. If you’re having a conference, stream it live, and use Fastly to do it. Please.
Make It Accessible to Non-attendees
Live streaming was just one of various ways we made RICON accessible to the those who weren’t able to attend. On the day of the event, we deployed a dedicated RICON Live site (which has since been deprecated) that included links to code and slides decks from RICON, tweets, and pun-riddled play lists for both days   (in addition to the stream). From a traffic perspective, we had as many visitors connect to the RICON Live site during the conference as we did on basho.com during the entire previous month.
Making RICON accessible to others around the world was also very important to us as we are a distributed company. We believe very deeply in this approach to building companies, communities, and distributed systems, and we wanted RICON to reflect this.
Only Distribute What People Will Keep
Early on we decided to keep to eliminate as many printed materials as possible. Conference bags and the paper products that go in them cost a lot of money and are very wasteful because not many people keep them. Instead, every attendee was given a customized hoodie wrapped in their conference pass. And we built the passes such that we could include the sponsors’ stickers therein.
Coffee (Preferably Ritual)
Good, readily-available coffee is essential to keeping people excited and energized for two days of in-depth talks. We made sure that there would be freshly-brewed coffee all day both days. Additionally, we brought in a two person team from Ritual Coffee Roasters to serve espresso drinks in the hacker lounge for both days. Niley and the team at Trifork made this happen. This was a huge hit (not surprisingly) and I hope to see more events doing it as it’s not too pricey and makes for an easy-to-sell sponsorship. The cherry on the cake was that the brewed coffee in the hallways was also from Ritual.
It’s Not All About The Talks
The conference ran two days, with two tracks each day. There were 23 full length talks. That’s a tremendous amount of content. But we wanted to make sure that there was plenty of room to relax, hack, socialize, and get work done. To that end we made the entire 2nd floor of the hotel wide open and equipped it with power, food, coffee, and a few video games (including NES courtesy of the generous @cscotta.
Party Like It’s 1999
Night one featured a party sponsored by GitHub and Boundary. Much like with the venue, we wanted a space we could call our own, and we settled on 620 Jones. 620 Jones has the largest outdoor patio in San Francisco, and Amber had the idea to project the sponsor logos on the buildings that surrounded the patio. The view from Geary Street:
We also opened the party up to non-attendees which meant that significant others and those who weren’t able to attend but happened to be local could take part in the festivities. And, thanks to the sponsors, all the refreshments were free for the entire evening and we were able to feature a top shelf selection.
Design What Matters; Don’t Over Do It
The micro-site for the conference was one of the first assets built for RICON. Pulling from our design elements at Basho, the team branded and formatted a simple page, highlighting speakers and playing off of our pre-existing logos, color palette and fonts. We also made sure the site was ready for mobile devices (as that’s what people use while they are walking from talk to talk).
From Day 1, the consensus was that we didn’t want to print sheets upon sheets of paper that would be discarded. The only paper printing done was isolated to the passes and involved a traditional dye cut press. We wanted to provide a practical handout (name tag, schedule etc.), along with memento – a take away from the event that Sarah dreamt up, including a over lay of San Francisco, twitter handles, an embossed Riak node, Riak code snippet lining the inside, and the RICON twitter hash tag.
Working with a small, intimate space, RICON design was intentionally under spoken, and integrated with the authentic elements of the Basho brand and the uniqueness that the W had to offer.
No Sales People And Thoughtful, Generous Sponsors
The majority of the Basho Sales team was at RICON, but you probably didn’t realize this. The culture at Basho is engineering-driven. As such, our sales people take pride in knowing just as much about the space as our users and customers do. They were at RICON but they weren’t trying to hard close anyone. Instead they were there to learn just like the rest of us.
Our amazing sponsors also helped our attendees focus by staying largely in the background and branding things like parties, lightning talks, coffee bars, and lanyards. We cut a lot of custom deals for sponsors (there was no official prospectus), and we worked with everyone who committed money and time to RICON to make their investment fit their needs.
Lightning talks are nothing new. They are usually a huge hit if you do them right. Make them informal, encourage crowd participation (in the form of light heckling and interactive Q & A), and make sure you’re serving refreshments for the duration. Tom ran the lightning talks from top to bottom and crushed it. All told there were eleven talks at about 5-7 minutes each. Nearly all RICON attendees were present for this session and enjoyed topics varied from “Scaling Cassandra” to “How to Demotivate Your Best Talent”, providing an excellent finale to RICON Day 1.
Don’t Skimp On The Food
The W Staff worked with us to put together custom menus for both days. The lunch was served on the 4th floor roof deck, and the seating and was such that it encouraged interaction while eating. We were of the opinion that the food should be plentiful and exceed expectations for conference fare. Attendees were treated to buffet stations themed after San Francisco, and we served everything from tacos to vegan pasta to cheese and fruit plates to seared Ahi sliders.
Take Care Of Your Speakers
We tried to made it easy for the speakers to commit to being a part of RICON. We didn’t have a formal CFP (this year anyway) but instead opted to extend invites. Every speaker was of course given a free pass to RICON. If they were non-local, we paid their airfare to and from San Francisco. We also put them up at the W if they needed lodging.
Additionally, when each speaker registered, we had one person dedicated to walking them around the venue; they were given a personal tour that started with the track room they were slated to speak in, covered the entirety of the conference space, and ended with arrival at the dedicated Speaker’s Lounge.
There’s a lot we didn’t cover. This post is long-winded as it is. If you’ve got any specific questions, comments, or ideas on how we might of done things better, shoot an email to firstname.lastname@example.org. We would love to talk to you.
All told, just under 350 people registered for RICON and we sold out three times (and flirted with the fire code at the W in the process). This was a huge event for Basho as a company for our personal growth, and nearly all of our team touched it in some way; some were speakers, some stayed late setting up the space; some hustled tickets; others worked the Riak help desk during the event; etc.
Most importantly, we were able to share our passion for distributed systems with developers the world over. We’re counting down the days until we get to do this again. See you at RICON2013.
Thanks for being a part of RICON.