Author Archives: Shanley

GDC Video: Riak at Rovio

May 6, 2013

At the Game Developers Conference this March, Basho Chief Architect Andy Gross and Rovio Entertainment Product Manager Timo Herttua co-presented a session on Riak at Rovio. Angry Birds developer Rovio is using Riak as the database supporting its new mobile gaming platform, including features such as payments, game state storage, and push notifications. The Croods game was the first to use this new platform and is now available on Android and iOS.

In this video, Andy and Timo discuss how Rovio uses Riak and the data storage requirements for today’s high-scale, high-performance gaming platforms. Topics covered include:

  • Why gaming companies are moving to NoSQL to meet changing business requirements
  • How to evaluate distributed systems and databases like Riak
  • Structuring applications for common gaming use cases
  • Setting up and optimizing Riak, with a look at how Rovio uses Riak on the Amazon cloud
  • Use cases for Riak at Rovio, including push notifications, user profiles, and game state
  • Lessons learned from running Riak in production

For more information on building gaming services with Riak, download the Riak on Gaming whitepaper. This video is also available on the GDC Vault. Enjoy!

Gaming with NoSQL from Basho Technologies on Vimeo.

Riak at Qeep, the Global Social Network

April 25, 2013

Qeep is a global social network, with more than 19 million registered users sending nearly five million messages each day. Qeep allows you to play games, chat with friends, send pictures, and more. They use Riak to store all user chat messages.

Qeep Home Page

Qeep was founded in 2006. As they started to grow, they realized their single-instance relational database was not going to work for them anymore. Sharding was becoming a significant operational burden and high latency was preventing quick access to the users’ messages. Qeep needed a new solution that could better handle their significant growth.

They started evaluating a number of different NoSQL solutions. With over one billion keys to store and no complex querying requirements, a key/value store would provide straightforward access to user data. Ultimately, Qeep selected Riak due to its high availability, ease of scale, and predictable operational cost. Once chosen, they quickly migrated over 1.8 billion entries of legacy data from their previous relational database installation over to Riak.

Currently, Qeep uses Riak to store all user chat messages. These messages are stored as JSON objects and are accessed using the open source Riak Java Client, offering high performance access for fetching this data. Qeep uses additional structs to model aggregation of data like inbox, outbox and chats between two users. Qeep has made some changes to the Java client to meet their unique caching needs, which can be reviewed on Github.

Qeep has a 12-node cluster with 48GB of RAM per node, connected via a 1GB network. Qeep uses Riak’s LevelDB backend, which is ideal for deployments with a very large number of keys.

“We love the simplicity of administration and scaling that Riak offers,” said Ingo Rockel, Engineer at Blue Lion Mobile, “We have seen huge performance gains since switching to Riak, which has been the biggest plus for us, and has been the basis for planning future feature releases.”

For more information on Riak, sign up for our introductory webcast on Wednesday, May 1 or check out the Riak documentation.

Basho

Riak at Shopzilla

April 24, 2013

Will Gage of Shopzilla presented last week on their production Riak usage at the Santa Monica Java Users’ Group. Gage, a member of the Consumer Site Engineering team, shared details on how they built various user-facing services on Riak, why it was the right tool for the job, and when you might want to use it in production. Will’s talk starts at the 49 minute mark in the video embedded below, and it’s well worth your time. In addition to offering details on data modeling for their specific use cases, he also talks about service latencies for their production applications and how the Riak community played an important role in their decision.

Mark Phillips, Basho’s Director of Technical Evangelism, also presented. His talk starts at approximately the 1:20:00 point and is entitled Riak and the Power of Distributed Systems. An excellent complement to Will’s talk, this covers Riak’s architecture at a high level, how to access it as a developer, and then ends with a few use case discussions.

If you’re interested in more talks on Riak in production and the future of Riak, make sure to grab a ticket for RICON East, happening May 13-14 in New York City. This will be two days of talks, parties, and hacking dedicated to Riak, developers, and the future of distributed systems in production.

The Basho Team

Riak at Shopzilla

Upcoming Basho Events – April

April 23, 2013

During the rest of April, Basho will be speaking and sponsoring events around the United States and internationally. If you want to meet up with a Basho team member at one of these events, contact us to set up a time, or send us a note on Twitter. Below are some of the highlights:

NY Tech Day: Basho will be exhibiting at NY Tech Day (April 25) in New York, a massive science fair where entrepreneurs can exhibit their startups to thousands of consumers, investors, first adopters, job seekers, major companies, press and media.

NoSQL Matters: Basho is sponsoring NoSQL Matters (April 26-27) in Cologne, Germany. Additionally, Basho engineers, Sean Cribbs and Eric Redmond, will be speaking about Riak Technologies.

RailsConf: Basho will be attending RailsConf (April 29-May 2) in Portland. It is the largest gathering of Rails developers (and most of the time, Rubyists) in the world, drawing world-class developers and companies together to see the state of the art in Rails and web development.

Meetups: This month, we are hosting meetups in Atlanta at Atlanta Tech Village on April 23rd and in Portland on the 29th at NedSpace.

Sponsored Events: Basho will be sponsoring Railsberry in Krakow, Poland (April 23-24), GOTO Chicago in Chicago (April 23-24), and ChefConf in San Francisco (April 24-26).

We hope to see you at one of these events! For a full list of events this month and in upcoming months, visit our Events Page.

Basho

Riak CS: Building a Virtual Testing Environment

This blog post excerpts content from the Riak CS Fast Track. It walks through how to build a Riak CS environment using Vagrant and Chef.

This option for building a test environment uses a Vagrant project powered by Chef to bring up a local Riak CS cluster. Each node can run either Ubuntu 12.04 or CentOS 6.3 32-bit with 1536MB of RAM by default. If you want to tune the OS or node/memory count, you’ll have to edit the Vagrantfile directly.

Install Prerequisites

Download and install VirtualBox via the VirtualBox Downloads.
Download and install Vagrant via the Vagrant Installer.

NOTE: Please make sure to install Vagrant 1.1.0 and above.

Clone the Repository

In order to begin, it is necessary to clone a GitHub repository to your local machine and change directories into the cloned folder.

$ git clone https://github.com/basho/vagrant-riak-cs-cluster
$ cd vagrant-riak-cs-cluster

Launch Cluster

With VirtualBox and Vagrant installed, it’s time to actually launch our virtual environment. The command below will initiate the Vagrant project:

$ RIAK_CS_CREATE_ADMIN_USER=1 vagrant up

If you haven’t already downloaded the Ubuntu or CentOS Vagrant box, this step will download it.

Recording Admin User Credentials

In the Chef provisioning output you will see entries that look like:

[2013-03-27T11:59:12+00:00] INFO: Riak CS Key: 5N2STDSXNV-US8BWF1TH
[2013-03-27T11:59:12+00:00] INFO: Riak CS Secret: RF7WD0b3RjfMK2cTaPfLkpZGbPDaeALDtqHeMw==

Take note of these keys as they will be required in the testing step. In this case, those keys are:

Access key: 5N2STDSXNV-US8BWF1TH
Secret key: RF7WD0b3RjfMK2cTaPfLkpZGbPDaeALDtqHeMw==

Next Steps

Congratulations, you have deployed a virtualized environment of Riak CS. You are ready to progress to Testing the Riak CS Installation in the Riak CS Fast Track.

Stopping Your Virtual Environment

When you are done testing, or just want to start again from scratch, you can end the current virtualized environment by typing:

vagrant destroy

NOTE: Executing this command will reset the environment to a clean state removing any/all changes that you have done.

Congratulations on setting up your first Riak CS testing environment! Make sure to try the entire Riak CS Fast Track. Full documentation is available here.

Riak on Retail – Slides

February 12, 2013

Last week we gave a webcast on Riak for retail and eCommerce services. You can find the slides below. They cover use cases including shopping carts, product catalogs, mobile apps and API platforms, in addition to data modeling and querying, and an overview Riak’s architecture and operations. We also share some user stories, including Copious, ideeli, Bump, Best Buy and OpenX. For more information, download our technical overview, Retail on Riak.

Riak Powers enStratus Cloud Management

January 15, 2013

enStratus is a cloud infrastructure management solution for deploying and managing enterprise-class applications. You can think of enStratus as the enterprise console to cloud computing – a unified solution for managing single or multi-cloud environments. enStratus uses Riak to store a combination of read-heavy and write-intensive data, including machine and state information, and data supporting analytics and audit control.

Previously, enStratus had relied on MySQL as its primary data store, but needed to provide a greater level of write availability and resilience to failure across multiple datacenters. Scaling writes in MySQL had become a bottleneck, and MySQL’s master/slave replication made master nodes a possible single point of failure.

First migrating customer and API data to Riak, enStratus successfully made the switch to Riak’s data model and eventually consistent approach, which favors availability over consistency in the event of node failure or network partition. “As I’ve looked at a number of problem domains from customers and our own systems, you see this pattern where a relational database has been used just because it’s the default… and the reality is that more of the world is eventually consistent than not,” said George Reese, CTO of enStratus.

At our developer conference Ricon, we were lucky to have George speak about migrating from MySQL to Riak, enStratus’ “design for failure” architecture, and how their application is built. George also talks about challenges of moving to a non-relational system, including adjusting to the data model and migration approaches. You can view the video below, or read the full case study here.

Migrating from MySQL to Riak – George Reese, RICON2012 from Basho Technologies on Vimeo.

Want more info on moving from MySQL to Riak? Sign up for our webcast on Thursday, January 24 here or read our whitepaper on moving from relational to Riak.

Basho

Building Riak Clusters with AWS CloudFormation

January 15, 2013

Today we’re introducing an easier way to build Riak clusters on AWS using CloudFormation.

The project, cloudformation-riak, comes with three CloudFormation templates. These templates range from building a simple Riak cluster to building a VPC-based stack that includes: a front-end load balancer; a cluster of application servers with a Riak powered demo application; a backend load balancer; and a riak-cluster.

Head over to the cloudformation-riak repo to get started. We also put together a screencast (below) that shows things in action.

Enjoy.

James

Relational to Riak, Part 2- Operational Cost of Scaling

January 14, 2013

This is the second in a series of blog posts that discusses a high-level overview of the benefits and tradeoffs of Riak versus traditional relational databases. If this is relevant to your projects or applications, register for our “From Relational to Riak” webcast on Thursday, January 24.

One critical factor in deciding which database to use is its operational profile. Many customers today are dealing with rapid data growth, intense peak loads and the imperative to maintain economies of scale across a large platform. For these customers, how the database scales up and what impact that has on operations is a huge factor in business and technical decisions around what technology to use.

The cost of scale is one reason why many of our users and customers have picked Riak over a traditional relational system. From experience, users have discovered that scaling a relational system can be expensive, error-prone and lead to significant and disruptive operations projects. In this blog, we’ll take a look at how a relational database’s sharding approach differs from Riak’s consistent hashing approach and what that means for you as an operator.

Historically, relational databases were commonly found running in production on a single server. If capacity and availability needs require more than a single machine, relational databases address scale using a technique called sharding. Sharding breaks data into logical parts (such as alphabetically, numerically or by geographic region) that can be distributed across multiple machines. A simplified example is below.

Sharding

This approach can be problematic for several reasons. First, writing and maintaining sharding logic increases the overhead of operating and developing an application on the database. Significant growth of data or traffic typically means significant, often manual, resharding projects. Determining how to intelligently split the dataset without negatively impacting performance, operations, and development presents a substantial challenge– especially when dealing with “big data”, rapid scale, or peak loads. Further, rapidly growing applications frequently outpace an existing sharding scheme. When the data in a shard grows too large, the shard must again be split. While several “auto”-sharding technologies have emerged in recent years, these methods are often imprecise and manual intervention is standard practice. Finally, sharding can often lead to “hot spots” in the database – physical machines responsible for storing and serving a disproportionately high amount of both data and requests – which can lead to unpredictable latency and degraded performance.

To avoid sharding (and the associated expenses), data in Riak is distributed across nodes using consistent hashing. Consistent hashing ensures data is evenly distributed around the cluster and new nodes can be added with automatic, minimal reshuffling of data. This significantly decreases risky “hot spots” in the database and lowers the operational burden of scaling.

How does consistent hashing work? Riak stores data using a simple key/value scheme. These keys and values are stored in a namespace called a bucket. When you add new key/value pairs to a bucket in Riak, each object’s bucket and key combination is hashed. The resulting value maps onto a 160-bit integer space. You can think of this integer space as a ring used to figure out what data to put on which physical machines.

How? Riak divides the integer space into equally-sized partitions (default is 64). Each partition owns the given range of values on the ring, and is responsible for all buckets and keys that, when hashed, fall into that range. Each partition is managed by a process called a virtual node (or “vnode”). Physical machines in the cluster evenly divide responsibility for vnodes. Each physical machine thus becomes responsible for all keys represented by its vnodes.

Consistent Hashing

When nodes are added or removed, data is rebalanced automatically without any operator intervention. New machines assume ownership of some of the partitions and existing machines hand off relevant partitions and associated data until data ownership is equal amongst nodes. Riak also has an elegant approach to making cluster changes such as adding or removing nodes, allowing you to stage up the changes, view the impact on the cluster, and then choose to commit or abort the changes. Developers and operators don’t have to deal with the underlying complexity of what data lives where as all nodes can serve and route requests. By eliminating the manual requirements of sharding and much of the potential for “hot spots,” Riak provides a much simpler operational scenario for many users that lets them add and remove machines as needed, no matter how much they grow.

Want more info on relational vs Riak approaches? Sign up for the webcast here or read our whitepaper on moving from relational to Riak.

Basho

An Introduction To Stasis With Rusty Sears

November 14, 2012

The video from last week’s BashoChats Meetup is ready for consumption. Rusty Sears was kind enough to join us for an overview of Stasis. Stasis is “a flexible transactional storage library that is geared toward high-performance applications and system developers.” Rusty worked on it when he was at UC Berkeley and is now doing related work as part of Microsoft’s Cloud and Information Services Lab.

The talk runs just over 30 minutes, and is well worth your time. You’ll soon realize why Eric Brewer mentioned Stasis in his RICON2012 keynote as the type of framework that will be important for the next generation of distributed systems.

Enjoy, and make sure to sign up for BashoChats. When we announce January’s speaker, you’ll be glad you did…

Mark