This is a cross post from compositecode.com written by Adron Hall, one of the Basho Technical Evangelists. In it he walks through one of the methods of setting up and configuring a cluster on AWS. Other options are enumerated in a post entitled Riak on AWS – Deployment Options
March 14, 2013
I wanted to write up an intro to getting Riak installed on AWS, even though the steps are absurdly simple and already available on the Basho Docs site, there are a few extra notes that can be very helpful for a few specific points during the process.
Start off by logging into AWS. At this point you can take two different paths that are almost identical. You can follow the path of using the pre-built AWS Marketplace image of Riak, or just start form scratch. The difference is a total of about 2 steps: installing & setting some security port connections. I’m going to step through without using the prebuilt image in these instructions.
First thing you’ll need to get a security group with the correct permissions setup. For that, you’ll need to make a security group.
NOTE: No, I didn’t mean to misspell Riak, but it’s in there now.
Before adding the ports, go to the security group details tab and copy the security group id. I’ve pointed it out in the image above.
Now add the following three and assign the security group to the ports; 4369, 8099 & 6000-7999. For the source set it to the security group id. Once you get all three added the list should look like this (below). For each rule click the Add Rule button and remember to click the Apply Rule Changes. I often forget this because the screen on some of the machines I use only shows to the bottom of the Add Rule button, so you’ll have to scroll down to find the Apply Rule Changes button.
Now add the standard port 22 for SSH. Next get the final two of 8087 and 8098 setup and we’re ready for moving on to creating the virtual machines.
Server Virtual Machines
For creating virtual machines I just clicked on Launch Instance and used the classic wizard. From there you get a selection of items. I’ve used the AWS image to do this, but would actually suggest using a CentOS image of your choice or Red Hat Enterprise Linux (RHEL). Another great option is to use the Ubuntu 12.04 LTS. Really though, use whatever Linux version or distro you like, there are 1-2 step instructions for installing Riak on almost every distro out there.
Next just launch a single instance. We’ll be able to launch duplicates of these further along in the process. I’ve selected a “Micro” here but I’m not intending to do anything with a remotely heavy load right now. At some point, I’ll upgrade this cluster to larger instances when I start putting it under a real load. I’ll have another blog entry to describe exactly how I do this too.
Continue again until you can select the security group that we created above.
Now keep hitting that continue button, until you get to launch, and launch this thing. Once the instance is launched launch your preferred SSH connection tooling. The easiest way I’ve found for getting the most current private IP to connect to with the appropriate command is to right click on the instance in the AWS Console and click on Connect. There you’ll find the command to connect via SSH.
Paste that in and hit enter in your SSH App, you’ll see something akin to this.
$ cd Codez/working-content/
$ ssh -i riaktionz.pem firstname.lastname@example.org
The authenticity of host 'ec2-54-245-201-97.us-west-2.compute.amazonaws.com (188.8.131.52)' can't be established.
RSA key fingerprint is 31:18:ac:1a:ac:fc:6e:6d:55:e8:8a:83:9a:8f:c7:5f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-54-245-201-97.us-west-2.compute.amazonaws.com,184.108.40.206' (RSA) to the list of known hosts.
Please login as the user "ubuntu" rather than the user "root".
Enter yes to continue connecting. For some instance types, like Ubuntu you’ll have to do some teaks to log into as “ubuntu” vs. “root” and the same goes for the AWS image or others. I’ll leave that to you, dear reader to get connected via ole’ SSH.
One of the other things, that you may have to do some tweaking about and googling, is figuring out the firewall setups on the various virtual machine images. For the RHEL you’ll want to turn off the firewall or open up the specific connection ports and such. Since the AWS firewall does this, it isn’t particularly important for the OS to continue running its firewall service. In this case, I’ve turned off the OS firewall and just rely on the AWS firewall. To turn off the RHEL firewall, execute the following commands.
$ service iptables save
$ service iptables stop
$ chkconfig iptables off
Now is a perfect time to start those other instances. Navigate into the AWS Console again and right click on the virtual machine instance you’ve created. On that menu select Launch More Like This.
Go through and check the configuration on each of these, make sure the firewall is turned off, etc. Then move on to the next step and install Riak and cluster them. So it’s time to get to the distributed, massively complex, extensive list of steps to install & cluster Riak. Ok, so that’s sarcasm.
Step 1: Install Riak
Install Riak on each of the instances.
wget http://yum.basho.com/gpg/$package -O /tmp/$package &&
sudo rpm -ivh /tmp/$package
sudo yum install riak
NOTE: For other installation methods, such as directly downloading the RPM or other Linux OSes, check out the http://docs.basho.com/riak/latest/tutorials/installation/Installing-on-RHEL-and-CentOS/.
Step 2: Setup the Cluster
On the first instance, get the IP. You won’t need to do anything to this instance, just keep the IP handy. Then move on to the second instance and run the cluster command.
sudo riak-admin cluster join riak@
Do this on each of the instances you’ve added, using that first node. When you’ve added them all, on that last instance (or really any of them) then run the plan. This will get you a display plan of what will take place when the cluster is committed.
sudo riak-admin cluster plan
If that looks all cool. Commit the plan.
sudo riak-admin cluster commit
Get a check of the cluster.
sudo riak-admin member_status
That’s it; all done. You now have a Riak Cluster. For more operations to try out on your cluster, check out this list of basic API Operations.
March 13, 2013
For a complete overview, download the whitepaper, “Gaming on Riak: A Technical Introduction.” To see how other gaming companies are using Riak, visit us at the Game Developers Conference at Booth #202!
As discussed in our previous post, “Gaming on Riak: A Brief Overview and User Case Studies,” Riak can provide a number of advantages for gaming platforms. Content agnostic, an HTTP API, many client libraries, and a simple key/value data model, Riak is a flexible data store that can be used for a variety of different use cases in the gaming industry. This post looks at some common examples and how to start building them in Riak.
Player and Session Data: Riak can serve and store key player and session data with predictable low latency, and ensures it is available even in the event of node failure and network partition. This data may include user and profile information, game performance, statistics and rankings, and more. In Riak, all objects are stored on disk as binaries, providing flexible storage for many content types. Since Riak is schema-less, applications can evolve without changing an underlying schema, providing agility with growth and change.
Social Information: Riak can be used for social content such as social graph information, player profiles and relationships, social authentication accounts, and other types of social gaming data.
Content: Riak is often used to store text, documents, images, videos and other assets that power gaming experiences. This data often needs to be highly available and able to scale quickly to attract and keep users.
Global Data Locality: Gaming requires a low-latency experience, no matter where the players are located. Riak Enterprise’s multi-datacenter replication feature means data can be served to global users quickly.
Below are some common approaches to structuring gaming data with Riak’s key/value model:
Riak offers robust additional functionality on top of the fundamental key/value model. For more information on these options as well as how to implement them, their architecture, and their limitations, check out the documentation on searching and accessing data in Riak.
Riak Search is a distributed, full-text search engine. It provides support for various MIME types & analyzers, and robust querying including exact matches, wildcards, range queries, proximity searches, and more.
Possible Use Cases: Searching player and game information.
Secondary Indexing (2i) gives developers the ability, at write time, to tag an object stored in Riak with one or more queryable values. Indexes can be either integers or strings and can be queried by either exact matches or ranges of an index.
Possible Use Cases: Tagging player information with user relationships, general attributes, or other metadata.
Possible Use Cases: Filtering game data by tag, counting items, and extracting links to related data.
To learn more about how your gaming platform can benefit from Riak, download “Gaming on Riak: A Technical Introduction.” For more information about Riak, sign up for our webcast on Thursday, March 14.
March 12, 2013
Riak provides low-latency, highly available storage to power gaming platforms and applications. Gaming companies use Riak to store player and game data, session and social information, and a variety of gaming content and events. This post offers a quick look at the advantages of Riak and some user case studies. Later this week, we’ll publish an in-depth look at the common gaming use cases and examples of data modeling.
For a complete overview, download the whitepaper, “Gaming on Riak: A Technical Introduction.”
Advantages of Riak
- Support for Rapid Growth: Built for operational ease-of-use, Riak yields a near-linear performance and throughput increase as capacity is added.
- Low-Latency Design: Riak is designed to store data and serve requests predictably and quickly, even during peak times.
- Flexible, Reliable Storage: Riak has a flexible data model with redundancy built-in, and a number of mechanisms to maintain availability even in the event of node failure or network partition. Riak is content-agnostic, providing flexibility for document, image, video, and other storage.
- Multi-Datacenter Replication: Riak Enterprise’s multi-datacenter replication provides disaster recovery and data locality.
Hibernum is a creator and developer of unique gaming experiences that combine the latest in social gaming, top quality visuals and animations, and cutting edge design. They switched from a relational database to Riak due to its high availability, ability to scale to peak loads, and predictable operational cost. Riak is used to store user game information for one of their most popular social games. For more information about how Hibernum uses Riak, check out the complete case study.
Kiip is a platform that lets brands provide rewards to mobile gamers for in-game achievements. Kiip replaced MongoDB with Riak in order to achieve low read/write latencies and horizontal scalability. Kiip uses Riak for session and device data. To learn more about Kiip’s experience selecting Riak, check out this video by two of their engineers.
To learn more about how your gaming platform can benefit from Riak, download “Gaming on Riak: A Technical Introduction.” For more information about Riak, sign up for out webcast on Thursday, March 14.
March 11, 2013
Nearly each day this month, we will be speaking at conferences, hosting meetups, and sponsoring events. For a full list of events, visit our Events Page. If you want to meet up with a Basho team member at one of these events, contact us to set up a time. Below are some of the highlights:
GigaOM Structure: Basho will be speaking at two different sessions at GigaOM Structure (March 20-21) in New York. Come hear Basho CTO, Justin Sheehy, and Technical Evangelist, Tom Santero, speak, stop by our booth, or attend our cocktail reception on March 20th.
Game Developer Conference 2013: Basho Chief Architect, Andy Gross, will be speaking at the Game Developer Conference at a session titled “Gaming on NoSQL: Building Available, Fast Services with Riak.” GDC will be held March 25-29 in San Francisco. Check out our session and booth to learn more about how gaming platforms can use Riak.
Meetups: This month, we are hosting a number of meetups all over the country. If you’re in Austin, come visit us at BlackLocus on March 11th, if you’re in Seattle, visit us on March 13th at Blue Box Group, if you’re in Chicago, visit us on March 14th at Braintree, or if you’re in Boston, check us out on March 27th at Basho’s Cambridge office. We’ll also be at Riot Games in LA on March 19th and in Portland on March 28th at NedSpace.
Sponsored Events: Basho will be sponsoring Erlang Factory 2013 in San Francisco (March 18-22), Clojure/West in Portland (March 18-20), Open Analytics Summit in Arlington, VA (March 25), and Monitorama in Boston (March 28-29).
Hope to see you soon!
March 7, 2013
We are excited to announce that we have a new board on our Pinterest all about getting started with Riak! We have included a variety of tools to help you start running Riak, learn about Riak’s design, and find Riak events near you. This board is also a great way to find key videos, blog posts, academic papers, and other content that can be useful when learning about Riak and distributed systems.
We will be constantly adding new material, so follow us to stay up-to-date on everything Riak. Check it out and let us know if there’s anything you’d like to see added!
March 5, 2013
Mobile platforms need to provide always available, low-latency experiences that can scale to millions of users and support highly concurrent access. Riak’s redundant and fault-tolerant design ensures mobile data can be served quickly and reliably, and Riak is run in production by many popular mobile applications. For a full overview, check out the whitepaper “Mobile on Riak: A Technical Introduction.” Below are a few key mobile use cases and basic approaches to modeling them in Riak:
User Data: Storing user accounts, profile information, and events is a common use case for Riak. Mobile apps often store this data in JSON documents, using a UUID or other identifier as the key. Data can be queried through Riak features such as secondary indexes, MapReduce, and full-text search.
Session Data: Since session IDs are commonly stored in cookies, or otherwise known at lookup time, they are a natural fit for Riak’s key/value model and Riak can serve these requests at predictably low-latency. Session data can also be encoded in many different ways and evolve without any administrative changes to schema.
Text & Multimedia Storage: Since Riak is content agnostic, mobile platforms can easily store a variety of different types of data, including audio, text, photos, video, etc. to power mobile experiences.
Social Authentication: Many mobile applications have users sign in via their Facebook or Twitter accounts. Riak’s key/value scheme makes it easy to store both registered accounts and the tokens that make it possible for users to authenticate with their social accounts.
Global Data Locality: Riak Enterprise’s multi-datacenter capabilities mean mobile data can be stored in physical proximity to users and served at low-latency no matter where they happen to be.
Here is a chart with possible ways these applications and services can be modeled using Riak’s key/value design. Of course, your application should be structured in a way appropriate to its access and query patterns, among other factors – this is just to get you started. For more information on designing applications with Riak, check out our documentation.
To learn more about how mobile platforms can use Riak for their data needs, check out the complete overview, “Mobile on Riak: A Technical Introduction.” For more details about Riak and the latest 1.3 release, sign up for our webcast on March 7th.
Copious is a social marketplace that, behind the scenes, blends data models at the storage level to fit specific use cases and features. Each of these comes with its own constraints and operational challenges. Activity streams, social context, product search, and purchase processing span the spectrum from absolute consistency requirements to “any response is better than no response.”
In this talk, Rob Zuber, Co-Founder at Copious, talks about why they put Riak into production and where it fits in their total data store strategy. Rob has a very healthy, humorous, and deep understanding of data stores and systems environments and it all shows in this presentation. This talk is well worth watching for anyone interested in how Riak fits social use cases and how to approach choosing the right tool for the job. He also has some one-of-a-kind insight into starting companies and building products with a small team that is not to be overlooked.
February 28, 2013
In the last post, we looked at how Riak Enterprise’s multi-datacenter replication can be configured for backups and data locality. In this post, we examine two other common implementations: availability zones and public cloud use cases. For more information on Riak Enterprise architecture and configuration, download the complete whitepaper.
Availability zones provide efficient multi-datacenter replication and data redundancy within a geographic region (such as a coast or a country). In this configuration, data is replicated within an availability zone’s series of datacenters. In the event that one of datacenters experiences an outage or serious failure, data can still be served from other datacenters within the same region.
One approach to setting this up is to have a “primary” site in a region where all reads and writes for specific users, applications, or data sets are directed. This primary cluster can then be replicated to one or more proximal secondary clusters. In other approaches, data can be replicated in real-time from one cluster to both another datacenter and other cold backups maintained for emergency conditions. The right approach is highly dependent on the requirements of users, availability, expense of bandwidth, and other constraints.
Public Cloud Use Cases
Riak is designed to be easy to use and operate on public clouds, and is partnered with many of the leading cloud providers, including Amazon Web Services, Microsoft Azure, and Joyent. Hosted Riak is also available from Engine Yard and Riak packages can always be manually installed on any physical or virtual provider, even if a machine image isn’t explicitly supported.
There are several use cases for Riak Enterprise’s multi-datacenter replication in the public cloud. Many enterprises want to maintain a cold or hot backup of their cluster in a public cloud for business continuity in the event of a datacenter outage in their private infrastructure. For other customers, the public cloud can provide a more cost-effective way of meeting peak loads, rather than building out private infrastructure to accommodate them year-round. For example, many retailers and media providers need to offer increased capacity over the holiday season. Riak Enterprise is used to scale out capacity on public clouds over these periods, either with full-sync or real-time sync depending on the business needs.
Finally, some enterprises run certain applications or services entirely on public clouds. Riak Enterprise allows for redundancy and data locality across public cloud availability zones for this use case, ensuring optimal performance and resiliency.
February 26, 2013
Mobile platforms and applications pose unique infrastructure challenges for today’s companies. These applications require low-latency, always-available small object storage that can scale to millions or more users, and support highly concurrent access and traffic spikes.
Riak provides a number of benefits for these platforms, including:
- Low-Latency Data Storage: Riak is designed to serve predictable, low-latency requests to provide a fast, available experience to all users.
- Straightforward Data Model: Riak uses a simple key-value data model, which is ideal for storing and serving mobile content, user information, events, and session data. Riak is content agnostic, so there are no restrictions on content type.
- Accommodates Peak Loads Gracefully: To handle increasing user data and accommodate peak loads during events, Riak makes it easy to add additional capacity and scale out quickly. Riak automatically rebalances data when new nodes are added, while its consistent hashing methodology prevents hot spots in the database.
- Multi-Datacenter Replication: Riak Enterprise’s multi-datacenter replication allows mobile platforms to serve low-latency content to users all over world by maintaining a global data footprint.
- For a full overview, download our new whitepaper on building mobile services with Riak
Bump is a popular mobile app that makes it easy for users to share their contact information, photos, and other objects by simply “bumping” their smartphones. They use Riak to store user data and currently run 25 nodes of Riak storing about 3TB of data.
For more details about how Bump uses Riak and how they designed their application, check out Bump’s presentation at RICON2012, Basho’s 2012 developer conference. You can also read the complete case study for more information about why Bump chose Riak.
Voxer is a popular Walkie Talkie application for smartphones that allows users to send instant voice messages to one or more friends. They switched to Riak due to its fault-tolerance and ability to scale quickly and easily. They currently run more than 50 machines on Riak to support their huge growth and user base. For more details about how Voxer uses Riak, check out the complete case study and watch Matt Ranney’s talk at a Riak Meetup in San Francisco.
To learn more about how mobile platforms can use Riak for their data needs, check out the complete overview, “Mobile on Riak: A Technical Introduction.”
February 24, 2013
Recently, Basho engineer, Eric Redmond, published “A Little Riak Book.” This book is available free for download at littleriakbook.com and provides a great overview of Riak, including how to think about a distributed system compared to more traditional databases.
The book starts with a discussion on concepts. Since Riak is a distributed NoSQL database, it requires developers to approach problems differently than they would with a relational database. The concepts section describes the differences between various NoSQL systems, takes an in-depth look at Riak’s key/value data model, and describes how Riak is designed for high availability (as well as how it handles eventual consistency constraints). After laying the theoretical groundwork, the book walks developers through how to use Riak by explaining the different querying options and showing them how to tinker with settings to meet different use case needs. Finally, it covers the basic details that operators should know, such as how to set up a Riak cluster, configure values, use optional tools, and more.
After finishing the book, start playing around with Riak to see if it’s the right fit for your needs. You can download Riak on our Docs Page.