January 30, 2013
Many teams run Riak in public cloud environments, either as a part of their infrastructure or as the foundation of it. Increasingly, we see enterprises and startups use a hybrid implementation that leverages both private infrastructure and public cloud services. This hybrid model is often used to address burst capacity issues, tenancy/location concerns, and simple proof-of-concept implementations prior to hardware acquisition.
Over the past few years, we have seen substantive adoption of Riak on Amazon Web Services. To that end, we are pleased that Basho has been approved as an Amazon Web Services Technology Partner. We look forward to highlighting interesting use cases, publishing detailed case studies of usage, and continuing to improve the usability and deployment speed of Riak on the AWS platform.
This post provides a high-level overview of your deployment options for using Riak on Amazon.
How Many Nodes?
Before we discuss the mechanics of implementation, it is important to consider the size of your deployment. One of the most frequent questions Basho is asked is, “How many nodes should I start with?”
If you have played with the Riak Fast Track you are familiar with deploying three nodes on a single machine. However, for production deployments, we recommend that your cluster be setup with a minimum of five nodes. For more details on how this minimum ensures the performance and availability of your implementation, please read the post entitled: Why Your Riak Cluster Should Have At Least Five Nodes.
So, you have a minimum of five nodes and you’ve decided that leveraging a cloud provider is appropriate for your current business needs. Now, how do you get started?
Amazon Machine Image
At its simplest, an Amazon Machine Image (AMI) is a pre-built machine image and configuration of Riak for Amazon EC2 users.
Obtaining and configuring the image is a relatively straightforward process. However, since Riak needs the nodes in the cluster to communicate with each other, there is some manual setup involved.
First, provision the Riak AMI onto the server of your choice via the AWS marketplace.
Once the virtual machine is created, manually configure the EC2 security group to allow the Riak nodes to speak to each other. The details of this step can be found on our docs portal under Installing on AWS Marketplace. However, this is generally as simple as opening a few inbound ports and defining a “Custom TCP rule.”
At this point, the machines can be clustered together. When the individual virtual machines are provisioned and the security group is configured, simply SSH into each machine and use internal riak-admin tools to join the nodes to the cluster.
But what if you want to automate some of the configuration of your cluster? Or, what if you want the ability to setup a VPC-based stack that includes:
- a front-end load balancer,
- a cluster of application servers,
- a Riak powered demo application,
- a back-end load balancer,
- and a cluster of Riak servers.
In that case, the Basho team has made available scripts that leverage AWS CloudFormation to build out your cluster in a scripted fashion.
Since this is a much different process than the previous method, it is well worth watching the introductory video (embedded below). In addition, the scripts in the cloudformation-riak repo can be thought of as “known good” templates. We accept Pull Requests and happy forking!
As always, there is a manual option.
If you need to control the system configuration or are most comfortable with software that you have built and deployed yourself, there is always the option to install from package or source.
For a full list of supported operating systems, check out the Installing and Upgrading page of the doc portal. In addition, we have recently launched a new download page that includes the source for the OSS version of Riak.
And easier to deploy than ever before. If you have feedback on present deployment alternatives, or recommendations on ways to make Riak support for cloud infrastructure easier, please drop us a note in the mailing list.
January 29, 2013
This is the first in a series of blog posts covering the benefits Riak offers to developers and operators of retail and eCommerce platforms. To learn more, join our “Retail on Riak” webcast on Friday, February 8th.
As retailers grow and have to store more and more data, traditional relational databases aren’t always the best option. Retailers want to scale easily, without the operational burden of manual sharding. Meanwhile, business requirements demand their data is always available for reads and writes. Riak is a highly available, low latency distributed database that is ideal for retailers who need to serve product data quickly and maintain “always on” shopping experiences. Riak is based on architectural principles from Amazon. Riak is designed for high availability and scale so retailers can always serve customers, even under failure conditions, and rapidly grow to meet peak loads.
Retailers of all sizes have chosen Riak to power parts of their business, including:
- Best Buy: Best Buy is North America’s top specialty retailer of consumer electronics, personal computers, entertainment software, and appliances. Riak has been an integral part in the transformation push to re-platform Best Buy’s eCommerce platform. For more info, check out Best Buy’s talk from our 2012 developer conference, RICON.
- ideal: ideel is one of the fastest growing retailers with over 5 million members and more than 1,000 brand partners. They use Riak to serve HTML documents and user-specific products. ideel chose Riak to power their event-based shopping experience due to Riak’s ability to serve users information at low latency and provide ease of use and scale to ideel’s operations team. Check out the complete case study for more details.
- Copious: Copious is a social commerce marketplace that makes it easy for people to buy and sell the things they love. They currently store all registered accounts in Riak as well as the tokens that make it possible for users to authenticate with Copious via their Facebook or Twitter accounts. They chose to use Riak for their social login functionality because of its operational simplicity, which allows them to easily scale up without sharding and provides the high availability required for a smooth user experience. For more details, check out the complete Copious story on our blog.
For more information about the benefits of Riak for retailers and the retailers already using it, register for our “Retail on Riak” webcast on February 8th!
January 28, 2013
Copious is a social commerce marketplace that makes it easy for people to buy and sell the things they love. Copious uses social data from Facebook and Twitter to customize the shopping and selling experience for each individual user around their interests, taste, and style. This connects buyers and sellers to each other in a unique way that also presents exciting technical challenges.
In the summer of 2012, Copious decided to switch to Riak from MongoDB for their social login functionality. They currently store all registered accounts in Riak as well as the tokens that make it possible for users to authenticate with Copious via their Facebook or Twitter accounts. Copious now stores hundreds of thousands of keys in Riak.
Copious found that Riak’s key/value data model was a natural fit for their authentication scheme and that it was easy to set up and run on Amazon Web Services. They run an HTTP service in front of Riak that presents data in the appropriate form and handles events between the front-end application and the backend data storage.
“Operating in a cloud environment, our infrastructure must be resilient to failure. Sometimes machines or even entire availability zones just disappear, and Riak’s fault-tolerant design means we remain available despite these failures. Riak is one component we never worry about, and requires much less operational work than other datastores,” said Robert Zuber, co-founder of Copious.
Copious is one of the many companies now using a polyglot approach to their platform – using Riak alongside MySQL, Redis, SOLR, and other technologies depending on the problem they need to solve. They chose to use Riak for their social login functionality because of its operational simplicity, which allows them to easily scale up without sharding and provides the high availability required for a smooth user experience. Based on their success with Riak, Zuber has said that they are looking to move more data to Riak in the future.
You can sign up now on Copious to earn $10 off your first $20+ purchase, plus the added bonus of knowing that Riak is in the background helping the site do its magic. If you’re in San Francisco, make sure to attend our Riak meetup on Wednesday, February 13, where Copious will be presenting how they use Riak alongside other datastores.
To learn more about how Riak can benefit your eCommerce or retail platform, join our webcast, “Riak on Retail” on February 8th!
January 25, 2013
Today we’re excited to introduce early access of Riak on Engine Yard! You can also learn more on the Engine Yard blog. With Riak on Engine Yard, you can deploy a Riak cluster as simply as defining some configuration values and clicking “Add Cluster.”
A common theme, in several of our recent blog posts, has been Basho’s key focus on ease of deployment. We excel in making highly available, low latency, distributed systems. Engine Yard’s strengths lie in providing a hardened and secure Platform as a Service where you can manage your entire platform while retaining control of the environment. In addition, Engine Yard is well known for its contributions to the Ruby, PHP, and Node.js communities. The introduction of Riak on Engine Yard further validates customer demand for reliable and easy to use cloud solutions.
If you were at Ricon2012, you were probably one of the many who attended a talk entitled “Riak in the Cloud.” If you were unable to attend, you missed an amazing session where Ines Sombra and Michael Broadhead from Engine Yard spoke about their experiences with Riak and deploying it in the cloud. It’s great to see the lessons of distributed systems that were discussed translated into reality.
We look forward to seeing what the Basho community builds using Riak on Engine Yard. Get started now with 500 hours for free on their platform.
January 24, 2013
Alert Logic, the industry-leading Security-as-a-Service provider and protector of customer infrastructure and data, uses Riak to help manage their massive amount of data collection needs and to support their rapid business growth.
Recently named a Leader in Emerging Managed Security Services by Forrester Research, Alert Logic helps companies defend against security threats and address compliance mandates, such as PCI and HIPAA. Alert Logic’s Security solutions include intrusion detection, web application security, log management and vulnerability assessment, coupled with 24×7 monitoring and expert guidance services. Alert Logic is used by dozens of the world’s largest hosting service providers.
With the help of Riak, Alert Logic collects and processes machine data and uses this information to perform real-time analytics, detect anomalies, ensure compliance and proactively respond to threats. Alert Logic introduced Riak in 2012 to support the development of a new analytics infrastructure, and ultimately replace an existing MySQL system that could not support the anticipated increase in workload.
The new analytics infrastructure performs statistical and correlation processing on all data collected from Alert Logic’s products – including log messages, network intrusion detection events, and NetFlow data – processing approximately 5 TB/day. All of this data is processed in real-time as it streams in from over 2,000 customers, 5,000 appliances, and hundreds of thousands of data sources on customer networks. The data grows more than 50% a year, outpacing revenue growth of 40%.
Today, Alert Logic’s analytics infrastructure, powered by Riak, achieves performance results of up to 35,000 operations/second across each node in the cluster – performance that eclipses the existing MySQL deployment by a large margin on single node performance. In real business terms, the initial deployment of the combination of Riak and the analytic infrastructure has allowed Alert Logic to process in real-time 7,500 reports, which previously took 12 hours of dedicated processing every night. In addition, Alert Logic’s expert security analysts’ benefited as well, by gaining increased functionality and efficiency.
Alert Logic uses Riak Enterprise advanced replication technology to deploy clusters that can handle different priority workloads. This frees up Alert Logic’s primary cluster to ensure it is always available to receive and write customer-specific analytic data, even during times requiring extreme scale. Other Riak clusters will provide data mining and reporting that are critical to Alert Logic’s solutions.
“Alert Logic depends on the reliable processing of massive amounts of machine data and turning that into actionable information,” said Paul Fisher, Director of Platform Services, at Alert Logic. “Our security operations center depends on this information for analysis to detect and respond to real-time security incidents that occur on our customers networks. We selected Riak for scalability and fault-tolerance, and it continues to be a vital component helping ensure that the Alert Logic Platform can scale to keep up with our rapid growth.”
Alert Logic plans to accelerate development of its real-time analytical capabilities in 2013, and expand the presence of Riak as a foundational technology throughout Alert Logic’s solutions. On deck next is the replacement of the largest existing MySQL workload at Alert Logic, which today sustains 9,000 queries per second, and peaks at over 20,000.
Basho plans to announce the inaugural Houston Riak meet-up featuring the Platform Services team at Alert Logic shortly. Stay tuned.
January 22, 2013
Traditionally, most retailers have used relational databases to manage their platforms and eCommerce sites. However, with the rapid growth of data and business requirements for high availability and scale, more retailers are looking at non-relational solutions like Riak.
Riak is a masterless, distributed database that provides retailers with high read and write availability, fault-tolerance and the ability to grow with low operational cost. Architectural, operational and development benefits for retailers include:
- “Always On” Shopping Experience: Based on architectural principles from Amazon, Riak is designed to favor data availability, even in the event of hardware failure or network partition. For retailers, failure to accept additions to a shopping cart, or serve product information quickly, has a direct and negative impact on revenue. Riak is architected to ensure the system can always accept writes and serve reads at low-latency.
- Resilient Infrastructure: At scale, hardware malfunction, network partition, and other failure modes are inevitable. Riak provides a number of mechanisms to ensure that retail infrastructure is resilient to failure. Data is replicated automatically within the cluster so nodes can go down but the system still responds to requests. This ensures read and write availability, even in serious failure conditions.
- Low-Latency Data Storage: Many retailers now operate online and mobile experiences with an API or data services platform. In order to provide a fast and available experience to end users, Riak is designed to serve predictable, low-latency requests as part of a service-oriented infrastructure and is accessible via HTTP API, protocol buffers, or Riak’s many client libraries.
- Scale to Peak Loads with Low Operational Cost: During major holidays and other periods of peak load, retailers may have to significantly increase their database capacity quickly. When new nodes are added, Riak automatically distributes data evenly to naturally prevent hot spots in the database, and yields a near-linear increase in performance and throughput when capacity is added.
- Global Data Locality and Redundancy: Riak Enterprise’s multi-site replication allows replication of data to multiple data centers, providing both a global data footprint and the ability to survive datacenter failure.
Top retailers using Riak include Best Buy and ideel. Best Buy selected Riak as an integral part in the transformation push to re-platform its eCommerce platform. For more information about how Best Buy is using Riak, check out this video.
ideel uses Riak to serve HTML documents and user-specific products. ideel chose Riak to provide its highly available, event-based shopping experience – Riak gives them the ability to serve user information at low latency and provides ease of use and scale to ideel’s operations team. For more information on ideel’s use of Riak check out the complete case study.
Common use cases for Riak in the retail/eCommerce space include shopping carts (due to Riak’s “always-on” capabilities), product catalogs (Riak is well suited for the storage of rapidly growing content that needs to be served at low-latency), API platforms (Riak’s flexible, schemaless design allows for rapid application development), and mobile applications (Riak is ideal for powering mobile experiences across platforms due to its low-latency, always-available small object storage capabilities).
To help retailers evaluate and adopt Riak, we’ve published a technical overview: “Retail on Riak: A Technical Introduction.” We discuss more in-depth information on modeling applications for common use cases, switching from a relational architecture, querying, multi-site replication and more.
January 22, 2013
Best Buy is North America’s top specialty retailer of consumer electronics, personal computers, entertainment software, and appliances. Riak has been an integral part in the transformation push to re-platform Best Buy’s eCommerce platform. Riak’s architecture has helped Best Buy build and operate its new platform with Riak playing a key role.
At our developer conference, Ricon, we were lucky to have Joel Crabb, Director of Web Architecture at BestBuy.com, talk about how Riak fits into their pattern of innovation and moving from a traditional relational architecture.
To learn more about how retailers can use Riak for their eCommerce needs, check out the whitepaper, “Retail on Riak: A Technical Introduction.” For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.
January 21, 2013
Mad Mimi is an email marketing service that allows users to create, send, and track email campaigns without using templates. With over 100,000 clients, Mad Mimi is storing a large amount of data that needs to be accessed quickly and easily.
In 2011, Mad Mimi realized that their data was growing beyond the capacity of their MySQL database. Rather than resharding the data, which would require an extensive operational effort, Mad Mimi decided to try Riak based on its ability to scale quickly and easily without manual sharding.
Mad Mimi now uses Riak to track email statistics, leveraging the secondary indexing feature to make retrieving data easier. Secondary indexing allows users to attach additional key/value data to Riak objects and query them by exact match or range value. Mad Mimi is currently running an 8 node cluster storing between three and five billion keys, adding between 10-20 million keys each day.
Since launching with Riak, their cluster has never gone down and it is still as fast as ever. Based on this success, they hope to move all their email tracking statistics to Riak and eliminate MySQL entirely.
For more details on Mad Mimi’s experience with Riak, check out the case study, “Email Marketing Success with Mad Mimi and Riak.”
For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.
January 15, 2013
enStratus is a cloud infrastructure management solution for deploying and managing enterprise-class applications. You can think of enStratus as the enterprise console to cloud computing – a unified solution for managing single or multi-cloud environments. enStratus uses Riak to store a combination of read-heavy and write-intensive data, including machine and state information, and data supporting analytics and audit control.
Previously, enStratus had relied on MySQL as its primary data store, but needed to provide a greater level of write availability and resilience to failure across multiple datacenters. Scaling writes in MySQL had become a bottleneck, and MySQL’s master/slave replication made master nodes a possible single point of failure.
First migrating customer and API data to Riak, enStratus successfully made the switch to Riak’s data model and eventually consistent approach, which favors availability over consistency in the event of node failure or network partition. “As I’ve looked at a number of problem domains from customers and our own systems, you see this pattern where a relational database has been used just because it’s the default… and the reality is that more of the world is eventually consistent than not,” said George Reese, CTO of enStratus.
At our developer conference Ricon, we were lucky to have George speak about migrating from MySQL to Riak, enStratus’ “design for failure” architecture, and how their application is built. George also talks about challenges of moving to a non-relational system, including adjusting to the data model and migration approaches. You can view the video below, or read the full case study here.
Want more info on moving from MySQL to Riak? Sign up for our webcast on Thursday, January 24 here or read our whitepaper on moving from relational to Riak.
January 15, 2013
Today we’re introducing an easier way to build Riak clusters on AWS using CloudFormation.
The project, cloudformation-riak, comes with three CloudFormation templates. These templates range from building a simple Riak cluster to building a VPC-based stack that includes: a front-end load balancer; a cluster of application servers with a Riak powered demo application; a backend load balancer; and a riak-cluster.
Head over to the cloudformation-riak repo to get started. We also put together a screencast (below) that shows things in action.