January 25, 2013
Today we’re excited to introduce early access of Riak on Engine Yard! You can also learn more on the Engine Yard blog. With Riak on Engine Yard, you can deploy a Riak cluster as simply as defining some configuration values and clicking “Add Cluster.”
A common theme, in several of our recent blog posts, has been Basho’s key focus on ease of deployment. We excel in making highly available, low latency, distributed systems. Engine Yard’s strengths lie in providing a hardened and secure Platform as a Service where you can manage your entire platform while retaining control of the environment. In addition, Engine Yard is well known for its contributions to the Ruby, PHP, and Node.js communities. The introduction of Riak on Engine Yard further validates customer demand for reliable and easy to use cloud solutions.
If you were at Ricon2012, you were probably one of the many who attended a talk entitled “Riak in the Cloud.” If you were unable to attend, you missed an amazing session where Ines Sombra and Michael Broadhead from Engine Yard spoke about their experiences with Riak and deploying it in the cloud. It’s great to see the lessons of distributed systems that were discussed translated into reality.
We look forward to seeing what the Basho community builds using Riak on Engine Yard. Get started now with 500 hours for free on their platform.
January 24, 2013
Alert Logic, the industry-leading Security-as-a-Service provider and protector of customer infrastructure and data, uses Riak to help manage their massive amount of data collection needs and to support their rapid business growth.
Recently named a Leader in Emerging Managed Security Services by Forrester Research, Alert Logic helps companies defend against security threats and address compliance mandates, such as PCI and HIPAA. Alert Logic’s Security solutions include intrusion detection, web application security, log management and vulnerability assessment, coupled with 24×7 monitoring and expert guidance services. Alert Logic is used by dozens of the world’s largest hosting service providers.
With the help of Riak, Alert Logic collects and processes machine data and uses this information to perform real-time analytics, detect anomalies, ensure compliance and proactively respond to threats. Alert Logic introduced Riak in 2012 to support the development of a new analytics infrastructure, and ultimately replace an existing MySQL system that could not support the anticipated increase in workload.
The new analytics infrastructure performs statistical and correlation processing on all data collected from Alert Logic’s products – including log messages, network intrusion detection events, and NetFlow data – processing approximately 5 TB/day. All of this data is processed in real-time as it streams in from over 2,000 customers, 5,000 appliances, and hundreds of thousands of data sources on customer networks. The data grows more than 50% a year, outpacing revenue growth of 40%.
Today, Alert Logic’s analytics infrastructure, powered by Riak, achieves performance results of up to 35,000 operations/second across each node in the cluster – performance that eclipses the existing MySQL deployment by a large margin on single node performance. In real business terms, the initial deployment of the combination of Riak and the analytic infrastructure has allowed Alert Logic to process in real-time 7,500 reports, which previously took 12 hours of dedicated processing every night. In addition, Alert Logic’s expert security analysts’ benefited as well, by gaining increased functionality and efficiency.
Alert Logic uses Riak Enterprise advanced replication technology to deploy clusters that can handle different priority workloads. This frees up Alert Logic’s primary cluster to ensure it is always available to receive and write customer-specific analytic data, even during times requiring extreme scale. Other Riak clusters will provide data mining and reporting that are critical to Alert Logic’s solutions.
“Alert Logic depends on the reliable processing of massive amounts of machine data and turning that into actionable information,” said Paul Fisher, Director of Platform Services, at Alert Logic. “Our security operations center depends on this information for analysis to detect and respond to real-time security incidents that occur on our customers networks. We selected Riak for scalability and fault-tolerance, and it continues to be a vital component helping ensure that the Alert Logic Platform can scale to keep up with our rapid growth.”
Alert Logic plans to accelerate development of its real-time analytical capabilities in 2013, and expand the presence of Riak as a foundational technology throughout Alert Logic’s solutions. On deck next is the replacement of the largest existing MySQL workload at Alert Logic, which today sustains 9,000 queries per second, and peaks at over 20,000.
Basho plans to announce the inaugural Houston Riak meet-up featuring the Platform Services team at Alert Logic shortly. Stay tuned.
January 22, 2013
Traditionally, most retailers have used relational databases to manage their platforms and eCommerce sites. However, with the rapid growth of data and business requirements for high availability and scale, more retailers are looking at non-relational solutions like Riak.
Riak is a masterless, distributed database that provides retailers with high read and write availability, fault-tolerance and the ability to grow with low operational cost. Architectural, operational and development benefits for retailers include:
- “Always On” Shopping Experience: Based on architectural principles from Amazon, Riak is designed to favor data availability, even in the event of hardware failure or network partition. For retailers, failure to accept additions to a shopping cart, or serve product information quickly, has a direct and negative impact on revenue. Riak is architected to ensure the system can always accept writes and serve reads at low-latency.
- Resilient Infrastructure: At scale, hardware malfunction, network partition, and other failure modes are inevitable. Riak provides a number of mechanisms to ensure that retail infrastructure is resilient to failure. Data is replicated automatically within the cluster so nodes can go down but the system still responds to requests. This ensures read and write availability, even in serious failure conditions.
- Low-Latency Data Storage: Many retailers now operate online and mobile experiences with an API or data services platform. In order to provide a fast and available experience to end users, Riak is designed to serve predictable, low-latency requests as part of a service-oriented infrastructure and is accessible via HTTP API, protocol buffers, or Riak’s many client libraries.
- Scale to Peak Loads with Low Operational Cost: During major holidays and other periods of peak load, retailers may have to significantly increase their database capacity quickly. When new nodes are added, Riak automatically distributes data evenly to naturally prevent hot spots in the database, and yields a near-linear increase in performance and throughput when capacity is added.
- Global Data Locality and Redundancy: Riak Enterprise’s multi-site replication allows replication of data to multiple data centers, providing both a global data footprint and the ability to survive datacenter failure.
Top retailers using Riak include Best Buy and ideel. Best Buy selected Riak as an integral part in the transformation push to re-platform its eCommerce platform. For more information about how Best Buy is using Riak, check out this video.
ideel uses Riak to serve HTML documents and user-specific products. ideel chose Riak to provide its highly available, event-based shopping experience – Riak gives them the ability to serve user information at low latency and provides ease of use and scale to ideel’s operations team. For more information on ideel’s use of Riak check out the complete case study.
Common use cases for Riak in the retail/eCommerce space include shopping carts (due to Riak’s “always-on” capabilities), product catalogs (Riak is well suited for the storage of rapidly growing content that needs to be served at low-latency), API platforms (Riak’s flexible, schemaless design allows for rapid application development), and mobile applications (Riak is ideal for powering mobile experiences across platforms due to its low-latency, always-available small object storage capabilities).
To help retailers evaluate and adopt Riak, we’ve published a technical overview: “Retail on Riak: A Technical Introduction.” We discuss more in-depth information on modeling applications for common use cases, switching from a relational architecture, querying, multi-site replication and more.
January 22, 2013
Best Buy is North America’s top specialty retailer of consumer electronics, personal computers, entertainment software, and appliances. Riak has been an integral part in the transformation push to re-platform Best Buy’s eCommerce platform. Riak’s architecture has helped Best Buy build and operate its new platform with Riak playing a key role.
At our developer conference, Ricon, we were lucky to have Joel Crabb, Director of Web Architecture at BestBuy.com, talk about how Riak fits into their pattern of innovation and moving from a traditional relational architecture.
To learn more about how retailers can use Riak for their eCommerce needs, check out the whitepaper, “Retail on Riak: A Technical Introduction.” For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.
January 21, 2013
Mad Mimi is an email marketing service that allows users to create, send, and track email campaigns without using templates. With over 100,000 clients, Mad Mimi is storing a large amount of data that needs to be accessed quickly and easily.
In 2011, Mad Mimi realized that their data was growing beyond the capacity of their MySQL database. Rather than resharding the data, which would require an extensive operational effort, Mad Mimi decided to try Riak based on its ability to scale quickly and easily without manual sharding.
Mad Mimi now uses Riak to track email statistics, leveraging the secondary indexing feature to make retrieving data easier. Secondary indexing allows users to attach additional key/value data to Riak objects and query them by exact match or range value. Mad Mimi is currently running an 8 node cluster storing between three and five billion keys, adding between 10-20 million keys each day.
Since launching with Riak, their cluster has never gone down and it is still as fast as ever. Based on this success, they hope to move all their email tracking statistics to Riak and eliminate MySQL entirely.
For more details on Mad Mimi’s experience with Riak, check out the case study, “Email Marketing Success with Mad Mimi and Riak.”
For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.
January 15, 2013
enStratus is a cloud infrastructure management solution for deploying and managing enterprise-class applications. You can think of enStratus as the enterprise console to cloud computing – a unified solution for managing single or multi-cloud environments. enStratus uses Riak to store a combination of read-heavy and write-intensive data, including machine and state information, and data supporting analytics and audit control.
Previously, enStratus had relied on MySQL as its primary data store, but needed to provide a greater level of write availability and resilience to failure across multiple datacenters. Scaling writes in MySQL had become a bottleneck, and MySQL’s master/slave replication made master nodes a possible single point of failure.
First migrating customer and API data to Riak, enStratus successfully made the switch to Riak’s data model and eventually consistent approach, which favors availability over consistency in the event of node failure or network partition. “As I’ve looked at a number of problem domains from customers and our own systems, you see this pattern where a relational database has been used just because it’s the default… and the reality is that more of the world is eventually consistent than not,” said George Reese, CTO of enStratus.
At our developer conference Ricon, we were lucky to have George speak about migrating from MySQL to Riak, enStratus’ “design for failure” architecture, and how their application is built. George also talks about challenges of moving to a non-relational system, including adjusting to the data model and migration approaches. You can view the video below, or read the full case study here.
Want more info on moving from MySQL to Riak? Sign up for our webcast on Thursday, January 24 here or read our whitepaper on moving from relational to Riak.
January 15, 2013
Today we’re introducing an easier way to build Riak clusters on AWS using CloudFormation.
The project, cloudformation-riak, comes with three CloudFormation templates. These templates range from building a simple Riak cluster to building a VPC-based stack that includes: a front-end load balancer; a cluster of application servers with a Riak powered demo application; a backend load balancer; and a riak-cluster.
Head over to the cloudformation-riak repo to get started. We also put together a screencast (below) that shows things in action.
January 14, 2013
This is the second in a series of blog posts that discusses a high-level overview of the benefits and tradeoffs of Riak versus traditional relational databases. If this is relevant to your projects or applications, register for our “From Relational to Riak” webcast on Thursday, January 24.
One critical factor in deciding which database to use is its operational profile. Many customers today are dealing with rapid data growth, intense peak loads and the imperative to maintain economies of scale across a large platform. For these customers, how the database scales up and what impact that has on operations is a huge factor in business and technical decisions around what technology to use.
The cost of scale is one reason why many of our users and customers have picked Riak over a traditional relational system. From experience, users have discovered that scaling a relational system can be expensive, error-prone and lead to significant and disruptive operations projects. In this blog, we’ll take a look at how a relational database’s sharding approach differs from Riak’s consistent hashing approach and what that means for you as an operator.
Historically, relational databases were commonly found running in production on a single server. If capacity and availability needs require more than a single machine, relational databases address scale using a technique called sharding. Sharding breaks data into logical parts (such as alphabetically, numerically or by geographic region) that can be distributed across multiple machines. A simplified example is below.
This approach can be problematic for several reasons. First, writing and maintaining sharding logic increases the overhead of operating and developing an application on the database. Significant growth of data or traffic typically means significant, often manual, resharding projects. Determining how to intelligently split the dataset without negatively impacting performance, operations, and development presents a substantial challenge– especially when dealing with “big data”, rapid scale, or peak loads. Further, rapidly growing applications frequently outpace an existing sharding scheme. When the data in a shard grows too large, the shard must again be split. While several “auto”-sharding technologies have emerged in recent years, these methods are often imprecise and manual intervention is standard practice. Finally, sharding can often lead to “hot spots” in the database – physical machines responsible for storing and serving a disproportionately high amount of both data and requests – which can lead to unpredictable latency and degraded performance.
To avoid sharding (and the associated expenses), data in Riak is distributed across nodes using consistent hashing. Consistent hashing ensures data is evenly distributed around the cluster and new nodes can be added with automatic, minimal reshuffling of data. This significantly decreases risky “hot spots” in the database and lowers the operational burden of scaling.
How does consistent hashing work? Riak stores data using a simple key/value scheme. These keys and values are stored in a namespace called a bucket. When you add new key/value pairs to a bucket in Riak, each object’s bucket and key combination is hashed. The resulting value maps onto a 160-bit integer space. You can think of this integer space as a ring used to figure out what data to put on which physical machines.
How? Riak divides the integer space into equally-sized partitions (default is 64). Each partition owns the given range of values on the ring, and is responsible for all buckets and keys that, when hashed, fall into that range. Each partition is managed by a process called a virtual node (or “vnode”). Physical machines in the cluster evenly divide responsibility for vnodes. Each physical machine thus becomes responsible for all keys represented by its vnodes.
When nodes are added or removed, data is rebalanced automatically without any operator intervention. New machines assume ownership of some of the partitions and existing machines hand off relevant partitions and associated data until data ownership is equal amongst nodes. Riak also has an elegant approach to making cluster changes such as adding or removing nodes, allowing you to stage up the changes, view the impact on the cluster, and then choose to commit or abort the changes. Developers and operators don’t have to deal with the underlying complexity of what data lives where as all nodes can serve and route requests. By eliminating the manual requirements of sharding and much of the potential for “hot spots,” Riak provides a much simpler operational scenario for many users that lets them add and remove machines as needed, no matter how much they grow.
January 10, 2013
This is the first in a series of blog posts that discusses a high-level overview of the benefits and tradeoffs of Riak versus traditional relational databases. If this is relevant to your projects or applications, register for our “From Relational to Riak” webcast on January 24.
One of the biggest differences between Riak and relational systems is the focus on availability and how the underlying architecture deals with failure modes.
Most relational databases leverage a master/slave architecture to replicate data. This approach usually means the master coordinates all write operations, working with the slave nodes to update data. If the master node fails, the database will reject write operations until the failure is resolved – often involving failover or leader election – to maintain correctness. This can result in a window of write unavailability.
Conversely, Riak uses a masterless system with no single point of failure, meaning any node can serve read or write requests. If a node experiences an outage, other nodes can continue to accept read and write requests. Additionally, if a node fails or becomes unavailable to the rest of the cluster due to a network partition, a neighboring node will take over responsibilities for the unavailable node. Once this node becomes available again, the neighboring node will pass over any updates through a process called “hinted handoff.” This is another way that Riak maintains availability and resilience even despite serious failure.
Because Riak’s system allows for reads and writes, even when multiple nodes are unavailable, and uses an eventually consistent design to maintain availability, in rare cases different replicas may contain different versions of an object. This can occur if multiple clients update the same piece of data at the exact same time or if nodes are down or laggy. These conflicts happen a statistically small portion of the time, but are important to know about. Riak has a number of mechanisms for detecting and resolving these conflicts when they occur. For more on how Riak achieves availability and the tradeoffs involved, see our documentation on the subject.
For many use cases today, high availability and fault tolerance are critical to the user experience and the company’s revenue. Unavailability has a negative impact on your revenue, damages user trust and leads to a poor user experience. For use cases such as online retail, shopping carts, advertising, social and mobile platforms or anything with critical data needs, high availability is key and Riak may be the right choice.
January 9, 2013
Synacor’s TV Everywhere platform enables cable, satellite, consumer electronics and telco companies to stream content and programming to any device, anytime. TV Everywhere also provides innovative search, discovery and recommendation solutions combined with deep social media integration.
Synacor TV Everywhere uses Riak as object storage for video clips, news stories and other content. Originally using a relational solution as their primary datastore, API response times had started to slow as they continued to add more assets. After evaluating several possible solutions, they chose to move to Riak due to its low latency and Synacor’s high availability requirements.
Riak Enterprise has been deployed in multiple Synacor datacenters and has improved the API response time significantly since its integration. Synacor now stores over 5 million assets with thousands being added daily. According to Michael Collins, Synacor’s Senior Director of Engineering, “Riak has never been the source of a bottleneck for us. It’s been great.”
For more details, check out the complete case study, “TV Everywhere with Synacor and Riak”