Tag Archives: Riak

Riak for Retail and eCommerce Platforms

January 22, 2013

Traditionally, most retailers have used relational databases to manage their platforms and eCommerce sites. However, with the rapid growth of data and business requirements for high availability and scale, more retailers are looking at non-relational solutions like Riak.

Riak is a masterless, distributed database that provides retailers with high read and write availability, fault-tolerance and the ability to grow with low operational cost. Architectural, operational and development benefits for retailers include:

  • “Always On” Shopping Experience: Based on architectural principles from Amazon, Riak is designed to favor data availability, even in the event of hardware failure or network partition. For retailers, failure to accept additions to a shopping cart, or serve product information quickly, has a direct and negative impact on revenue. Riak is architected to ensure the system can always accept writes and serve reads at low-latency.
  • Resilient Infrastructure: At scale, hardware malfunction, network partition, and other failure modes are inevitable. Riak provides a number of mechanisms to ensure that retail infrastructure is resilient to failure. Data is replicated automatically within the cluster so nodes can go down but the system still responds to requests. This ensures read and write availability, even in serious failure conditions.
  • Low-Latency Data Storage: Many retailers now operate online and mobile experiences with an API or data services platform. In order to provide a fast and available experience to end users, Riak is designed to serve predictable, low-latency requests as part of a service-oriented infrastructure and is accessible via HTTP API, protocol buffers, or Riak’s many client libraries.
  • Scale to Peak Loads with Low Operational Cost: During major holidays and other periods of peak load, retailers may have to significantly increase their database capacity quickly. When new nodes are added, Riak automatically distributes data evenly to naturally prevent hot spots in the database, and yields a near-linear increase in performance and throughput when capacity is added.
  • Global Data Locality and Redundancy: Riak Enterprise’s multi-site replication allows replication of data to multiple data centers, providing both a global data footprint and the ability to survive datacenter failure.

Top retailers using Riak include Best Buy and ideeli. Best Buy selected Riak as an integral part in the transformation push to re-platform its eCommerce platform. For more information about how Best Buy is using Riak, check out this video.

ideeli uses Riak to serve HTML documents and user-specific products. ideeli chose Riak to provide its highly available, event-based shopping experience – Riak gives them the ability to serve user information at low latency and provides ease of use and scale to ideeli’s operations team. For more information on ideeli’s use of Riak check out the complete case study.

Common use cases for Riak in the retail/eCommerce space include shopping carts (due to Riak’s “always-on” capabilities), product catalogs (Riak is well suited for the storage of rapidly growing content that needs to be served at low-latency), API platforms (Riak’s flexible, schemaless design allows for rapid application development), and mobile applications (Riak is ideal for powering mobile experiences across platforms due to its low-latency, always-available small object storage capabilities).

To help retailers evaluate and adopt Riak, we’ve published a technical overview: “Retail on Riak: A Technical Introduction.” We discuss more in-depth information on modeling applications for common use cases, switching from a relational architecture, querying, multi-site replication and more.

Basho

A Focus on Innovation- Best Buy and Riak

January 22, 2013

Best Buy is North America’s top specialty retailer of consumer electronics, personal computers, entertainment software, and appliances. Riak has been an integral part in the transformation push to re-platform Best Buy’s eCommerce platform. Riak’s architecture has helped Best Buy build and operate its new platform with Riak playing a key role.

At our developer conference, Ricon, we were lucky to have Joel Crabb, Director of Web Architecture at BestBuy.com, talk about how Riak fits into their pattern of innovation and moving from a traditional relational architecture.


Pattern of Innovation: Riak Usage at BestBuy.com – Joel Crabb, RICON2012 from Basho Technologies on Vimeo.

To learn more about how retailers can use Riak for their eCommerce needs, check out the whitepaper, “Retail on Riak: A Technical Introduction.” For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.

Basho

How Mad Mimi Uses Riak to Power Their Email Marketing Service

January 21, 2013

Mad Mimi is an email marketing service that allows users to create, send, and track email campaigns without using templates. With over 100,000 clients, Mad Mimi is storing a large amount of data that needs to be accessed quickly and easily.

In 2011, Mad Mimi realized that their data was growing beyond the capacity of their MySQL database. Rather than resharding the data, which would require an extensive operational effort, Mad Mimi decided to try Riak based on its ability to scale quickly and easily without manual sharding.

Mad Mimi now uses Riak to track email statistics, leveraging the secondary indexing feature to make retrieving data easier. Secondary indexing allows users to attach additional key/value data to Riak objects and query them by exact match or range value. Mad Mimi is currently running an 8 node cluster storing between three and five billion keys, adding between 10-20 million keys each day.

Since launching with Riak, their cluster has never gone down and it is still as fast as ever. Based on this success, they hope to move all their email tracking statistics to Riak and eliminate MySQL entirely.

For more details on Mad Mimi’s experience with Riak, check out the case study, “Email Marketing Success with Mad Mimi and Riak.”

For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.

Basho

Riak Powers enStratus Cloud Management

January 15, 2013

enStratus is a cloud infrastructure management solution for deploying and managing enterprise-class applications. You can think of enStratus as the enterprise console to cloud computing – a unified solution for managing single or multi-cloud environments. enStratus uses Riak to store a combination of read-heavy and write-intensive data, including machine and state information, and data supporting analytics and audit control.

Previously, enStratus had relied on MySQL as its primary data store, but needed to provide a greater level of write availability and resilience to failure across multiple datacenters. Scaling writes in MySQL had become a bottleneck, and MySQL’s master/slave replication made master nodes a possible single point of failure.

First migrating customer and API data to Riak, enStratus successfully made the switch to Riak’s data model and eventually consistent approach, which favors availability over consistency in the event of node failure or network partition. “As I’ve looked at a number of problem domains from customers and our own systems, you see this pattern where a relational database has been used just because it’s the default… and the reality is that more of the world is eventually consistent than not,” said George Reese, CTO of enStratus.

At our developer conference Ricon, we were lucky to have George speak about migrating from MySQL to Riak, enStratus’ “design for failure” architecture, and how their application is built. George also talks about challenges of moving to a non-relational system, including adjusting to the data model and migration approaches. You can view the video below, or read the full case study here.

Migrating from MySQL to Riak – George Reese, RICON2012 from Basho Technologies on Vimeo.

Want more info on moving from MySQL to Riak? Sign up for our webcast on Thursday, January 24 here or read our whitepaper on moving from relational to Riak.

Basho

Building Riak Clusters with AWS CloudFormation

January 15, 2013

Today we’re introducing an easier way to build Riak clusters on AWS using CloudFormation.

The project, cloudformation-riak, comes with three CloudFormation templates. These templates range from building a simple Riak cluster to building a VPC-based stack that includes: a front-end load balancer; a cluster of application servers with a Riak powered demo application; a backend load balancer; and a riak-cluster.

Head over to the cloudformation-riak repo to get started. We also put together a screencast (below) that shows things in action.

Enjoy.

James

Relational to Riak, Part 2- Operational Cost of Scaling

January 14, 2013

This is the second in a series of blog posts that discusses a high-level overview of the benefits and tradeoffs of Riak versus traditional relational databases. If this is relevant to your projects or applications, register for our “From Relational to Riak” webcast on Thursday, January 24.

One critical factor in deciding which database to use is its operational profile. Many customers today are dealing with rapid data growth, intense peak loads and the imperative to maintain economies of scale across a large platform. For these customers, how the database scales up and what impact that has on operations is a huge factor in business and technical decisions around what technology to use.

The cost of scale is one reason why many of our users and customers have picked Riak over a traditional relational system. From experience, users have discovered that scaling a relational system can be expensive, error-prone and lead to significant and disruptive operations projects. In this blog, we’ll take a look at how a relational database’s sharding approach differs from Riak’s consistent hashing approach and what that means for you as an operator.

Historically, relational databases were commonly found running in production on a single server. If capacity and availability needs require more than a single machine, relational databases address scale using a technique called sharding. Sharding breaks data into logical parts (such as alphabetically, numerically or by geographic region) that can be distributed across multiple machines. A simplified example is below.

Sharding

This approach can be problematic for several reasons. First, writing and maintaining sharding logic increases the overhead of operating and developing an application on the database. Significant growth of data or traffic typically means significant, often manual, resharding projects. Determining how to intelligently split the dataset without negatively impacting performance, operations, and development presents a substantial challenge– especially when dealing with “big data”, rapid scale, or peak loads. Further, rapidly growing applications frequently outpace an existing sharding scheme. When the data in a shard grows too large, the shard must again be split. While several “auto”-sharding technologies have emerged in recent years, these methods are often imprecise and manual intervention is standard practice. Finally, sharding can often lead to “hot spots” in the database – physical machines responsible for storing and serving a disproportionately high amount of both data and requests – which can lead to unpredictable latency and degraded performance.

To avoid sharding (and the associated expenses), data in Riak is distributed across nodes using consistent hashing. Consistent hashing ensures data is evenly distributed around the cluster and new nodes can be added with automatic, minimal reshuffling of data. This significantly decreases risky “hot spots” in the database and lowers the operational burden of scaling.

How does consistent hashing work? Riak stores data using a simple key/value scheme. These keys and values are stored in a namespace called a bucket. When you add new key/value pairs to a bucket in Riak, each object’s bucket and key combination is hashed. The resulting value maps onto a 160-bit integer space. You can think of this integer space as a ring used to figure out what data to put on which physical machines.

How? Riak divides the integer space into equally-sized partitions (default is 64). Each partition owns the given range of values on the ring, and is responsible for all buckets and keys that, when hashed, fall into that range. Each partition is managed by a process called a virtual node (or “vnode”). Physical machines in the cluster evenly divide responsibility for vnodes. Each physical machine thus becomes responsible for all keys represented by its vnodes.

Consistent Hashing

When nodes are added or removed, data is rebalanced automatically without any operator intervention. New machines assume ownership of some of the partitions and existing machines hand off relevant partitions and associated data until data ownership is equal amongst nodes. Riak also has an elegant approach to making cluster changes such as adding or removing nodes, allowing you to stage up the changes, view the impact on the cluster, and then choose to commit or abort the changes. Developers and operators don’t have to deal with the underlying complexity of what data lives where as all nodes can serve and route requests. By eliminating the manual requirements of sharding and much of the potential for “hot spots,” Riak provides a much simpler operational scenario for many users that lets them add and remove machines as needed, no matter how much they grow.

Want more info on relational vs Riak approaches? Sign up for the webcast here or read our whitepaper on moving from relational to Riak.

Basho

Relational to Riak, Part 1- High Availability

January 10, 2013

This is the first in a series of blog posts that discusses a high-level overview of the benefits and tradeoffs of Riak versus traditional relational databases. If this is relevant to your projects or applications, register for our “From Relational to Riak” webcast on January 24.

One of the biggest differences between Riak and relational systems is the focus on availability and how the underlying architecture deals with failure modes.

Most relational databases leverage a master/slave architecture to replicate data. This approach usually means the master coordinates all write operations, working with the slave nodes to update data. If the master node fails, the database will reject write operations until the failure is resolved – often involving failover or leader election – to maintain correctness. This can result in a window of write unavailability.

Master Slave Systems

Conversely, Riak uses a masterless system with no single point of failure, meaning any node can serve read or write requests. If a node experiences an outage, other nodes can continue to accept read and write requests. Additionally, if a node fails or becomes unavailable to the rest of the cluster due to a network partition, a neighboring node will take over responsibilities for the unavailable node. Once this node becomes available again, the neighboring node will pass over any updates through a process called “hinted handoff.” This is another way that Riak maintains availability and resilience even despite serious failure.

Riak Masterless

Because Riak’s system allows for reads and writes, even when multiple nodes are unavailable, and uses an eventually consistent design to maintain availability, in rare cases different replicas may contain different versions of an object. This can occur if multiple clients update the same piece of data at the exact same time or if nodes are down or laggy. These conflicts happen a statistically small portion of the time, but are important to know about. Riak has a number of mechanisms for detecting and resolving these conflicts when they occur. For more on how Riak achieves availability and the tradeoffs involved, see our documentation on the subject.

For many use cases today, high availability and fault tolerance are critical to the user experience and the company’s revenue. Unavailability has a negative impact on your revenue, damages user trust and leads to a poor user experience. For use cases such as online retail, shopping carts, advertising, social and mobile platforms or anything with critical data needs, high availability is key and Riak may be the right choice.

Sign up for the webcast here or read our whitepaper on moving from relational to Riak.

Basho

TV Everywhere With Synacor and Riak

January 9, 2013

Synacor’s TV Everywhere platform enables cable, satellite, consumer electronics and telco companies to stream content and programming to any device, anytime. TV Everywhere also provides innovative search, discovery and recommendation solutions combined with deep social media integration.

Synacor TV Everywhere uses Riak as object storage for video clips, news stories and other content. Originally using a relational solution as their primary datastore, API response times had started to slow as they continued to add more assets. After evaluating several possible solutions, they chose to move to Riak due to its low latency and Synacor’s high availability requirements.

Riak Enterprise has been deployed in multiple Synacor datacenters and has improved the API response time significantly since its integration. Synacor now stores over 5 million assets with thousands being added daily. According to Michael Collins, Synacor’s Senior Director of Engineering, “Riak has never been the source of a bottleneck for us. It’s been great.”

For more details, check out the complete case study, “TV Everywhere with Synacor and Riak”

Basho

Riak on VM Depot from MS Open Tech

January 9, 2013

Today, Microsoft Open Technologies, Inc announced the public preview of VM Depot. Basho is pleased to participate in this launch. Available today, you can quickly deploy a virtual machine image, configured with an OSS Riak implementation from the VM Depot.

For more information on the VM Depot launch, check out the interoperability@Microsoft blog and follow @openatmicrosoft.

Ease of deployment is a common theme we hear from the community…ensuring Riak is available on your platform of choice is part of our purpose in supporting your deployment needs. Whether it’s quickly prototyping an internal application in the enterprise, deploying a hybrid cloud solution, or leveraging solely public cloud services, Riak is an excellent choice for solving your data-storage needs at scale.

If you need multi-datacenter replication and support, contact us to discuss Riak Enterprise on Azure.

Given that this is a public preview, installation documentation is forthcoming. When it is ready, and that will be soon, you can find it on our documentation portal. In the mean time, feel free to ask questions, or provide feedback, on the mailing list.

Self-Service Test Harness for Riak Cloud Storage

January 7, 2013

Riak Cloud Storage is simple, available cloud storage software built on top of Riak. It offers an S3 API, multi-tenancy and large object support for enterprises building public or private clouds. We want to make it easier to get started with Riak CS, so we’re now offering a self-service test harness. Visit riakcs.net to sign up - you can explore the functionality, test API operations, and experiment with clients and development apps. With the self-service feature, you can start playing right away.

Note that the test harness is primarily for exploring Riak CS features – if you want to do load testing and performance benchmarking, you should sign up for a developer trial that will give you access to Riak CS packages you can install and test on your own hardware.

Interested in learning more about Riak CS? All of the docs are available online.

Basho Team