Tag Archives: whitepaper

Gaming on Riak: A Brief Overview and User Case Studies

March 12, 2013

Riak provides low-latency, highly available storage to power gaming platforms and applications. Gaming companies use Riak to store player and game data, session and social information, and a variety of gaming content and events. This post offers a quick look at the advantages of Riak and some user case studies. Later this week, we’ll publish an in-depth look at the common gaming use cases and examples of data modeling.

For a complete overview, download the whitepaper, “Gaming on Riak: A Technical Introduction.”

Advantages of Riak

  • Support for Rapid Growth: Built for operational ease-of-use, Riak yields a near-linear performance and throughput increase as capacity is added.
  • Low-Latency Design: Riak is designed to store data and serve requests predictably and quickly, even during peak times.
  • Flexible, Reliable Storage: Riak has a flexible data model with redundancy built-in, and a number of mechanisms to maintain availability even in the event of node failure or network partition. Riak is content-agnostic, providing flexibility for document, image, video, and other storage.
  • Multi-Datacenter Replication: Riak Enterprise’s multi-datacenter replication provides disaster recovery and data locality.

Case Studies

Hibernum is a creator and developer of unique gaming experiences that combine the latest in social gaming, top quality visuals and animations, and cutting edge design. They switched from a relational database to Riak due to its high availability, ability to scale to peak loads, and predictable operational cost. Riak is used to store user game information for one of their most popular social games. For more information about how Hibernum uses Riak, check out the complete case study.

Kiip is a platform that lets brands provide rewards to mobile gamers for in-game achievements. Kiip replaced MongoDB with Riak in order to achieve low read/write latencies and horizontal scalability. Kiip uses Riak for session and device data. To learn more about Kiip’s experience selecting Riak, check out this video by two of their engineers.

To learn more about how your gaming platform can benefit from Riak, download “Gaming on Riak: A Technical Introduction.” For more information about Riak, sign up for out webcast on Thursday, March 14.


Multi-Datacenter Replication: Backups and Data Locality

February 27, 2013

Multi-datacenter replication is a critical part of modern infrastructure, providing essential business benefits for enterprise applications, platforms and services. Riak Enterprise offers multi-datacenter replication so that data stored in Riak can be replicated to multiple sites. Over the next two posts, we will look at some common implementations, starting with configurations for backups and data locality. For more information on Riak Enterprise architecture and configuration, download the complete whitepaper.

Primary Cluster with Failover

One of the most common architectural patterns in multi-datacenter replication is maintaining a primary cluster that serves traffic and a backup cluster for emergency failover. This configuration can be an important component of compliance with regulatory requirements, ensuring business continuity and access to data even in serious failure modes.

In this configuration, a primary cluster serves as the production cluster from which all read and write operations are served. The backup cluster(s) is maintained in another datacenter. In the event of a datacenter outage or critical failure at the primary site, requests can be directed to the backup cluster either by changing DNS configuration or rules for routing via a load balancer.

For this use case, we recommend that your failover strategy be tested periodically so any potential issues can be resolved in advance of a crisis. It’s also beneficial to have your failover strategy fully defined upfront – know the conditions in which a failover mode will be invoked, decide how traffic will be directed to the backup, and document and test the failover strategy to ensure success.

Active-Active Cluster Configuration

To achieve data locality, when clients are served at low-latency by whatever datacenter is nearest to them, you can maintain two (or more) active, synced clusters that are both responsible for serving data to clients. An added benefit of this approach is that in the event of a datacenter failure where one of the clusters is hosted, all traffic can be served from the other, still-functional site for business continuity.

For data locality, requests can be load balanced across geographies, with geo-based client requests directed to the appropriate datacenter. For example, US-based requests can be served out of a US-based datacenter while EU-based requests can be served out of a regional site. For situations where not all data needs to be shared across all datacenters (or if certain data, such as user data, must only be stored in a specific geographic region to meet privacy regulations), Riak Enterprise’s multi-datacenter replication can be configured on a per-bucket basis so only shared assets, popular assets, etc. are replicated.

For a more in-depth look at common architectures and use cases for Riak Enterprise, download our technical overview. You can also sign up for our webcast on Thursday, March 7th.


Mobile on Riak: Overview and User Stories

February 26, 2013

Mobile platforms and applications pose unique infrastructure challenges for today’s companies. These applications require low-latency, always-available small object storage that can scale to millions or more users, and support highly concurrent access and traffic spikes.

Riak provides a number of benefits for these platforms, including:

  • Low-Latency Data Storage: Riak is designed to serve predictable, low-latency requests to provide a fast, available experience to all users.
  • Straightforward Data Model: Riak uses a simple key-value data model, which is ideal for storing and serving mobile content, user information, events, and session data. Riak is content agnostic, so there are no restrictions on content type.
  • Accommodates Peak Loads Gracefully: To handle increasing user data and accommodate peak loads during events, Riak makes it easy to add additional capacity and scale out quickly. Riak automatically rebalances data when new nodes are added, while its consistent hashing methodology prevents hot spots in the database.
  • Multi-Datacenter Replication: Riak Enterprise’s multi-datacenter replication allows mobile platforms to serve low-latency content to users all over world by maintaining a global data footprint.
  • For a full overview, download our new whitepaper on building mobile services with Riak

User Stories

Bump is a popular mobile app that makes it easy for users to share their contact information, photos, and other objects by simply “bumping” their smartphones. They use Riak to store user data and currently run 25 nodes of Riak storing about 3TB of data.

For more details about how Bump uses Riak and how they designed their application, check out Bump’s presentation at RICON2012, Basho’s 2012 developer conference. You can also read the complete case study for more information about why Bump chose Riak.

Voxer is a popular Walkie Talkie application for smartphones that allows users to send instant voice messages to one or more friends. They switched to Riak due to its fault-tolerance and ability to scale quickly and easily. They currently run more than 50 machines on Riak to support their huge growth and user base. For more details about how Voxer uses Riak, check out the complete case study and watch Matt Ranney’s talk at a Riak Meetup in San Francisco.

To learn more about how mobile platforms can use Riak for their data needs, check out the complete overview, “Mobile on Riak: A Technical Introduction.”


Introducing Riak 1.3: Active Anti-Entropy, a New Look for Riak Control, and Faster Multi-Datacenter Replication

February 21, 2013

Today we are excited to announce the latest version of Riak. Here is a summary of the major enhancements delivered in Riak 1.3:

  • Introduced Active Anti-Entropy. Riak now has active anti-entropy. In distributed systems, inconsistencies can arise between replicas due to failure modes, concurrent updates, and physical data loss or corruption. Pre-1.3 Riak already had several features for repairing this “entropy”, but they all required some form of user intervention. Riak 1.3 introduces automatic, self-healing properties that repair entropy on an ongoing basis.
  • Improved Riak Enterprise’s multi-datacenter replication performance. New advanced mode for multi-datacenter replication capabilities, with better performance, more TCP connections and easier configuration. Read more in this write up from GigaOM.
  • Improved graphical user experience. Riak Control, the user interface for managing and monitoring Riak, has a brand new look.
  • Expanded IPv6 support. IPv6 support in Riak now is supported by all interfaces.
  • Improved MapReduce. Riak MapReduce has improved back-pressure to reduce the risk of overwhelming endpoint processes during large tasks.
  • Simplified log management. Riak can now optionally send log messages to syslog.

Ready to get started or upgrade? Download the new release here, check out the official release notes, or read on for more details. Documentation for all products and releases is available on the documentation site. For an introduction to Riak and what’s new in Riak 1.3, sign up for our webcast on Thursday, March 7.

More on What’s in Riak 1.3

Active Anti-Entropy
A key feature of Riak is its ability to regenerate lost or corrupted data from replicated data stored on other nodes. Prior to this release, Riak provided two methods to repair data:

  • Read Repair: Riak compares the replies from all replicas during a read request, repairing any replica that is divergent or missing data. (K/V data only)
  • Repair Command via Riak Console: Introduced in Riak 1.2, the repair command enables users to trigger a repair of a specific partition. The partition is rebuilt based on a subset of data stored on adjacent nodes in the Riak ring. All data is rebuilt, not just missing or divergent data. (K/V and Search data)

Riak 1.3 introduces active anti-entropy, a continuous background process that compares and repairs any divergent, missing, or corrupted replicas (K/V data only). Unlike read repair, which is only triggered when data is read, the active anti-entropy system ensures the integrity of all data stored in Riak. This is particularly useful in clusters containing “cold data”: data that may not be read for long periods of time, potentially years. Furthermore, unlike the repair command, active anti-entropy is an automatic process, requiring no user intervention and is enabled by default in Riak 1.3.

Riak’s active anti-entropy feature is based on hash tree exchange, which enables differences between replicas to be determined with minimal exchange of information. Specifically, the amount of information exchanged in the process is proportional to the differences between two replicas, not the amount of data that they contain. Approximately the same amount of information is exchanged when there are 10 differing keys out of 1 million keys as when there are 10 differing keys out of 10 billion keys. This enables Riak to provide continuous data protection regardless of cluster size.

Additionally, Riak uses persistent, on-disk hash trees rather than purely in-memory trees, a key difference from similar implementations in other products. This allows Riak to maintain anti-entropy information for billions of keys with minimal additional memory usage, as well as allows Riak nodes to be restarted without losing any anti-entropy information. Furthermore, Riak maintains the hash trees in real time, updating the tree as new write requests come in. This reduces the time it takes Riak to detect and repair missing/divergent replicas. For added protection, Riak periodically (default: once a week) clears and regenerates all hash trees from the on-disk K/V data. This enables Riak to detect silent data corruption to the on-disk data arising from bad disks, faulty hardware components, etc.

New Look for Riak Control
Riak Control is a UI for managing and monitoring your Riak cluster. Riak Control lets you start and re-start Riak nodes, view a “health check” for your cluster, see all nodes and their current status, and have visibility into their partitions and services. Riak Control now has a brand new look and feel. Check out the Riak Control Github page to get up and running.

Expanded IPv6 Support
While Riak’s HTTP interface has always supported IPv6, not all of its interfaces have been as current. In Riak 1.3, the protocol buffers interfaces can now listen on IPv6 or IPv4 addresses. Riak handoff (which is responsible for data transfer when nodes are added or removed, and for handing off update responsibilities when nodes fail) also supports IPv6. It should also be noted that community member Tom Lanyon started the work on this feature. Thanks, Tom!

Improved Backpressure in Riak MapReduce
Riak has Javascript and Erlang MapReduce for performing aggregation and analytics tasks. Backpressure is an important aspect of the MapReduce system, keeping processes from being overwhelmed or memory consumption getting out of control. In Riak 1.3, tunable backpressure is extended to the MapReduce sink to prevent these types of problems at endpoint processes.

Riak Enterprise: Advanced Multi-Datacenter Replication Capabilities
With hundreds of companies using Riak Enterprise, a commercial extension of Riak, we’ve been lucky to work with many teams pushing the limits of multi-datacenter replication performance and resiliency. We’ve learned a lot and are excited to announce these capabilities are now available in advanced mode.

  • Previously, multi-datacenter replication had one TCP connection over which data was streamed from one cluster to another. This could create a performance bottleneck, especially when run on nodes constrained by per-instance bandwidth limits, such as in a cloud environment. In the new version of multi-datacenter replication, multiple concurrent TCP connections (approximately one per physical node) and processes are used between sites.
  • Configuration of multi-datacenter replication is easier. Use a shell command to name your clusters, then connect both clusters using a simple ip:port combination.
  • Better per-connection statistics for both full-sync and real-time modes.
  • New ability to tweak full-sync workers per node and per cluster, allowing customers to dial-in performance.

The new replication improvements are already used in production by customers and yielding significant performance improvements. For now, the new replication technology is available in advanced mode: it’s optional to turn on. It currently doesn’t have all of the features of the default mode – including SSL, NAT support and full-sync scheduling. Both default and advanced modes are available in the 1.3 release and function independently. In the future, “advanced mode” will become the default.

For more details about multi-datacenter replication, download our whitepaper, “Multi-Datacenter Replication: A Technical Overview.”


Riak for Advertising Services and Platforms

February 14, 2013

Advertisers need to provide highly available, low latency experiences to thousands of clients and partners and millions of users. They also need to serve large amounts of data all over the world and can experience significant traffic spikes. To meet these needs, more advertisers are considering distributed data solutions. This post looks at common use cases for Riak in the advertising space, and the stories of two existing advertising users. For a full technical overview, download our whitepaper on Riak for advertisers.

Top Use Cases for Riak in Advertising:

  • Serving Ad Content: Riak’s rapid storage and content agnosticism makes it ideal for storing ad content and handling influxes of ad traffic. For more information on serving ad content with Riak, check out our documentation.
  • Session Storage: This type of data is naturally a good fit for Riak’s key/value model. This data can also be encoded in many different ways and can evolve without any administrative changes to the schema. You can find more information on building a session store with Riak here.
  • Mobile: Riak is ideal for the low-latency, always-available small object storage needed to power mobile experiences across platforms.
  • Global Data Locality: Riak Enterprise’s multi-datacenter capabilities allow advertisers to maintain a global data footprint while providing an always-on, low-latency experience, anywhere in the world.

User Stories:
OpenX, the global leader in digital and mobile advertising technology, serves trillions of ads each year. They use Riak for handling user and trafficking data storage behind their data services API. Riak was selected due to its highly available, low-latency, redundant architecture. OpenX also uses Riak Enterprise’s multi-datacenter replication across several data centers. For more details about how OpenX uses Riak, check out the video of Anthony Molinaro, OpenX engineer, speaking at RICON2012, Basho’s 2012 developer conference.

Velti is a global marketing and advertising technology provider. Velti’s interactive subscriber services provide television broadcast audiences the ability to interact with programs using their mobile phone– voting on people or things, giving feedback, or participating in contests. They selected Riak because it is distributed, scalable, and highly available with the ability to handle large volumes of traffic. To minimize any potentially catastrophic outages, they also opted to build two geographically separated, mirrored sites using Riak Enterprise’s multi-datacenter replication feature. For more information on Velti’s use of Riak check out the complete case study.

To learn more about how advertisers can use Riak for their data needs, check out the complete overview, “Advertisers on Riak: A Technical Introduction,” or stay tuned for future blogs posts on data modeling and querying for advertising services built on Riak.