February 28, 2013
In the last post, we looked at how Riak Enterprise’s multi-datacenter replication can be configured for backups and data locality. In this post, we examine two other common implementations: availability zones and public cloud use cases. For more information on Riak Enterprise architecture and configuration, download the complete whitepaper.
Availability zones provide efficient multi-datacenter replication and data redundancy within a geographic region (such as a coast or a country). In this configuration, data is replicated within an availability zone’s series of datacenters. In the event that one of datacenters experiences an outage or serious failure, data can still be served from other datacenters within the same region.
One approach to setting this up is to have a “primary” site in a region where all reads and writes for specific users, applications, or data sets are directed. This primary cluster can then be replicated to one or more proximal secondary clusters. In other approaches, data can be replicated in real-time from one cluster to both another datacenter and other cold backups maintained for emergency conditions. The right approach is highly dependent on the requirements of users, availability, expense of bandwidth, and other constraints.
Public Cloud Use Cases
Riak is designed to be easy to use and operate on public clouds, and is partnered with many of the leading cloud providers, including Amazon Web Services, Microsoft Azure, and Joyent. Hosted Riak is also available from Engine Yard and Riak packages can always be manually installed on any physical or virtual provider, even if a machine image isn’t explicitly supported.
There are several use cases for Riak Enterprise’s multi-datacenter replication in the public cloud. Many enterprises want to maintain a cold or hot backup of their cluster in a public cloud for business continuity in the event of a datacenter outage in their private infrastructure. For other customers, the public cloud can provide a more cost-effective way of meeting peak loads, rather than building out private infrastructure to accommodate them year-round. For example, many retailers and media providers need to offer increased capacity over the holiday season. Riak Enterprise is used to scale out capacity on public clouds over these periods, either with full-sync or real-time sync depending on the business needs.
Finally, some enterprises run certain applications or services entirely on public clouds. Riak Enterprise allows for redundancy and data locality across public cloud availability zones for this use case, ensuring optimal performance and resiliency.
February 26, 2013
Mobile platforms and applications pose unique infrastructure challenges for today’s companies. These applications require low-latency, always-available small object storage that can scale to millions or more users, and support highly concurrent access and traffic spikes.
Riak provides a number of benefits for these platforms, including:
- Low-Latency Data Storage: Riak is designed to serve predictable, low-latency requests to provide a fast, available experience to all users.
- Straightforward Data Model: Riak uses a simple key-value data model, which is ideal for storing and serving mobile content, user information, events, and session data. Riak is content agnostic, so there are no restrictions on content type.
- Accommodates Peak Loads Gracefully: To handle increasing user data and accommodate peak loads during events, Riak makes it easy to add additional capacity and scale out quickly. Riak automatically rebalances data when new nodes are added, while its consistent hashing methodology prevents hot spots in the database.
- Multi-Datacenter Replication: Riak Enterprise’s multi-datacenter replication allows mobile platforms to serve low-latency content to users all over world by maintaining a global data footprint.
- For a full overview, download our new whitepaper on building mobile services with Riak
Bump is a popular mobile app that makes it easy for users to share their contact information, photos, and other objects by simply “bumping” their smartphones. They use Riak to store user data and currently run 25 nodes of Riak storing about 3TB of data.
For more details about how Bump uses Riak and how they designed their application, check out Bump’s presentation at RICON2012, Basho’s 2012 developer conference. You can also read the complete case study for more information about why Bump chose Riak.
Voxer is a popular Walkie Talkie application for smartphones that allows users to send instant voice messages to one or more friends. They switched to Riak due to its fault-tolerance and ability to scale quickly and easily. They currently run more than 50 machines on Riak to support their huge growth and user base. For more details about how Voxer uses Riak, check out the complete case study and watch Matt Ranney’s talk at a Riak Meetup in San Francisco.
To learn more about how mobile platforms can use Riak for their data needs, check out the complete overview, “Mobile on Riak: A Technical Introduction.”
February 24, 2013
Recently, Basho engineer, Eric Redmond, published “A Little Riak Book.” This book is available free for download at littleriakbook.com and provides a great overview of Riak, including how to think about a distributed system compared to more traditional databases.
The book starts with a discussion on concepts. Since Riak is a distributed NoSQL database, it requires developers to approach problems differently than they would with a relational database. The concepts section describes the differences between various NoSQL systems, takes an in-depth look at Riak’s key/value data model, and describes how Riak is designed for high availability (as well as how it handles eventual consistency constraints). After laying the theoretical groundwork, the book walks developers through how to use Riak by explaining the different querying options and showing them how to tinker with settings to meet different use case needs. Finally, it covers the basic details that operators should know, such as how to set up a Riak cluster, configure values, use optional tools, and more.
After finishing the book, start playing around with Riak to see if it’s the right fit for your needs. You can download Riak on our Docs Page.
February 21, 2013
Today we are excited to announce the latest version of Riak. Here is a summary of the major enhancements delivered in Riak 1.3:
- Introduced Active Anti-Entropy. Riak now has active anti-entropy. In distributed systems, inconsistencies can arise between replicas due to failure modes, concurrent updates, and physical data loss or corruption. Pre-1.3 Riak already had several features for repairing this “entropy”, but they all required some form of user intervention. Riak 1.3 introduces automatic, self-healing properties that repair entropy on an ongoing basis.
- Improved Riak Enterprise’s multi-datacenter replication performance. New advanced mode for multi-datacenter replication capabilities, with better performance, more TCP connections and easier configuration. Read more in this write up from GigaOM.
- Improved graphical user experience. Riak Control, the user interface for managing and monitoring Riak, has a brand new look.
- Expanded IPv6 support. IPv6 support in Riak now is supported by all interfaces.
- Improved MapReduce. Riak MapReduce has improved back-pressure to reduce the risk of overwhelming endpoint processes during large tasks.
- Simplified log management. Riak can now optionally send log messages to syslog.
Ready to get started or upgrade? Download the new release here, check out the official release notes, or read on for more details. Documentation for all products and releases is available on the documentation site. For an introduction to Riak and what’s new in Riak 1.3, sign up for our webcast on Thursday, March 7.
More on What’s in Riak 1.3
A key feature of Riak is its ability to regenerate lost or corrupted data from replicated data stored on other nodes. Prior to this release, Riak provided two methods to repair data:
- Read Repair: Riak compares the replies from all replicas during a read request, repairing any replica that is divergent or missing data. (K/V data only)
- Repair Command via Riak Console: Introduced in Riak 1.2, the repair command enables users to trigger a repair of a specific partition. The partition is rebuilt based on a subset of data stored on adjacent nodes in the Riak ring. All data is rebuilt, not just missing or divergent data. (K/V and Search data)
Riak 1.3 introduces active anti-entropy, a continuous background process that compares and repairs any divergent, missing, or corrupted replicas (K/V data only). Unlike read repair, which is only triggered when data is read, the active anti-entropy system ensures the integrity of all data stored in Riak. This is particularly useful in clusters containing “cold data”: data that may not be read for long periods of time, potentially years. Furthermore, unlike the repair command, active anti-entropy is an automatic process, requiring no user intervention and is enabled by default in Riak 1.3.
Riak’s active anti-entropy feature is based on hash tree exchange, which enables differences between replicas to be determined with minimal exchange of information. Specifically, the amount of information exchanged in the process is proportional to the differences between two replicas, not the amount of data that they contain. Approximately the same amount of information is exchanged when there are 10 differing keys out of 1 million keys as when there are 10 differing keys out of 10 billion keys. This enables Riak to provide continuous data protection regardless of cluster size.
Additionally, Riak uses persistent, on-disk hash trees rather than purely in-memory trees, a key difference from similar implementations in other products. This allows Riak to maintain anti-entropy information for billions of keys with minimal additional memory usage, as well as allows Riak nodes to be restarted without losing any anti-entropy information. Furthermore, Riak maintains the hash trees in real time, updating the tree as new write requests come in. This reduces the time it takes Riak to detect and repair missing/divergent replicas. For added protection, Riak periodically (default: once a week) clears and regenerates all hash trees from the on-disk K/V data. This enables Riak to detect silent data corruption to the on-disk data arising from bad disks, faulty hardware components, etc.
New Look for Riak Control
Riak Control is a UI for managing and monitoring your Riak cluster. Riak Control lets you start and re-start Riak nodes, view a “health check” for your cluster, see all nodes and their current status, and have visibility into their partitions and services. Riak Control now has a brand new look and feel. Check out the Riak Control Github page to get up and running.
Expanded IPv6 Support
While Riak’s HTTP interface has always supported IPv6, not all of its interfaces have been as current. In Riak 1.3, the protocol buffers interfaces can now listen on IPv6 or IPv4 addresses. Riak handoff (which is responsible for data transfer when nodes are added or removed, and for handing off update responsibilities when nodes fail) also supports IPv6. It should also be noted that community member Tom Lanyon started the work on this feature. Thanks, Tom!
Improved Backpressure in Riak MapReduce
Riak Enterprise: Advanced Multi-Datacenter Replication Capabilities
With hundreds of companies using Riak Enterprise, a commercial extension of Riak, we’ve been lucky to work with many teams pushing the limits of multi-datacenter replication performance and resiliency. We’ve learned a lot and are excited to announce these capabilities are now available in advanced mode.
- Previously, multi-datacenter replication had one TCP connection over which data was streamed from one cluster to another. This could create a performance bottleneck, especially when run on nodes constrained by per-instance bandwidth limits, such as in a cloud environment. In the new version of multi-datacenter replication, multiple concurrent TCP connections (approximately one per physical node) and processes are used between sites.
- Configuration of multi-datacenter replication is easier. Use a shell command to name your clusters, then connect both clusters using a simple ip:port combination.
- Better per-connection statistics for both full-sync and real-time modes.
- New ability to tweak full-sync workers per node and per cluster, allowing customers to dial-in performance.
The new replication improvements are already used in production by customers and yielding significant performance improvements. For now, the new replication technology is available in advanced mode: it’s optional to turn on. It currently doesn’t have all of the features of the default mode – including SSL, NAT support and full-sync scheduling. Both default and advanced modes are available in the 1.3 release and function independently. In the future, “advanced mode” will become the default.
For more details about multi-datacenter replication, download our whitepaper, “Multi-Datacenter Replication: A Technical Overview.”
February 19, 2013
Hibernum is a creator and developer of unique gaming experiences that combine the latest in social gaming, top quality visuals and animations, and cutting edge design. They use Riak to store user game information for one of their most popular social games.
Currently, Hibernum’s Riak installation serves thousands of requests per second to more than a million monthly active users. User data is stored in Riak as JSON objects, and Hibernum uses Riak’s HTTP interface, a perfect fit for their Node.js-based application server. As the game grows in popularity, millions of new entries are generated and stored in Riak, as well as any updates or modifications that may occur during gameplay. Mario Lefebvre, IT Specialist at Hibernum, has said that Riak is “managing this load like a charm and is a stable and rock solid solution.”
Originally, Hibernum was using a relational database, however, they found the manual sharding required to scale was operationally intensive and inefficient. They needed something that could better handle their significant growth and started looking for a cost-efficient solution that could support the large amount of requests, as well as a solution that allowed for easy scalability. After testing multiple solutions, Riak was chosen for its high availability, ability to scale to peak loads, and predictable operational cost.
To learn more about how Hibernum uses Riak, check out the complete case study.
February 14, 2013
Advertisers need to provide highly available, low latency experiences to thousands of clients and partners and millions of users. They also need to serve large amounts of data all over the world and can experience significant traffic spikes. To meet these needs, more advertisers are considering distributed data solutions. This post looks at common use cases for Riak in the advertising space, and the stories of two existing advertising users. For a full technical overview, download our whitepaper on Riak for advertisers.
Top Use Cases for Riak in Advertising:
- Serving Ad Content: Riak’s rapid storage and content agnosticism makes it ideal for storing ad content and handling influxes of ad traffic. For more information on serving ad content with Riak, check out our documentation.
- Session Storage: This type of data is naturally a good fit for Riak’s key/value model. This data can also be encoded in many different ways and can evolve without any administrative changes to the schema. You can find more information on building a session store with Riak here.
- Mobile: Riak is ideal for the low-latency, always-available small object storage needed to power mobile experiences across platforms.
- Global Data Locality: Riak Enterprise’s multi-datacenter capabilities allow advertisers to maintain a global data footprint while providing an always-on, low-latency experience, anywhere in the world.
OpenX, the global leader in digital and mobile advertising technology, serves trillions of ads each year. They use Riak for handling user and trafficking data storage behind their data services API. Riak was selected due to its highly available, low-latency, redundant architecture. OpenX also uses Riak Enterprise’s multi-datacenter replication across several data centers. For more details about how OpenX uses Riak, check out the video of Anthony Molinaro, OpenX engineer, speaking at RICON2012, Basho’s 2012 developer conference.
Velti is a global marketing and advertising technology provider. Velti’s interactive subscriber services provide television broadcast audiences the ability to interact with programs using their mobile phone– voting on people or things, giving feedback, or participating in contests. They selected Riak because it is distributed, scalable, and highly available with the ability to handle large volumes of traffic. To minimize any potentially catastrophic outages, they also opted to build two geographically separated, mirrored sites using Riak Enterprise’s multi-datacenter replication feature. For more information on Velti’s use of Riak check out the complete case study.
To learn more about how advertisers can use Riak for their data needs, check out the complete overview, “Advertisers on Riak: A Technical Introduction,” or stay tuned for future blogs posts on data modeling and querying for advertising services built on Riak.
February 13, 2013
Bump, one of the most popular mobile apps of all time, makes it easy for users to share their contact information, photos and other objects by simply “bumping” their smartphones. Bump uses Riak to store user data including events, communications sent and received, handset information and tokens needed to authenticate using social networks.
Bump chose Riak for its operational ease-of-use, ability to scale writes, and availability under failure conditions.
“It’s a relief that we don’t need to spend time thinking about whether or not Riak is working,” said Will Moss, Server Engineer at Bump. “It does what it’s supposed to do; nodes can go down but Riak will still work. It’s great to be able to deal with node failures the next day instead of at 3am.”
Recently, Bump expanded their mobile app offerings and launched Flock, a photo-sharing app. For more information on how Flock uses Riak, including their data model, watch Bump’s presentation at RICON2012, Basho’s 2012 developer conference. Bump is now running 25 nodes on Riak and storing around 3TB of data.
You can also check out the complete case study.
February 12, 2013
OmniTI is a provider of web infrastructures and applications for companies that require scalable, high-performance, mission critical solutions. They specialize in providing complex, high-transaction, and large-volume data applications. One of their customers is Viggle, a mobile app that rewards people for checking into the television shows they are watching. Viggle has more than a million users, and its advertisers include Pepsi, Kraft, and Capital One.
OmniTI designed the server architecture for Viggle’s mobile app and also designed the internal APIs that connect Viggle’s multiple back-end services, creating a cohesive, fault-tolerant system. OmniTI chose Riak as an integral part of this system, providing high availability and low latency during peak times.
“For this architecture, near-zero downtime and sustaining high throughput with low latency are critically important,” said Theo Schlossnagle, CEO of OmniTI. “We needed Viggle’s key components to remain available and responsible under sudden floods of user traffic, which made Riak the perfect fit. Riak has not disappointed us and has performed exactly as needed for this application.”
For more information on how Viggle has used OmniTI to design and support their system, check out their full announcement.
February 11, 2013
We are excited to announce Datapipe’s Stratosphere, a globally available, high-performance managed cloud computing platform, leverages Riak Cloud Storage (CS). Riak Cloud Storage provides Datapipe and its customers with highly available, low-latency and S3-compatible storage.
Datapipe offers a single provider solution for managing and securing mission critical IT services, including cloud computing, infrastructure as a service, platform as a service, managed hosting, and colocation.
Stratosphere is Datapipe’s globally available managed cloud computing platform. With the launch of Riak CS to support cloud object storage, Datapipe customers can now access cloud object storage from any solution hosted with Datapipe and adjacent to existing solutions in any Datapipe data center. Stratosphere is designed for enterprise high I/O production environments and can also be used for development, testing and QA environments. Use cases include large-scale marketing campaigns, brand sites and analytics; applications with variable peak demand times and other dynamic workloads; and cloud disaster recovery and geographic redundancy.
Datapipe delivers services from the world’s most influential technical and financial markets including New York metro, Silicon Valley, London, Hong Kong and Shanghai.
Why Riak Cloud Storage at Datapipe?
Datapipe selected Riak Cloud Storage for its low-latency, highly available object storage, operational ease-of-use, and multi-site replication capabilities. After extensively testing solutions from a variety of vendors in the space, Datapipe selected Riak Cloud Storage for a few core reasons:
- Built on years of developing Riak, Riak CS is designed to provide simple, available, distributed cloud storage at any scale.
- Riak CS is compatible with major cloud object storage clients and applications with its S3-based API.
- Riak CS meets the high performance requirements of the Stratosphere cloud-computing platform.
“Riak CS provides the high-performance, distributed datastore we need to deliver a sound foundation for our cloud storage needs now and for many years into the future,” said Ed Laczynski, VP Cloud Strategy, Datapipe.
Be on the lookout for upcoming documentation about using Riak CS-backed functionality on Stratosphere at Datapipe. Riak CS is now available with Datapipe in a limited beta, with an upcoming full release.
For a developer trial of Riak CS, sign up here.
February 7, 2013
Basho and our community have a handful of events lined up for February 13th. We have official meetups/group hacks in at least seven cities in the US.
We hope to see you next week. If you can’t attend an official Meetup, throw a Riak hack or drink up in your city and email email@example.com to tell us about it.
Thanks for being a part of Riak.
- Speaker: Weston Jossey, Software Engineer, Tapjoy
- Talk Title: Huge Data Migrations to Riak Made Easy(er)
- Details and RSVP
- Speaker: Sean Cribbs, Software Engineer, Basho Technologies
- Talk Title: The Deep Riak
- Details and RSVP
New York City
- Speaker: Aaron Brown, Lead Systems Engineer, ideel
- Talk Title: Riak at ideel
- Details and RSVP
- Speaker: Adron Hall and You
- Talk Title: Riak Hack & Brew
- Details and RSVP
- Speaker: Robert Zuber, Co-Founder, Copious
- Talk Title: Riak in a Multi-Datastore Strategy at Copious
- Details and RSVP
- Speaker 1: Pavan Venkatesh, Technical Evangelist, Basho Technologies
- Talk Title 1: From Relational to Riak
- Speaker 2: Sajith Kizhakkiniyil, Software Infrastructure and Backend Architecture Support, Apollo Group
- Talk Title 2: Riak at Apollo
- Details and RSVP
- Speaker: Adron Hall and You
- Talk Title: Nerd Lunch and The Start of Seattle Riak
- Details and RSVP