Tag Archives: multi-data center

Top Five Questions About Riak CS

May 1, 2013

This post looks at five commonly asked questions about Riak CS – simple, available, open source storage built on top of Riak. For more information, please review our full documentation, or sign up for an intro to Riak CS webcast on Friday, May 10.

What is the relationship between Riak and Riak CS?

Riak CS is built on top of Riak, exposing higher-level storage functions including large object support, an S3-compatible API, multi-tenancy, and per-user storage and access statistics. Riak itself provides the replication, availability, fault-tolerance, and underlying storage functions for the Riak CS implementation. Riak and Riak CS should both be installed on every node in your cluster. While Riak and Riak CS could be run on separate virtual or physical nodes, running them on the same machine minimizes intra-cluster bandwidth usage and is the recommended approach. As with Riak, we advise a minimum 5-node cluster.

When objects are uploaded to Riak CS, the object is broken up into smaller chunks which are then streamed, stored, and replicated in the underlying cluster. A manifest is maintained for each object, that points to which blocks comprise the object, and is used to retrieve all blocks and present them to the client on read. In addition to running Riak and Riak CS on each node, Stanchion, a request serializer, must be installed on at least one node in the cluster. This ensures that global entities, such as users and buckets, are unique in the system.

What use cases does Riak CS support that Riak doesn’t?

Riak CS has several features that are not provided in the standalone Riak database. One of the most obvious differences is in the size of objects supported. Riak CS exposes large object support, and includes multi-part upload so you can upload objects as a series of parts. This allows you to upload single objects to the system into the terabyte range. In Riak, the data model is simply key/value; in Riak CS, the key/value model provides the underlying structure for higher-level storage semantics – users, buckets and objects. The Riak CS interface is an S3-compatible HTTP API, allowing you to use existing S3 libraries and tools. In contrast, Riak exposes an HTTP and protobufs API and offers many language-specific clients. Unlike Riak, Riak CS is multi-tenant, with the concept of “users” and per-user reporting on storage and access. This makes it a fit for both private cloud scenarios, with multiple internal users, or as a foundation for a public cloud storage offering.

How does multi-tenancy, authentication and reporting work?

Riak CS exposes an interface for user creation, disablement and credential management. Riak CS can be set so that only administrators can create new users. Administrators also have special privileges including being able to retrieve a list of all users in the system and query the user account information of any user. Once issued credentials, users are able to authenticate, create buckets, upload and download files, retrieve account information, obtain new credentials, or disable their account through the API. Riak CS supports the standard S3 authentication scheme, with support for header and query string authorization.

Riak CS exposes storage, usage and network statistics that support use cases like accounting, subscription, billing or multi-group utilization for public or private clouds. Riak CS will report information on how much storage a user is consuming and the network operations related to access. This data is exposed via an HTTP interface and can be queried on the default timespan “now” or as a range from start time through end time. Access statistics are reported as bytes in and bytes out for both object and bucket operations. Reporting of this information can be scheduled for a set interval or manually triggered.

What’s the difference between Riak CS and Riak CS Enterprise?

Riak CS Enterprise provides multi-datacenter replication on top of Riak CS. For multi-datacenter replication in Riak CS, global information for users, bucket information and manifests are streamed in real-time from a primary implementation to a secondary site so global state is maintained across locations. Objects can then be replicated in either full sync or real-time sync mode. The secondary site will replicate the object as in normal operations. Additional datacenters can be added in order to create availability zones or provide additional data redundancy and locality. Riak CS Enterprise can also be configured for bi-directional replication. Riak CS Enterprise also comes with 24/7, enterprise-level support. More information and pricing can be found here, and full technical information is available on our docs portal. Ready to get started? Sign up for a developer trial of Riak CS Enterprise.

What are your plans for integration of Riak CS with open source compute solutions?

Riak CS provides highly available, distributed storage, making it a natural fit for usage alongside compute solutions. We have partnered with Citrix to collaborate on the integration of Apache CloudStack and Riak CS to create a complete cloud software offering that combines compute and storage in an integrated platform. For more information on our partnership with CloudStack, check out this blog post with the latest update. API and authentication support for OpenStack is also in progress.

Ready to get started? You can download Riak CS here, and check out the Riak CS Fast Track for a hands-on getting started guide.

Introducing Riak 1.3: Active Anti-Entropy, a New Look for Riak Control, and Faster Multi-Datacenter Replication

February 21, 2013

Today we are excited to announce the latest version of Riak. Here is a summary of the major enhancements delivered in Riak 1.3:

  • Introduced Active Anti-Entropy. Riak now has active anti-entropy. In distributed systems, inconsistencies can arise between replicas due to failure modes, concurrent updates, and physical data loss or corruption. Pre-1.3 Riak already had several features for repairing this “entropy”, but they all required some form of user intervention. Riak 1.3 introduces automatic, self-healing properties that repair entropy on an ongoing basis.
  • Improved Riak Enterprise’s multi-datacenter replication performance. New advanced mode for multi-datacenter replication capabilities, with better performance, more TCP connections and easier configuration. Read more in this write up from GigaOM.
  • Improved graphical user experience. Riak Control, the user interface for managing and monitoring Riak, has a brand new look.
  • Expanded IPv6 support. IPv6 support in Riak now is supported by all interfaces.
  • Improved MapReduce. Riak MapReduce has improved back-pressure to reduce the risk of overwhelming endpoint processes during large tasks.
  • Simplified log management. Riak can now optionally send log messages to syslog.

Ready to get started or upgrade? Download the new release here, check out the official release notes, or read on for more details. Documentation for all products and releases is available on the documentation site. For an introduction to Riak and what’s new in Riak 1.3, sign up for our webcast on Thursday, March 7.

More on What’s in Riak 1.3

Active Anti-Entropy
A key feature of Riak is its ability to regenerate lost or corrupted data from replicated data stored on other nodes. Prior to this release, Riak provided two methods to repair data:

  • Read Repair: Riak compares the replies from all replicas during a read request, repairing any replica that is divergent or missing data. (K/V data only)
  • Repair Command via Riak Console: Introduced in Riak 1.2, the repair command enables users to trigger a repair of a specific partition. The partition is rebuilt based on a subset of data stored on adjacent nodes in the Riak ring. All data is rebuilt, not just missing or divergent data. (K/V and Search data)

Riak 1.3 introduces active anti-entropy, a continuous background process that compares and repairs any divergent, missing, or corrupted replicas (K/V data only). Unlike read repair, which is only triggered when data is read, the active anti-entropy system ensures the integrity of all data stored in Riak. This is particularly useful in clusters containing “cold data”: data that may not be read for long periods of time, potentially years. Furthermore, unlike the repair command, active anti-entropy is an automatic process, requiring no user intervention and is enabled by default in Riak 1.3.

Riak’s active anti-entropy feature is based on hash tree exchange, which enables differences between replicas to be determined with minimal exchange of information. Specifically, the amount of information exchanged in the process is proportional to the differences between two replicas, not the amount of data that they contain. Approximately the same amount of information is exchanged when there are 10 differing keys out of 1 million keys as when there are 10 differing keys out of 10 billion keys. This enables Riak to provide continuous data protection regardless of cluster size.

Additionally, Riak uses persistent, on-disk hash trees rather than purely in-memory trees, a key difference from similar implementations in other products. This allows Riak to maintain anti-entropy information for billions of keys with minimal additional memory usage, as well as allows Riak nodes to be restarted without losing any anti-entropy information. Furthermore, Riak maintains the hash trees in real time, updating the tree as new write requests come in. This reduces the time it takes Riak to detect and repair missing/divergent replicas. For added protection, Riak periodically (default: once a week) clears and regenerates all hash trees from the on-disk K/V data. This enables Riak to detect silent data corruption to the on-disk data arising from bad disks, faulty hardware components, etc.

New Look for Riak Control
Riak Control is a UI for managing and monitoring your Riak cluster. Riak Control lets you start and re-start Riak nodes, view a “health check” for your cluster, see all nodes and their current status, and have visibility into their partitions and services. Riak Control now has a brand new look and feel. Check out the Riak Control Github page to get up and running.

Expanded IPv6 Support
While Riak’s HTTP interface has always supported IPv6, not all of its interfaces have been as current. In Riak 1.3, the protocol buffers interfaces can now listen on IPv6 or IPv4 addresses. Riak handoff (which is responsible for data transfer when nodes are added or removed, and for handing off update responsibilities when nodes fail) also supports IPv6. It should also be noted that community member Tom Lanyon started the work on this feature. Thanks, Tom!

Improved Backpressure in Riak MapReduce
Riak has Javascript and Erlang MapReduce for performing aggregation and analytics tasks. Backpressure is an important aspect of the MapReduce system, keeping processes from being overwhelmed or memory consumption getting out of control. In Riak 1.3, tunable backpressure is extended to the MapReduce sink to prevent these types of problems at endpoint processes.

Riak Enterprise: Advanced Multi-Datacenter Replication Capabilities
With hundreds of companies using Riak Enterprise, a commercial extension of Riak, we’ve been lucky to work with many teams pushing the limits of multi-datacenter replication performance and resiliency. We’ve learned a lot and are excited to announce these capabilities are now available in advanced mode.

  • Previously, multi-datacenter replication had one TCP connection over which data was streamed from one cluster to another. This could create a performance bottleneck, especially when run on nodes constrained by per-instance bandwidth limits, such as in a cloud environment. In the new version of multi-datacenter replication, multiple concurrent TCP connections (approximately one per physical node) and processes are used between sites.
  • Configuration of multi-datacenter replication is easier. Use a shell command to name your clusters, then connect both clusters using a simple ip:port combination.
  • Better per-connection statistics for both full-sync and real-time modes.
  • New ability to tweak full-sync workers per node and per cluster, allowing customers to dial-in performance.

The new replication improvements are already used in production by customers and yielding significant performance improvements. For now, the new replication technology is available in advanced mode: it’s optional to turn on. It currently doesn’t have all of the features of the default mode – including SSL, NAT support and full-sync scheduling. Both default and advanced modes are available in the 1.3 release and function independently. In the future, “advanced mode” will become the default.

For more details about multi-datacenter replication, download our whitepaper, “Multi-Datacenter Replication: A Technical Overview.”

Basho

TV Everywhere With Synacor and Riak

January 9, 2013

Synacor’s TV Everywhere platform enables cable, satellite, consumer electronics and telco companies to stream content and programming to any device, anytime. TV Everywhere also provides innovative search, discovery and recommendation solutions combined with deep social media integration.

Synacor TV Everywhere uses Riak as object storage for video clips, news stories and other content. Originally using a relational solution as their primary datastore, API response times had started to slow as they continued to add more assets. After evaluating several possible solutions, they chose to move to Riak due to its low latency and Synacor’s high availability requirements.

Riak Enterprise has been deployed in multiple Synacor datacenters and has improved the API response time significantly since its integration. Synacor now stores over 5 million assets with thousands being added daily. According to Michael Collins, Synacor’s Senior Director of Engineering, “Riak has never been the source of a bottleneck for us. It’s been great.”

For more details, check out the complete case study, “TV Everywhere with Synacor and Riak”

Basho

Riak Enterprise at Velti

December 3, 2012

In 2009, mobile marketing and advertising technology provider Velti had a good (but challenging) problem on their hands. Their technology, which allows people to interact with their TV by voting, giving feedback, participating in contests, etc., had taken off. It had been adopted by nearly all of the TV broadcasters in the UK and three of the UK’s five mobile operators. As more customers began using their technology, Velti saw quick growth in (inherently spikey) traffic. Their 2003-era .NET, SQLServer platform was becoming a concern.

Because the team at Velti had been working with Erlang (what Riak is written in), in 2010 they brought in Erlang Solutions to help them architect their next generation platform. Riak was chosen for the database, and an early version of Multi-Data Center replication in Riak Enterprise was used to build two geographically separated sites to minimize potential catastrophic outages.

Velti’s new mGageTM platform is now running on 18 servers across two data centers (nine nodes in each data center), with each server running both Erlang applications as well as Riak. We’re pleased to pass along reports that the platform is redundant, queue behavior has significantly improved (especially for large queue populations), and that after Velti moved to Riak 1.2, they saw noticeable disk space utilization thanks to improvements in merge management.

Markus Kern, VP Technology at Velti summarizes, “We operate a 24/7 service for over 140 customers. We cannot afford a single minute of downtime. Riak gives us the ability to meet and exceed our requirements for scale, data, durability, and availability.” Woot!

For more details on Velti’s experience, see our case study Highly Available Mobile Platform With Riak.

Basho Team

Announcing Riak Cloud Storage With Multi-Datacenter Replication

November 30, 2012

Riak Cloud Storage is an S3-compatible, multi-tenant storage platform built on Riak. It combines the availability and fault tolerance of Riak with the ability to store large objects, an S3-compatible API, user administration and usage reporting. It can be used for public and private clouds or as reliable storage for applications. Today we’re announcing multi-datacenter replication support in Riak CS. Increasingly, global enterprises and apps require multi-site storage replication to achieve data locality, availability in disaster scenarios, or maintain active backups, so we’re very excited to provide these features in the latest release of Riak CS.

You can read more about multi-datacenter replication for Riak CS in the public docs, or sign up for an upcoming webcast on Thursday, December 6, which gives a technical overview of Riak CS and discussion of new features. If you want something more hands on, get a developer trial of Riak CS to take it for a test drive.

Technical Details

Multi-datacenter replication in Riak CS provides two modes of object replication: full sync and real-time sync. Data is streamed over a TCP connection, and multi-datacenter replication in Riak CS has support for SSL so data can be securely replicated between sites.

In Riak CS, large objects are broken into blocks and streamed to the underlying Riak cluster on write, where they are replicated for high availability (3 replicas by default). A manifest for each object is maintained so that blocks can be retrieved from the cluster and the full object presented to clients. For multi-site replication in Riak CS, global information for users, bucket information and manifests are streamed in real-time from a primary implementation to a secondary site so global state is maintained across locations. Objects can then be replicated in either full sync or real-time sync mode.

In full sync, objects are replicated from a primary Riak CS implementation to a secondary site on a configurable interval – the default is 6 hours. In full-sync replication, each cluster computes a hash for each key’s block value. Key/block pairs are compared, and the primary site streams any missing blocks or updates needed to the secondary site.

Real-time sync is triggered when an update is sent from a client to a primary Riak CS implementation. Once replicated in the first location, the updates are streamed in real-time to the secondary site. But what happens if a client requests an object from the secondary cluster and not all of its blocks have been replicated to that cluster? With Riak multi-site replication, the secondary cluster will request any missing blocks from the primary cluster so that the client can be served.

Try It Out

We’ve got two ways for you to try out Riak CS software. First, we can give you access to a hosted version where you can upload files, test out the API, and try s3cmd or other clients against it. If you want to try Riak CS on your own hardware, we also have a developer trial that gives you access to the Riak CS code and a little bit of our help to get you up and running. So check out the docs and then sign up to start.

Basho Team

Multi-Data Center Replication in Riak Enterprise 1.2

August 8, 2012

The Replication team @Basho has been hard at work implementing new features for Multi-Data Center (MDC) Replication. These new features are the direct result of customer feedback, and are included in the release of Riak Enterprise 1.2. Riak Enterprise documentation is also now publicly available for the first time.

What is MDC Replication?

Replication is a tool available in Riak Enterprise that allows data to be copied between Riak clusters. Data can be copied on initial connection to a remote cluster, in realtime as a bucket is updated, or as a periodic full-synchronization. Although replication is uni-directional, remote clusters can be setup to replicate data back to a primary cluster, thus synchronizing bi-directionally.

These settings are all configurable along side other Riak settings in app.config, and by using the Riak Enterprise command line tool riak-repl (in your Riak Enterprise ./bin directory).

What’s new?

SSL

As replicating sensitive data over the internet isn’t safe, we now provide encryption via OpenSSL out of the box. Certificates signed by a standard Certificate Authority (CA) such as Verisign are supported, as well as self-signed certs.

Certificate chains can be validated down to the CA, but both certificates must resolve to the same root CA. Additionally, you can configure the number of intermediate CA’s allowed. Certificate common name whitelisting is also supported.

An example of enabling SSL is as easy as specifying these 4 parameters to the riak_repl section of app.config:

bash
{ssl_enabled, true},
{certfile, "/full/path/to/site1-cert.pem"},
{keyfile, "/full/path/to/site1-key.pem"},
{cacertdir, "/full/path/to/cacertsdir"}

Additional SSL configuration parameters are documented in the forthcoming Riak Enterprise Replication Operations Guide.

Per-bucket replication settings

Per-bucket replication allows for more granular control of exactly what and how things get replicated. Using this feature is as easy as setting a bucket property. Supported per-bucket replication schemes are: realtime only, full-sync only, both realtime + full-sync, and no replication.

For example, to entirely disable replication on a bucket titled “my_bucket”:

bash
curl -v -X PUT -H "Content-Type: application/json" -d '{"props":{"repl":false}}' http://127.0.0.1:8091/riak/my_bucket

The following example only replicates data during a full-sync (skipping real-time replication) on a bucket titled “my_bucket”:

bash
curl -v -X PUT -H "Content-Type: application/json" -d '{"props":{"repl":"fullsync"}}' http://127.0.0.1:8091/riak/my_bucket

These parameters are documented in the forthcoming Riak Enterprise Replication Operations Guide.

Extensive documentation updates

We are excited to be releasing new and improved Riak Enterprise documentation in v1.2. This documentation is now available publicly on the Riak wiki. Additional settings have been documented which allow for greater control of replication behavior.

Support for replication over NAT

It’s typical to see Network Address Translation (NAT) in an enterprise environment, so support has been added to make this easier for our customers to use. Combining SSL + replication over NAT should take care of securely copying data over the internet.

The new command:

bash
riak-repl add-nat-listener <nodename> <internal_ip> <internal_port> <nat_ip> <nat_port>

will allow the primary cluster (aka “listener”) to replicate data on both an internal IP/port and public IP/port.

Replication over NAT Example

Server A is the primary source of replicated data.

Server B and Server C would like to be clients of replicated data.

To configure this scenario:

Server A is setup with static NAT, configured for IP addresses:

192.168.1.10 (internal) and 50.16.238.123 (public)
Server A replication will listen on:

  • the internal IP address 192.168.1.10, port 9010
  • the public IP address 50.16.238.123, port 9011

Server B is setup with a single public IP address: 50.16.238.200

Server B replication will connect as a client to the public IP address 50.16.238.123, port 9011
Server C is setup with a single internal IP address: 192.168.1.20

Server C replication will connect as a client to the internal IP address of 192.168.1.10, port 9010

Configure a listener (replication server) on Server A:

bash
riak-repl add-nat-listener riak@192.168.1.10 192.168.1.10 9010 50.16.238.123 9011

Configure a site (replication client) on Server B

bash
riak-repl add-site 50.16.238.123 9011 server_a_to_b

Configure a site (replication client) on Server C

bash
riak-repl add-site 192.168.1.10 9010 server_a_to_c


To summarize, we hope that SSL, replication over NAT, per-bucket replication settings and updated documentation will allow for better control of Riak MDC Replication in your enterprise installation.

Thanks for reading!

Dave, Andrew, and Chris