San Francisco, CA – June 18, 2013 – On Thursday, July 20th, Basho’s Chief Architect Andy Gross will be speaking at Structure, GigaOm’s flagship conference on cloud computing and internet infrastructure, in San Francisco. His talk entitled, “The Free Lunch is Over Again,” will discuss the resurgence in interest in both theoretical and applied distributed systems, explore new areas of promising research, and provide practical advice for dealing with systems in our new distributed world. He will be speaking at 10:20am.
Additionally, Basho is also a sponsor of GigaOm Structure and will have a booth throughout the conference. Feel free to stop by with any questions about Riak or distributed systems. We are also announcing an exciting new partner tomorrow during the conference and will be available to answer any questions related to the partnership as well.
If you aren’t able to attend Structure this week, Basho employees are also speaking all over the world. Visit the Basho Events page to see where we’ll be next.
About Basho Technologies
Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s distributed database, Riak and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by over 25 percent of the Fortune 50 to power their critical Web, mobile and social applications and their public and private cloud platforms.
Riak and Riak CS are available open source. Riak Enterprise and Riak CS Enterprise offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com. Basho is headquartered in Cambridge, Massachusetts and has offices in London, San Francisco, Tokyo and Washington DC.
February 25, 2013
This post takes an in-depth look at Riak Enterprise’s new multi-datacenter replication capabilities, available in the recent 1.3 release. Riak Enterprise’s multi-datacenter replication now ships with “advanced mode,” which features some performance and resiliency improvements that we’ve developed by working with production customers:
- Instead of having only one TCP connection over which data is streamed from one cluster to another, this new version features multiple concurrent TCP connections (approximately one per physical node) and processes are used between sites. This prevents against possible performance bottlenecks, which can be especially common when run on nodes constrained by per-instance bandwidth limits (such as in a cloud environment).
- Easier configuration of multi-datacenter replication. Simply use a shell command to name your clusters, then connect both clusters using an ip:port combination.
- Better per-connection statistics for both full-sync and real-time modes.
- New ability to tweak full-sync workers per node and per cluster, which allows customers to dial-in performance.
Details of the advanced mode capabilities are below. For more about the multi-datacenter replication upgrades and our 1.3 release, check out this recent article from GigaOM. For full technical details, check out our documentation on multi-datacenter replication. For an examination of common architectures and use cases for Riak Enterprise, including datacenter failover, active-active cluster configurations, availability zones, and cloud bursting, download our technical overview.
The new advanced mode of Riak Enterprise’s multi-datacenter replication takes the best features from the past single channel communications model and scales it up to one-to-one connections between peer nodes of clusters. With concurrent channels and the ability to constrain the maximum connections per node and per cluster, the new multi-datacenter replication allows the full capacity of the network and cluster size to scale the performance to available resources.
It begins with a much easier configuration command language and environment, with natural objects as sources, sinks, and cluster names. For example, real-time and full-sync enable/disable, start/stop, and status all use these human friendly symbols. All of the connections go through a single port, reducing network administration to a single firewall port forwarding. Riak then manages the different protocols on this port. Connections are persistent, resilient to outages, and adapt to changes in cluster names and IP addresses automatically.
Two Sync Modes
Real-time synchronization between clusters uses a new queueing mechanism that ensures maximum performance and graceful shutdown of nodes. This guarantees that there is no loss of replication data during upgrades or scheduled maintenance. It also automatically balances the load across all nodes of both the source and sink clusters and requires no configuration.
Full-synchronization between clusters uses a scheduling algorithm to maximize the transfer rate of data between peer nodes of the two clusters. Partitions are synchronized in parallel so that the maximum number of keys can be updated concurrently with the minimum overlap, which minimizes load and contention on both the source and sink clusters. Full-sync supports concurrent syncs between multiple clusters and optimizes the load dynamically, ensuring nodes never exceed their configured connectivity. This allows clusters to synchronize at maximum efficiency, without impacting their runtime performance for connected clients as they make PUT and GET requests.
We are also planning to include Secure Sockets Layer and Network Address Translation support in this advanced mode of multi-datacenter replication – it is currently only available in default mode. Additionally, future improvements will take advantage of the Active Anti-Entropy (that was introduced in Riak 1.3) to make full-sync differencing even faster. Stay tuned for more updates!
To learn more about Riak 1.3 and the new advanced mode for multi-datacenter replication, sign up for our webcast on Thursday, March 7th.