Yesterday it was announced that Apple has acquired FoundationDB. As you may imagine, I have been asked to comment on what this means for the NoSQL database industry and for those who are investing heavily in retooling their traditional database infrastructures with new technologies to meet the availability, scalability, and fault-tolerance characteristics required by the massive influx of data.
NoSQL databases are an increasingly critical part of enterprises’ ability to derive real business value from the massive amounts of data that users, devices and online systems generate. They are also an important part of the developers’ toolkits when building applications for the Internet of Things, a major contributor to this ever-growing body of data. Apple is acutely aware of the importance of being able to reliably scale to meet the real-time data needs of today’s global applications. The news of Apple’s intent to acquire FoundationDB greatly amplifies these points to a growing number of IT and engineering leaders.
Part of the comments around the announcement are the discussion of Open Source software both as an underpinning for enterprise infrastructure and as a viable business model. I contributed to a detailed discussion about the latter in a recent article on Silicon Angle entitled NoSQL market frames larger debate: Can open source be profitable?, noting that there is enormous opportunity for Open Source NoSQL companies if they can serve the specific needs of enterprise customers. We feel that we are doing so, and that our approximately 1:10 ratio of paying customers to Open Source users is an indicator of our solution’s value and the strength of our business. Our clear path to being cash-flow-positive includes a measured, strategic investment in R&D which is essential to ensuring Basho’s corporate viability for all customers who have, already, made multi-million dollar investments in their business critical workloads.
Unlike others, the core underpinnings of Riak as a distributed, multi-model data persistence platform are, and will remain, Open Source. Basho builds premium, enterprise-grade features atop this distributed infrastructure, and these features help us attract a higher percentage of paying customers than others in the industry.
Acquisition and consolidation — whether done to enhance technical capabilities, secure talent, or expand a company’s customer base — are essential to the high technology arena. The NoSQL space will be the focus of more of this activity than most in the coming year, given the amount of attention it has already received, with PwC naming NoSQL as one of the “surprising digital bets for 2015” and given the success of the HortonWorks IPO. Combine that buzz with the fact that a prominent database ranking tool lists more than 200 different database management systems, and we are certain to see more industry consolidation.
The decision to re-architect an existing enterprise data workload infrastructure is not one to be taken lightly. Basho’s commitment to Open Source, our commitment to long-term business viability, and our impressive list of customers making substantial investments, point to a bright future not only for our company but for those who choose Riak as a core underpinning of their persistence infrastructure. Apple’s acquisition of FoundationDB strongly validates the value of the solutions we offer and underscores the criticality of these technologies to companies that need to scale business-critical applications.
March 17, 2015
As a Solutions Architect for Basho, I’m often called upon by customers to explain Riak. Frequently this is in the context of a specific use-case they are deploying, or a problem they are attempting to solve. In each of these cases, an important component of the conversation is ensuring a base level of understanding of the design principles in Riak’s distributed architecture. It is this architecture, and these early design decisions that ensure that Riak “just works” and why companies choose Riak for their business critical applications.
This blog highlights key reasons why Riak “just works” from a high availability, fault tolerant, data distribution, predictable performance and operational simplicity perspective. It also shows how Riak addresses data replication, detection of node failures, read latency, and rolling restarts due to upgrades, failures or operating system related issues.
Uniform Data Distribution and Predictable Performance
If you have ever seen a presentation about Riak, you have seen an inclusion of the “ring diagram.” The image is fairly simple and one that we use to describe data distribution, scalability, and performance characteristics of Riak. But the underlying architectural implementation that this ring represents is an important characteristic of Riak’s architecture.
Riak employs a SHA1 based consistent hashing algorithm that is mathematically proven to produce a perfectly uniform distribution about a 160-bit space (the ring). This 160-bit space is divided into equal partitions called “vnodes.” These vnodes, in turn, are evenly distributed amongst participating physical nodes in the cluster. Uniform distribution about the 160-bit space and an even allocation of vnodes amongst physical nodes ensures keys are uniformly distributed about the cluster. Participating nodes in a Riak cluster are homogeneous – meaning any node can service any request – and due to the nature of consistent hashing, every node in the cluster knows where data should reside within the cluster.
Riak’s default behavior is to seek equilibrium. Each node in a cluster is responsible for 1/nth data within the cluster and 1/nth total performance, where n represents total node count in a cluster. This architectural principle allows operators to make reasonable assumptions about Riak’s linear scalability. Often in sharded systems, access patterns that disproportionately access specific ranges of data will cause “hot spots” in a cluster making predictable operations more difficult to maintain. Uniform data distribution about the cluster (and Hinted Handoff described below) allows for continuous normal operations of an active Riak cluster while maintaining a predictable level of performance as any node is removed from the cluster for any reason. Even in resource-starved virtual environments, a Riak cluster will work to maintain its equilibrium and equal data distribution such that operators may assemble larger numbers of slower individual virtual machines to achieve their desired performance profile in aggregate. Additionally, each physical node in a Riak cluster maintains its own performance statistics that are easily accessible and parsable such that an advanced deployment would be able to wrap those statistics into a sick-node detection algorithm based on their own specific thresholds.
Furthering the predictable performance rationale, Riak uses vector clocks, specifically dotted version vectors in Riak 2.0, to internally reason about conflict resolution as it relates to multiple updates to the same object. Updates to any object are independent of updates to any other object and thus there is no locking in Riak for any of its core operations – reads, writes and deletes. There are no global or local locks on any tables or rows – those structures do not exist in Riak. Locks introduce nondeterministic delays in performance yet are often a necessary component of any absolutely consistent database. Riak on the other hand is an eventually consistent database (caveat Riak 2.0’s strong consistency feature at the expense of availability). Additionally, because the base level of abstraction in Riak is the vnode, and data is replicated amongst a set of vnodes operating on distinct physical hardware, background processes such as file compaction is scheduled in such a way that it only affects a segment of the cluster at any given time. This rolling compaction ensures a high degree of availability with minimal performance penalty.
High Availability and Failure Recovery
The measure of a distributed system is not how well that system runs under optimal conditions in the general case, but rather how that system performs in the face of failure. An architect must ask herself how well her system will perform in the face of node failure and how well that system will recover from failure. Failures happen with increasing frequency as the size of your system grows. Riak implements a number of technologies that when combined ensure Riak excels at failure recovery. These technologies provide a baseline set of features that allow Riak to quickly recover from failure scenarios with minimal operator intervention. Not only will these built-in recovery mechanisms maintain eventual consistency within the database but they will also maintain synchronicity with features such as Solr’s full text indexing and Multi Datacenter Replication.
Hinted Handoff ensures data is replicated an appropriate number of times in spite of failure by allowing a node to take over responsibility for a vnode and then return that data to its original “owner.” Handoff from one vnode to another can happen on a temporary (in the case of failure) or permanent (in the case of cluster resizing) basis. In both cases, Riak handles this automatically while the cluster remains available. Because Riak is able to dynamically allocate vnode assignments continuously, the cluster can absorb the loss of any physical node(s) for write operations and ensure availability of data as long as one vnode of any replica set is still accessible. This allows Riak to maintain availability when a node is removed from the cluster for any reason – whether scheduled or not. Ultimately, an unscheduled failure or a scheduled upgrade of Riak or the operating system results in the removal of a node from the cluster – Riak’s core architecture and capabilities accommodate this behavior with minimal operator intervention.
Read Repair is a mechanism triggered on a successful read of a value where all replicas may not agree on the value. There are two possibilities where this can happen: when one replica returns not found – meaning it doesn’t have a copy, and when one replica returns a value where the vector clocks is an ancestor of the successful read. When this situation occurs, Riak will force the errant nodes to update the object’s value based on the value of the successful read.
Finally, Active Anti Entropy corrects inconsistent data continuously in the background. Where Read Repair corrects data at read time, AAE corrects all data regardless of whether or not it is actively accessed by running a background process that continuously checks for inconsistencies. AAE uses Merkle Tree hash exchanges between vnodes to look for these inconsistencies. When a difference at the top of the tree is detected, Riak recursively checks the tree until it finds the exact values with a difference between vnodes and then sends the smallest amount of data necessary to regain equilibrium.
Architecture Makes a Difference
Riak’s core architecture and key technical features provide the building blocks valued in a highly distributed environment. Uniform data distribution, homogeneous nodes, hinted handoff, read repair and active anti entropy all play a role in providing the high availability, fault tolerance, predictable performance and operational simplicity developers and managers are looking for from their non-relational persisted data solutions.
We’ve seen Riak adopted across a wide variety of verticals and for a broad range of use cases. From gaming to retail to advertising to mobile, our customers begin by identifying a workload, or use case, where availability, scalability, and fault tolerance are critical. We then work closely with these customers to ensure Riak is an ideal fit for the architectural design and business requirements. This process often begins with a Tech Talk. Someone like me, working with you either onsite or remotely, to assess how Riak can help you solve your critical business requirements. You can sign up for a Tech Talk here.
February 1st, 2015
If you missed last week’s webinar Preparing for the Deluge of Unstructured Data you can still watch it on-demand. Dorothy Pults and I discuss the news emanating from the 2015 Consumer Electronics show and highlight that the Internet of Thing, connected devices, and the resulting explosion of unstructured data are front and center of growth trends in 2015. In particular, we covered the topics of:
- What is driving the growth in unstructured data
- The challenges associated with managing unstructured data
- How companies are capitalizing on the opportunities that unstructured data presents, to save money, time, and create new market opportunities
The webinar covers each of these topic in great details and provides some insights on distributed systems.
Why Distributed Systems?
Companies like Facebook, Amazon, and Google have built huge distributed systems with strict requirements around scalability, fault tolerance, and global footprints. These same concepts must now be considered by companies of all sizes…from the Enterprise to the startup.
The reality is that everything works at small scale. Challenges arise as it becomes necessary to scale out, up and down, predictably and linearly. When assuming that failure and latency are part of the equation, it is necessary to choose a distributed database that enables horizontal scale. And, similarly, that it enables this scale on commodity hardware or the compute instance that your business has adopted in its architecture. This is particularly important when data governance is a key component of your design considerations.
Ultimately, the customer experience matters. When designing your distributed architecture, and choosing persistence solutions like Riak, ensure that there is a solution for the geographic distribution of data (like Riak Enterprise’s multi-datacenter replication capability) to provide low latency experiences for your customers, regardless of their physical location.
For more information on this topic space, we have compiled a few resources to enable your education and decision-making.
January 22, 2015
In speaking with Riak users, both open source and commercial, we are frequently told that Riak’s key/value model is more flexible and faster to develop against than a traditional relational database. Even though Riak is well suited for many applications, there are inevitable tradeoffs in terms of query options and data types that are available. With a key/value model, there is no concept of columns or rows, therefore Riak does not have join operations. Riak can be queried either directly via HTTP, the protocol buffers API and through various client libraries. However, there is no SQL or SQL-like language that is currently available.
Riak’s key/value data model does not preclude queryability. There are several powerful querying options including:
- Riak Search: Integration with Apache Solr provides full-text search and support for Solr’s client query APIs.
- Secondary Indexes: Secondary Indexes (2i) give developers the ability to tag an object stored in Riak with one or more query values. Indexes can be either integers or strings, and can be queried by either exact matches or ranges of values.
- MapReduce: Developers can leverage Riak MapReduce for tasks like filtering documents by tag, counting words in documents, and extracting links to related data.
For more information, check out the Riak documentation on Querying Data.
The table below illustrates key/value mappings for common application types. Remember that values in Riak are opaque and stored on disk as binaries – JSON or XML documents, images, text, etc. The way data is organized in Riak should take into account the unique needs of the application, including access patterns such as read/write distribution, latency differences between various operations, use of Riak features (including MapReduce, Search, Secondary Indexes), and more.
|Session||User/Session ID||Session Data|
|Advertising||Campaign ID||Ad Content|
|Sensor||Date, Date/Time||Sensor Updates|
|User Data||Login, eMail, UUID||User Attributes|
|Content||Title, Integer||Text, JSON/XML/HTML Document, Images, etc.|
Consider, for example, one of the canonical use cases for Riak…storing user and session data. In a relational database, the “users” table is well known and, basically, provides a unique identifier per user, and then a series of identifying information about that user as individual columns such as:
- First name
- Last name
- Counter of Site Visits
- Paid Account Identifier
This data can then be used to correlate or count, paid users, common interests, etc. via a series of SQL queries against the row/column structure of the users table.
Riak, in contrast, provides flexibility in how this data can be modeled based upon the application use case. It may be desirable to create a Users bucket, with the UserName (or Unique Identifier) as the key and a JSON object storing all user attributes as the value. Or, as we describe in Data Modeling with Riak Data Types, leverage the power of Riak Data Types by creating a map type for each user storing:
- first and last name strings in the register type,
- interests as a set,
- a counter for visits,
- and a flag for paid account identifier.
One of the best ways to enable application interaction with objects (a key/value pair) in Riak is to provide structured bucket and key names for the objects. This approach often involves wrapping information about the object in the object’s location data itself.
For example, appending a timestamp, UUID, or Geographical coordinate, to a key’s name allows for fine grained queryability via simple lookup to locate and retrieve a specific set of information. Leveraging the same naming mechanism as created for users (UniqueID as the key) enables, in a separate sessions bucket, storing the UUID append with a timestamp as the key and the session data (in binary format) as the object. In this way, using the same UUID, I am able to obtain both user and session data stored in different buckets and in different formats.
For additional information, and more complex considerations such as modeling relationship and advanced social applications, see the Riak documentation on use cases and data modeling.
Resolving Data Conflicts
In any system that replicates data, conflicts can arise – e.g., if two clients update the same object at the exact same time or if not all updates have yet reached hardware that is experiencing lag. Riak is “eventually consistent” – while data is always available, not all replicas may have the most recent update at the same time, causing brief periods (generally on the order of milliseconds) of inconsistency while all state changes are synchronized.
However, Riak does provide features to detect and help resolve the statistically small number of incidents when data conflicts occur. When a read request is performed, Riak looks up all replicas for that object. By default, Riak will return the most updated version, determined by looking at the object’s vector clock. Vector clocks are metadata attached to each replica when it is created. They are extended each time a replica is updated to keep track of versions. Clients can also be allowed to resolve conflicts themselves.
Further, when an outdated object is discovered as part of a read request, Riak will automatically update the out-of-sync replica to make it consistent. Read Repair, a self-healing property of the database, will even update a replica that returns a “not_found” in the event that a node loses it due to physical failure.
Riak also features “Active Anti-Entropy,” which is an automatic self-healing property that runs in the background. Rather than waiting for a read request to trigger a replica repair (as with Read Repair), Active Anti-Entropy constantly uses a hash tree exchange to compare replicas of objects and automatically repairs or updates any that are divergent, missing, or corrupt. This can be beneficial for large clusters storing “stale” data.
More information on vector clocks, dotted version vectors, and conflict resolution can be found in the online documentation in the section regarding Causal Context.
Multi-site replication is quickly becoming critical for many of today’s platforms and applications. Not only does replication across multiple clusters provide geographic data locality – the ability to serve global traffic at low-latencies – it can also be an integral part of a disaster recovery or backup strategy. Other teams may use multi-site replication to maintain secondary data stores, both for failover as well as for performing intensive computation without disrupting production load. Multi-site replication is included in Basho’s commercial extension to Riak, Riak Enterprise, which also includes 24/7 support.
Multi-site replication in Riak works differently than the typical approach seen in the relational world, multi-master replication. In Riak’s multi-datacenter replication, one cluster acts as a “primary cluster.” The primary cluster handles replication request from one or more “secondary clusters” (generally located in datacenters in other regions or countries). If the datacenter with the primary cluster goes down, a secondary cluster can take over as the primary cluster. In this sense, Riak’s multi-datacenter capabilities are “masterless.”
In multi-datacenter replication, there are two primary modes of operation: full sync and real-time. In full sync mode, a complete synchronization occurs between primary and secondary cluster(s). In real-time mode, continual, incremental synchronization occurs – replication is triggered by new updates. Full sync is performed upon initial connection of a secondary cluster, and then periodically (by default, every 6 hours). Full sync is also triggered if the TCP connection between primary and secondary clusters is severed and then recovered.
Data transfer is unidirectional (primary->secondary). However, bidirectional synchronization can be achieved by configuring a pair of connections between clusters.
Full documentation for multi-datacenter replication in Riak Enterprise is available in the online documentation.
Modeling data in any non-relational solution requires a different way of thinking about the data itself. Rather than an assumption that all data cleanly fits into a structure of rows and columns, the data domain can be overlayed on the core Key/Value store (Riak) in a variety of ways. There are, however, distinct tradeoffs and benefits to understand.
Relational Databases have:
- Foreign keys and constraints
- Sophisticated query planners
- Declarative query language (SQL)
- A Key/Value model where the value is any unstructured data
- More data redundancy that provides better availability
- Eventual consistency
- Simplified query capabilities
- Riak Search
What you will gain:
- More flexible, fluid designs
- More natural data representations
- Scaling without pain
- Reduced operational complexity
For more information on Data Modeling, or to chat with a member of the Basho team on the topic, please request a Tech Talk.
New executive team secured additional funding and drove strong bookings growth
BELLEVUE, Wash. – January 13, 2015 – Basho Technologies, the creator and developer of Riak®, the industry leading distributed NoSQL database, today announced record 2014 sales growth along with closure of a $25 million Series G funding round led by existing investor Georgetown Partners. The financing is being used to expand development and marketing activities.
The company achieved several critical milestones in 2014:
- Grew bookings 62 percent sequentially in Q3 and 116 percent sequentially in Q4
- Grew bookings 88 percent from second half 2013 to second half 2014
- Ended 2014 with 87 percent licensing, 13 percent professional services revenues
- Closed numerous multi-million dollar enterprise deals
- Shipped Riak 2.0
- Shipped Riak CS 1.5
- Replaced Oracle at National Health Service of UK
“The new Basho management team has made strong progress in positioning the company to capitalize on growth opportunities for solutions that enable enterprises to extract value from the massive amounts of data they generate,” said Chester Davenport, chairman of Basho Technologies and managing director of Georgetown Partners. “Riak and Riak CS software have extremely strong product roadmaps for 2015 and sales momentum is impressive. With Series G funding secured, I have confidence Basho will establish itself as a leading unstructured data solutions provider in 2015.”
In March, Basho announced Adam Wray, formerly CEO of Tier 3, as CEO and Dave McCrory, formerly of Warner Music Group and VMware, as CTO. The company also added executive leadership for product, engineering, finance and EMEA management.
Basho has been widely recognized for innovation in distributed systems since being founded in 2008 and Riak has been deployed by more than 30 percent of the Fortune 50. The company experienced a significant increase in enterprise adoption in 2014 in a variety of industries, including advertising, financial services, gaming, retail and healthcare, replacing Oracle at the United Kingdom’s National Health Service.
The Weather Company, which oversees popular brands such as The Weather Channel, weather.com, Weather Underground, Weather Central and WSI, initially selected Riak Enterprise with its Multi-Datacenter Replication capabilities while still being extremely lightweight, easy to use and simple software.
“The amount of data we collect from satellites, radars, forecast models, users and weather stations worldwide is over 20TB each day and growing quickly. This data helps us deliver the world’s most accurate weather forecast as well as deliver more severe weather alerts than anyone else, so it is absolutely mission critical and has to be available all of the time,” said Bryson Koehler, executive vice president and CITO for The Weather Company. “Riak Enterprise Software gives us the flexibility and reliability that we depend on to enable over 100,000 transactions a second with sub 20ms latency on a global basis.”
Tapjoy first deployed Riak software to guarantee performance and uptime, even with peak traffic. It found that Riak helped them keep costs down, decrease engineering complexity, and reduce operational effort due to its ease of use and general stability.
“Two years ago, we implemented Riak Enterprise Software due to its high availability, operational simplicity, and ability to scale,” said Wes Jossey, head of operations at Tapjoy. “When we began, our clusters typically moved around 40,000 operations per second at peak. Today, we now see peaks well over 250,000 operations per second, all while sustaining sub-millisecond response times and rock solid stability. Despite this massive change in growth, we still do not employ any full-time engineers to work on our Riak cluster. It’s really that easy to use.”
OpenX became a Basho customer in 2012 to address multi datacenter replication and to consolidate the number of databases it was using to support its ad trafficking system. Riak Enterprise Software meets OpenX high availability and scale objectives with its multi-data center replication achieving over a billion daily real-time ad requests from a global audience.
“Basho has established themselves as a key OpenX partner,” said Matt Davis, site reliability engineer at OpenX. “They have worked with us in true partnership fashion to keep up with our rapidly scaling business and have always addressed our concerns in a timely manner. As a supporter of both Riak software and the greater Erlang community, OpenX appreciates the strong engineering prowess at Basho.”
“Worldwide demand for NoSQL technologies is driving our growth and greatly expanding our large enterprise deployments,” said Wray. “As NoSQL moves into primetime, we’re seeing more enterprises seek solutions that address a broad range of unstructured data requirements and we expect this trend to increase rapidly. We set aggressive product and sales goals for 2014 and I couldn’t be happier with our achievements this year. We look forward to continuing this acceleration into 2015 and beyond.”
- Basho Website (http://basho.com)
- Basho Blog (http://basho.com/blog/)
- Riak® software (http://basho.com/riak/)
- Riak® CS software (http://basho.com/riak-cloud-storage/)
- Additional Resources (http://basho.com/resources/)
- Twitter: @Basho (https://twitter.com/basho)
- LinkedIn (https://www.linkedin.com/company/basho-technologies-inc)
About Basho Technologies
Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s distributed database, Riak®, the industry leading distributed NoSQL database, and Basho’s cloud storage software, Riak® CS, are used by fast growing Web businesses and by one third of the Fortune 50 to power their critical Web, mobile and social applications, and their public and private cloud platforms.
Riak is the registered trademark of Basho Technologies, Inc. Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s Riak software is the industry leading distributed NoSQL database software. Basho’s Riak CS software is cloud storage software used by fast growing Web businesses and by one third of the Fortune 50 to power their critical Web, mobile and social applications, and their public and private cloud platforms.
Riak software and Riak CS software are available open source. Riak Enterprise Software and Riak CS Enterprise Software offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com.
# # #
January 6, 2015
If you have read about Riak, or seen a member of the Basho team present, you have probably heard the phrase “Your data is opaque to Riak.” While this is not, strictly, true with the inclusion of distributed Data Types in Riak 2.0, it was a phrase that hinted at the core structure of Riak itself.
Riak is a Key Value data store.
In a relational database, data is organized by tables that are separate and unique structures. Within these tables exist rows of data organized into columns. As such, interaction with the database is by retrieving or updating entire tables, individual rows, or a group of columns within a set of rows.
In contrast, Riak has a much simpler data model. An Object is both the largest and smallest element of data. As such, interaction with the database is by retrieving or modifying the entire object. There is no partial fetch or update of the data.
Keys in Riak are simply a binary value (or a string) that are used to identify Objects. The Key/Value pair (or Object) is stored in a higher level namespace called a Bucket. And, with Riak 2.0, there is an extra layer of abstraction known as Bucket Types.
This Key/Value/Bucket model enables broad flexibility in modeling the applications data domain with Riak as the data store for persistence.
Another NoSQL model that many are familiar with is the document store. Unlike the Key/Value model the data store is aware of the structure of the objects stored. These objects, or documents, are grouped into “collections” — which is analogous to a relational “table” — and the datastore provides a query mechanism to search collections for objects with particular attributes. When the data that is being persisted is easily rendered as a JSON document, a document store can seem a natural fit. Some common use cases include product catalog data and content management systems.
The Basho Docs have a lengthy tutorial entitled Using Riak as a Document Store that walks you through the process of leveraging Riak as a document store for a CMS. There are many approaches to modeling, but the tutorial demonstrates the power of Riak 2.0 features by combining the maps data type and indexing that data with Riak Search.
When the data you are persisting can be represented as JSON, and you require the ability to query the data, Riak 2.0 is an excellent solution for persisting and modeling document data. The flexibility of the Key/Value model, combined with the power of Riak Search and Riak Data Types, provide you with a highly scalable, highly available document store with rich, full-text query capabilities. In addition, the inclusion of the maps data type means that you don’t have to write complex client side resolution logic when faced with network partitions. Riak Data Types handle that conflict resolution automatically.
A scalable, available document store that is operationally simple may seem compelling enough to use Riak. But when you combine the characteristics of Riak with the multi-datacenter replication capabilities of Riak Enterprise, now you have a solution that enables you to bring your data operations closer to the end user.
Scalable, available, operationally simple, and replicated. That’s the power of using Riak as a document store.
December 18, 2014
One of the interesting things about attending industry events, like AWS re:Invent, is identifying common trends that arise in conversations. Recent conversations point to a renewed interest in “enterprise ready replication” for NoSQL databases.
As business data continues to grow, there is an entirely new set of challenges that are presented related to availability, scalability, and fault-tolerance. While most NoSQL databases work at small scale, availability is often compromised as applications reach full production or peak capacity. Having the right replication functionality is key to ensuring that availability requirements are not compromised as your system grows.
“Replication” may mean different things based on context. In this case, we are referring to the movement of data in a database cluster — or across datacenters — for the purpose of redundancy or data locality. If your database experience began in an RDBMS context, then replication implies a specific contextual understanding of multi-master transactional deployment and, perhaps, shipping transaction logs between incremental backups in a hot/warm database configuration. In contrast, for those who began in the NoSQL era, the term may evoke images of replica-sets on a sharded infrastructure and the operational overhead associated therewith.
In a distributed NoSQL database, like Riak, the term replication is used to encompass two distinct concepts. First, intra-cluster replication for high availability and fault tolerance within the datacenter; and second, multi-datacenter replication for data locality and global availability. There is none of the complexity of log shipping or dealing with a sharded infrastructure.
Data replication is a core feature of Riak’s basic architecture. Riak was designed to operate as a clustered system containing multiple nodes (commodity servers or cloud instances). The replication implementation allows data to live on multiple machines at once, with a single write request, in case a node in the cluster goes down or is unavailable due to issues like network partitioning.
Intra-cluster replication is fundamental and automatic in Riak, so that your data is always available. All data stored in Riak is replicated to a number of nodes in the cluster according to a configurable parameter (
n_val) set in a buckets bucket type.
With the default
n_val setting of 3, there are always three copies of all data. These copies will be on three different partitions/vnodes. A detailed explanation and analysis of this replication capability is discussed in the Riak documentation – Understanding replication by example.
In the case of intra-cluster replication, or what we would refer to simply as “replication”, data distribution ensures redundant data such that high availability is maintained in a failure state.
In contrast to intra-cluster replication, multi-datacenter replication (a feature of Riak Enterprise) is a critical part of modern application infrastructures. Riak Enterprise offers multi-datacenter replication features so that data stored in Riak can be replicated to multiple sites (vs. multiple servers in the same site).
As we are all aware, understanding application latency (for an end user) begins with the realization data can’t travel faster than the speed of light. So, inherently, as source information moves further from it’s consumption latency is introduced. As such, there is a set amount of latency for a customer connecting to your application hosted in New York when they are accessing the application from San Francisco. This latency profile increases, and becomes more complex, as the geographic distribution of your customer base increases.
With multi-datacenter replication in Riak Enterprise, data can be replicated across locations and geographic areas providing for disaster recovery, data locality, compliance with regulatory requirements, the ability to “burst” peak loads into public cloud infrastructure, amongst others.
Riak’s multi-datacenter replication is masterless. One cluster acts as a primary, or source, cluster. The primary cluster handles replication requests from one or more secondary, or sink, clusters (generally located in datacenters in other regions or countries). If the datacenter with the primary cluster goes down, a secondary cluster can automatically take over as the primary cluster.
More architectural strategies for multi-datacenter implementations, are covered in the Basho whitepaper entitled Riak Enterprise: Multi-Datacenter Replication – A Technical Overview & Use Cases or in the Basho Documentation section Multi-Datacenter Replication: v3 Architecture.
Replication, inside a cluster, is a core design tenant of Riak. This is used to provide the availability and fault-tolerance characteristics — with a low operational overhead — that many unstructured data workloads demand.
Multi-datacenter replication, while related, is an entirely different approach and architecture to enable the geographic distribution of data to solve for redundancy, geo-data locality, etc.
Replication is an important scalability feature of any database deployment. Ensuring that your NoSQL database replicates data in a way that is scalable, operationally simple and achieves your business objectives is key to your success.
For years, the press and industry analysts have been telling us that cloud is mainstream, but the reality is that Enterprises must shift their workloads to the cloud in an orderly, low risk manner. While there are many applications already built and running in the cloud, there are many new (or underutilized and, perhaps, misunderstood) technologies like Docker, Chef and object storage that are changing the way cloud applications are implemented.
At RICON 2014, Basho worked with Citrix to host “Build a Cloud Day.” Build a Cloud day sessions explore new technologies and show how to bring some order to the chaos of moving workloads to the cloud. The attendees learn the concepts and best practices to deploy a cloud computing environment using Apache CloudStack and other cloud infrastructure tools, including those from XenServer, Docker, RiakCS, Chef, Zenoss, Puppet and many others that automate server and network configuration for building highly available cloud computing environments.
Cloud Architecture: Virtualization, Orchestration and Storage
“Build a Cloud Day” started with an excellent presentation by Mark Hinkle. Many of us know him as @mrhinkle. Mark is the Senior Director of Open Source Solutions at Citrix Systems where he helps support the Apache CloudStack and Xen.org projects.
Mark has an excellent grasp of cloud computing and provides an overview of cloud computing architecture and the open source software that can be used to deploy and manage a cloud-computing environment. He looks at virtualization and containers and provides a brief description of Docker and how it is being used in today’s applications.
He also provides an overview of OpenStack. Mark closes the presentation with insights into how to deliver Platform-as-a-Service (PaaS) and what technologies can be used to compliment this evolving cloud computing paradigm.
Software is Eating Infrastructure
Other presenters at “Build a Cloud Day” included Basho’s own John Burwell (@john_burwell). John is a Senior Software Engineer at Basho Technologies. He also serves as an Apache CloudStack PMC member and committer focused on storage architecture and security integration. John’s talk explores cloud design strategies to achieve high availability and reliability using commodity components and how to apply these strategies using Apache CloudStack and Riak CS.
By migrating reliability and scalability responsibilities up the stack from specialized hardware to software, cloud orchestration platforms such as Apache CloudStack (ACS) and object stores such as Riak CS increase the utilization and density of compute and storage resources by dynamically shifting workloads based on demand.
John describes two workloads predominately managed in cloud environments — traditional virtualization and cloud — and how to use Apache CloudStack to efficiently manage both simultaneously. He then explores storage design to support this dual workload model, including the use of Riak CS with Apache CloudStack to reduce infrastructure costs without sacrificing reliability.
Riak CS provides software-defined, fault-tolerant object storage uniquely built to handle a variety of unstructured and big data needs using commodity hardware.
Apache CloudStack, Apache Brooklyn and more…
There were many great presentations at “Build a Cloud Day” including:
- Primary Storage in CloudStack by Mike Tutkowski (Slides | Video)
- Introduction to Apache CloudStack by David Nalley (Slides | Video)
- Hypervisor Selection in the Cloud by Tim Mackey (Slides | Video)
- Cloud Application Blueprints with Apache Brooklyn by Alex Henevald (Slides | Video). Alex also did a Riak-specific presentation at RICON 2014, Running Riak in a Docker Cloud using Apache Brooklyn.
You can find out more about RICON 2014 in our blog post. http://basho.com/wrapping-up-ricon-2014/.
The videos of the presentations at RICON 2014 can be found on our RICON Archive site. The Keynote by Peter Alvaro – Outwards from the Middle of the Maze is very popular.
November 10, 2014
Many data needs are better served by data stores that are optimized for maximum availability and scalability — rather than optimized for consistency. For certain use cases, there are elements to the data that require strong consistency. With Riak 2.0, in addition to eventual consistency, there is now a way to enforce strong consistency when needed.
NOTE: Riak’s strong consistency feature is currently an open-source-only feature and is not yet commercially supported.
Behavioral Changes with Strong Consistency
Strongly consistent operations in Riak function much like eventually consistent operations at the application level. The core difference lies in the types of errors Riak will report to the client.
Each request to update an object (except for the initial creation) must include a context value reflecting the last time the application read it. This is the same behavior that Riak clients have always followed with version vectors and strong consistency also mandates its use. Similarly, reading data from a strongly consistent Riak bucket functions exactly like eventually consistent reads.
If that value is not provided for an update operation to an existing object, Riak will reject it. This is because the database assumes that you have not seen the current value and may not know what you’re doing.
Similarly, if that context value is out of date, Riak will also reject update operations. The client must re-read the latest value and supply an update based on that new value, with the new context.
If Riak cannot contact a majority of the servers responsible for the key, the request will fail. Ordinarily, Riak is happy to accept all operations in the interest of high availability and never dropping a write – even in the extreme case of only one server surviving data center outages.
Strong consistency also eliminates object siblings, as it is effectively impossible for the cluster to disagree on the value of an object.
When considering consistency models in an application, it is easy for the logic to quickly become daunting. This is especially true when designing a workflow that leverages both eventually and strongly consistent models. It is, therefore, easiest to begin with a simple use case.
Consider the workflow involved in storing and updating username and password data. In the case of a password update, it is necessary that — at any given time — there be exactly ONE result for a user’s password. Relatedly, it is important to ensure that an update of this value is fully atomic or user experience is substantially degraded. It would be possible to leverage Riak for all the eventually consistent elements of the application and leverage strong consistency for the username and password.
To see how eventual and strong consistency can be combined to solve business problems, let’s take a not-so-hypothetical example from the energy industry.
Imagine you’re collecting massive amounts of geological data for analysis. Each batch of data must be processed by a single instance of your application. Since this processing can take hours, days, or even weeks to complete, it’s expensive if two applications handle the same batch.
Let’s walk through the sequence of events.
- Batch of data arrives for processing.
- The batch is stored in a large object store (like, Riak CS) under a batch ID.
- The batch ID is added to a pending job list in Riak and stored as a set (one of the new Riak Data Types).
This is a classic example of eventual consistency and an illustration of the value of the new Riak Data Types introduced with Riak 2.0. Storing a new batch ID in your database should never fail, even if servers are offline. If multiple applications are adding batch IDs to the pending list at the same time, it’s perfectly reasonable for those lists to temporarily diverge, as long as they can be trivially merged later.
Let’s continue to see where strong consistency comes into play.
- A compute node becomes available to process the data.
- The compute node retrieves the pending job list and picks a batch ID.
- The compute node attempts to create a lock for that batch ID.
This is where strong consistency is required. This lock object should be created in a bucket that is managed by the new strong consistency subsystem in Riak 2.0. If someone else also grabs that batch ID and tries to create another lock object, Riak’s strong consistency logic will reject this second attempt. This compute node will just start over by grabbing a new ID.
To detect crashed jobs, the lock object should be created with basic job data, such as which compute node owns the processing job and what time it was started.
- The compute node asks Riak to add the batch ID to a different set, a running job list.
- The compute node asks Riak to remove the batch ID from the pending list.
- The job runs.
- When completed, the compute node asks Riak to add the batch ID to a completed job list.
- Riak is asked to remove the batch ID from the running list.
- The compute node deletes the lock object (or updates it to reflect the completion of the processing job).
Tradeoffs When Using Strong Consistency
- Blind updates will be rejected, so the client must read the existing value before supplying a new one (except in the case of entirely new keys).
- Write requests may be slightly slower due to coordination overhead.
- If a majority of the servers responsible for a piece of data are unavailable, write requests will fail. Read operations may fail depending on the freshness of the data that is still accessible.
- Secondary indexes (2i) are not yet supported.
- Multi-datacenter replication in Riak Enterprise is not yet supported.
- Using Strong Consistency (for developers)
- Managing Strong Consistency (for operators)
- Strong Consistency (theory & concepts)
Strong Consistency is now available with Riak 2.0. Download Riak 2.0 on our Docs Page.
By: Peter Coppola
We had the opportunity to stop by DATAVERSITY’S NoSQL Now! conference in San Jose last week. I was very impressed with the level of energy and the wide-ranging selection of sessions offered. According to Tony Shaw, the CEO of DATAVERSITY, the organizer of NoSQL Now, registrations were up 15 percent from 2013.
The exhibition hall was packed and lively as attendees jostled between booths. DATAVERSITY did an outstanding job keeping the show floor tightly packed with exhibitors. The industry was well represented by Cloudera (saw “Data is the new bacon” t-shirts), MarkLogic, MongoDB, Oracle and EnterpriseDB – all present as major sponsors. Between conversations, I was able to nab a nifty versatile screw-driver disguised as a pen from DataStax.
NoSQL Now sessions do rely heavily on sponsors, but with such a wide selection of tracks there’s bound to be a topic of interest at any given time slot. I had a choice of the following concurrent sessions at 4:15 p.m. on Wednesday:
- Internet of Things with MongoDB – MongoDB
- Out with MapReduce, In with Spark – DataStax
- Case Studies in Search and Semantics – MarkLogic
- Just the Right Weather for our Company: How We Chose Our Data Stores – The Weather Company
- NoSQL on ACID – EnterpriseDB
I attended The Weather Company’s session – not only was it the only non-vendor presenter, but the company is also a customer and big fan of Riak. The Weather Company manages five data centers that in production handle 25,000 requests per second and distribute 60 GB of data to each data center every 10 minutes. Surya Kangeyan Sivakumar took us through the journey of how The Weather Company selected its data store solutions and how it overcame the mindset of having to use its existing relational database solution just because the company had invested so much in it. Riak was selected, along with other NoSQL solutions, due to the speed and ease at which it could be stood up.
In 2015 Basho looks forward to being a more active participant in NoSQL Now.