Riak is an open source, distributed database. Riak is architected for:
- Low-Latency: Riak is designed to store data and serve requests predictably and quickly, even during peak times.
- Availability: Riak replicates and retrieves data intelligently, making it available for read and write operations even in failure conditions.
- Fault-Tolerance: Riak is fault-tolerant so you can lose access to nodes due to network partition or hardware failure and never lose data.
- Operational Simplicity: Riak allows you to add machines to the cluster easily, without a large operational burden.
- Scalability: Riak automatically distributes data around the cluster and yields a near-linear performance increase as capacity is added.
For commercial and datacenter applications, Riak Enterprise adds Multi-Datacenter Replication, monitoring and 24×7 support.
Traditional database architectures were first developed in the late 60s and early 70s. They were the default option for many pre-Internet applications. Riak is uniquely built to better handle a variety of scale application challenges including: tracking user sessions, storing fast growing unstructured or connected device data, ensuring globally distributed reads and writes are fast, plus many more.
User session data is used by an application as it interacts with the end-user. Session data is typically passed back and forth from the end user’s client (browser, phone, etc.) and then stored on the server awaiting the return of new session data with changes from the user. This session data is often critical to ensure user engagement, download content or software, and to complete transactions or purchases. Riak is uniquely architected to handle this type of data. It is designed to never lose a write and to scale horizontally so that even on peak days all your users actions are completed seamlessly.
Connected Device Data
Internet of Things (IoT) and web applications gather and host vast quantities of data generated on a frequent basis often by thousands or millions of devices. This data can be time-series data updated at hour, minute, second, or even millisecond intervals. Riak is a data repository that scales in an unbounded and cost-effective manner in order to provide the ability to retain this quickly generated — and often unstructured — data.
For more use cases, and customer stories, browse the Industries section.
Developing on Riak
Riak uses a simple key/value model for object storage. Objects in Riak consist of a unique key and a value, stored in a flat namespace called a bucket. You can store anything you want in Riak: text, images, JSON/XML/HTML documents, user and session data, backups, log files, and more.
Riak Data Types
Riak integrates cutting edge data types known as convergent replicated data types (CRDT). These developer friendly distributed data types enable tracking of operations (Create, Read, Update, and Delete) in an eventually consistent environment. Riak data types include: counter, set, flag, register, and maps.
APIs and Client Libraries
Riak provides a straight-forward, REST-ful API as well as a protocol buffers interface. There are many client libraries for Riak, including Java, Python, Perl, Erlang, Ruby, PHP, .NET, and many others.
Search and Accessing Data
Riak has several additional features for querying data, including:
- Riak Search: Integration with Apache Solr adds robust query and search to Riak’s highly available distributed key/value store. Support for Solr’s client query APIs enables ease of integration with a wide variety of existing software and commercial solutions.
- Secondary Indexes: Tag objects stored in Riak with additional values and query by exact match or range.
- MapReduce: Non-key-based querying for large datasets.
What is a Riak Node?
Each node in a Riak cluster is the same – containing a complete, independent copy of the Riak package. There is no “master.” This uniformity provides the basis for Riak’s fault-tolerance and scalability. Riak is written in Erlang, a language designed for massively scalable systems.
Data is distributed across nodes using consistent hashing. Consistent hashing ensures data is evenly distributed around the cluster and new nodes can be added automatically, with minimal reshuffling.
Riak automatically replicates data in the cluster (default three replicas per object). You can lose access to many nodes in the cluster due to failure conditions and still maintain read and write availability.
When Nodes Fail
If a node fails or is partitioned from the rest of the cluster, a neighboring node will take over its storage operations. When the failed node returns, the updates received by the neighboring node are handed back to it. This ensures availability for writes or updates, and happens automatically.
Unlike other NoSQL solutions, Riak enables you to scale up/down easily. Growing your cluster doesn’t require increasing your operational staff. When you add or delete nodes, data is rebalanced automatically with no downtime. Developers don’t need to deal with the underlying complexity of what data is where. Any node can accept and route requests.
Stats and DTrace Support
Basho takes pride in developing, releasing and supporting open source projects, and nurturing and building communities around them. Get connected with us on Twitter, LinkedIn, IRC or Facebook. Make sure to sign up for the Riak users mailing list. Riak is one of our most popular pieces of code, but it’s by no means the only one. If you want to browse all of Basho’s code, visit our Github account.
RIAK PRODUCT PORTFOLIO
Basho offers support-only options for users of Riak open source. Riak Starter and Riak Basic provide Basho engineering and customer support for those not requiring the benefits of Riak Enterprise.