September 18, 2013
The other day you heard about a cool new object storage solution, Riak CS, with an Amazon S3-compatible API. You starred the repository on GitHub so that you could easily find it on another day when there’s more time to play.
That day is today.
(If you haven’t heard about the cool new object storage solution called Riak CS, today is your lucky day.)
You download and install the Riak and Riak CS packages for your operating system and dig into the configuration files. For Riak CS, the configuration files live in a file named
As you skim through the default settings and the comments that surround them, something stands out. The default for
cs_root_host is set to
s3.amazonaws.com. Before reading the comments, your mind begins to speculate, “Does Riak CS talk to S3? I thought this was meant to replace Amazon S3!”
Good news: Riak CS doesn’t talk to S3.
Instead, this configuration item makes it possible to direct Amazon S3 clients to your Riak CS installation, even if they weren’t designed to support an S3-compatible alternative.
Proxy Configuration for S3 Clients
Ideally, your client does support alternatives to S3. If so, skip to the “Direct Configuration for S3 Clients” section below. However, if you’re not so lucky, read on.
A proxy configuration allows S3 clients to communicate with Riak CS as if it were Amazon S3. When configuring these clients, you’ll need:
portof your Riak CS cluster, configured under your client’s proxy settings
- The Riak CS user credentials (
When requests from this client hit Riak CS, they are processed and returned to the client as if they were serviced by S3.
Note that in this scenario, URLs returned from Riak CS will contain
s3.amazonaws.com. Also, several S3 clients only allow you to set one proxy per client. Both of these issues make things difficult if you’re trying to link users to objects stored in Riak CS, or if you want to interact with Riak CS and S3 simultaneously from the same client.
Direct Configuration for S3 Clients
A direct configuration requires that the client has support for interacting with an S3-compatible service. This boils down to a client that allows you to alter the endpoint of the storage service you want to use.
Examples of clients that allow you to do this:
There is no S3 trickery in this scenario. The client connects directly to Riak CS without any proxies. To make this work, the value for
cs_root_host needs to change to the fully qualified domain name (FQDN) of your Riak CS cluster.
Also, since S3 uses a subdomain to identify buckets created within it, in the spirit of S3-compatibility, Riak CS does too. In order to make this work in your environment, you will need a wildcard DNS entry. This is typically hosted beneath a Riak CS-specific subdomain. If you use
storage.example.com as your cluster name, you’ll need
*.storage.example.com defined as a DNS entry with the appropriate IP address so the S3 buckets will resolve properly.
There are pros and cons to each approach. Proxy is easier to setup initially and works with a wider variety of clients. Direct requires a bit more technical expertise and works with a smaller number of clients, but allows you to rid your application of references to
Choose the one that makes the most sense for your use case. We’re just glad you chose Riak CS.
- Riak CS docs
- Riak CS code on GitHub
- Riak CS download and installation
- Riak CS configuration
- Riak mailing list for questions
August 26, 2013
Earlier this month, we announced the availability of Riak CS 1.4, which added a number of performance improvements, OpenStack integration, and simpler user management. To provide more details about what was introduced with the latest release, we also hosted a “What’s New in Riak CS 1.4” webcast.
This short webcast provides an overview of both Riak CS and Riak, and discusses what’s new in Riak CS 1.4. It also looks at the fundamental features and architecture of Riak CS, talks about the key partnerships, and discusses Riak CS Enterprise – the commercial extension of Riak CS.
You can watch the complete recording below.
You can also view the slides from this webcast here.
To get started with Riak CS, visit docs.basho.com/riakcs/latest/riakcs-downloads/ to download the latest release.
August 19, 2013
The OpenStack Summit takes place in Hong Kong from November 5-8th. It is a conference for developers, users, and administrators of OpenStack Cloud Software. Basho is a big supporter of the open source community and, with the added OpenStack integration available with Riak CS 1.4, we aim to make our open source cloud storage software as accessible as possible.
This OpenStack integration adds a lot of exciting possibilities to Riak CS. A few Basho engineers have submitted speaking proposals to OpenStack Summit about how the two technologies can work together.
We need your help though! Part of the presentation selection process involves community voting. You can vote for your favorites now through August 25th.
Here’s a look at our submissions. Please vote for any or all of them.
“Riak CS: Coexisting with Swift” – Casey Rosenthal
Vote Here: www.openstack.org/rate/Presentation/riakcs-coexisting-with-swift
Riak CS is an open source, fault-tolerant, large object storage platform. With Keystone integration and Swift-API compatibility made available in version 1.4, Riak CS can now serve as a drop-in replacement to Swift in many deployments. When would you want to choose one versus the other?
Explore the architecture underlying Riak CS, the problems that Riak CS is trying to solve, and how these goals contrast with the architecture of Swift. OpenStack integration is a key driver for Riak CS adoption and is now part of the core commitment of the Riak CS team to open source and enterprise users alike. Learn how Riak CS is coexisting with Swift in the OpenStack ecosystem to solve large object storage and scaling problems.
“Highly Scalable Global Keystone Token Storage using Riak” – Dean Proctor
Vote Here: www.openstack.org/rate/Presentation/highly-scalable-global-keystone-token-storage-using-riak
Concurrent requests to Keystone scale with your OpenStack deployment; however, simple methods for linearly scaling Keystone request capacity do not currently exist. This issue is compounded when you attempt to unify authentication across multi-datacenter installations.
Learn how the Riak key value store can be used to provide an operationally simple, linearly scalable Keystone service with the ability to sync globally in real time.
“Using Riak CS as a Backend for Glance” – James Martin
Vote Here: www.openstack.org/rate/Presentation/using-riakcs-as-a-backend-for-glance
Glance can use a number of different methods to store VM images and snapshots, including object-stores. The image object store’s availability is critical to the functionality of OpenStack’s Nova service, and as time goes by it’s going to grow massively in scale; let’s not forget to mention how complex it can be to manage such as system. And for those interested in consistency across their OpenStack deployments, maintaining and replicating images can be a painful process. Learn how to use Riak CS as the storage backend for Glance and gain all the benefits of Riak – horizontal scalability, ease of administration, and dead-simple multi-datacenter replication.
August 15, 2013
With the launch of Riak CS 1.4, several members of the Basho team have been approached with the question “Why did you build Riak CS?”
When we open sourced Riak CS in March of 2013, the conversation focused on the importance of the community of developers with whom we engage, and participating with this community in a more open fashion.
However, understanding the history of a product can be just as important as understanding the logic behind our go-to-market strategy.
Put simply, Basho is a distributed systems company.
As a company that started with Riak, an open source distributed database, we had an immediate, targeted focus on high availability, fault-tolerance, and linear scalability. These core properties of our database implementation are, in actuality, consistent themes to consider when building any distributed system. And as Riak and Riak Enterprise gained traction in market, several customers began to use their Riak implementation to store larger objects.
With this and other customer feedback in mind, we prototyped Riak CS, which offers all of the benefits of Riak, while also adding the features and functionality required to power large object storage in public or private clouds as well as providing reliable storage for applications and services.
As we built upon this initial prototype, both based on distributed systems themes and customer input, we added an S3-compatible API to Riak CS. This provided a solution for service providers that wanted to offer S3-compatible storage and for customers that wanted to adopt a hybrid-cloud approach to address data sovereignty or redundancy concerns. We also added OpenStack Object Storage API compatibility with the latest Riak CS 1.4 release. Riak CS can now easily interact with multiple IaaS providers, which helps expand our potential user base for both the open source and enterprise product.
However, regardless of feature decisions – either present or in the future – our commitment to providing robust, resilient distributed storage remains.
August 14, 2013
Next week, we will be hosting the “What’s New in Riak CS 1.4” webcast. Join us on Friday, August 23rd at 11am PT/2pm ET for a free webcast that will discuss the new features and updates announced with the latest release. You can sign up for this 30-minute webcast here.
In addition to looking at the 1.4 updates, this webcast will discuss the basics of Riak CS and Riak CS Enterprise, while also providing some common use cases and user stories.
New Release Adds OpenStack Integration, Simplifies Management, and Boosts Multi-Datacenter Replication Speed
August 13, 2013 – CAMBRIDGE, MA – Basho Technologies, the leader in distributed systems software, announced today the availability of Riak CS 1.4 and Riak CS 1.4 Enterprise. Riak CS 1.4 continues Basho’s commitment to provide cloud storage software that is simple to operate, highly available by design, and compatible with industry cloud standards. Riak CS is used by organizations worldwide to power their public and private clouds.
Riak CS 1.4 introduces formal integration with OpenStack, provides enhanced performance and manageability, includes community requests, and improves performance at scale. Riak CS 1.4 Enterprise significantly boosts the performance of multi-data center replication by allowing for concurrent channels, so the full capacity of the network and cluster size can scale the performance to available resources.
“Riak CS is seeing impressive market adoption, especially from service providers looking to increase their portfolio offering with large object storage,” said Greg Collins, president and CEO of Basho Technologies. “This release continues our commitment of providing simple and accessible cloud storage for a broad range of cloud computing platforms and use cases. With the addition of OpenStack integration and significant performance improvements, Riak CS 1.4 also appeals strongly to enterprises building their own object storage or adopting a hybrid-cloud deployment methodology.”
“Object storage is quickly becoming a foundational platform capability for cloud providers and large enterprises to meet the rapidly growing surge in demand to store more data,” said Simon Robinson, vice president of storage research at 451 Research. “Riak CS continues to see greater adoption in public and private clouds. Riak CS’s tighter integration with OpenStack is certain to be another catalyst for Basho. OpenStack users gain a very capable storage alternative to Swift, OpenStack’s object storage platform.”
“Yahoo! JAPAN has been using Riak CS for over a year to power our public cloud storage platform” said Shingo Saito, cloud product manager at Yahoo! JAPAN. “Riak CS is also used by LOHACO, for its on-line shopping platform, operated by ASKUL Corporation, Yahoo! JAPAN partner, and by some of the largest companies in Japan. We are excited to continue to partner with Basho and look forward to deploying Riak CS 1.4.”
“Redapt is excited to work with Basho to help customers address distributed object storage needs within OpenStack environments,” said David Cantu, co-founder and COO at Redapt. “Redapt’s mission is to enable leading service providers, enterprises, and web centric companies with the ability to achieve the numerous economic and operational benefits of private cloud computing. With the Riak 1.4 announcement, Basho is helping us deliver on that commitment for our customers with proven distributed cloud storage software that is now more finely tuned for integration with OpenStack.”
“Businesses have a range of object storage needs and our partnership with Basho helps us easily address even the most complex scenarios in our public cloud,” said Jared Wray, CTO of Tier 3. “Our global data center footprint enables businesses of all sizes to adopt object storage for a variety of use cases including: cloud-native and cross-device apps, backups and archives, and secure file transfer. The improved performance and simplified operations available with Riak CS 1.4 continue to help our customers simply scale to meet operational demand.”
Major Feature Additions of Riak CS 1.4 include:
- Built-in integration with OpenStack. Riak CS 1.4 introduces support for OpenStack’s Keystone authentication service and introduces compatibility with OpenStack Object Storage API.
- Improved performance of large bucket query operations. Secondary indexing pagination, introduced with Riak 1.4, allows for significant performance improvements of large bucket query requests.
- Simplified operational management. Improvements to the User API allow operators greater flexibility in managing Riak CS user information, while also improving the agility and responsiveness of Riak CS.
- Decreased bandwidth for object block retrieval. Changes to how Riak CS handles object block retrieval will decrease intracluster bandwidth by 67% and improve download performance.
Riak CS 1.4 Enterprise adds the following:
- Enhanced multi-site replication performance. Riak CS 1.4 Enterprise allows for concurrent channels of communication between clusters, which greatly enhances the capability for replication by taking advantage of all the network’s available resources.
Riak CS 1.4 is available for Debian, Ubuntu, FreeBSD, Mac, Red Hat Enterprise Linux, Fedora, SmartOS, and Solaris.
To view the latest technical documentation or to download Riak CS, visit docs.basho.com/riakcs/latest/.
To view a feature comparison with OpenStack Swift, visit docs.basho.com/riakcs/latest/references/appendices/comparisons/Riak-Compared-to-Swift/.
To view a feature comparison with EMC Atmos, visit docs.basho.com/riakcs/latest/references/appendices/comparisons/Riak-Compared-to-Atmos/.
To request a trial license of Riak CS Enterprise, prospective inquiries can request a Riak CS Tech Talk at http://info.basho.com/SignUpRiakTechTalk.html.
Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s distributed database, Riak and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by over 25 percent of the Fortune 50 to power their critical Web, mobile and social applications and their public and private cloud platforms.
Riak and Riak CS are available open source. Riak Enterprise and Riak CS Enterprise offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com. Basho is headquartered in Cambridge, Massachusetts and has offices in London, San Francisco, Tokyo and Washington DC.
August 13, 2013
The release of Riak CS 1.4, Basho’s open source cloud storage software, adds a number of performance improvements as well as OpenStack integration and simpler user management. Riak CS is being used by companies all over the world to build public and private clouds, and as reliable storage to power various applications.
One of the biggest additions with Riak CS 1.4 is the integration with OpenStack, broadening our relationship with the open source community. This integration supports OpenStack’s Keystone authentication service and the OpenStack Object Storage API, which allows OpenStack users the means to integrate Riak CS for object storage in an OpenStack deployment.
The Riak CS Users API provides an interface for user creation and management. This release also improves this API to give operators greater flexibility in managing user information. Additionally, this release benefits from ongoing refactoring and reorganization efforts aimed at improving the agility and responsiveness of Riak CS.
Riak CS 1.4 takes advantage of some changes made in Riak 1.4 to provide performance improvements to Riak CS users. First, Riak CS 1.4 features improved performance of listing the contents of large buckets by taking advantage of secondary index pagination in Riak. Riak CS 1.4 also leverages a new option for object block retrieval, which decreases intracluster bandwidth by 67%. This improves the download performance when handling many concurrent requests. These features can be independently enabled, but are disabled by default to accommodate users not using Riak 1.4 with Riak CS. See the documentation for more details.
Riak CS Enterprise is the commercial extension of Riak CS, which adds multi-datacenter replication and 24/7 support. The 1.4 release improves replication performance by increasing storage efficiencies and adding multiple TCP connections between clusters.
In addition to the features and upgrades listed above, many bugs were harmed in the making of this release. For a full list of what is included in Riak CS 1.4, check out our code at Github.com/basho or review the release notes. To learn even more, join our live webcast, “What’s New in Riak CS 1.4” on August 23rd.
July 23, 2013
This week is O’Reilly OSCON, a conference dedicated to all things open source. Basho is a sponsor and Basho engineer, Eric Redmond, will be delivering a presentation entitled “Distributed Patterns In Action“.
Basho first open sourced Riak in 2009. It’s a decision that helped us grow our business, and become a leader in newer, agile enterprise environments. Our participation in the open source community benefits our culture, our development process, and our business.
In honor of OSCON, we thought it important to explore the commercial aspects of our open source decision.
The Business of Open Source
Open source is in the DNA of our company, with both Riak and Riak CS available under the Apache 2 license. (It is worth noting that these products are but a few of our open source contributions, which also include Webmachine and Lager.) To turn this great code into a business, we chose to stay true to our roots as a software company, instead of just selling services. The enterprise versions of Riak and Riak CS offer the entirety of our open source software, with the addition of multi-datacenter replication and monitoring capabilities.
The decision to sell licenses to the enterprise, rather than to rely just on services, makes Basho unique. It allows us to engage with our enterprise customers in the transformation of their application architecture. They can be confident in the software’s availability and in Basho’s commitments to support them – as customers. Enterprises need an alternative to traditional database vendors, but one that can still fit — in license structure, operational management, and process integration — into a traditional organization.
Our licensing model for Riak Enterprise and Riak CS Enterprise lets us balance agility with tradition. Our community helps us develop groundbreaking software, while the enterprise license helps corporate IT and Operations sleep at night.
Open source drives adoption (a concept discussed at length in Stephen O’Grady’s book The New Kingmakers). That means Riak is used across many different industries, powering thousands of applications. That commercial validation — our success in production deployments — is accelerated due to the open source availability.
We remain keenly aware, and tremendously appreciative, that our community (from the individuals to the large organizations) guides Riak and Riak CS updates, and has been crucial to the refinement and forward momentum of this software.
Basho’s success is open source’s success. Our strengths reside both in our team and in our community, as their combined efforts improve our technology and its utilization. We are excited to see what other open source showcases are in view at OSCON 2013.
June 19, 2013
Today, Tier 3 announced the availability of their global cloud object storage product, powered by Riak CS. You can find the entirety of the release in our News Section entitled “Tier 3 Launches Global Cloud Object Storage.”
In particular, we are keenly interested in the unique geographic footprint that Tier 3 maintains. In conversations with customers, press, and analysts, we frequently hear people discussing “geo-data locality.” This phrase typically is used to express a desire to address regulatory compliance or to improve the end-customer experience through low-latency (in the case of mobile applications).
With the Tier 3 release, their geographic footprint — in addition to maximizing availability — leverages the inherent replication present in Riak CS to pre-determine the physical locations of specific data.
For geo-data locality, requests can be load balanced across geographies, with geo-based client requests directed to the appropriate datacenter. For example, US-based requests can be served out of a Tier 3 US-based datacenter, while EU-based requests can be served out of a Tier 3 European datacenter. For situations where not all data needs to be shared across all datacenters (or if certain data, such as user data, must only be stored in a specific geographic region to provide low-latency response and address privacy regulations), Riak CS Enterprise’s multi-datacenter replication can be configured on a per-bucket basis so only shared assets, popular assets, etc. are replicated.
May 1, 2013
This post looks at five commonly asked questions about Riak CS – simple, available, open source storage built on top of Riak. For more information, please review our full documentation, or sign up for an intro to Riak CS webcast on Friday, May 10.
What is the relationship between Riak and Riak CS?
Riak CS is built on top of Riak, exposing higher-level storage functions including large object support, an S3-compatible API, multi-tenancy, and per-user storage and access statistics. Riak itself provides the replication, availability, fault-tolerance, and underlying storage functions for the Riak CS implementation. Riak and Riak CS should both be installed on every node in your cluster. While Riak and Riak CS could be run on separate virtual or physical nodes, running them on the same machine minimizes intra-cluster bandwidth usage and is the recommended approach. As with Riak, we advise a minimum 5-node cluster.
When objects are uploaded to Riak CS, the object is broken up into smaller chunks which are then streamed, stored, and replicated in the underlying cluster. A manifest is maintained for each object, that points to which blocks comprise the object, and is used to retrieve all blocks and present them to the client on read. In addition to running Riak and Riak CS on each node, Stanchion, a request serializer, must be installed on at least one node in the cluster. This ensures that global entities, such as users and buckets, are unique in the system.
What use cases does Riak CS support that Riak doesn’t?
Riak CS has several features that are not provided in the standalone Riak database. One of the most obvious differences is in the size of objects supported. Riak CS exposes large object support, and includes multi-part upload so you can upload objects as a series of parts. This allows you to upload single objects to the system into the terabyte range. In Riak, the data model is simply key/value; in Riak CS, the key/value model provides the underlying structure for higher-level storage semantics – users, buckets and objects. The Riak CS interface is an S3-compatible HTTP API, allowing you to use existing S3 libraries and tools. In contrast, Riak exposes an HTTP and protobufs API and offers many language-specific clients. Unlike Riak, Riak CS is multi-tenant, with the concept of “users” and per-user reporting on storage and access. This makes it a fit for both private cloud scenarios, with multiple internal users, or as a foundation for a public cloud storage offering.
How does multi-tenancy, authentication and reporting work?
Riak CS exposes an interface for user creation, disablement and credential management. Riak CS can be set so that only administrators can create new users. Administrators also have special privileges including being able to retrieve a list of all users in the system and query the user account information of any user. Once issued credentials, users are able to authenticate, create buckets, upload and download files, retrieve account information, obtain new credentials, or disable their account through the API. Riak CS supports the standard S3 authentication scheme, with support for header and query string authorization.
Riak CS exposes storage, usage and network statistics that support use cases like accounting, subscription, billing or multi-group utilization for public or private clouds. Riak CS will report information on how much storage a user is consuming and the network operations related to access. This data is exposed via an HTTP interface and can be queried on the default timespan “now” or as a range from start time through end time. Access statistics are reported as bytes in and bytes out for both object and bucket operations. Reporting of this information can be scheduled for a set interval or manually triggered.
What’s the difference between Riak CS and Riak CS Enterprise?
Riak CS Enterprise provides multi-datacenter replication on top of Riak CS. For multi-datacenter replication in Riak CS, global information for users, bucket information and manifests are streamed in real-time from a primary implementation to a secondary site so global state is maintained across locations. Objects can then be replicated in either full sync or real-time sync mode. The secondary site will replicate the object as in normal operations. Additional datacenters can be added in order to create availability zones or provide additional data redundancy and locality. Riak CS Enterprise can also be configured for bi-directional replication. Riak CS Enterprise also comes with 24/7, enterprise-level support. More information and pricing can be found here, and full technical information is available on our docs portal. Ready to get started? Sign up for a developer trial of Riak CS Enterprise.
What are your plans for integration of Riak CS with open source compute solutions?
Riak CS provides highly available, distributed storage, making it a natural fit for usage alongside compute solutions. We have partnered with Citrix to collaborate on the integration of Apache CloudStack and Riak CS to create a complete cloud software offering that combines compute and storage in an integrated platform. For more information on our partnership with CloudStack, check out this blog post with the latest update. API and authentication support for OpenStack is also in progress.