May 6, 2013
The free Riak AMI available on the AWS Marketplace has been updated to the latest version, Riak 1.3.1.
In Riak 1.3, we introduced:
- Active Anti-Entropy
- Updates to Riak Control
- Expanded IPv6 support
- Improved MapReduce
- Simplified Log Management
Riak 1.3.1 includes all these features with some additional changes enumerated in the release notes.
For those of you currently using Riak on AWS, or interested in testing Riak on AWS, the AMI makes installation and configuration much easier. We see open source and Riak Enterprise users leverage AWS both as their primary infrastructure and to support hybrid implementations.
Installation instructions for the AMI are available on in our docs.
May 1, 2013
This post looks at five commonly asked questions about Riak CS – simple, available, open source storage built on top of Riak. For more information, please review our full documentation, or sign up for an intro to Riak CS webcast on Friday, May 10.
What is the relationship between Riak and Riak CS?
Riak CS is built on top of Riak, exposing higher-level storage functions including large object support, an S3-compatible API, multi-tenancy, and per-user storage and access statistics. Riak itself provides the replication, availability, fault-tolerance, and underlying storage functions for the Riak CS implementation. Riak and Riak CS should both be installed on every node in your cluster. While Riak and Riak CS could be run on separate virtual or physical nodes, running them on the same machine minimizes intra-cluster bandwidth usage and is the recommended approach. As with Riak, we advise a minimum 5-node cluster.
When objects are uploaded to Riak CS, the object is broken up into smaller chunks which are then streamed, stored, and replicated in the underlying cluster. A manifest is maintained for each object, that points to which blocks comprise the object, and is used to retrieve all blocks and present them to the client on read. In addition to running Riak and Riak CS on each node, Stanchion, a request serializer, must be installed on at least one node in the cluster. This ensures that global entities, such as users and buckets, are unique in the system.
What use cases does Riak CS support that Riak doesn’t?
Riak CS has several features that are not provided in the standalone Riak database. One of the most obvious differences is in the size of objects supported. Riak CS exposes large object support, and includes multi-part upload so you can upload objects as a series of parts. This allows you to upload single objects to the system into the terabyte range. In Riak, the data model is simply key/value; in Riak CS, the key/value model provides the underlying structure for higher-level storage semantics – users, buckets and objects. The Riak CS interface is an S3-compatible HTTP API, allowing you to use existing S3 libraries and tools. In contrast, Riak exposes an HTTP and protobufs API and offers many language-specific clients. Unlike Riak, Riak CS is multi-tenant, with the concept of “users” and per-user reporting on storage and access. This makes it a fit for both private cloud scenarios, with multiple internal users, or as a foundation for a public cloud storage offering.
How does multi-tenancy, authentication and reporting work?
Riak CS exposes an interface for user creation, disablement and credential management. Riak CS can be set so that only administrators can create new users. Administrators also have special privileges including being able to retrieve a list of all users in the system and query the user account information of any user. Once issued credentials, users are able to authenticate, create buckets, upload and download files, retrieve account information, obtain new credentials, or disable their account through the API. Riak CS supports the standard S3 authentication scheme, with support for header and query string authorization.
Riak CS exposes storage, usage and network statistics that support use cases like accounting, subscription, billing or multi-group utilization for public or private clouds. Riak CS will report information on how much storage a user is consuming and the network operations related to access. This data is exposed via an HTTP interface and can be queried on the default timespan “now” or as a range from start time through end time. Access statistics are reported as bytes in and bytes out for both object and bucket operations. Reporting of this information can be scheduled for a set interval or manually triggered.
What’s the difference between Riak CS and Riak CS Enterprise?
Riak CS Enterprise provides multi-datacenter replication on top of Riak CS. For multi-datacenter replication in Riak CS, global information for users, bucket information and manifests are streamed in real-time from a primary implementation to a secondary site so global state is maintained across locations. Objects can then be replicated in either full sync or real-time sync mode. The secondary site will replicate the object as in normal operations. Additional datacenters can be added in order to create availability zones or provide additional data redundancy and locality. Riak CS Enterprise can also be configured for bi-directional replication. Riak CS Enterprise also comes with 24/7, enterprise-level support. More information and pricing can be found here, and full technical information is available on our docs portal. Ready to get started? Sign up for a developer trial of Riak CS Enterprise.
What are your plans for integration of Riak CS with open source compute solutions?
Riak CS provides highly available, distributed storage, making it a natural fit for usage alongside compute solutions. We have partnered with Citrix to collaborate on the integration of Apache CloudStack and Riak CS to create a complete cloud software offering that combines compute and storage in an integrated platform. For more information on our partnership with CloudStack, check out this blog post with the latest update. API and authentication support for OpenStack is also in progress.
April 3, 2013
As you might have heard, we recently open sourced Riak CS, cloud storage built on Riak. You can find all of the code on our GitHub account and download Riak CS here. To help you get started with Riak CS, here are some common use cases.
- Large Object Storage For Applications and Services: Riak CS is built for storing large objects of all types. It is content agnostic so you can store images, text, video, documents, database backups, software binaries, or other data types. Riak CS can store objects into the terabyte size range using the new multipart upload feature. When an object is uploaded, Riak CS breaks it into smaller blocks that are streamed, stored, replicated in the underlying Riak cluster.
- On-Demand Internal Storage Capacity: Riak CS provides highly available storage for internal business units. Built on Riak, Riak CS has a masterless, redundant design that ensures availability and fault-tolerance. Use cases might include document storage or backing for internal applications.
- Storage Layer for Public Clouds/Cloud Services: Riak CS’ flexibility and scalability provide the ideal foundation for building public clouds or cloud services. Capacity can be added by installing Riak CS on a new physical node and joining it with the cluster. Riak automatically redistributes data and ownership so all nodes have equal responsibility, which prevents storage hot spots and decreases the operational burden of adding new nodes. Additionally, Riak CS is multi-tenant, a requirement of most public cloud services today.
- Amazon S3 Compatibility: Riak CS is S3-compatible, making it easy for your developers to be productive quickly. Riak CS can be used with existing S3 clients and libraries. The HTTP REST API supports service, bucket, and object-level operations to easily store and retrieve data. Riak CS makes sense for companies that are trying to provide internal, S3-like services or using a hybrid approach with some public and some private cloud storage.
- Disaster Recovery and Active Backups: Riak CS Enterprise extends Riak CS with multi-datacenter replication. By replicating data across datacenters using either real-time or full-sync, you can maintain redundant storage in case of disaster scenarios. Multi-datacenter replication can also be used to maintain active backups or create availability zones.
March 27, 2013
Basho is excited to share 451 Research’s report on Riak CS and last week’s code release to open source. In this report, analysts Simon Robinson and Matt Aslett state that they are “not surprised that Basho has opted to take this route for Riak CS, and the move should help grow its community – not to mention increase its funnel of prospects – at a critical time in the evolution of what remains a nascent market.”
You can download the complete 451 Research report here.
March 26, 2013
Last week, we announced that Riak CS is now open source. To complement this announcement, here’s a 15-minute overview of Riak CS and how it can be used.
Below is a video that goes over the basics of Riak CS, taken from the Citrix CloudPlatform partner program. You will learn the key building blocks of cloud services platforms and the enterprise requirements for cloud storage. You will also learn the properties, architecture, interfaces, and operations of Riak CS.
New York City, NY. – March 20, 2013 – Today at GigaOM Structure Data 2013 in New York City, Basho, the worldwide leader in distributed database and cloud storage software, announced that Riak CS (Cloud Storage) is now available open source, significantly expanding the ease-of-access to Basho’s software for developers, enterprise architects, and IT operations professionals seeking to build public or private storage clouds. Also today, Basho announced the general availability of Riak CS v1.3, the third release of Basho’s simple, available cloud storage software.
Riak CS is a multi-tenant, distributed, S3-compatible cloud storage platform that enables enterprises and service providers to launch public or private cloud services. Built on top of Riak, the world’s most advanced, open source, distributed database, Riak CS provides horizontal scale, extreme durability and low operational overhead in a distributed object storage system. Riak CS Enterprise adds Basho’s multi-datacenter replication technology and is backed by Basho’s 24×7 support and enterprise-class service-level commitments.
Riak CS Enterprise is used by great organizations worldwide including Datapipe, Deutsche Vermögensberatung (DVAG), IDC Frontier, Rovio, and Yahoo! JAPAN.
New Features in 1.3
- Multipart Upload. Riak CS v1.3 includes a new multipart upload capability that lets users store very large files by uploading parts in parallel.
- Enhanced Control for Multi-Tenant Environments. Riak CS v1.3 introduces object access control by source IP enabling operating to restrict access to Riak buckets by IP address.
- Support for GET Range Requests. Riak CS users can now retrieve a range of bytes from a single object. This functionality is implemented in the “Range” request header of GET operations.
- Graphical Tool for Riak CS. Riak CS Control is a standalone web administration tool for user management.
Basho offers a hosted “sandbox” to test interfacing with a live implementation of Riak CS. The “sandbox” is available at https://www.riakcs.net/users/sign_in.
For more information on Riak CS Enterprise, and to request a Developers Trial License, click https://basho.com/riak-cloud-storage/.
Upcoming Riak CS Webinar and RICON EAST
Basho will host an “Introduction to Riak CS Webinar” on Tuesday, April 2. To participate in the webinar, sign up here.
Basho is hosting RICON EAST on May 13 – 14, 2013 in New York City, NY. RICON is Basho’s distributed systems conference by and for engineers, developers, scientists and architects. For ticket information on RICON East, visit http://ricon.io/east.html.
Greg Collins, president and CEO, Basho
“It has been almost one year since we first released Riak CS. In just 12 months, we have seen rapid adoption by global cloud operators, telecommunication providers and large enterprises. Over the past year, Riak CS has gained new advanced capabilities and has been battle-tested in many of our customers’ and partners’ labs. Our customers have deployed Riak CS as the object storage engine inside popular cloud computing platforms, including Apache CloudStack and OpenStack. Today, by open sourcing Riak CS, we are making it easier for users to experiment with and test Riak CS, to provide rapid product feedback, and to contribute to its future capabilities.”
Ash Yamanaka, general manager, IDC Frontier and
Shingo Saito, cloud product manager, Yahoo! JAPAN
“Basho, Yahoo! JAPAN and IDC Frontier, a member of Yahoo! JAPAN group, have a very strong and growing partnership. Today, Yahoo! JAPAN and IDC Frontier leverage Riak CS Enterprise to offer an S3-compatible public cloud storage service, as well as dedicated hosting options for our customers various applications. Yahoo! JAPAN and IDC Frontier are highly supportive of open source software and we view Basho’s announcement today as a positive move that will work to accelerate its ability to innovate and ultimately strengthen our cloud platform.”
Sameer Dholakia, group vice president and GM, Citrix Platforms Group, Citrix
“Basho clearly understands the market power of open source. Since Citrix and Basho started collaborating last year, we have seen strong enthusiasm among Citrix CloudPlatform users for Basho’s cloud object storage solution. Now, CloudStack users have easy access to Riak CS and can quickly deploy an object storage solution that features multi-tenancy and S3 compatibility. We believe that many Citrix CloudPlatform customers will also seek Riak CS Enterprise for its distributed data capabilities across multiple data centers.”
Ed Laczynski, vice president, Cloud Strategy and Architecture, Datapipe
“Datapipe is very supportive of Basho’s decision to open source portions of Riak CS. During the last six months, we have deployed Riak CS Enterprise in Datapipe’s 10gig Stratosphere cloud computing platform. Riak CS provides Datapipe and its customers with highly available, low-latency and S3-compatible storage. Datapipe’s customers will benefit as Basho’s community increasingly experiments, tests and contributes to Riak CS, ultimately speeding our access to more capabilities and higher performance.”
Simon Robinson, vice president of storage research, 451 Research
“The cloud storage market continues to accelerate as companies seek to build public and private storage clouds that mirror Amazon Web Services’ capabilities and economics. Basho, with Riak CS, already has a proven track record of successful customer public and private cloud deployments. Now, Basho is demonstrating it has confidence that the technical and business benefits of Risk CS can be accelerated even faster via the open source model.”
About Basho Technologies
Basho is a distributed systems company dedicated to making software that is always available, fault-tolerant and easy-to-operate at scale. Basho’s distributed NoSQL database, Riak, and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by over 25% of the Fortune 50 to power their critical Web, mobile and social applications and their public and private cloud platforms.
Riak and Riak CS are available open source. Riak Enterprise and Riak CS Enterprise offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com.
Basho is headquartered in Cambridge, Massachusetts and has offices in London, San Francisco, Tokyo and Washington DC.
Basho Marketing Manager
March 20, 2013
Riak CS (Cloud Storage) is simple, available cloud storage software built on Riak. Basho announced today that Riak CS is now open source under the Apache 2 license. Organizations and users can now access the source code on Github and download the latest packages from the downloads page. Also, today, we announced that Riak CS Enterprise is now available as commercial licensed software, featuring multi-datacenter replication technology and 24×7 Basho customer support.
We will be hosting an introductory webcast to Riak CS on Tuesday, April 2. Sign up here.
Riak CS can be used to build private or public clouds or as reliable, available storage behind applications and platforms. Riak CS Enterprise is currently used by large corporations including Datapipe, Deutsche Vermögensberatung (DVAG), IDC Frontier, Rovio, and Yahoo! JAPAN.
Basho is a distributed systems company dedicated to making software that is available, fault-tolerant, and easy to operate at scale. Twenty-five percent of the Fortune 50 and thousands of open source users large and small run our flagship open source database, Riak. Riak CS takes distributed systems principles derived from production Riak users and applies it to the problem of large scale storage. We are excited to share this code with the world.
Riak CS features:
- Highly available, fault-tolerant storage
- Large object support
- S3-compatible API and authentication
- Multi-tenancy and per-user reporting
- Simple operational model for adding capacity
- Robust stats for monitoring and metrics
For users requiring multi-datacenter replication and enterprise-level support, Riak CS Enterprise (a commercial extension of Riak CS) is available.
Today we are also announcing several new features, available now as part of the open source edition.
- Multipart upload. Upload very large files to Riak CS as a series of parts. Parts can be between 5MB and 5GB.
- Support for GET range queries. Retrieve a range of bytes from a single object. This functionality is implemented in the “Range” request header of GET operations.
- Per-bucket policies to restrict access to buckets based on source IP.
- Riak CS Control. Riak CS Control is a standalone web administration tool for user management available on Github.
“Basho, Yahoo! JAPAN, and IDC Frontier a member of Yahoo! JAPAN group have a very strong and growing partnership. Today, Yahoo! JAPAN and IDC Frontier leverage Riak CS Enterprise to offer an S3-compatible public cloud storage service, as well as dedicated hosting options for our customers various applications. Yahoo! JAPAN and IDC Frontier are highly supportive of open source software and we view Basho’s announcement today as a positive move that will work to accelerate its ability to innovate and ultimately strengthen our cloud platform.”
– Ash Yamanaka, general manager, IDC Frontier and
– Shingo Saito, cloud product manager, Yahoo! JAPAN
“Basho clearly understands the market power of open source. Since Citrix and Basho started collaborating last year, we have seen strong enthusiasm among Citrix CloudPlatform users for Basho’s cloud object storage solution. It has also provided the Apache CloudStack community with easy access to Riak CS for multi-tenancy and S3 compatibility. With today’s announcement, Citrix CloudPlatform customers will continue to benefit from Riak CS Enterprise for its distributed data capabilities across multiple data centers.”
– Sameer Dholakia, group vice president and GM, Citrix Platforms Group, Citrix
“Over the last six months, we have deployed Riak CS Enterprise within Datapipe’s 10gig Stratosphere cloud computing platform. Riak CS provides our customers with highly available, low-latency, S3-compatible cloud object storage. Datapipe is very supportive of Basho’s decision to open source portions of Riak CS. As Basho’s open source community grows, experiments, tests and contributes to Riak CS, Datapipe clients will benefit from access to additional capabilities and higher performance.”
– Ed Laczynski, vice president, Cloud Strategy and Architecture, Datapipe
Please join us for an introductory technical webcast to Riak CS on April 2. You can also read a technical overview on our website and find full documentation here.
In the coming weeks and months, we look forward to helping new users get started with Riak CS and be successful running it in production. We’ll be expanding integration and partnerships with open source cloud computing platforms in order to provide integrated storage and compute to the marketplace. As always, we’ll be listening to feedback, engaging with the community, and accepting pull requests.
April 12, 2011
There has been a lot of buzz as of late around “riak_core” in various venues, so much so that we are having trouble producing enough resources and content to keep the community at bay (though we most-certainly have plans to). While we hustle to catch up, here is the rundown on what is currently available for those of you who want to learn about, look at, and play with riak_core.
(TL;DR – riak_core is the distributed systems framework that underpins Riak and is, in our opinion, what makes Riak the best and most-robust distributed datastore available today. If you want so see it in action, go download Riak and put it through its paces.)
If you know nothing about riak_core (or are in the mood for a refresher), start with the Introducing Riak Core blog post that appeared on the Basho Blog a while back. This will introduce you, at a very high-level, to what riak_core is and how it works.
Slides and Videos
There are varying degrees of overlap in each of these slides and videos, but they all address riak_core primarily.
- “Building Distributed Systems With Riak and Riak Core”
- Riak Core: An Erlang Distributed Systems Toolkit Slides and Video
- Masterless Distributed Computing with Riak Core Slides and Video
- Riak Core: Dynamo Building Blocks (PDF)
- Riak From The Inside
- Riak’s Distributed Storage Architecture
- riak_core repo on GitHub
- Basho Banjo – Sample application that uses Riak Core to play distributed music
- Try Try Try – Ryan Zezeski’s working blog that is taking an in depth look at various aspects of riak_core
- rebar_riak_core – Rebar templates for riak_core apps from the awesome team at Webster/Clay
Getting Involved With Riak and Riak Core
We are very much as the beginning of what Riak Core can be as a stand alone platform for distributed applications, so if you want to get in at the ground floor of something that we feel is truly innovative and unparalleled, now is the time. The best way to join the conversation and to help with the development of Riak Core is to join the Riak Mailing list where you can start asking questions and sharing code.
Also, make sure to follow the Basho Team on Twitter as we spend way too much time talking about this stuff.
November 15, 2010
Oracle didn’t (and can’t) take away your open source software.
A few weeks ago Oracle caused a lot of confusion when they changed the makeup of the MySQL product line, including a “MySQL Classic Edition” version that does not cost money and does not include InnoDB. That combination in the product chart made many people wonder if InnoDB itself had ceased to be free in either the “free beer” or “free speech” sense. The people wondering and worrying included a few users of Innostore, the InnoDB-based storage engine that can be used with Riak.
Luckily, open source software doesn’t work that way.
Oracle didn’t really even try to do what some people thought; they just released a confusing product graph which they have since updated. The MySQL that most people think of first is MySQL Community Edition and it was not one of the editions mentioned in the chart that confused people. That version of MySQL, as well as all of the GPL components included in it such as InnoDB, remain free of cost and also available under the GPL.
This confusion eventually led to a public response from Oracle, so you can read it authoritatively if you like.
Even if someone wanted to, they couldn’t “take it back” in the way that some people feared. Existing software that has been legitimately available under an open source license such as GPL or Apache cannot retroactively be made unfree. The copyright owner might choose to not license future improvements as open source, but that which is already released in such a way cannot be undone. Oracle and Innobase aren’t currently putting new effort into Embedded InnoDB, but a new project has spun up to move it forward. If the HailDB project produces improvements of value, then future versions of Innostore may switch to using that engine instead of using the original Embedded InnoDB release.
November 11, 2010
We announced recently on the Riak Mailing List that Basho was switching to git and GitHub for development of Riak and all other Basho software. As stated in that linked email, we did this primarily for reasons pertaining to community involvement in the development of Riak. The explanation on the Mailing List was a bit terse, so we wanted to share some more details to ensure we answered all the questions related to the switch.
Riak was initially used as the underlying data store for an application Basho was selling several years ago and, at that time, its development was exclusively internal. The team used Mercurial for internal projects, so that was the de-facto DVCS choice for the source.
When we open-sourced Riak in August 2009, being Mercurial users, we chose to use BitBucket as our canonical repository. At the time we open-sourced it, we were less concerned with community involvement in the development process than we are now. Our primary reason for open-sourcing Riak was to get it into the hands of more developers faster.
Not long after this happened, the questions about why we weren’t on GitHub started to roll in. Our response was that we were a Mercurial shop and BitBucket was a natural extension of that. Sometime towards the beginning of May we started maintaining an official mirror of our code on GitHub. This mirror was our way of acknowledging that there is more than one way to develop software collaboratively and that we weren’t ignoring the heaps of developers who were dedicated GitHub users and preferred to look at and work with code on this platform.
GitHub has the concept of “Watchers” (analogous to “Followers” on BitBucket). We started accumulating Watchers once this GitHub mirror was in place. “Watchers” is a useful, but not absolute, metric for measuring interest and activity in a project. They bring a tremendous amount of attention to any given project through their use of the code and their promotion of it. They also, in the best case scenario, will enhance the code in a meaningful way by finding bugs and contributing patches.
This table shows the week on week of growth of BitBucket Followers vs. GitHub Watchers since we put the official mirror in place:
|Number of Followers/Watchers at Time of Switch||97||145|
|Avg. Week on Week Growth (%)||0.74||7.2|
Since putting the official mirror in place, the number of Watchers on the GitHub repo for Riak has grown at steady ready, averaging just over 7% week on week. This far outpaced the less than 1% growth in Followers on the canonical Bitbucket repository for Riak.
With this information it was clear that Riak on GitHub as a mirror was bringing us more attention and driving more community growth than was our canonical repo on BitBucket. So, in the interest of community development, we decided that Riak needed to live on GitHub. What they have built is truly the most collaborative and simple-to-use development platform there is (at least one well-respected software analyst has even called it “the future of open source”). Though Mercurial was deeply ingrained in our development process, we were willing to tolerate the workflow hiccups that arose during the week or so it took to get used to git in exchange for the resulting increase in attention and community contributions.
The switch is already proving fruitful. In addition to the sharp influx in Watchers for Riak, we’ve already taken some excellent code contributions via GitHub. That said, there is much left to be written. And we would love for you to join us in building something legendary in Riak, whatever your distributed version control system and platform preference may be.