May 1, 2013
This post looks at five commonly asked questions about Riak CS – simple, available, open source storage built on top of Riak. For more information, please review our full documentation, or sign up for an intro to Riak CS webcast on Friday, May 10.
What is the relationship between Riak and Riak CS?
Riak CS is built on top of Riak, exposing higher-level storage functions including large object support, an S3-compatible API, multi-tenancy, and per-user storage and access statistics. Riak itself provides the replication, availability, fault-tolerance, and underlying storage functions for the Riak CS implementation. Riak and Riak CS should both be installed on every node in your cluster. While Riak and Riak CS could be run on separate virtual or physical nodes, running them on the same machine minimizes intra-cluster bandwidth usage and is the recommended approach. As with Riak, we advise a minimum 5-node cluster.
When objects are uploaded to Riak CS, the object is broken up into smaller chunks which are then streamed, stored, and replicated in the underlying cluster. A manifest is maintained for each object, that points to which blocks comprise the object, and is used to retrieve all blocks and present them to the client on read. In addition to running Riak and Riak CS on each node, Stanchion, a request serializer, must be installed on at least one node in the cluster. This ensures that global entities, such as users and buckets, are unique in the system.
What use cases does Riak CS support that Riak doesn’t?
Riak CS has several features that are not provided in the standalone Riak database. One of the most obvious differences is in the size of objects supported. Riak CS exposes large object support, and includes multi-part upload so you can upload objects as a series of parts. This allows you to upload single objects to the system into the terabyte range. In Riak, the data model is simply key/value; in Riak CS, the key/value model provides the underlying structure for higher-level storage semantics – users, buckets and objects. The Riak CS interface is an S3-compatible HTTP API, allowing you to use existing S3 libraries and tools. In contrast, Riak exposes an HTTP and protobufs API and offers many language-specific clients. Unlike Riak, Riak CS is multi-tenant, with the concept of “users” and per-user reporting on storage and access. This makes it a fit for both private cloud scenarios, with multiple internal users, or as a foundation for a public cloud storage offering.
How does multi-tenancy, authentication and reporting work?
Riak CS exposes an interface for user creation, disablement and credential management. Riak CS can be set so that only administrators can create new users. Administrators also have special privileges including being able to retrieve a list of all users in the system and query the user account information of any user. Once issued credentials, users are able to authenticate, create buckets, upload and download files, retrieve account information, obtain new credentials, or disable their account through the API. Riak CS supports the standard S3 authentication scheme, with support for header and query string authorization.
Riak CS exposes storage, usage and network statistics that support use cases like accounting, subscription, billing or multi-group utilization for public or private clouds. Riak CS will report information on how much storage a user is consuming and the network operations related to access. This data is exposed via an HTTP interface and can be queried on the default timespan “now” or as a range from start time through end time. Access statistics are reported as bytes in and bytes out for both object and bucket operations. Reporting of this information can be scheduled for a set interval or manually triggered.
What’s the difference between Riak CS and Riak CS Enterprise?
Riak CS Enterprise provides multi-datacenter replication on top of Riak CS. For multi-datacenter replication in Riak CS, global information for users, bucket information and manifests are streamed in real-time from a primary implementation to a secondary site so global state is maintained across locations. Objects can then be replicated in either full sync or real-time sync mode. The secondary site will replicate the object as in normal operations. Additional datacenters can be added in order to create availability zones or provide additional data redundancy and locality. Riak CS Enterprise can also be configured for bi-directional replication. Riak CS Enterprise also comes with 24/7, enterprise-level support. More information and pricing can be found here, and full technical information is available on our docs portal. Ready to get started? Sign up for a developer trial of Riak CS Enterprise.
What are your plans for integration of Riak CS with open source compute solutions?
Riak CS provides highly available, distributed storage, making it a natural fit for usage alongside compute solutions. We have partnered with Citrix to collaborate on the integration of Apache CloudStack and Riak CS to create a complete cloud software offering that combines compute and storage in an integrated platform. For more information on our partnership with CloudStack, check out this blog post with the latest update. API and authentication support for OpenStack is also in progress.
SoftLayer & Basho Partner for High-Performance, Scalable Riak In The Cloud
Turnkey Big Data Environments Available Across SoftLayer’s Global Infrastructure
Dallas, TX and Cambridge, MA — April 30, 2013 — Basho and SoftLayer Technologies today announced the availability of Riak and Riak Enterprise on SoftLayer’s global cloud platform. The integrated solution provides the availability, fault tolerance, operational simplicity, and scalability of Riak combined with the flexibility, performance, and agility of SoftLayer’s on-demand infrastructure.
SoftLayer and Basho have collaborated to make Riak—an open source, distributed database—deployment more accessible and flexible through a pay-as-you-go service model. The solution enables organizations to swiftly deploy scalable production-grade systems, accelerating the speed of deployment of big data applications and providing greater business agility. Organizations can design and deploy a complete solution set through SoftLayer’s Web-based Solution Designer, with ongoing management and provisioning available via the company’s portal, mobile apps and API.
Common use cases for Riak include fault-tolerant, low-latency storage for content, user and session information, mobile data, log files, JSON/XML documents, and more. For customers that require replication of clusters between multiple data centers, Riak Enterprise is available and also adds extended monitoring and 24×7 support. Customers use multi-datacenter replication, in two or more sites, to serve global traffic, maintain active backups, run secondary analytics clusters, or meet disaster recovery and regulatory compliance requirements.
“Customer demand for easy-to-deploy and manage big data cloud-based solutions continues to rise,” says Duke Skarda, CTO of SoftLayer. “We are seeing substantial adoption through joint customers, such as Bump. Our platform was built specifically to support the kind of Web-scale distributed applications that big data exemplifies, and our partnership with Basho is further validation of our commitment to deliver a complete suite of scalable, high-performance big data solutions.”
“Basho and SoftLayer have long catered to innovative developers building the next generation of web, social and mobile applications. Today, enterprise customers are demanding the same, an architecture that provides for zero-data loss and delivers zero-downtime,” says Bobby Patrick, executive vice president and CMO of Basho. “We believe distributed systems software, such as Riak, and distributed infrastructure is required to help customers truly achieve these ambitions. Basho is excited to partner with SoftLayer to help companies easily deploy applications that are truly distributed, scalable, and always available.”
Bump is one of the most popular mobile apps on the market today. The app makes it easy for users to share their contact information, photos, and other objects by simply “bumping” their smartphones. Bump uses Riak to store user data including events, communications sent and received, handset information and authentication tokens.
“Operational ease is key to our business success,” says Mark Smith, Operations Lead at Bump. “The combination of SoftLayer, who we already trust with our business and data, and Basho, who makes the database that we trust at scale, saves us time and effort and allows us to focus on our business, not our data infrastructure.”
Features & Benefits
- Web-based SoftLayer Solution Designer makes it easy to configure and deploy Riak environments on demand and at the click of a button.
- High performance and superior availability and scalability leveraging the broadest cloud infrastructure platform in the industry including dedicated bare metal servers and a broad range of storage options.
- Global private network allows for high-speed, secure replication between clusters
Optimized infrastructure and best-practice deployments based on joint insights, expertise, and experience from SoftLayer and Basho.
- Pay-as-you-go model provides the flexibility of monthly or annual billing and no long-term contracts.
Over 30 speakers from bitly, Comcast, The Weather Channel, Turner Broadcasting System, Harvard University, and more to discuss the future of distributed systems.
New York City, NY – April 8, 2013 – Basho, the worldwide leader in distributed database and cloud storage software, announced today the initial speaker line up for RICON East. RICON is Basho’s global conference series that is dedicated to distributed systems and is designed by and for engineers, developers, data scientists, and architects. RICON East is being held May 13-14 in New York City, NY. Basho expects to assemble hundreds of the industry’s most influential thinkers and practitioners devoted to deploying distributed systems technologies, including NoSQL solutions and Cloud Storage.
Dr. Margo L. Seltzer, Harvard University
Rich Hickey, Creator of Clojure, Datomic
Camille Fournier, Rent the Runway
Alex Payne, Breather
Hilary Mason, bitly
Theo Schlossnagle, OmniTI
Robert Treat, OmniTI
Neha Narula, Massachusetts Institute of Technology (MIT)
Neil Conway, UC Berkeley
Kyle Kingsbury, Factual
Ed Laczynski, Datapipe
Brian Akins, Turner Broadcasting System
Sathish Gaddipati, The Weather Channel
Michajlo Matijkiw, Comcast
Mark Wunsch, Gilt Groupe
Basho engineers will be featured prominently throughout RICON East. Basho speakers include: Andy Gross, Sean Cribbs, Matthew Von-Maszewski, Ryan Zezeski, Chris Tilt.
RICON East builds on Basho’s highly successful, sold-out RICON 2012 event held Fall 2012 in San Francisco. Presentations from RICON 2012 are available to view at www.ricon2012.com.
Tickets are available online at http://ricon.io/east.html. Student discount prices are available online. For other discounts, including discounts for large groups, contact Mark Phillips at email@example.com.
Initial sponsors of RICON East include Fastly, Meraki, Engine Yard, Github and NoSQLWeekly. For more information on sponsorship opportunities, contact Tom Santero at firstname.lastname@example.org.
About Basho Technologies
Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s distributed NoSQL database, Riak, and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by over 25% of the Fortune 50 to power their critical Web, mobile and social applications and their public and private cloud platforms.
Basho is headquartered in Cambridge, Massachusetts and has offices in London, San Francisco, Tokyo and Washington DC.
March 25, 2013
All of us here at Basho would like to congratulate CloudStack on their graduation from incubation status to a Top-Level Apache project. This signifies that the Project’s community and products have been well governed under the Apache Software Foundation’s meritocratic process and principles. CloudStack joined the Apache Incubator back in April 2012 and has experienced large successes since – including Datapipe’s decision to build their public cloud on CloudStack.
If you’re not familiar with CloudStack, it is used to deliver Infrastructure-as-a-Service (IaaS) cloud computing in private-cloud, public, and hybrid cloud environments. It has been proven to be both stable and highly scalable, underpinning production clouds more than 30,000 physical nodes, in geo-distributed environments.
Basho has been partnered with CloudStack since September of last year, working together to build a combined platform for compute and storage resources. For more details about how Basho and CloudStack are working together, check out the full announcement.
The Apache Software Foundation is a non-profit that provides organizational, legal, and financial support for nearly 150 open source projects and initiatives. The pragmatic Apache License makes it easy for all users, commercial and individual, to deploy Apache products. Visit their site to learn more.
Congratulations again, CloudStack. We’re excited to see what comes next!
New York City, NY. – March 20, 2013 – Today at GigaOM Structure Data 2013 in New York City, Basho, the worldwide leader in distributed database and cloud storage software, announced that Riak CS (Cloud Storage) is now available open source, significantly expanding the ease-of-access to Basho’s software for developers, enterprise architects, and IT operations professionals seeking to build public or private storage clouds. Also today, Basho announced the general availability of Riak CS v1.3, the third release of Basho’s simple, available cloud storage software.
Riak CS is a multi-tenant, distributed, S3-compatible cloud storage platform that enables enterprises and service providers to launch public or private cloud services. Built on top of Riak, the world’s most advanced, open source, distributed database, Riak CS provides horizontal scale, extreme durability and low operational overhead in a distributed object storage system. Riak CS Enterprise adds Basho’s multi-datacenter replication technology and is backed by Basho’s 24×7 support and enterprise-class service-level commitments.
Riak CS Enterprise is used by great organizations worldwide including Datapipe, Deutsche Vermögensberatung (DVAG), IDC Frontier, Rovio, and Yahoo! JAPAN.
New Features in 1.3
- Multipart Upload. Riak CS v1.3 includes a new multipart upload capability that lets users store very large files by uploading parts in parallel.
- Enhanced Control for Multi-Tenant Environments. Riak CS v1.3 introduces object access control by source IP enabling operating to restrict access to Riak buckets by IP address.
- Support for GET Range Requests. Riak CS users can now retrieve a range of bytes from a single object. This functionality is implemented in the “Range” request header of GET operations.
- Graphical Tool for Riak CS. Riak CS Control is a standalone web administration tool for user management.
Basho offers a hosted “sandbox” to test interfacing with a live implementation of Riak CS. The “sandbox” is available at https://www.riakcs.net/users/sign_in.
For more information on Riak CS Enterprise, and to request a Developers Trial License, click https://basho.com/riak-cloud-storage/.
Upcoming Riak CS Webinar and RICON EAST
Basho will host an “Introduction to Riak CS Webinar” on Tuesday, April 2. To participate in the webinar, sign up here.
Basho is hosting RICON EAST on May 13 – 14, 2013 in New York City, NY. RICON is Basho’s distributed systems conference by and for engineers, developers, scientists and architects. For ticket information on RICON East, visit http://ricon.io/east.html.
Greg Collins, president and CEO, Basho
“It has been almost one year since we first released Riak CS. In just 12 months, we have seen rapid adoption by global cloud operators, telecommunication providers and large enterprises. Over the past year, Riak CS has gained new advanced capabilities and has been battle-tested in many of our customers’ and partners’ labs. Our customers have deployed Riak CS as the object storage engine inside popular cloud computing platforms, including Apache CloudStack and OpenStack. Today, by open sourcing Riak CS, we are making it easier for users to experiment with and test Riak CS, to provide rapid product feedback, and to contribute to its future capabilities.”
Ash Yamanaka, general manager, IDC Frontier and
Shingo Saito, cloud product manager, Yahoo! JAPAN
“Basho, Yahoo! JAPAN and IDC Frontier, a member of Yahoo! JAPAN group, have a very strong and growing partnership. Today, Yahoo! JAPAN and IDC Frontier leverage Riak CS Enterprise to offer an S3-compatible public cloud storage service, as well as dedicated hosting options for our customers various applications. Yahoo! JAPAN and IDC Frontier are highly supportive of open source software and we view Basho’s announcement today as a positive move that will work to accelerate its ability to innovate and ultimately strengthen our cloud platform.”
Sameer Dholakia, group vice president and GM, Citrix Platforms Group, Citrix
“Basho clearly understands the market power of open source. Since Citrix and Basho started collaborating last year, we have seen strong enthusiasm among Citrix CloudPlatform users for Basho’s cloud object storage solution. Now, CloudStack users have easy access to Riak CS and can quickly deploy an object storage solution that features multi-tenancy and S3 compatibility. We believe that many Citrix CloudPlatform customers will also seek Riak CS Enterprise for its distributed data capabilities across multiple data centers.”
Ed Laczynski, vice president, Cloud Strategy and Architecture, Datapipe
“Datapipe is very supportive of Basho’s decision to open source portions of Riak CS. During the last six months, we have deployed Riak CS Enterprise in Datapipe’s 10gig Stratosphere cloud computing platform. Riak CS provides Datapipe and its customers with highly available, low-latency and S3-compatible storage. Datapipe’s customers will benefit as Basho’s community increasingly experiments, tests and contributes to Riak CS, ultimately speeding our access to more capabilities and higher performance.”
Simon Robinson, vice president of storage research, 451 Research
“The cloud storage market continues to accelerate as companies seek to build public and private storage clouds that mirror Amazon Web Services’ capabilities and economics. Basho, with Riak CS, already has a proven track record of successful customer public and private cloud deployments. Now, Basho is demonstrating it has confidence that the technical and business benefits of Risk CS can be accelerated even faster via the open source model.”
About Basho Technologies
Basho is a distributed systems company dedicated to making software that is always available, fault-tolerant and easy-to-operate at scale. Basho’s distributed NoSQL database, Riak, and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by over 25% of the Fortune 50 to power their critical Web, mobile and social applications and their public and private cloud platforms.
Riak and Riak CS are available open source. Riak Enterprise and Riak CS Enterprise offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com.
Basho is headquartered in Cambridge, Massachusetts and has offices in London, San Francisco, Tokyo and Washington DC.
Basho Marketing Manager
March 20, 2013
Riak CS (Cloud Storage) is simple, available cloud storage software built on Riak. Basho announced today that Riak CS is now open source under the Apache 2 license. Organizations and users can now access the source code on Github and download the latest packages from the downloads page. Also, today, we announced that Riak CS Enterprise is now available as commercial licensed software, featuring multi-datacenter replication technology and 24×7 Basho customer support.
We will be hosting an introductory webcast to Riak CS on Tuesday, April 2. Sign up here.
Riak CS can be used to build private or public clouds or as reliable, available storage behind applications and platforms. Riak CS Enterprise is currently used by large corporations including Datapipe, Deutsche Vermögensberatung (DVAG), IDC Frontier, Rovio, and Yahoo! JAPAN.
Basho is a distributed systems company dedicated to making software that is available, fault-tolerant, and easy to operate at scale. Twenty-five percent of the Fortune 50 and thousands of open source users large and small run our flagship open source database, Riak. Riak CS takes distributed systems principles derived from production Riak users and applies it to the problem of large scale storage. We are excited to share this code with the world.
Riak CS features:
- Highly available, fault-tolerant storage
- Large object support
- S3-compatible API and authentication
- Multi-tenancy and per-user reporting
- Simple operational model for adding capacity
- Robust stats for monitoring and metrics
For users requiring multi-datacenter replication and enterprise-level support, Riak CS Enterprise (a commercial extension of Riak CS) is available.
Today we are also announcing several new features, available now as part of the open source edition.
- Multipart upload. Upload very large files to Riak CS as a series of parts. Parts can be between 5MB and 5GB.
- Support for GET range queries. Retrieve a range of bytes from a single object. This functionality is implemented in the “Range” request header of GET operations.
- Per-bucket policies to restrict access to buckets based on source IP.
- Riak CS Control. Riak CS Control is a standalone web administration tool for user management available on Github.
“Basho, Yahoo! JAPAN, and IDC Frontier a member of Yahoo! JAPAN group have a very strong and growing partnership. Today, Yahoo! JAPAN and IDC Frontier leverage Riak CS Enterprise to offer an S3-compatible public cloud storage service, as well as dedicated hosting options for our customers various applications. Yahoo! JAPAN and IDC Frontier are highly supportive of open source software and we view Basho’s announcement today as a positive move that will work to accelerate its ability to innovate and ultimately strengthen our cloud platform.”
- Ash Yamanaka, general manager, IDC Frontier and
– Shingo Saito, cloud product manager, Yahoo! JAPAN
“Basho clearly understands the market power of open source. Since Citrix and Basho started collaborating last year, we have seen strong enthusiasm among Citrix CloudPlatform users for Basho’s cloud object storage solution. It has also provided the Apache CloudStack community with easy access to Riak CS for multi-tenancy and S3 compatibility. With today’s announcement, Citrix CloudPlatform customers will continue to benefit from Riak CS Enterprise for its distributed data capabilities across multiple data centers.”
- Sameer Dholakia, group vice president and GM, Citrix Platforms Group, Citrix
“Over the last six months, we have deployed Riak CS Enterprise within Datapipe’s 10gig Stratosphere cloud computing platform. Riak CS provides our customers with highly available, low-latency, S3-compatible cloud object storage. Datapipe is very supportive of Basho’s decision to open source portions of Riak CS. As Basho’s open source community grows, experiments, tests and contributes to Riak CS, Datapipe clients will benefit from access to additional capabilities and higher performance.”
- Ed Laczynski, vice president, Cloud Strategy and Architecture, Datapipe
Please join us for an introductory technical webcast to Riak CS on April 2. You can also read a technical overview on our website and find full documentation here.
In the coming weeks and months, we look forward to helping new users get started with Riak CS and be successful running it in production. We’ll be expanding integration and partnerships with open source cloud computing platforms in order to provide integrated storage and compute to the marketplace. As always, we’ll be listening to feedback, engaging with the community, and accepting pull requests.
March 11, 2013
Nearly each day this month, we will be speaking at conferences, hosting meetups, and sponsoring events. For a full list of events, visit our Events Page. If you want to meet up with a Basho team member at one of these events, contact us to set up a time. Below are some of the highlights:
GigaOM Structure: Basho will be speaking at two different sessions at GigaOM Structure (March 20-21) in New York. Come hear Basho CTO, Justin Sheehy, and Technical Evangelist, Tom Santero, speak, stop by our booth, or attend our cocktail reception on March 20th.
Game Developer Conference 2013: Basho Chief Architect, Andy Gross, will be speaking at the Game Developer Conference at a session titled “Gaming on NoSQL: Building Available, Fast Services with Riak.” GDC will be held March 25-29 in San Francisco. Check out our session and booth to learn more about how gaming platforms can use Riak.
Meetups: This month, we are hosting a number of meetups all over the country. If you’re in Austin, come visit us at BlackLocus on March 11th, if you’re in Seattle, visit us on March 13th at Blue Box Group, if you’re in Chicago, visit us on March 14th at Braintree, or if you’re in Boston, check us out on March 27th at Basho’s Cambridge office. We’ll also be at Riot Games in LA on March 19th and in Portland on March 28th at NedSpace.
Sponsored Events: Basho will be sponsoring Erlang Factory 2013 in San Francisco (March 18-22), Clojure/West in Portland (March 18-20), Open Analytics Summit in Arlington, VA (March 25), and Monitorama in Boston (March 28-29).
Hope to see you soon!
March 6, 2013
We had a great time and met a lot of interesting people. During the conference, our own Tyler Hannan, Director of Technical Marketing, was interviewed about Riak and the conference. Check out his interview below:
Missed us at Strata? Check out our Events Page to see where Basho will be next. We’d love to chat!
February 28, 2013
In the last post, we looked at how Riak Enterprise’s multi-datacenter replication can be configured for backups and data locality. In this post, we examine two other common implementations: availability zones and public cloud use cases. For more information on Riak Enterprise architecture and configuration, download the complete whitepaper.
Availability zones provide efficient multi-datacenter replication and data redundancy within a geographic region (such as a coast or a country). In this configuration, data is replicated within an availability zone’s series of datacenters. In the event that one of datacenters experiences an outage or serious failure, data can still be served from other datacenters within the same region.
One approach to setting this up is to have a “primary” site in a region where all reads and writes for specific users, applications, or data sets are directed. This primary cluster can then be replicated to one or more proximal secondary clusters. In other approaches, data can be replicated in real-time from one cluster to both another datacenter and other cold backups maintained for emergency conditions. The right approach is highly dependent on the requirements of users, availability, expense of bandwidth, and other constraints.
Public Cloud Use Cases
Riak is designed to be easy to use and operate on public clouds, and is partnered with many of the leading cloud providers, including Amazon Web Services, Microsoft Azure, and Joyent. Hosted Riak is also available from Engine Yard and Riak packages can always be manually installed on any physical or virtual provider, even if a machine image isn’t explicitly supported.
There are several use cases for Riak Enterprise’s multi-datacenter replication in the public cloud. Many enterprises want to maintain a cold or hot backup of their cluster in a public cloud for business continuity in the event of a datacenter outage in their private infrastructure. For other customers, the public cloud can provide a more cost-effective way of meeting peak loads, rather than building out private infrastructure to accommodate them year-round. For example, many retailers and media providers need to offer increased capacity over the holiday season. Riak Enterprise is used to scale out capacity on public clouds over these periods, either with full-sync or real-time sync depending on the business needs.
Finally, some enterprises run certain applications or services entirely on public clouds. Riak Enterprise allows for redundancy and data locality across public cloud availability zones for this use case, ensuring optimal performance and resiliency.
February 15, 2013
On January 16, 2013, Citrix’s David Nalley and Basho’s John Burwell, detailed the underlying architecture and customer benefits of CloudStack and Basho’s Riak CS software. In the video, David Nalley provides an overview of CloudStack’s architecture and offers a walkthrough of CloudStack Administration, including provisioning of new instances and a description of the CloudStack API.
John Burwell, also a Committer to Apache CloudStack, explains the role of Secondary Storage software to store immutable assets (templates, ISO images, snapshots). John details the difference between Object Storage and Block Storage, discussing benefits such as flexible meta-data definition and custom API access. Finally, John describes enhancements in the upcoming 4.1.0 release that leverage Riak CS to synchronize assets in secondary storage across zones/data centers, reducing operational costs and complexity for multi-zone CloudStack implementations.
Basho is actively working with the CloudStack community to design CloudStack’s next generation architecture to drive deeper integration of leading edge storage technologies such as Riak CS.
Earlier this week, Basho and Datapipe announced the availability of a new object storage service on Datapipe’s 10 Gig Stratosphere Cloud Platform. The S3-compatible object storage service is built on Citrix CloudStack and fully integrates Riak CS.