Tag Archives: S3

Top Five Questions About Riak CS

May 1, 2013

This post looks at five commonly asked questions about Riak CS – simple, available, open source storage built on top of Riak. For more information, please review our full documentation, or sign up for an intro to Riak CS webcast on Friday, May 10.

What is the relationship between Riak and Riak CS?

Riak CS is built on top of Riak, exposing higher-level storage functions including large object support, an S3-compatible API, multi-tenancy, and per-user storage and access statistics. Riak itself provides the replication, availability, fault-tolerance, and underlying storage functions for the Riak CS implementation. Riak and Riak CS should both be installed on every node in your cluster. While Riak and Riak CS could be run on separate virtual or physical nodes, running them on the same machine minimizes intra-cluster bandwidth usage and is the recommended approach. As with Riak, we advise a minimum 5-node cluster.

When objects are uploaded to Riak CS, the object is broken up into smaller chunks which are then streamed, stored, and replicated in the underlying cluster. A manifest is maintained for each object, that points to which blocks comprise the object, and is used to retrieve all blocks and present them to the client on read. In addition to running Riak and Riak CS on each node, Stanchion, a request serializer, must be installed on at least one node in the cluster. This ensures that global entities, such as users and buckets, are unique in the system.

What use cases does Riak CS support that Riak doesn’t?

Riak CS has several features that are not provided in the standalone Riak database. One of the most obvious differences is in the size of objects supported. Riak CS exposes large object support, and includes multi-part upload so you can upload objects as a series of parts. This allows you to upload single objects to the system into the terabyte range. In Riak, the data model is simply key/value; in Riak CS, the key/value model provides the underlying structure for higher-level storage semantics – users, buckets and objects. The Riak CS interface is an S3-compatible HTTP API, allowing you to use existing S3 libraries and tools. In contrast, Riak exposes an HTTP and protobufs API and offers many language-specific clients. Unlike Riak, Riak CS is multi-tenant, with the concept of “users” and per-user reporting on storage and access. This makes it a fit for both private cloud scenarios, with multiple internal users, or as a foundation for a public cloud storage offering.

How does multi-tenancy, authentication and reporting work?

Riak CS exposes an interface for user creation, disablement and credential management. Riak CS can be set so that only administrators can create new users. Administrators also have special privileges including being able to retrieve a list of all users in the system and query the user account information of any user. Once issued credentials, users are able to authenticate, create buckets, upload and download files, retrieve account information, obtain new credentials, or disable their account through the API. Riak CS supports the standard S3 authentication scheme, with support for header and query string authorization.

Riak CS exposes storage, usage and network statistics that support use cases like accounting, subscription, billing or multi-group utilization for public or private clouds. Riak CS will report information on how much storage a user is consuming and the network operations related to access. This data is exposed via an HTTP interface and can be queried on the default timespan “now” or as a range from start time through end time. Access statistics are reported as bytes in and bytes out for both object and bucket operations. Reporting of this information can be scheduled for a set interval or manually triggered.

What’s the difference between Riak CS and Riak CS Enterprise?

Riak CS Enterprise provides multi-datacenter replication on top of Riak CS. For multi-datacenter replication in Riak CS, global information for users, bucket information and manifests are streamed in real-time from a primary implementation to a secondary site so global state is maintained across locations. Objects can then be replicated in either full sync or real-time sync mode. The secondary site will replicate the object as in normal operations. Additional datacenters can be added in order to create availability zones or provide additional data redundancy and locality. Riak CS Enterprise can also be configured for bi-directional replication. Riak CS Enterprise also comes with 24/7, enterprise-level support. More information and pricing can be found here, and full technical information is available on our docs portal. Ready to get started? Sign up for a developer trial of Riak CS Enterprise.

What are your plans for integration of Riak CS with open source compute solutions?

Riak CS provides highly available, distributed storage, making it a natural fit for usage alongside compute solutions. We have partnered with Citrix to collaborate on the integration of Apache CloudStack and Riak CS to create a complete cloud software offering that combines compute and storage in an integrated platform. For more information on our partnership with CloudStack, check out this blog post with the latest update. API and authentication support for OpenStack is also in progress.

Ready to get started? You can download Riak CS here, and check out the Riak CS Fast Track for a hands-on getting started guide.

Getting Started with Riak CS and S3cmd

April 9, 2013

Riak CS (Cloud Storage) is simple, open source storage software built on top of Riak. s3cmd is a command-line tool for uploading, retrieving, and managing data via an Amazon S3 compatible API.

In this short screencast, we cover the process of installing and configuring s3cmd on a Debian- or Ubuntu-based system. Once installed, we’ll use Amazon’s s3cmd tool to manage buckets and files in Riak CS. You can view the entire screencast below.

Due to the small type used in the screencast, we recommend viewing this video at a high resolution.

You can also learn more about Riak CS here.

Basho

Use Cases for Riak CS

April 3, 2013

As you might have heard, we recently open sourced Riak CS, cloud storage built on Riak. You can find all of the code on our GitHub account and download Riak CS here. To help you get started with Riak CS, here are some common use cases.

  • Large Object Storage For Applications and Services: Riak CS is built for storing large objects of all types. It is content agnostic so you can store images, text, video, documents, database backups, software binaries, or other data types. Riak CS can store objects into the terabyte size range using the new multipart upload feature. When an object is uploaded, Riak CS breaks it into smaller blocks that are streamed, stored, replicated in the underlying Riak cluster.
  • On-Demand Internal Storage Capacity: Riak CS provides highly available storage for internal business units. Built on Riak, Riak CS has a masterless, redundant design that ensures availability and fault-tolerance. Use cases might include document storage or backing for internal applications.
  • Storage Layer for Public Clouds/Cloud Services: Riak CS’ flexibility and scalability provide the ideal foundation for building public clouds or cloud services. Capacity can be added by installing Riak CS on a new physical node and joining it with the cluster. Riak automatically redistributes data and ownership so all nodes have equal responsibility, which prevents storage hot spots and decreases the operational burden of adding new nodes. Additionally, Riak CS is multi-tenant, a requirement of most public cloud services today.
  • Amazon S3 Compatibility: Riak CS is S3-compatible, making it easy for your developers to be productive quickly. Riak CS can be used with existing S3 clients and libraries. The HTTP REST API supports service, bucket, and object-level operations to easily store and retrieve data. Riak CS makes sense for companies that are trying to provide internal, S3-like services or using a hybrid approach with some public and some private cloud storage.
  • Disaster Recovery and Active Backups: Riak CS Enterprise extends Riak CS with multi-datacenter replication. By replicating data across datacenters using either real-time or full-sync, you can maintain redundant storage in case of disaster scenarios. Multi-datacenter replication can also be used to maintain active backups or create availability zones.

For more information about Riak CS, visit our site and download the technical overview.

Basho

Datapipe and Riak Cloud Storage

February 11, 2013

We are excited to announce Datapipe’s Stratosphere, a globally available, high-performance managed cloud computing platform, leverages Riak Cloud Storage (CS). Riak Cloud Storage provides Datapipe and its customers with highly available, low-latency and S3-compatible storage.

Sign up here to get started with Datapipe’s 10 Gig Stratosphere cloud platform and earn a $500 credit.

Datapipe offers a single provider solution for managing and securing mission critical IT services, including cloud computing, infrastructure as a service, platform as a service, managed hosting, and colocation.

Stratosphere is Datapipe’s globally available managed cloud computing platform. With the launch of Riak CS to support cloud object storage, Datapipe customers can now access cloud object storage from any solution hosted with Datapipe and adjacent to existing solutions in any Datapipe data center. Stratosphere is designed for enterprise high I/O production environments and can also be used for development, testing and QA environments. Use cases include large-scale marketing campaigns, brand sites and analytics; applications with variable peak demand times and other dynamic workloads; and cloud disaster recovery and geographic redundancy.

Datapipe delivers services from the world’s most influential technical and financial markets including New York metro, Silicon Valley, London, Hong Kong and Shanghai.

Why Riak Cloud Storage at Datapipe?
Datapipe selected Riak Cloud Storage for its low-latency, highly available object storage, operational ease-of-use, and multi-site replication capabilities. After extensively testing solutions from a variety of vendors in the space, Datapipe selected Riak Cloud Storage for a few core reasons:

  • Built on years of developing Riak, Riak CS is designed to provide simple, available, distributed cloud storage at any scale.
  • Riak CS is compatible with major cloud object storage clients and applications with its S3-based API.
  • Riak CS meets the high performance requirements of the Stratosphere cloud-computing platform.

“Riak CS provides the high-performance, distributed datastore we need to deliver a sound foundation for our cloud storage needs now and for many years into the future,” said Ed Laczynski, VP Cloud Strategy, Datapipe.

Be on the lookout for upcoming documentation about using Riak CS-backed functionality on Stratosphere at Datapipe. Riak CS is now available with Datapipe in a limited beta, with an upcoming full release.

For a developer trial of Riak CS, sign up here.

Basho

Self-Service Test Harness for Riak Cloud Storage

January 7, 2013

Riak Cloud Storage is simple, available cloud storage software built on top of Riak. It offers an S3 API, multi-tenancy and large object support for enterprises building public or private clouds. We want to make it easier to get started with Riak CS, so we’re now offering a self-service test harness. Visit riakcs.net to sign up - you can explore the functionality, test API operations, and experiment with clients and development apps. With the self-service feature, you can start playing right away.

Note that the test harness is primarily for exploring Riak CS features – if you want to do load testing and performance benchmarking, you should sign up for a developer trial that will give you access to Riak CS packages you can install and test on your own hardware.

Interested in learning more about Riak CS? All of the docs are available online.

Basho Team

Announcing Riak Cloud Storage With Multi-Datacenter Replication

November 30, 2012

Riak Cloud Storage is an S3-compatible, multi-tenant storage platform built on Riak. It combines the availability and fault tolerance of Riak with the ability to store large objects, an S3-compatible API, user administration and usage reporting. It can be used for public and private clouds or as reliable storage for applications. Today we’re announcing multi-datacenter replication support in Riak CS. Increasingly, global enterprises and apps require multi-site storage replication to achieve data locality, availability in disaster scenarios, or maintain active backups, so we’re very excited to provide these features in the latest release of Riak CS.

You can read more about multi-datacenter replication for Riak CS in the public docs, or sign up for an upcoming webcast on Thursday, December 6, which gives a technical overview of Riak CS and discussion of new features. If you want something more hands on, get a developer trial of Riak CS to take it for a test drive.

Technical Details

Multi-datacenter replication in Riak CS provides two modes of object replication: full sync and real-time sync. Data is streamed over a TCP connection, and multi-datacenter replication in Riak CS has support for SSL so data can be securely replicated between sites.

In Riak CS, large objects are broken into blocks and streamed to the underlying Riak cluster on write, where they are replicated for high availability (3 replicas by default). A manifest for each object is maintained so that blocks can be retrieved from the cluster and the full object presented to clients. For multi-site replication in Riak CS, global information for users, bucket information and manifests are streamed in real-time from a primary implementation to a secondary site so global state is maintained across locations. Objects can then be replicated in either full sync or real-time sync mode.

In full sync, objects are replicated from a primary Riak CS implementation to a secondary site on a configurable interval – the default is 6 hours. In full-sync replication, each cluster computes a hash for each key’s block value. Key/block pairs are compared, and the primary site streams any missing blocks or updates needed to the secondary site.

Real-time sync is triggered when an update is sent from a client to a primary Riak CS implementation. Once replicated in the first location, the updates are streamed in real-time to the secondary site. But what happens if a client requests an object from the secondary cluster and not all of its blocks have been replicated to that cluster? With Riak multi-site replication, the secondary cluster will request any missing blocks from the primary cluster so that the client can be served.

Try It Out

We’ve got two ways for you to try out Riak CS software. First, we can give you access to a hosted version where you can upload files, test out the API, and try s3cmd or other clients against it. If you want to try Riak CS on your own hardware, we also have a developer trial that gives you access to the Riak CS code and a little bit of our help to get you up and running. So check out the docs and then sign up to start.

Basho Team

Try Riak Cloud Storage

October 18, 2012

Riak CS is enterprise cloud storage built on top of Riak, offering S3-compatible, multi-tenant large object storage. It’s software for building public clouds, or reliable storage behind your applications. And of course, it’s designed to be fault-tolerant, highly available and easy to operate, just like Riak. Riak CS docs are available online, but we also wanted to provide an easy way for teams to get their hands on some code or a hosted version to try out.

We’ve got two ways to get you started. First, we’ve got hosted Riak CS for testing. It’s got a simple UI so you can upload files, test out the API, use s3cmd or another client against it, and generally work your way around. If you want to dig further into Riak CS, we also have a Riak CS developer trial. This gives you access to the Riak CS code (and a little bit of our help to get you up and running), so you can install it on your own machines, do all the testing you need in your own environment, and decide if it’s the right choice for you.

If either of these options sound good to you, send us a little info on you and we’ll get you started.

Riak Cloud Storage – New Features and Product Update

August 21, 2012

Earlier this year we launched Riak CS – simple, available cloud storage built on Riak. We gave it an S3-compatible API, made it multi-tenant, and added per-user reporting on network and storage utilization. Riak CS provides the core features to build public or private clouds that are distributed, fault-tolerant and easy to scale.

Since then, we’ve rolled out a startup program to make Riak CS affordable for earlier-stage shops. And we’ve talked to a number of companies who wanted to license Riak CS based on the amount of storage they need, rather than the number of nodes, so now you can do that too. (Get in touch if you want a quote.) Riak CS docs are now also publicly available on our wiki.

Today we’re announcing some new features and additions to Riak CS, including more options for user admin, auth and object metadata, plus improved stats and troubleshooting. We’ll walk thru the Riak CS architecture, operations and new functionality in an upcoming webcast, or read on.

New in Riak CS

Better User Administration

We’ve beefed up the Riak CS user API so admins can now list all users, issue new credentials to a user, or disable a user. We’ve also got new configuration options to restrict user creation to admins or let anyone create a user directly.

List All Objects in a Bucket Much Faster

User objects in Riak CS are stored in flat namespaces called “buckets”. We sped up one of the most common bucket operations: listing all of its objects. Riak CS now uses MapReduce to look up objects, yielding significant performance gains.

Arbitrary Metadata

One of our top customer requests. Now you can store some additional metadata with your Riak CS objects – whatever is most useful to you.

New Basic Health Check

If you’ve ever set up Riak, you’ve probably used the basic health check to test Riak nodes. Now you can send an HTTP “ping” to Riak CS nodes as well to make sure they are responsive. The “ping” will also fail if the Riak CS node can’t reach the underlying Riak database (Riak CS nodes have a 1×1 mapping onto Riak nodes).

Better Inspection and Troubleshooting

We have added DTrace probing in Riak CS and we are working on SmartOS packaging. We will also be working on packaging for other platforms that support DTrace. Once these packages are available it should help operators who use DTrace platforms troubleshoot any issues with Riak CS and have more visibility into their stack. We also now expose more stats information on the Riak CS runtime by HTTP request.

Improved S3 API Coverage

We also added support for query parameter authentication, increasing compliance with S3’s REST Authentication scheme. This means that Riak CS users can now authenticate thru a request header or a URL-encoded query string parameter.

Stay tuned because we’re hard at work on the next Riak CS features and improvements. What do you want to see? We’d love to hear from you.

Intro to Riak CS Webcast

**November 05, 2012**

Earlier this year we launched Riak CS – simple, available cloud storage built on Riak. We gave it an S3-compatible API, made it multi-tenant, and added per-user reporting on network and storage utilization. Riak CS provides the core features to build public or private clouds that are distributed, fault-tolerant and easy to scale.

New to Riak CS? Join us this Wednesday (11am PT / 2pm ET) for an Intro to Riak CS webcast with Basho chief architect Andy Gross and director of product management Shanley Kane. In this 30 minute webcast, we’ll cover:

* Main features, including S3-compatibility, multi-tenancy, large object support and reporting
* Operations and interfaces
* Use cases in public/private clouds and applications
* Latest release and roadmap plans

Register for the webcast [here](http://info.basho.com/IntroToRiakCSNov7.html).

[Shanley](http://twitter.com/shanley)

[Andy](https://twitter.com/argv0)