March 5, 2014
This month features a wide range of developer events, major conferences, and various meetups. Here’s a look at some of where we’ll be in March.
Erlang Factory: Erlang Factory SF brings together the rapidly growing community that uses Erlang in order to showcase the language and its various application to today’s distributed environments. Come hear talks by Basho’s own Tom Santero, Joseph Blomstedt, Chris Meiklejohn, and Joe DeVivo. We will also have a table set up to answer any questions about Riak. Erlang Factory takes place March 6-7 in San Francisco.
Game Developer’s Conference: GDC is the largest gaming industry event and will take place March 17-21 in San Francisco. Eric Liaw (Co-Founder and Developer, Quark Games) and Seth Thomas (Technical Evangelist, Basho) will be speaking on “Riak for Gaming” on Thursday, March 20th at 11:30am. Basho will also be exhibiting, so stop by and grab a t-shirt.
Clojure/West: Clojure/West takes place March 24-26 in San Francisco. Basho engineer, Reid Draper, will be speaking on “Powerful Testing with simple-check” and the Basho team will be available to answer questions about Riak.
Meetups: On March 12th, Hector Castro (Basho Technical Evangelist) will talk about Riak CS at OpenStack Boston. Both Tom Santero (Basho Technical Evangelist) and Hector Castro will talk about Riak CS and the underlying architecture, Riak, at OpenStack Connecticut on March 18th. Tom Santero will present on Highly Available Applications in the Cloud on March 19th at OpenStack New York. Finally, Hector Castro will talk about Riak CS and Riak at OpenStack Philly on March 20th.
For a full list of where we’ll be, check out the Events Page.
December 30, 2013
2013 was a huge year for Basho Technologies and before we dive into 2014, we thought we’d take a moment to reflect on how far we’ve come.
2013 was the year of the Riak User. We love hearing about all the amazing ways companies across various industries are using Riak. This year, we were able to share dozens of exciting case studies. These include:
- Synacor’s TV Everywhere platform
- Enstratius (acquired by Dell)
- Best Buy
- Alert Logic
- Viggle (through OmniTI)
- Turner Broadcasting
- Hosted Graphite
- Gilt Groupe
- Praekelt Foundation
- National Health Service
- City Maps
- The Weather Company
For even more Riak Users, check out the Users Page.
We released Riak 1.3, Riak 1.4, and the Technical Preview of Riak 2.0 this year. These releases added such features as Active Anti-Entropy, revamped Riak Control, queryability improvements, Riak Data Types, and much more. Be on the lookout for the general release of Riak 2.0 early next year.
This year, we expanded RICON, Basho’s distributed systems conference, to both RICON East and RICON West. These were both sold out conferences that featured speakers from bitly, Comcast, Google, Netflix, Salesforce, The Weather Company, Turner Broadcasting, Twitter, and many more.
We drastically increased the number of Basho partners in 2013. For a full list of partners, check out the Partnerships Page. Some key ones to note include Tokyo Electron Device, SoftLayer, and Seagate.
Our amazing community team hosted over 200 meetups around the world this year. On top of that, they also attended dozens of industry events to spread the word about Basho. Keep an eye on the Events Page to see where we’ll be in 2014.
2013 was a busy year but, with some exciting announcements coming, we look forward to an even busier 2014. Happy New Year!
October 16, 2013
Basho is a proud sponsor and exhibitor of Cloud Connect Chicago. The Cloud Connect conferences provide CIOs, IT professionals, and developers with the necessary tools to navigate cloud technologies and deploy cloud solutions in their organizations. Cloud Connect Chicago takes place October 21-23rd.
Basho engineer and Apache CloudStack PPMC member, John Burwell, will also be speaking on “Building Complete Private Clouds with Apache CloudStack and Riak CS” as part of the CloudStack track. During this session, he will explain how the combination of the CloudStack Infrastructure as a Service platform and the Riak CS object store will allow you to establish operational agility to drive rapid innovation and embrace commodity infrastructure without sacrificing scalability or reliability. He will also explore more general cloud system architecture principles and best practices.
While Burwell’s talk will be focused on Riak CS and CloudStack, Riak CS also offers OpenStack integration. Stop by the Basho booth to learn more about Basho’s object storage solution, Riak CS, and how it can be used in conjunction with both CloudStack and OpenStack.
For more information on how enterprisers can use private clouds for their business needs, check out Burwell’s blog post, “Learnings from Private Cloud Storage.”
October 2, 2013
What Is Riak CS?
In May of this year, we posted the top 5 questions we heard from customers and our community about Riak CS; today we’ll take a deeper dive into the technical details, specifically the differences between Riak CS and Riak itself.
Riak CS as Compared to Riak
Both Riak CS and Riak are, at their core, places to store objects. Both are open source and both are designed to be used in a cluster of servers for availability and scalability.
The fundamental distinction between the two is simple: Riak CS can be used for storing very large objects, into the terabyte size range, while Riak is optimized for fast storage and retrieval of small objects (typically no more than a few megabytes).
There are subtle differences; however, that can be obscured by the similarities between the two.
Why Would I Use Riak CS?
Riak CS is used for a variety of reasons. Some examples:
- Private object storage services, for example for companies that want to store sensitive data behind their own firewalls.
- Large binary object storage as part of a voice or video service.
- An integrated component in an OpenStack cloud solution, storing and serving VM images on demand.
Tier 3, Yahoo! Japan, Datapipe, and Turner Broadcasting are just a few of the big names using Riak CS today.
What Does Riak CS Do That Riak Doesn’t?
Riak CS carves large objects into small chunks of data to be distributed throughout a Riak cluster and, when used with Riak CS Enterprise, synchronized with remote data centers.
Riak CS adds compatibility with Amazon’s S3 and OpenStack’s Swift APIs. These offer very different semantics than Riak, and the advanced search capabilities in Riak such as Secondary Indexes and full text search are not available using S3 or Swift clients.
We strongly advise against it, but it is possible to work with Riak’s standard APIs “under the hood” when deploying a Riak CS solution.
Work is actively underway to add a security model to Riak in the upcoming 2.0 release.
Buckets or Buckets?
Users of Riak CS store their objects in virtual containers (called buckets in Amazon S3 parlance, containers in OpenStack).
Riak also relies heavily on buckets for data storage and configuration but, despite the names, these buckets are not the same.
As an example of how this can cause confusion: the replication factor in Riak (the number of times a piece of data is stored in a cluster) is configurable per-bucket. Because Riak’s buckets do not underly the user buckets in Riak CS, this feature cannot be used to create tiered services.
Riak is designed to maximize availability; the price paid for that is delayed consistency when the network is split and clients are writing to both sides of the cluster.
Creating user accounts in Riak CS; however, led to the need for a mechanism to maintain strong consistency. If two people attempt to create user accounts with the same username on either side of a network partition, both cannot be allowed to succeed, or else a conflict will occur that is very difficult to automatically recover from.
Furthermore, user buckets in S3 (and OpenStack APIs as implemented in Riak CS) reside in a global rather than a user-specific namespace, so bucket creation must also be handled carefully.
Riak CS introduced a service named Stanchion that is designed to handle these specific requests to avoid conflicts. Stanchion is a single process running on a single Riak server (thus introducing a single point of failure for user account and bucket creation requests).
While it is possible to deploy Stanchion using common system tools to make a daemon process run in a highly available manner, Basho recommends doing so carefully and testing it thoroughly. Since the only impact of failure is to prevent user and bucket creation, it may be preferable to monitor and alert on failure. If two copies of Stanchion are running due to a network partition, its strong consistency guarantees will be lost.
With strong consistency options targeted for Riak 2.0, expect to see some changes.
Basho offers multi-datacenter replication with its Enterprise software licenses, and Riak CS Enterprise takes full advantage of that feature. Data can be written to one or more clusters in multiple data centers and be synchronized automatically between them.
There are two types of synchronization: real-time, which occurs as objects are written, and full sync, which happens on a periodic basis to compare the full contents of each cluster for any changes to be merged.
One key difference is that Riak CS maintains manifest files to track the chunks it creates, and it is these manifests that are distributed between clusters during real-time sync. The individual chunks are not synchronized until a full sync replication occurs, or until someone requests the file from a remote cluster. The manifest is made active for someone to retrieve the chunks after the original upload to the source cluster is complete.
A common mistake while installing Riak CS is to configure it using information specific to Riak rather than Riak CS. As an example, per the Riak CS installation instructions the relevant backend data store must be configured to
riak_cs_kv_multi_backend, which is forked from Riak’s
riak_kv_multi_backend. Using the latter will cause problems.
Riak (CS) Control
Exposure to Internet
Exposing any database directly to the Internet is risky. Riak, currently lacking any concept of authentication, absolutely must not be accessible to untrusted networks.
Riak CS; however, is designed with Internet access in mind. It is still advisable to place a load balancer or proxy in front of a Riak CS cluster, for example to ease cluster maintenance/upgrades and to provide a central location to log and block potentially hostile access.
Riak CS servers will still have open Riak ports that must be protected from the Internet as you would any Riak servers.
Where to Next for Riak CS?
2013 has been a big year for Riak CS: it was released as open source in the spring, with OpenStack support added this summer. Still, there is much to do.
As mentioned above, improving or replacing Stanchion is a high priority.
We will continue to expand the API coverage for Riak CS. The next major targets are the copy object operations that Amazon S3 and OpenStack Swift offer.
Compression and more granular replication controls are also under consideration for future releases.
By building Riak CS atop the most robust open source distributed database in the world, we’ve created a very operationally friendly, powerful storage solution that can evolve to meet present and future needs. Feel free to give it a try if you aren’t already using it.
If you’re interested in hearing from the engineers who’ve made this software possible (and seeing just how far a highly available data storage solution can take you), join us October 29-30th for RICON West. RICON West is where Basho brings together industry and academia to discuss the rapidly expanding world of distributed systems, including Riak and Riak CS.
September 26, 2013
Big Data. eCommerce. Mobile. Suddenly, information technology has shifted from cost center to business opportunity. This opportunity favors fast movers with the ability to rapidly execute on emerging trends. Therefore, the length of traditional IT procurement cycles and provisioning processing has become a significant barrier to capitalizing on these opportunities. To increase their operational agility, some organizations are employing public infrastructure as a service (IaaS) or cloud providers (such as Amazon Web Services and Joyent) to rapidly provision compute and storage resources. However, technical incompatibilities, regulatory restrictions, cost at scale, and/or existing capital investments prevent many organizations from utilizing public cloud providers to achieve this operational agility. Private clouds allow these organizations to realize the value of public clouds with the flexibility to comply with their unique combination business and technical requirements.
Fundamentally, a cloud (public or private) creates a composable infrastructure with the following capabilities:
- Resource Pooling: Presents compute, storage, and network resources through a unified set of vendor neutral abstractions and manages them based on service-level requirements.
- Rapid Elasticity: Optimizes resource allocation based on performance relative to service-level requirements.
- Self Service: Delegates management responsibilities for a subset of the infrastructure resources to end-users.
- Metering/Charge Back: Records resource utilization on a per customer basis to support usage billing.
Private clouds implement these characteristics by orchestrating infrastructure provisioning and management through the following services:
- Compute: Physical or virtual machines with a specified number of processing cores and RAM.
- Block Storage: Random access, read/write persistent storage capable of supporting disk partitioning and file systems.
- Object Storage: Write-once, read-many (WORM) oriented storage for large files (multiple gigabytes to terabytes in size) accessed through a key-value oriented interface.
- Network: Network topology definition and connectivity management between compute, block storage, and object storage services, as well as public networks such as the Internet.
Typically, these services are exposed via an HTTP API, as well as a web-based dashboard allowing end-users to simultaneously script complex workflows and visualize their infrastructure.
Superficially, private clouds appear to be traditional virtualization infrastructures with a web interface and HTTP API. While both models share a number of common components, cloud infrastructures achieve reliability by horizontally scaling commodity hardware instead of vertically scaling specialized hardware. The following table contrasts the storage strategies employed by the traditional virtualization and cloud models:
|Data Type||Traditional Virtualization||Cloud|
|Application Data||VM direct attached storage (e.g. NAS, SAN, etc)||Elastic database service (e.g. Riak)|
|Static Content||VM direct attached storage||Object Storage (e.g. Riak CS)|
|Templates||VM direct attached storage||Object Storage (e.g. Riak CS)|
|Backups||VM direct attached storage||Object Storage (e.g. Riak CS)|
Static content, templates, and backups typically represent the majority of a system’s storage consumption. Employing object storage to manage this data brings the following benefits to private cloud infrastructures:
- Reduced Hardware Costs: By replicating multiple copies of data across a cluster of services, object storage systems such as Riak CS guarantee data durability through software rather than hardware. This approach allows users to employ cheaper commodity hardware using ubiquitous SATA/SAS storage subsystems without sacrificing reliability.
- Horizontal Scalability: Since storage coordination and data replication occurs in software, storage is expanded by simply adding new servers to the cluster.
- Operational Simplicity: Accessed via HTTP/HTTPS, object storage systems provide secure access to data using a simple, ubiquitous protocol. Unlike iSCSI and Fiber Channel solutions, this approach typically has little to no impact on network infrastructure designs.
The Apache CloudStack IaaS platform has supported Swift-based object storage since version 4.0.0 and S3-based object storage since version 4.1.0. With the 4.2.0, CloudStack supports S3 and Swift as native secondary storage devices – allowing the system to provision and backup VMs directly from an object store. When coupled with Riak CS Enterprise, Apache CloudStack-based clouds are able to replicate template and snapshot data across multi-data centers to meet off-site backup and disaster recovery requirements.
The OpenStack Object Storage API specifies the semantics of OpenStack’s object storage service. The Swift implementation of this API is provided as the default implementation of this API. With the 1.4.0 release, Riak CS implements both the OpenStack Object Storage API allowing it to serve as a drop-in Swift replacement.
As organizations work to understand the opportunities created by information technology, private clouds have emerged as a key component of their strategies to increase operational agility. While private clouds can be constructed using traditional virtualization approaches, such designs will simply mask core infrastructure brittleness and high infrastructure costs. By embracing design principles such as object storage that underpin cloud infrastructure platforms, organizations can realize the promise of increased operational agility and cost savings.
August 26, 2013
Earlier this month, we announced the availability of Riak CS 1.4, which added a number of performance improvements, OpenStack integration, and simpler user management. To provide more details about what was introduced with the latest release, we also hosted a “What’s New in Riak CS 1.4” webcast.
This short webcast provides an overview of both Riak CS and Riak, and discusses what’s new in Riak CS 1.4. It also looks at the fundamental features and architecture of Riak CS, talks about the key partnerships, and discusses Riak CS Enterprise – the commercial extension of Riak CS.
You can watch the complete recording below.
You can also view the slides from this webcast here.
To get started with Riak CS, visit docs.basho.com/riakcs/latest/riakcs-downloads/ to download the latest release.
August 15, 2013
With the launch of Riak CS 1.4, several members of the Basho team have been approached with the question “Why did you build Riak CS?”
When we open sourced Riak CS in March of 2013, the conversation focused on the importance of the community of developers with whom we engage, and participating with this community in a more open fashion.
However, understanding the history of a product can be just as important as understanding the logic behind our go-to-market strategy.
Put simply, Basho is a distributed systems company.
As a company that started with Riak, an open source distributed database, we had an immediate, targeted focus on high availability, fault-tolerance, and linear scalability. These core properties of our database implementation are, in actuality, consistent themes to consider when building any distributed system. And as Riak and Riak Enterprise gained traction in market, several customers began to use their Riak implementation to store larger objects.
With this and other customer feedback in mind, we prototyped Riak CS, which offers all of the benefits of Riak, while also adding the features and functionality required to power large object storage in public or private clouds as well as providing reliable storage for applications and services.
As we built upon this initial prototype, both based on distributed systems themes and customer input, we added an S3-compatible API to Riak CS. This provided a solution for service providers that wanted to offer S3-compatible storage and for customers that wanted to adopt a hybrid-cloud approach to address data sovereignty or redundancy concerns. We also added OpenStack Object Storage API compatibility with the latest Riak CS 1.4 release. Riak CS can now easily interact with multiple IaaS providers, which helps expand our potential user base for both the open source and enterprise product.
However, regardless of feature decisions – either present or in the future – our commitment to providing robust, resilient distributed storage remains.
New Release Adds OpenStack Integration, Simplifies Management, and Boosts Multi-Datacenter Replication Speed
August 13, 2013 – CAMBRIDGE, MA – Basho Technologies, the leader in distributed systems software, announced today the availability of Riak CS 1.4 and Riak CS 1.4 Enterprise. Riak CS 1.4 continues Basho’s commitment to provide cloud storage software that is simple to operate, highly available by design, and compatible with industry cloud standards. Riak CS is used by organizations worldwide to power their public and private clouds.
Riak CS 1.4 introduces formal integration with OpenStack, provides enhanced performance and manageability, includes community requests, and improves performance at scale. Riak CS 1.4 Enterprise significantly boosts the performance of multi-data center replication by allowing for concurrent channels, so the full capacity of the network and cluster size can scale the performance to available resources.
“Riak CS is seeing impressive market adoption, especially from service providers looking to increase their portfolio offering with large object storage,” said Greg Collins, president and CEO of Basho Technologies. “This release continues our commitment of providing simple and accessible cloud storage for a broad range of cloud computing platforms and use cases. With the addition of OpenStack integration and significant performance improvements, Riak CS 1.4 also appeals strongly to enterprises building their own object storage or adopting a hybrid-cloud deployment methodology.”
“Object storage is quickly becoming a foundational platform capability for cloud providers and large enterprises to meet the rapidly growing surge in demand to store more data,” said Simon Robinson, vice president of storage research at 451 Research. “Riak CS continues to see greater adoption in public and private clouds. Riak CS’s tighter integration with OpenStack is certain to be another catalyst for Basho. OpenStack users gain a very capable storage alternative to Swift, OpenStack’s object storage platform.”
“Yahoo! JAPAN has been using Riak CS for over a year to power our public cloud storage platform” said Shingo Saito, cloud product manager at Yahoo! JAPAN. “Riak CS is also used by LOHACO, for its on-line shopping platform, operated by ASKUL Corporation, Yahoo! JAPAN partner, and by some of the largest companies in Japan. We are excited to continue to partner with Basho and look forward to deploying Riak CS 1.4.”
“Redapt is excited to work with Basho to help customers address distributed object storage needs within OpenStack environments,” said David Cantu, co-founder and COO at Redapt. “Redapt’s mission is to enable leading service providers, enterprises, and web centric companies with the ability to achieve the numerous economic and operational benefits of private cloud computing. With the Riak 1.4 announcement, Basho is helping us deliver on that commitment for our customers with proven distributed cloud storage software that is now more finely tuned for integration with OpenStack.”
“Businesses have a range of object storage needs and our partnership with Basho helps us easily address even the most complex scenarios in our public cloud,” said Jared Wray, CTO of Tier 3. “Our global data center footprint enables businesses of all sizes to adopt object storage for a variety of use cases including: cloud-native and cross-device apps, backups and archives, and secure file transfer. The improved performance and simplified operations available with Riak CS 1.4 continue to help our customers simply scale to meet operational demand.”
Major Feature Additions of Riak CS 1.4 include:
- Built-in integration with OpenStack. Riak CS 1.4 introduces support for OpenStack’s Keystone authentication service and introduces compatibility with OpenStack Object Storage API.
- Improved performance of large bucket query operations. Secondary indexing pagination, introduced with Riak 1.4, allows for significant performance improvements of large bucket query requests.
- Simplified operational management. Improvements to the User API allow operators greater flexibility in managing Riak CS user information, while also improving the agility and responsiveness of Riak CS.
- Decreased bandwidth for object block retrieval. Changes to how Riak CS handles object block retrieval will decrease intracluster bandwidth by 67% and improve download performance.
Riak CS 1.4 Enterprise adds the following:
- Enhanced multi-site replication performance. Riak CS 1.4 Enterprise allows for concurrent channels of communication between clusters, which greatly enhances the capability for replication by taking advantage of all the network’s available resources.
Riak CS 1.4 is available for Debian, Ubuntu, FreeBSD, Mac, Red Hat Enterprise Linux, Fedora, SmartOS, and Solaris.
To view the latest technical documentation or to download Riak CS, visit docs.basho.com/riakcs/latest/.
To view a feature comparison with OpenStack Swift, visit docs.basho.com/riakcs/latest/references/appendices/comparisons/Riak-Compared-to-Swift/.
To view a feature comparison with EMC Atmos, visit docs.basho.com/riakcs/latest/references/appendices/comparisons/Riak-Compared-to-Atmos/.
To request a trial license of Riak CS Enterprise, prospective inquiries can request a Riak CS Tech Talk at http://info.basho.com/SignUpRiakTechTalk.html.
Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s distributed database, Riak and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by over 25 percent of the Fortune 50 to power their critical Web, mobile and social applications and their public and private cloud platforms.
Riak and Riak CS are available open source. Riak Enterprise and Riak CS Enterprise offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com. Basho is headquartered in Cambridge, Massachusetts and has offices in London, San Francisco, Tokyo and Washington DC.
August 13, 2013
The release of Riak CS 1.4, Basho’s open source cloud storage software, adds a number of performance improvements as well as OpenStack integration and simpler user management. Riak CS is being used by companies all over the world to build public and private clouds, and as reliable storage to power various applications.
One of the biggest additions with Riak CS 1.4 is the integration with OpenStack, broadening our relationship with the open source community. This integration supports OpenStack’s Keystone authentication service and the OpenStack Object Storage API, which allows OpenStack users the means to integrate Riak CS for object storage in an OpenStack deployment.
The Riak CS Users API provides an interface for user creation and management. This release also improves this API to give operators greater flexibility in managing user information. Additionally, this release benefits from ongoing refactoring and reorganization efforts aimed at improving the agility and responsiveness of Riak CS.
Riak CS 1.4 takes advantage of some changes made in Riak 1.4 to provide performance improvements to Riak CS users. First, Riak CS 1.4 features improved performance of listing the contents of large buckets by taking advantage of secondary index pagination in Riak. Riak CS 1.4 also leverages a new option for object block retrieval, which decreases intracluster bandwidth by 67%. This improves the download performance when handling many concurrent requests. These features can be independently enabled, but are disabled by default to accommodate users not using Riak 1.4 with Riak CS. See the documentation for more details.
Riak CS Enterprise is the commercial extension of Riak CS, which adds multi-datacenter replication and 24/7 support. The 1.4 release improves replication performance by increasing storage efficiencies and adding multiple TCP connections between clusters.
In addition to the features and upgrades listed above, many bugs were harmed in the making of this release. For a full list of what is included in Riak CS 1.4, check out our code at Github.com/basho or review the release notes. To learn even more, join our live webcast, “What’s New in Riak CS 1.4” on August 23rd.
August 02, 2012
Recently, Shanley Kane, our director of product management, was nominated as an individual member candidate to the OpenStack Foundation Board of Directors. Following Shanley’s nomination, she decided to withdraw. There has been significant speculation regarding the events that led up to Shanley’s decision.
Shanley is currently working with OpenStack in its effort to review and resolve the situation. We have been encouraged by the way OpenStack has handled the matter, including their responsiveness from the very beginning. Basho supports OpenStack’s review process and cannot provide any additional details at this time.
We’re honored to have Shanley considered as an individual member of the OpenStack Board. We encourage our employees to be actively and personally involved in important industry-defining organizations. OpenStack and Basho share many mutual users. The community members of both OpenStack and Basho expect us to build great software, to interoperate as needed, and to advocate open source principles.