Thousands have watched and enjoyed Peter Alvaro’s engaging and informative RICON 2014 Keynote presentation. Alvaro is a PhD candidate at the University of California Berkeley. His research interests lie at the intersection of databases, distributed systems, and programming languages. Alvaro’s style of delivery blends humor with deep technical detail and is especially informative for those interested in distributed systems.
In his presentation, Alvaro discusses 4 key ideas:
- Mourning the death of transactions
- What is so hard about distributed systems?
- Distributed consistency: managing asynchrony
- Fault-tolerance: progress despite failures
Alvaro starts his presentation by introducing us to Jim Gray and transactional systems. Many of you may know Gray’s work, and, sadly, that he was lost at sea in January 2007. His spirit and legacy are missed.
Alvaro provides insights into transactional systems and the top-down approach these systems traditionally used. He also points out that Eric Brewer, in his RICON 2012 keynote address, suggested that a bottoms-up approach might be needed for today’s distributed systems.
Alvaro dives into why anyone would implement distributed systems and why developing distributed systems is hard, really hard. In a distributed system, it is necessary to manage two fundamental uncertainties or failure modes — asynchrony and partial failure. Alvaro uses a humorous metaphor of two clowns to demonstrate how, in the real world, asynchrony and partial failure can’t be dealt with separately, but must be looked at together.
From his humorous metaphor come some definitions:
Distributed consistency = managing asynchrony
Fault-tolerance = progress despite failures
Alvaro then provides details on distributed consistency and when data is distributed, how consistency is handled. First, start with object-level consistency. Alvaro introduces and defines CRDTs and how these replicated data types help solve the distributed consistency challenge at the object level.
But what happens as objects are in flight? There must also be flow-level consistency for data in motion. Language-level consistency can help with this problem. Alvaro makes the following key points:
Consistency is tolerance to asynchrony
Tip: Focus on data in motion, not at rest
Alvaro then moves from distributed consistency to fault tolerance. He discusses his most recent research “lineage-driven fault injection.” He reminds us that we build systems of components and we verify these components to be fault tolerant.
However, when we put these components together it doesn’t guarantee end-to-end fault tolerance.
Alvaro talks about the challenges of the top-down approach to testing all components in a system and outlines the goal of lineage-driven fault injection (LDFI).
Alvaro then introduces us to Molly, a top-down fault injector.
He describes Molly like starting from the middle of a maze and moving to the outside as a method to arrive at a solution.
Alvaro provides detailed examples to show modeling programs using lineage so that fault tolerance can be analyzed. He then shows how the role of the adversary can be automated. He describes Molly in more detail as a prototype LDFI. Molly finds fault-tolerance violations quickly or guarantees that none exist. Alvaro provides some output using Molly and shows how lineage allows you to reason backwards from good outcomes.
Alvaro closes with a recap and explanation describing composition as the hardest problem of distributed systems.
Don’t miss this interesting and informative presentation.
Also, KDnuggets did a follow-up interview with Alvaro in which he expanded on some points made in his RICON 2014 Keynote speech. Here are links to the 2-part article:
For years, the press and industry analysts have been telling us that cloud is mainstream, but the reality is that Enterprises must shift their workloads to the cloud in an orderly, low risk manner. While there are many applications already built and running in the cloud, there are many new (or underutilized and, perhaps, misunderstood) technologies like Docker, Chef and object storage that are changing the way cloud applications are implemented.
At RICON 2014, Basho worked with Citrix to host “Build a Cloud Day.” Build a Cloud day sessions explore new technologies and show how to bring some order to the chaos of moving workloads to the cloud. The attendees learn the concepts and best practices to deploy a cloud computing environment using Apache CloudStack and other cloud infrastructure tools, including those from XenServer, Docker, RiakCS, Chef, Zenoss, Puppet and many others that automate server and network configuration for building highly available cloud computing environments.
Cloud Architecture: Virtualization, Orchestration and Storage
“Build a Cloud Day” started with an excellent presentation by Mark Hinkle. Many of us know him as @mrhinkle. Mark is the Senior Director of Open Source Solutions at Citrix Systems where he helps support the Apache CloudStack and Xen.org projects.
Mark has an excellent grasp of cloud computing and provides an overview of cloud computing architecture and the open source software that can be used to deploy and manage a cloud-computing environment. He looks at virtualization and containers and provides a brief description of Docker and how it is being used in today’s applications.
He also provides an overview of OpenStack. Mark closes the presentation with insights into how to deliver Platform-as-a-Service (PaaS) and what technologies can be used to compliment this evolving cloud computing paradigm.
Software is Eating Infrastructure
Other presenters at “Build a Cloud Day” included Basho’s own John Burwell (@john_burwell). John is a Senior Software Engineer at Basho Technologies. He also serves as an Apache CloudStack PMC member and committer focused on storage architecture and security integration. John’s talk explores cloud design strategies to achieve high availability and reliability using commodity components and how to apply these strategies using Apache CloudStack and Riak CS.
By migrating reliability and scalability responsibilities up the stack from specialized hardware to software, cloud orchestration platforms such as Apache CloudStack (ACS) and object stores such as Riak CS increase the utilization and density of compute and storage resources by dynamically shifting workloads based on demand.
John describes two workloads predominately managed in cloud environments — traditional virtualization and cloud — and how to use Apache CloudStack to efficiently manage both simultaneously. He then explores storage design to support this dual workload model, including the use of Riak CS with Apache CloudStack to reduce infrastructure costs without sacrificing reliability.
Riak CS provides software-defined, fault-tolerant object storage uniquely built to handle a variety of unstructured and big data needs using commodity hardware.
Apache CloudStack, Apache Brooklyn and more…
There were many great presentations at “Build a Cloud Day” including:
- Primary Storage in CloudStack by Mike Tutkowski (Slides | Video)
- Introduction to Apache CloudStack by David Nalley (Slides | Video)
- Hypervisor Selection in the Cloud by Tim Mackey (Slides | Video)
- Cloud Application Blueprints with Apache Brooklyn by Alex Henevald (Slides | Video). Alex also did a Riak-specific presentation at RICON 2014, Running Riak in a Docker Cloud using Apache Brooklyn.
You can find out more about RICON 2014 in our blog post. http://basho.com/wrapping-up-ricon-2014/.
The videos of the presentations at RICON 2014 can be found on our RICON Archive site. The Keynote by Peter Alvaro – Outwards from the Middle of the Maze is very popular.
At RICON 2014 Mac Devine, VP and CTO, IBM Cloud Services Division, provided an entertaining and informational keynote delving into the details of the Internet of Things.
Devine begins by explaining that distributed systems are at the convergence of big data, cloud and the Internet of things (IoT). This is the “Perfect Storm” and has the ability to disrupt current big established industries and create new opportunities and new industries that didn’t exist before.
There are all kinds of forces changing the market. Devine says, “We are carrying around more computing power in our pockets (waving mobile phone) than NASA had to put men on the moon. Companies can assemble solutions from other service providers and are putting more solutions together than ever before. Technology is moving at a faster and faster pace and the companies that can move quickly with cloud speed will win.”
Devine tells us that data is the new currency of business. The challenge is no longer to connect the data sources within the enterprise, but to connect data sources from a myriad of places — both inside and outside the enterprise. Getting insight into the available data is what determines the success of an enterprise.
“There is a litmus test for a successful cloud-delivered service today. It needs to be easy enough to consume, while having a very simple-to-use API that is self-managing in terms of how it scales. If someone else can then build something that you never envisioned, without you having to train them on the platform, then that is a successful platform.” This is how Devine views Riak, “Riak is very much that way.”
Devine then moves on to tell us that IoT is at the top of the Gartner Hype Cycle. The potential growth of connected devices is huge and Devine tells us that if the data gives us greater insights to make better business decisions and the costs of collecting that data are reasonable we will see this data grow. The estimates are in the 10s of Billions.
Devine then introduces us to Internet of Things Reference Model. Which, like the OSI model, has 7 layers.
7 – Collaboration and Process
6 – Application
5 – Data Abstraction
4 – Data Accumulation
3 – Edge Computing
2 – Connectivity
1 – Physical Devices and Controllers
At each layer you need to be able to transpose, transform and append data. This requires an architectural model. Devine introduces the Softlayer Flow DataStream engine for streaming real-time analytics from the edge to the backend.
Devine explains that handling the Perfect Storm of data requires connectivity, security, scale and elasticity, data storage and retention, real-time and historical analytics, as well as extensibility and an eco-system. And he introduces the IOT foundational service made up of security services, data services, analytic services, and SDN services.
“Basho Riak will be part of the data services. As applications move to distributed systems, Riak’s masterless architecture makes it a great fit for IoT. Riak’s fault tolerance will be critical for IoT.” Devine also goes into data locality and how important Riak’s abilities to meet global availability requirements and to scale linearly and predictably are critical for IoT.
Devine closes with a look at the ecosystem and how IoT is changing businesses. One of the top initiatives of enterprise companies is to drive innovation using IoT. For example, automotive companies are looking to IoT for key differentiation.” Healthcare and life sciences are looking to IoT for data decision-making that can change lives.
You can view Mac Devine’s entire presentation at the RICON 2014 Archive site.
Thank you, Mac, for your valuable insights and entertaining session.
In just a few days we’re heading out to Vegas for RICON 2014! Keep reading to find out FAQs and everything you need to know about RICON 2014.
Location and Dates
October 27-29, 2014
How do I register for RICON 2014?
Please register for RICON 2014 here.
What is the cancellation policy?
Tickets are non-refundable and non-transferable within seven days of the start of the event. No refunds are issued for any cancellations to RICON after October 20, 2014. Refunds will not be provided for registrants who do not attend the conference.
At the event
What is Build a Cloud Day?
Presented by Citrix, Build a Cloud Day is on October 27 and is designed to expose attendees to the concepts and best practices around deploying cloud computing infrastructure. Mark Hinkle, senior director, open source solutions at Citrix, will start the day and is followed by speakers from CloudSoft, SolidFire and Basho. This will be held on October 27 and you can still register. The detailed schedule for Build a Cloud Day is here.
What are the registration hours and locations?
- Monday, October 27: 7-9 a.m. and 5-6 p.m. at registration desk
- Tuesday, October 28: 7 a.m.-6 p.m. at registration desk
- Wednesday, October 29: 7 a.m.-1 p.m. at registration desk
*Registration desk is located in the Ballroom Foyer, 2nd floor of the Palms conference center
What do I need to register?
Please bring a government issued ID.
What are the breakfast and lunch hours and locations?
Breakfast and lunch will be served in Grand Ballroom 1-3 each day.
- Breakfast: 7:30-8:30 a.m.
- Lunch: 12-1 p.m.
There are also two short breaks during the day from 10:15-10:30 a.m. and 2:45 p.m.-3:00 p.m.
Are there any receptions or after parties?
Of course! Please join us for our kick-off event on Monday, October 27 from 6:30-9:30 p.m. in the Kingpin Suite for appetizers, drinks and bowling hosted by AdRoll. It’s a great opportunity to meet other attendees and have some fun before the conference starts. The Kingpin Suite is located on the second floor of the Palms Fantasy Tower, on the 25th floor.
The official RICON after party will be on Tuesday, October 28 from 7-10 p.m. at the iconic MOON Nightclub. Co-hosted by Basho and Pivotal, the party will include appetizers, drinks, entertainment and the best view in Vegas! MOON is located on the top floor of the Palms Fantasy Tower and don’t forget to bring your conference badge – it’s your ticket to the party.
Tweet to @RICONconf and feel free to use #RICON in your tweets from the event. Post your pics at the event and keep an eye out to snap a photo with Basho man.
Are there special hotel rates?
There are discounted rates and special offers for RICON attendees, but with limited availability. You can book your hotel room here or email email@example.com for more information.
What time are standard check-in and check-out at the Palms?
Standard check-in is at 3 p.m. Standard check-out is at 11 a.m.
If you arrive before 3 p.m. and your room is not available, you can store your luggage at the bell desk. Please note that late check-out requests are subject to availability and occupancy on the date of departure and additional charges may apply.
Does the Palms offer a shuttle to and from the airport?
Yes, they have a shuttle service that runs direct from the Palms to the airport every hour on the hour from 6 a.m. to 6 p.m. with a cost of $8.
Does the Palms offer free parking?
Yes, they offer free self-parking and free valet.
Where are the designated smoking areas at the Palms?
Smoking is permitted throughout the casino floor, at both pools, on three floors of the Palms Ivory Tower and at ghostbar and Moon. The Fantasy Tower, and Palms Place Hotel and Spa are non-smoking.
Other Common Questions
What should I wear during RICON?
We’d love to see as many RICON t-shirts as possible. You are more than welcome to wear t-shirts or hoodies from past years. If you don’t have any old RICON swag don’t worry, attire is casual – jeans and a t-shirt is ok with us!
Whom should I contact if I have special dietary needs or allergies?
Please let a staff member know when you pick up your conference badge or send us an email – firstname.lastname@example.org.
What should I do if I lose or find an item at RICON?
Please bring any found items to the registration desk located in the Ballroom Foyer.
To report a missing item: visit the registration desk or send us an email email@example.com and we will do our best to return it to you.
Does RICON have a code of conduct?
Of course, we want all of our attendees to feel welcome and safe. You can view our code of conduct in further detail here.
What if I can’t attend RICON but I want to watch the sessions?
We will be livestreaming our keynote presentations and all other sessions will be available on-demand after the event. Please register for the livestream here.
What if I still have questions about RICON?
For more information, visit ricon.io or contact firstname.lastname@example.org.
Quick RICON 2014 Details:
- Date: October 27-29, 2014
- Location: The Palms, Las Vegas
- Agenda: http://ricon.io/event-details/index.html#agenda
- Tickets: https://www.eventbrite.com/e/ricon-2014-tickets-12372024057
- Social: Tweet to @RICONconf
- More details: Visit ricon.io for more information
Distributed systems conference to feature provocative discussions and presentations on the future of distributed systems with industry thought leaders from HP and IBM
HERNDON, Va. – September 25, 2014 – Basho Technologies, the creator and developer of Riak, the industry leading distributed NoSQL database, today introduced keynote speakers and a preliminary speaker list for RICON 2014, the distributed systems conference hosted by Basho. This year’s event will feature keynote speakers: Mark Interrante, SVP of engineering, HP Helion; Marten Mickos, SVP and general manager of cloud, HP (previously CEO of Eucalyptus); and Mac Devine, director of cloud innovation and CTO, IBM Cloud Services Division.
“We are thrilled to have attracted such an influential lineup of speakers at this year’s RICON conference,” said Adam Wray, CEO of Basho. “Mark, Marten and Mac all have extensive experience in the open source, infrastructure software and cloud industries, so it will be interesting for attendees to hear their perspectives on distributed systems trends. This year’s RICON is shaping up to be the biggest yet and we hope our attendees gain valuable insights as they build out their scalable database initiatives.”
Interrante, Mickos and Devine have experience leading cloud and software enterprise companies in distributed systems and are sought-after speakers at industry events. Their keynote presentations will be introduced by Wray and Basho CTO, Dave McCrory. Speakers will cover topics such as the state of open source, big data, IoT and public and private clouds all within the context of distributed systems.
- Mark Interrante, SVP of engineering, HP Helion, has more than 14 years of VP-level product development experience with Internet companies ranging from startups to global Fortune 500s. Prior to joining HP in early 2014, he was SVP of product at Rackspace where he helped invent, design, build and operate the world’s largest open cloud.
- Mac Devine, director of cloud innovation and CTO, IBM Cloud Services Division, has more than 20 years of experience with networking and virtualization. Before joining IBM in 2008, he started the zCloud Innovation team and served as its chief architect. Devine is an IBM Master Inventor and a member of IBM’s prestigious Academy of Technology.
- Marten Mickos, SVP and general manager of cloud, HP, is a veteran of open source, infrastructure software and global businesses, served as CEO of Eucalyptus (now in agreements to be acquired by HP) and will join HP as SVP and general manager of cloud. Previously, he was CEO of MySQL AB where he grew the company from a garage startup to the second largest open source software company in the world.
“RICON is a significant gathering of the distributed systems industry, bringing together luminaries from vendors to end users to academia,” said Devine. “I look forward to being a part of this year’s conference and interacting with other members of the cloud ecosystem at the show. My focus will be to get at the center of how cloud services will transform in the evolution of distributed systems.”
The preliminary speaker list is posted on the RICON site, ricon.io. Speaker sessions are expected from the following organizations : AdRoll, Alert Logic, Basho Technologies, Bitly, Braintree – a PayPal Company, Chartbeat, Cloudsoft, Comcast, Dataloop.io, Google, HashiCorp, MIT CSAIL, OpenX, Percona, Riot Games, SWI-Prolog, Two Sigma, UC Berkeley, Universidade Nova de Lisboa, Universidade do Minho, University of Kaiserslautern, UPMC/INRIA, Yahoo!. Below are highlights of a few of the speakers and their sessions.
- Peter Alvaro is a doctorate candidate at the University of California, Berkeley, where he is advised by Joseph M. Hellerstein. His principal research interests are databases, distributed systems and programming languages. His session will be: Outwards from the Middle of the Maze: Using Lineage to find Bugs at Scale.
- Armon Dadgar is CTO of HashiCorp, where he is focused is on the real-world application of distributed systems to solve problems, specifically in the world of DevOps and application orchestration. His session will be: Consul: Service Oriented at Scale.
- Aysylu Greenberg works at Google. In her spare time, she works on open source projects in Clojure, ponders the design of systems that deal with inaccuracies, paints and sculpts. Her session will be: Benchmarking: You’re Doing It Wrong.
- Alex Heneveld is the CTO of Cloudsoft and brings 20 years of experience designing software solutions in the enterprise, start-up and academic sectors. His session will be: Using Clocker to Deploy and Manage Distributed Applications in a Docker Cloud, and is co-presented with Andrew Kennedy, a senior engineer for Cloudsoft.
- Neha Narula is a doctorate candidate at MIT, building fast, scalable multi-core and distributed systems. Her session will be: Multicore and Distributed Systems: Sharing Ideas to Build a Scalable Database.
- David Pick is a software engineer at Braintree, a PayPal company, and currently leads Braintree’s data engineering team to solve problems around data access and performance at scale. His session will be: Building a Real-Time Data Pipeline with Clojure and Kafka.
RICON will take place October 28-29 at the Palms Resort and Casino in Las Vegas. It will also feature a training day called Build a Cloud Day presented by Citrix Systems, Inc., on October 27, which is a program designed to expose attendees to the concepts and best practices around deploying cloud computing infrastructure. RICON immediately follows with two days of distributed systems keynote presentations, in-depth sessions, technical tracks, panel discussions, cocktail events and more. A discounted registration price is available for the next 48 hours with bit.ly/RICONPR.
- RICON registration (ly/RICONPR)
- Sponsorships (io/ricon-sponsors/index.html)
- Build a Cloud Day (http://open.citrix.com/about-diy-cloud-computing/cloud-events/viewevent/291-build-a-cloud-day-vegas.html)
- Twitter: @RICONconf (com/riconconf)
- RICON 2014 website (io/)
RICON is a distributed systems conference hosted by Basho. It brings together developers, industry leaders, partners, customers and community members to discuss all the ways they apply distributed systems and networks, and see innovations in the industry. RICON 2014 is being held October 28-29 at the Palms Resort and Casino in Las Vegas, NV. For more information, visit ricon.io.
About Basho Technologies
Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s distributed database, Riak, the industry leading distributed NoSQL database, and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by one third of the Fortune 50 to power their critical Web, mobile and social applications, and their public and private cloud platforms.
Riak and Riak CS are available open source. Riak Enterprise and Riak CS Enterprise offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com.
# # #
December 30, 2013
2013 was a huge year for Basho Technologies and before we dive into 2014, we thought we’d take a moment to reflect on how far we’ve come.
2013 was the year of the Riak User. We love hearing about all the amazing ways companies across various industries are using Riak. This year, we were able to share dozens of exciting case studies. These include:
- Synacor’s TV Everywhere platform
- Enstratius (acquired by Dell)
- Best Buy
- Alert Logic
- Viggle (through OmniTI)
- Turner Broadcasting
- Hosted Graphite
- Gilt Groupe
- Praekelt Foundation
- National Health Service
- City Maps
- The Weather Company
For even more Riak Users, check out the Users Page.
We released Riak 1.3, Riak 1.4, and the Technical Preview of Riak 2.0 this year. These releases added such features as Active Anti-Entropy, revamped Riak Control, queryability improvements, Riak Data Types, and much more. Be on the lookout for the general release of Riak 2.0 early next year.
This year, we expanded RICON, Basho’s distributed systems conference, to both RICON East and RICON West. These were both sold out conferences that featured speakers from bitly, Comcast, Google, Netflix, Salesforce, The Weather Company, Turner Broadcasting, Twitter, and many more.
We drastically increased the number of Basho partners in 2013. For a full list of partners, check out the Partnerships Page. Some key ones to note include Tokyo Electron Device, SoftLayer, and Seagate.
Our amazing community team hosted over 200 meetups around the world this year. On top of that, they also attended dozens of industry events to spread the word about Basho. Keep an eye on the Events Page to see where we’ll be in 2014.
2013 was a busy year but, with some exciting announcements coming, we look forward to an even busier 2014. Happy New Year!
November 5, 2013
RICON West, Basho’s distributed systems conference, took over San Francisco last week. RICON West was the largest RICON to date and was a huge success. We brought in speakers from both industry and academia to discuss the theory, practice, and importance of running distributed systems in production – as well as some predictions on what’s in store for the future.
For those that couldn’t attend RICON West, there was also a live stream available so you could tune in and watch all of the great talks. While we are editing the recorded talks, the video stream will remain available for you to watch any of the talks you missed or would like to rewatch. You can rewatch the video stream here.
If you attended RICON West or tuned in to the live stream, we would love your feedback! What did you love? How can we improve? Please take five minutes to fill out this quick survey.
We look forward to hearing from you!
October 22, 2013
Today, Seagate has announced the availability of their Kinetic Open Storage platform, which simplifies data management, improves performance and scalability, all while lowering expenses. This fundamentally new architecture reduces costs by allowing applications to communicate directly with the storage system, eliminating the acquisition, deployment, and support costs of hyperscale storage infrastructures.
Basho has partnered with Seagate to help them develop this platform to provide interoperability and testing with Riak. Now, with the release of this platform, we want to make it easier for developers to test the Kinetic Open Storage platform with Riak. We have just released an alpha version of our eKinetic driver, which enables an Erlang-based high-performance socket connection to the drive. We have also released software to improve Riak backend compatibility by mapping a Riak backend to the drive library. Both are available for download https://github.com/basho-labs/riak_kinetic.
Not only does deploying Riak on this platform drastically simplify the management of data through a straightforward socket-based network interface, this simplification also increases I/O efficiency by removing bottlenecks and optimizing cluster management, data replication, migration, and active multi-datacenter performance. Additionally, it is expected that users will realize up to a 50% decrease in the Total Cost of Operations through simplified operations alone. Users can also maximize storage density through reduced power and cooling costs and build out cloud datacenters for even more savings.
Seagate Principal Technologist, James Hughes, will be speaking about the Kinetic Open Storage platform and Riak at RICON, Basho’s distributed systems conference. His talk, “Device Based Innovation to Enable Scale-Out Storage” will take place on October 29th at 12pm in Track Two. Seagate is also a sponsor of RICON.
Basho makes available alpha eKinetic (Erlang) driver and an integrated backend for Riak
CAMBRIDGE, MA – October 22, 2013 – Basho, an expert in distributed systems and cloud storage software, announced today that it has partnered with Seagate Technology (NASDAQ:STX) to help significantly advance the economics and performance potential of cloud architectures. For the past six months, Basho has worked with Seagate on its development of its Seagate Kinetic Open Storage platform, providing interoperability and testing with Riak, Basho’s distributed NoSQL database.
The Seagate Kinetic Open Storage platform eliminates the storage server tier of traditional data center architectures by enabling applications to speak directly to the storage system, thereby reducing expenses associated with the acquisition, deployment, and support of hyperscale storage infrastructures. The platform leverages Seagate’s expertise in hardware and software storage systems integrating an open source API and Ethernet connectivity with Seagate hard drive technology.
Basho developed Riak to offer businesses a highly-available, fault-tolerant, distributed database ensuring ultra-low-latency performance that is simple-to-operate – at any scale. Today, Riak is used by thousands of companies including over 30 percent of the Fortune 50. Basho’s partnership with Seagate aimed to continue to improve on these ambitions and customers deploying Riak on the Seagate Kinetic Open Storage platform will see the following benefits:
- An increase in I/O efficiency by removing bottlenecks and optimizing cluster management, data replication, migration, and active multi-data center performance
- An improvement in customer Total Cost of Operations (TCO) by up to 50 percent through simplified operations
- An additional cost savings by maximizing storage density through reduced power and cooling costs, and receiving potentially dramatic savings in cloud data center build outs
To assist with developers seeking to test the Seagate Kinetic Open Storage platform, Basho is making available an eKinetic driver enabling an Erlang-based high-performance socket connection to the drive. Basho is also providing software that maps a Riak backend to the drive library. Both the eKinetic driver and Riak backend compatibility are available as alpha version software.
“Seagate is bringing device based innovation to the scale-out cloud market in a new and open way,” said Ali Fenn, Senior Director of Advanced Storage at Seagate Technology. “This is a fundamentally new architecture – integrating an open source key/ value API and Ethernet connectivity into devices – that represents a vital leap forward in decreasing cloud architecture TCO while improving performance. We are very excited to work with Basho, a leader in distributed object storage software, to bring complete solutions to customers.”
“Basho’s distributed database relies on key/value stores directly attached to servers,” commented Jon Meredith, Senior Vice President of Engineering at Basho. “Seagate’s kinetic drive simplifies the management of key/value store, filesystem, logical volume manager, RAID controllers and actual devices by replacing them with a simple socket-based network interface. Freeing drives from server chassis enables independent scaling of capacity and throughput of a cloud architecture. We look forward to continuing to work with Seagate to offer customers significant performance and cost benefits when combining Riak on kinetic drive technology.”
Seagate at RICON West
James Hughes, Principal Technologist from Seagate, will be speaking at Basho’s distributed systems conference RICON West held in San Francisco October 29-30. His session is entitled Device Based Innovation to Enable Scale Out Storage. Seagate is a sponsor of RICON.
About Basho Technologies
Basho is a distributed systems company dedicated to making software that is highly available, fault-tolerant and easy-to-operate at scale. Basho’s distributed database, Riak, and Basho’s cloud storage software, Riak CS, are used by fast growing Web businesses and by over 30 percent of the Fortune 50 to power their critical Web, mobile and social applications and their public and private cloud platforms.
Riak and Riak CS are available open source. Riak Enterprise and Riak CS Enterprise offer enhanced multi-datacenter replication and 24×7 Basho support. For more information, visit basho.com. Basho is headquartered in Cambridge, Massachusetts and has offices in London, San Francisco, Tokyo and Washington DC.
October 2, 2013
What Is Riak CS?
In May of this year, we posted the top 5 questions we heard from customers and our community about Riak CS; today we’ll take a deeper dive into the technical details, specifically the differences between Riak CS and Riak itself.
Riak CS as Compared to Riak
Both Riak CS and Riak are, at their core, places to store objects. Both are open source and both are designed to be used in a cluster of servers for availability and scalability.
The fundamental distinction between the two is simple: Riak CS can be used for storing very large objects, into the terabyte size range, while Riak is optimized for fast storage and retrieval of small objects (typically no more than a few megabytes).
There are subtle differences; however, that can be obscured by the similarities between the two.
Why Would I Use Riak CS?
Riak CS is used for a variety of reasons. Some examples:
- Private object storage services, for example for companies that want to store sensitive data behind their own firewalls.
- Large binary object storage as part of a voice or video service.
- An integrated component in an OpenStack cloud solution, storing and serving VM images on demand.
Tier 3, Yahoo! Japan, Datapipe, and Turner Broadcasting are just a few of the big names using Riak CS today.
What Does Riak CS Do That Riak Doesn’t?
Riak CS carves large objects into small chunks of data to be distributed throughout a Riak cluster and, when used with Riak CS Enterprise, synchronized with remote data centers.
Riak CS adds compatibility with Amazon’s S3 and OpenStack’s Swift APIs. These offer very different semantics than Riak, and the advanced search capabilities in Riak such as Secondary Indexes and full text search are not available using S3 or Swift clients.
We strongly advise against it, but it is possible to work with Riak’s standard APIs “under the hood” when deploying a Riak CS solution.
Work is actively underway to add a security model to Riak in the upcoming 2.0 release.
Buckets or Buckets?
Users of Riak CS store their objects in virtual containers (called buckets in Amazon S3 parlance, containers in OpenStack).
Riak also relies heavily on buckets for data storage and configuration but, despite the names, these buckets are not the same.
As an example of how this can cause confusion: the replication factor in Riak (the number of times a piece of data is stored in a cluster) is configurable per-bucket. Because Riak’s buckets do not underly the user buckets in Riak CS, this feature cannot be used to create tiered services.
Riak is designed to maximize availability; the price paid for that is delayed consistency when the network is split and clients are writing to both sides of the cluster.
Creating user accounts in Riak CS; however, led to the need for a mechanism to maintain strong consistency. If two people attempt to create user accounts with the same username on either side of a network partition, both cannot be allowed to succeed, or else a conflict will occur that is very difficult to automatically recover from.
Furthermore, user buckets in S3 (and OpenStack APIs as implemented in Riak CS) reside in a global rather than a user-specific namespace, so bucket creation must also be handled carefully.
Riak CS introduced a service named Stanchion that is designed to handle these specific requests to avoid conflicts. Stanchion is a single process running on a single Riak server (thus introducing a single point of failure for user account and bucket creation requests).
While it is possible to deploy Stanchion using common system tools to make a daemon process run in a highly available manner, Basho recommends doing so carefully and testing it thoroughly. Since the only impact of failure is to prevent user and bucket creation, it may be preferable to monitor and alert on failure. If two copies of Stanchion are running due to a network partition, its strong consistency guarantees will be lost.
With strong consistency options targeted for Riak 2.0, expect to see some changes.
Basho offers multi-datacenter replication with its Enterprise software licenses, and Riak CS Enterprise takes full advantage of that feature. Data can be written to one or more clusters in multiple data centers and be synchronized automatically between them.
There are two types of synchronization: real-time, which occurs as objects are written, and full sync, which happens on a periodic basis to compare the full contents of each cluster for any changes to be merged.
One key difference is that Riak CS maintains manifest files to track the chunks it creates, and it is these manifests that are distributed between clusters during real-time sync. The individual chunks are not synchronized until a full sync replication occurs, or until someone requests the file from a remote cluster. The manifest is made active for someone to retrieve the chunks after the original upload to the source cluster is complete.
A common mistake while installing Riak CS is to configure it using information specific to Riak rather than Riak CS. As an example, per the Riak CS installation instructions the relevant backend data store must be configured to
riak_cs_kv_multi_backend, which is forked from Riak’s
riak_kv_multi_backend. Using the latter will cause problems.
Riak (CS) Control
Exposure to Internet
Exposing any database directly to the Internet is risky. Riak, currently lacking any concept of authentication, absolutely must not be accessible to untrusted networks.
Riak CS; however, is designed with Internet access in mind. It is still advisable to place a load balancer or proxy in front of a Riak CS cluster, for example to ease cluster maintenance/upgrades and to provide a central location to log and block potentially hostile access.
Riak CS servers will still have open Riak ports that must be protected from the Internet as you would any Riak servers.
Where to Next for Riak CS?
2013 has been a big year for Riak CS: it was released as open source in the spring, with OpenStack support added this summer. Still, there is much to do.
As mentioned above, improving or replacing Stanchion is a high priority.
We will continue to expand the API coverage for Riak CS. The next major targets are the copy object operations that Amazon S3 and OpenStack Swift offer.
Compression and more granular replication controls are also under consideration for future releases.
By building Riak CS atop the most robust open source distributed database in the world, we’ve created a very operationally friendly, powerful storage solution that can evolve to meet present and future needs. Feel free to give it a try if you aren’t already using it.
If you’re interested in hearing from the engineers who’ve made this software possible (and seeing just how far a highly available data storage solution can take you), join us October 29-30th for RICON West. RICON West is where Basho brings together industry and academia to discuss the rapidly expanding world of distributed systems, including Riak and Riak CS.