February 28, 2013
In the last post, we looked at how Riak Enterprise’s multi-datacenter replication can be configured for backups and data locality. In this post, we examine two other common implementations: availability zones and public cloud use cases. For more information on Riak Enterprise architecture and configuration, download the complete whitepaper.
Availability zones provide efficient multi-datacenter replication and data redundancy within a geographic region (such as a coast or a country). In this configuration, data is replicated within an availability zone’s series of datacenters. In the event that one of datacenters experiences an outage or serious failure, data can still be served from other datacenters within the same region.
One approach to setting this up is to have a “primary” site in a region where all reads and writes for specific users, applications, or data sets are directed. This primary cluster can then be replicated to one or more proximal secondary clusters. In other approaches, data can be replicated in real-time from one cluster to both another datacenter and other cold backups maintained for emergency conditions. The right approach is highly dependent on the requirements of users, availability, expense of bandwidth, and other constraints.
Public Cloud Use Cases
Riak is designed to be easy to use and operate on public clouds, and is partnered with many of the leading cloud providers, including Amazon Web Services, Microsoft Azure, and Joyent. Hosted Riak is also available from Engine Yard and Riak packages can always be manually installed on any physical or virtual provider, even if a machine image isn’t explicitly supported.
There are several use cases for Riak Enterprise’s multi-datacenter replication in the public cloud. Many enterprises want to maintain a cold or hot backup of their cluster in a public cloud for business continuity in the event of a datacenter outage in their private infrastructure. For other customers, the public cloud can provide a more cost-effective way of meeting peak loads, rather than building out private infrastructure to accommodate them year-round. For example, many retailers and media providers need to offer increased capacity over the holiday season. Riak Enterprise is used to scale out capacity on public clouds over these periods, either with full-sync or real-time sync depending on the business needs.
Finally, some enterprises run certain applications or services entirely on public clouds. Riak Enterprise allows for redundancy and data locality across public cloud availability zones for this use case, ensuring optimal performance and resiliency.
February 15, 2013
On January 16, 2013, Citrix’s David Nalley and Basho’s John Burwell, detailed the underlying architecture and customer benefits of CloudStack and Basho’s Riak CS software. In the video, David Nalley provides an overview of CloudStack’s architecture and offers a walkthrough of CloudStack Administration, including provisioning of new instances and a description of the CloudStack API.
John Burwell, also a Committer to Apache CloudStack, explains the role of Secondary Storage software to store immutable assets (templates, ISO images, snapshots). John details the difference between Object Storage and Block Storage, discussing benefits such as flexible meta-data definition and custom API access. Finally, John describes enhancements in the upcoming 4.1.0 release that leverage Riak CS to synchronize assets in secondary storage across zones/data centers, reducing operational costs and complexity for multi-zone CloudStack implementations.
Basho is actively working with the CloudStack community to design CloudStack’s next generation architecture to drive deeper integration of leading edge storage technologies such as Riak CS.
Earlier this week, Basho and Datapipe announced the availability of a new object storage service on Datapipe’s 10 Gig Stratosphere Cloud Platform. The S3-compatible object storage service is built on Citrix CloudStack and fully integrates Riak CS.
February 7, 2013
Basho and our community have a handful of events lined up for February 13th. We have official meetups/group hacks in at least seven cities in the US.
We hope to see you next week. If you can’t attend an official Meetup, throw a Riak hack or drink up in your city and email firstname.lastname@example.org to tell us about it.
Thanks for being a part of Riak.
- Speaker: Weston Jossey, Software Engineer, Tapjoy
- Talk Title: Huge Data Migrations to Riak Made Easy(er)
- Details and RSVP
- Speaker: Sean Cribbs, Software Engineer, Basho Technologies
- Talk Title: The Deep Riak
- Details and RSVP
New York City
- Speaker: Aaron Brown, Lead Systems Engineer, ideel
- Talk Title: Riak at ideel
- Details and RSVP
- Speaker: Adron Hall and You
- Talk Title: Riak Hack & Brew
- Details and RSVP
- Speaker: Robert Zuber, Co-Founder, Copious
- Talk Title: Riak in a Multi-Datastore Strategy at Copious
- Details and RSVP
- Speaker 1: Pavan Venkatesh, Technical Evangelist, Basho Technologies
- Talk Title 1: From Relational to Riak
- Speaker 2: Sajith Kizhakkiniyil, Software Infrastructure and Backend Architecture Support, Apollo Group
- Talk Title 2: Riak at Apollo
- Details and RSVP
- Speaker: Adron Hall and You
- Talk Title: Nerd Lunch and The Start of Seattle Riak
- Details and RSVP
TED to Leverage Deep Relationships with Enterprise Companies to Accelerate Adoption of Riak Throughout Japan
CAMBRIDGE, MA and YOKOHAMA, JAPAN – February 7, 2013 – Basho Technologies, Inc. and Tokyo Electron Device Limited (TED) announced a strategic partnership and distribution agreement under which TED will resell Basho products throughout Japan and has become a strategic equity investor in Basho. Basho Technologies specializes in distributed systems technologies and is the creator of Riak, the industry leading distributed database and cloud storage software. TED provides world-class products and solutions that deliver competitive advantages to its customers. The strategic partnership enables Basho and TED to capitalize on the comprehensive resources of TED to open up new opportunities for Basho in the Japanese market. As part of the partnership, TED will build and maintain dedicated sales support and post-sales support resources specifically around Riak, Riak CS and future new products from Basho.
“Basho is very excited to enter into a long-term strategic partnership with TED,” said Sam Takagi, general manager of Basho Japan and Asia Pacific. “TED is highly regarded throughout Japan for its expertise in storage infrastructure, data backup and protection, and data warehouse design and operations. Riak’s strengths around high-availability, scalability and predictability are highly complementary to TED’s expertise and will provide an important new and innovative database and storage solution for TED’s customers. With Riak, Japanese businesses can meet demanding Internet, social and mobile requirements, as well as build highly-competitive public clouds and secure, high-performance private clouds.”
“Riak’s inherent distributed data capabilities offer a unique solution for companies building next generation applications, and cloud computing platforms that require high scalability, no downtime, and low cost operations,” said Vic Amano, VP & GM CN Business of Tokyo Electron Device Limited. “Our highly complementary expertise in data storage and our large and established network of commercial and industrial customers position us well to quickly speed adoption of Basho’s technology throughout Japan. The strategic nature of our partnership with Basho allows us to collaborate on future customer requirements and on product directions, allowing TED to maintain a competitive advantage for the next generation of database and storage technologies.”
“Through this strategic partnership, Basho is further building its global presence and particularly in the important and large market of Japan,” said Greg Collins, Basho’s president and CEO. “Our partnership with TED further underscores Basho’s commitment to the Japanese market. We are committed to building strong local capabilities and leveraging partnerships that have strong business networks and local-market expertise. TED is a terrific match for Basho. We look forward to working with TED and its customers for many years to come.”
Today’s announcement follows Basho’s recent opening of its Tokyo Office, officially launched on September 27, 2012.
About Tokyo Electron Device (TED) CN Business:
Tokyo Electron Device (TED) is a technical trading firm with a “trading business” function that provides semiconductor products and business solutions as well as a “development business” function that performs commissioned designing and the development of own-brand products. The Computer Network (CN) Business Section handles a wide range of storage systems, network-related equipment, and middleware products and provides them as part of its business solutions in the era of cloud computing. It has marketing functions in Japan and overseas to pick up on trends in the world’s advanced technologies ahead of others in order to offer products and services that cover processes that span everything from implementation to support.
For more information, visit: http://cn.teldevice.co.jp/english/.
About Basho Technologies
Basho Technologies is the leader in highly-available, distributed database technologies used to power scalable, data-intensive Web, mobile, and e-commerce applications and large cloud computing platforms. Basho customers, including fast-growing Internet-based businesses and large Fortune 500 enterprises, use the company’s flagship product, Riak, to deliver and manage digital media and unstructured data, implement multi-device user activity and sessions stores, to aggregate large amounts of data for logging, search and analytics, and to build scalable cloud storage platforms. The company is based in Cambridge, Massachusetts and operates regional offices in London, San Francisco, Tokyo and Washington DC.
Basho Technologies Medica Contact:
Bobby Patrick Chief Marketing Officer, Basho Technologies
Tokyo Electron Device Media Contact:
Yoichiro Hotta, Yoko Fukui Corporate Communications Department, Tokyo Electron Device Limited
Contact form: https://www.teldevice.co.jp/eng/contact_form_news.html
For inquiries regarding Basho Technologies and Riak Tokyo Electron Device Limited
Tsuyoshi Yoshi Tanaka, 1-510-624-3463
CN Business Contact form: http://cn.teldevice.co.jp/company/tea/form.html
January 30, 2013
Many teams run Riak in public cloud environments, either as a part of their infrastructure or as the foundation of it. Increasingly, we see enterprises and startups use a hybrid implementation that leverages both private infrastructure and public cloud services. This hybrid model is often used to address burst capacity issues, tenancy/location concerns, and simple proof-of-concept implementations prior to hardware acquisition.
Over the past few years, we have seen substantive adoption of Riak on Amazon Web Services. To that end, we are pleased that Basho has been approved as an Amazon Web Services Technology Partner. We look forward to highlighting interesting use cases, publishing detailed case studies of usage, and continuing to improve the usability and deployment speed of Riak on the AWS platform.
This post provides a high-level overview of your deployment options for using Riak on Amazon.
How Many Nodes?
Before we discuss the mechanics of implementation, it is important to consider the size of your deployment. One of the most frequent questions Basho is asked is, “How many nodes should I start with?”
If you have played with the Riak Fast Track you are familiar with deploying three nodes on a single machine. However, for production deployments, we recommend that your cluster be setup with a minimum of five nodes. For more details on how this minimum ensures the performance and availability of your implementation, please read the post entitled: Why Your Riak Cluster Should Have At Least Five Nodes.
So, you have a minimum of five nodes and you’ve decided that leveraging a cloud provider is appropriate for your current business needs. Now, how do you get started?
Amazon Machine Image
At its simplest, an Amazon Machine Image (AMI) is a pre-built machine image and configuration of Riak for Amazon EC2 users.
Obtaining and configuring the image is a relatively straightforward process. However, since Riak needs the nodes in the cluster to communicate with each other, there is some manual setup involved.
First, provision the Riak AMI onto the server of your choice via the AWS marketplace.
Once the virtual machine is created, manually configure the EC2 security group to allow the Riak nodes to speak to each other. The details of this step can be found on our docs portal under Installing on AWS Marketplace. However, this is generally as simple as opening a few inbound ports and defining a “Custom TCP rule.”
At this point, the machines can be clustered together. When the individual virtual machines are provisioned and the security group is configured, simply SSH into each machine and use internal riak-admin tools to join the nodes to the cluster.
But what if you want to automate some of the configuration of your cluster? Or, what if you want the ability to setup a VPC-based stack that includes:
- a front-end load balancer,
- a cluster of application servers,
- a Riak powered demo application,
- a back-end load balancer,
- and a cluster of Riak servers.
In that case, the Basho team has made available scripts that leverage AWS CloudFormation to build out your cluster in a scripted fashion.
Since this is a much different process than the previous method, it is well worth watching the introductory video (embedded below). In addition, the scripts in the cloudformation-riak repo can be thought of as “known good” templates. We accept Pull Requests and happy forking!
As always, there is a manual option.
If you need to control the system configuration or are most comfortable with software that you have built and deployed yourself, there is always the option to install from package or source.
For a full list of supported operating systems, check out the Installing and Upgrading page of the doc portal. In addition, we have recently launched a new download page that includes the source for the OSS version of Riak.
And easier to deploy than ever before. If you have feedback on present deployment alternatives, or recommendations on ways to make Riak support for cloud infrastructure easier, please drop us a note in the mailing list.
January 29, 2013
This is the first in a series of blog posts covering the benefits Riak offers to developers and operators of retail and eCommerce platforms. To learn more, join our “Retail on Riak” webcast on Friday, February 8th.
As retailers grow and have to store more and more data, traditional relational databases aren’t always the best option. Retailers want to scale easily, without the operational burden of manual sharding. Meanwhile, business requirements demand their data is always available for reads and writes. Riak is a highly available, low latency distributed database that is ideal for retailers who need to serve product data quickly and maintain “always on” shopping experiences. Riak is based on architectural principles from Amazon. Riak is designed for high availability and scale so retailers can always serve customers, even under failure conditions, and rapidly grow to meet peak loads.
Retailers of all sizes have chosen Riak to power parts of their business, including:
- Best Buy: Best Buy is North America’s top specialty retailer of consumer electronics, personal computers, entertainment software, and appliances. Riak has been an integral part in the transformation push to re-platform Best Buy’s eCommerce platform. For more info, check out Best Buy’s talk from our 2012 developer conference, RICON.
- ideal: ideel is one of the fastest growing retailers with over 5 million members and more than 1,000 brand partners. They use Riak to serve HTML documents and user-specific products. ideel chose Riak to power their event-based shopping experience due to Riak’s ability to serve users information at low latency and provide ease of use and scale to ideel’s operations team. Check out the complete case study for more details.
- Copious: Copious is a social commerce marketplace that makes it easy for people to buy and sell the things they love. They currently store all registered accounts in Riak as well as the tokens that make it possible for users to authenticate with Copious via their Facebook or Twitter accounts. They chose to use Riak for their social login functionality because of its operational simplicity, which allows them to easily scale up without sharding and provides the high availability required for a smooth user experience. For more details, check out the complete Copious story on our blog.
For more information about the benefits of Riak for retailers and the retailers already using it, register for our “Retail on Riak” webcast on February 8th!
January 22, 2013
Traditionally, most retailers have used relational databases to manage their platforms and eCommerce sites. However, with the rapid growth of data and business requirements for high availability and scale, more retailers are looking at non-relational solutions like Riak.
Riak is a masterless, distributed database that provides retailers with high read and write availability, fault-tolerance and the ability to grow with low operational cost. Architectural, operational and development benefits for retailers include:
- “Always On” Shopping Experience: Based on architectural principles from Amazon, Riak is designed to favor data availability, even in the event of hardware failure or network partition. For retailers, failure to accept additions to a shopping cart, or serve product information quickly, has a direct and negative impact on revenue. Riak is architected to ensure the system can always accept writes and serve reads at low-latency.
- Resilient Infrastructure: At scale, hardware malfunction, network partition, and other failure modes are inevitable. Riak provides a number of mechanisms to ensure that retail infrastructure is resilient to failure. Data is replicated automatically within the cluster so nodes can go down but the system still responds to requests. This ensures read and write availability, even in serious failure conditions.
- Low-Latency Data Storage: Many retailers now operate online and mobile experiences with an API or data services platform. In order to provide a fast and available experience to end users, Riak is designed to serve predictable, low-latency requests as part of a service-oriented infrastructure and is accessible via HTTP API, protocol buffers, or Riak’s many client libraries.
- Scale to Peak Loads with Low Operational Cost: During major holidays and other periods of peak load, retailers may have to significantly increase their database capacity quickly. When new nodes are added, Riak automatically distributes data evenly to naturally prevent hot spots in the database, and yields a near-linear increase in performance and throughput when capacity is added.
- Global Data Locality and Redundancy: Riak Enterprise’s multi-site replication allows replication of data to multiple data centers, providing both a global data footprint and the ability to survive datacenter failure.
Top retailers using Riak include Best Buy and ideel. Best Buy selected Riak as an integral part in the transformation push to re-platform its eCommerce platform. For more information about how Best Buy is using Riak, check out this video.
ideel uses Riak to serve HTML documents and user-specific products. ideel chose Riak to provide its highly available, event-based shopping experience – Riak gives them the ability to serve user information at low latency and provides ease of use and scale to ideel’s operations team. For more information on ideel’s use of Riak check out the complete case study.
Common use cases for Riak in the retail/eCommerce space include shopping carts (due to Riak’s “always-on” capabilities), product catalogs (Riak is well suited for the storage of rapidly growing content that needs to be served at low-latency), API platforms (Riak’s flexible, schemaless design allows for rapid application development), and mobile applications (Riak is ideal for powering mobile experiences across platforms due to its low-latency, always-available small object storage capabilities).
To help retailers evaluate and adopt Riak, we’ve published a technical overview: “Retail on Riak: A Technical Introduction.” We discuss more in-depth information on modeling applications for common use cases, switching from a relational architecture, querying, multi-site replication and more.
January 21, 2013
Mad Mimi is an email marketing service that allows users to create, send, and track email campaigns without using templates. With over 100,000 clients, Mad Mimi is storing a large amount of data that needs to be accessed quickly and easily.
In 2011, Mad Mimi realized that their data was growing beyond the capacity of their MySQL database. Rather than resharding the data, which would require an extensive operational effort, Mad Mimi decided to try Riak based on its ability to scale quickly and easily without manual sharding.
Mad Mimi now uses Riak to track email statistics, leveraging the secondary indexing feature to make retrieving data easier. Secondary indexing allows users to attach additional key/value data to Riak objects and query them by exact match or range value. Mad Mimi is currently running an 8 node cluster storing between three and five billion keys, adding between 10-20 million keys each day.
Since launching with Riak, their cluster has never gone down and it is still as fast as ever. Based on this success, they hope to move all their email tracking statistics to Riak and eliminate MySQL entirely.
For more details on Mad Mimi’s experience with Riak, check out the case study, “Email Marketing Success with Mad Mimi and Riak.”
For more information on moving from a relational database to Riak, sign up for our webcast this Thursday, covering advantages, tradeoffs and development considerations.
November 30, 2012
In September, we announced a partnership with Citrix CloudPlatform to provide integrated storage and compute capabilities. Basho and Citrix are working together on a combined platform that provides highly available cloud storage with multi-data center capabilities as part of the CloudPlatform solution for private, hybrid and public clouds.
With this month’s release of multi-datacenter features in Riak CS, we’re able to provide CloudPlatform users with highly available, multi-site cloud storage. We’re also working on authentication support for Citrix CloudPlatform in Riak CS for even more seamless integration.
“The developers of Riak have done a great job helping to extend Apache CloudStack, enabling users to use an S3-compatible object store for secondary storage,” said Citrix VP of Product Development and Apache CloudStack committer, Kevin Kluge. “We are also looking forward to having the option to use storage replication across zones as part of their Citrix CloudPlatform compatible Riak CS product.”
This weekend we’re at the CloudStack Collaboration Conference to talk to users about Riak CS and CloudStack, and we’ve got tons of free trials for Riak CS to pass out. If you’re at the Collaboration Conference, be sure to attend the technical overview presented by Basho Chief Architect Andy Gross, taking place at 10:45 AM PT in RM607. And make sure to track us down on Twitter if you’d like to talk more.
Several Basho team members will be presenting on distributed systems topics at QCon San Francisco.
SAN FRANCISCO, CA – November 7, 2012 – Attending QCon International Software Development Conference this week in San Francisco? We’d love to meet up and talk to you about Riak! You can catch us in the exhibitor’s hall all week, or at the welcome party taking place after the talks Wednesday, November 7 at Thirsty Bear. Additionally, several Basho team members will be presenting on distributed systems topics. Check out the talk synopsis below and hope to see you there.
Thursday, November 8
Riak and Dynamo, 5 Years Later
Andy Gross, Basho Chief Architect
October 2012 marks the five year anniversary of Amazon’s seminal Dynamo paper, which inspired most of the NoSQL databases that appeared shortly after its publication, including Riak. In this session, Andy will reflect on five years of involvement with Riak and distributed databases and discuss what went right, what went wrong, and what the next five years may hold for Riak as we outgrow our Dynamo roots.
Fear No More: Embrace Eventual Consistency
Sean Cribbs, Basho Software Engineer
A number of years ago, Eric Brewer, father of the CAP theorem, coined an architectural style of loosely-coupled distributed systems “BASE”, meaning, “Basically Available, Soft-state, and Eventually-consistent”. Clearly he meant this as a counterpoint to the “ACID” properties of traditional database systems. BASE systems choose to remain available to operations, sacrificing strict synchronization. While developers are very comfortable with the convenience of ACID, eventual consistency can be frightening, unfamiliar territory.
This talk will dive into the design of eventually consistent systems, touching on theory and practice. We’ll see why EC doesn’t mean “inconsistent” but is actually a different kind of consistency, with different tradeoffs. These new skills should help developers know when to embrace eventually-consistent solutions instead of fearing them.
Friday, November 9
Dynamo: Theme and Variations
Shanley Kane, Basho Director of Product Management
The Dynamo paper, released by Amazon five years ago, laid out a set of technical “themes” for highly available, fault-tolerant distributed systems. Since then, numerous NoSQL products have been built on its core principles. These “variations,” along with recent advances in research, represent both a fascinating study in technical evolution and the forefront of the non-relational world. In this talk, we’ll cover the foundations of Dynamo – consistent hashing, vector clocks, hinted handoff, gossip protocol – advances in each area, and how querying and application development has changed as a result of them.