Tag Archives: Engine Yard

RICON East: Day Two

May 14, 2013

We hope you all enjoyed the first day of RICON East. There were some great talks yesterday and we’re looking forward to even more today. As a reminder, all of the talks are being live streamed here, just in case you weren’t able to get your ticket in time.

We wanted to give a quick shout out to all of the great sponsors of RICON East this year. This conference would not be possible without them. A big thank you to Fastly, Meraki, Engine Yard, SoftLayer, NoSQL Weekly, OmniTI, Erlang Solutions, Github, and GoFactory for all of your help!

We also just announced the date and location for RICON West, happening October 29-30th at the St. Regis Hotel in San Francisco. Full conference details are available at the official conference RICON West site, and we’ve already got some great speakers announced. Tickets are available for the early bird price of $299 for a limited time. We hope to see you in October!

Basho

Top Five Questions About Riak

April 17, 2013

This post looks at five commonly asked questions about Riak. For more questions and answers, check out our Riak FAQ.

What hardware should I use with Riak?

Riak is designed to be run on commodity hardware and is run in production on a variety of different server types on both private and public infrastructure. However, there are several key considerations when choosing the right infrastructure for your Riak deployment.

RAM is one of the most important factors – RAM availability directly affects what Riak backend you should use (see question below), and is also required for complex MapReduce queries. In terms of disk space, Riak automatically replicates data according to a configurable n_val. A bucket-level property that defaults to 3, n_val determines how many copies of each object will be stored, and provides the inherent redundancy underlying Riak’s fault-tolerance and high availability. Your hardware choice should take into consideration how many objects you plan to store and the replication factor, however, Riak is designed for horizontal scale and lets you easily add capacity by joining additional nodes to your cluster. Additional factors that might affect choice of hardware include IO capacity, especially for heavy write loads, and intra-cluster bandwidth. For additional factors in capacity planning, check out our documentation on cluster capacity planning.

Riak is explicitly supported on several cloud infrastructure providers. Basho provides free Riak AMIs for use on AWS. We recommend using large, extra large, and cluster compute instance types on Amazon EC2 for optimal performance. Learn more in our documentation on performance tuning for AWS. Engine Yard provides hosted Riak solutions, and we also offer virtual machine images for the Microsoft VM Depot.

What backend is best for my application?

Riak offers several different storage backends to support use cases with different operational profiles. Bitcask and LevelDB are the most commonly used backends.

Bitcask was developed in-house at Basho to offer extremely fast read/write performance and high throughput. Bitcask is the default storage engine for Riak and ships with it. Bitcask uses an in-memory hash-table of all keys you write to Riak, which points directly to the on-disk location of the value. The direct lookup from memory means Bitcask never uses more than one disk seek to read data. Writes are also very fast with Bitcask’s write-once, append-only design. Bitcask also offers benefits like easier backups and fast crash recovery. The inherent limitation is that your system must have enough memory to contain your entire keyspace, with room for a few other operational components. However, unless you have an extremely large number of keys, Bitcask fits many datasets. Visit our documentation for more details on Bitcask, and use the Bitcask Capacity Calculator to assist you with sizing your cluster.

LevelDB is an open-source, on-disk key-value store from Google. Basho maintains a version of LevelDB tuned specifically for Riak. LevelDB doesn’t have Bitcask’s memory constraints around keyspace size, and thus is ideal for deployments with a very large number of keys. In addition to this advantage, LevelDB uses Google Snappy data compression, which provides particular efficiency for text data like raw text, Base64, JSON, HTML, etc. To use LevelDB with Riak, you must the change the storage backend variable in the app.config file. You can find more details on LevelDB here.

Riak also offers a Memory storage backend that does not persist data and is used simply for testing or small amounts of transient state. You can also run multiple backends within a single Riak instance, which is useful if you want to use different backends for different Riak buckets or use a different storage configuration for some buckets. For in-depth information on Riak’s storage backends, see our documentation on choosing a backend.

How do I model data using Riak’s key/value design?

Riak uses a key/value design to store data. Key/value pairs comprise objects, which are stored in buckets. Buckets are flat namespaces with some configurable properties, such as the replication factor. One frequent question we get is how to build applications using the key/value scheme. The unique needs of your application should be taken into account when structuring it, but here are some common approaches to typical use cases. Note that Riak is content-agnostic, so values can be any content type.

Data Type Key Value
Session User/Session ID Session Data
Content Title, Integer Document, Image, Post, Video, Text, JSON/HTML, etc.
Advertising Campaign ID Ad Content
Logs Date Log File
Sensor Date, Date/Time Sensor Updates
User Data Login, Email, UUID User Attributes

For more comprehensive information on building applications with Riak’s key/value design, view the use cases section of our documentation.

What other options, besides strict key/value access, are there for querying Riak?

Most operations done with Riak will be reading and writing key/value pairs to Riak. However, Riak exposes several other features for searching and accessing data: MapReduce, full-text search, and secondary indexing.

MapReduce provides non-primary key based querying that divides work across the Riak distributed database. It is useful for tasks such as filtering by tags, counting words, extracting links, analyzing log files, and aggregation tasks. Riak provides both Javascript and Erlang MapReduce support. Jobs written in Erlang are generally more performant. You can find more details about Riak MapReduce here.

Riak also provides Riak Search, a full-text search engine that indexes documents on write and provides an easy, robust query language and SOLR-like API. Riak Search is ideal for indexing content like posts, user bios, articles, and other documents, as well as indexing JSON data. For more information, see the documentation on Riak Search.

Secondary indexing allows you to tag objects in Riak with one or more queryable values. These “tags” can then be queried by exact or range value for integers and strings. Secondary indexing is great for simple tagging and searching Riak objects for additional attributes. Check out more details here.

How does Riak differ from other databases?

We often get asked how Riak is different from other databases and other technologies. While an in-depth analysis is outside the scope of this post, the below should point you in the right direction.

Riak is often used by applications and companies with a primary background in relational databases, such as MySQL. Most people who move from a relational database to Riak cite a few reasons. For one, Riak’s masterless, fault-tolerant, read/write available design make it a better fit for data that must be highly available and resilient to failure scenarios. Second, Riak’s operational profile and use of consistent hashing means data is automatically redistributed as you add machines, avoiding hot spots in the database and manual resharding efforts. Riak is also chosen over relational databases for the multi-datacenter capabilities provided in Riak Enterprise. A more detailed look at the difference between Riak and traditional databases and how to make the switch can be found in this whitepaper, From Relational to Riak.

A more detailed look at the technical differences between Riak and other NoSQL databases can be found in the comparisons section of our documentation, which covers databases such as MongoDB, Couchbase, Neo4j, Cassandra, and others.

Ready to get started? You can download Riak here. For more in-depth information about Riak, we also offer Riak Workshops in New York and San Francisco. Learn more here.

Basho

Getting Started with Riak in the Cloud

April 10, 2013

Earlier this year, we announced that hosted Riak is now available on the Engine Yard platform. Ines Sombra, Lead Data Engineer at Engine Yard, has put together a talk to help you get started using Riak in cloud environments. This talk introduces Riak’s overall architecture, some common use cases, and goes over some questions to consider when choosing a database. It also discusses what you need to know to run Riak in the cloud and how it differs from traditional hardware installation.

You can view the full talk below:

For more information about Riak on Engine Yard, check out our blog post or get started now with 500 hours free on their platform.

Basho

Riak on Engine Yard

January 25, 2013

Today we’re excited to introduce early access of Riak on Engine Yard! You can also learn more on the Engine Yard blog. With Riak on Engine Yard, you can deploy a Riak cluster as simply as defining some configuration values and clicking “Add Cluster.”

A common theme, in several of our recent blog posts, has been Basho’s key focus on ease of deployment. We excel in making highly available, low latency, distributed systems. Engine Yard’s strengths lie in providing a hardened and secure Platform as a Service where you can manage your entire platform while retaining control of the environment. In addition, Engine Yard is well known for its contributions to the Ruby, PHP, and Node.js communities. The introduction of Riak on Engine Yard further validates customer demand for reliable and easy to use cloud solutions.

If you were at Ricon2012, you were probably one of the many who attended a talk entitled “Riak in the Cloud.” If you were unable to attend, you missed an amazing session where Ines Sombra and Michael Broadhead from Engine Yard spoke about their experiences with Riak and deploying it in the cloud. It’s great to see the lessons of distributed systems that were discussed translated into reality.

We look forward to seeing what the Basho community builds using Riak on Engine Yard. Get started now with 500 hours for free on their platform.

Basho

Lineup and Location for September Riak Meetup

September 9, 2010

At long last we have all the details ironed out for the upcoming September Riak Meetup in San Francisco. The crew here in SF is quite excited about this month’s event, and here’s why:

Date: Thursday, Sept. 23rd

Time: 7-9

Location: Engine Yard Offices, located at 500 Third Street, Suite 510

Schedule:

  • 7:15 – Riak Basics

    After the first meetup, one of the attendees remarked, “Good, but looking for some basics and some hands on demo as well.” Admittedly, this is something we could have addressed a bit better. So at the beginning of this meetup (as well as all meetups moving forward) we are going to devote at least 15 minutes to discuss Riak basics. There are no stupid questions. Ask away.

  • 7:30 – Riak vs Git: NOSQL Battle Royale

    Presenter: Rick Olson, Github

    This talk will compare and contrast Riak and Git on their merits as key/value stores, and look at how the two can work together.

  • 8:00 – From Riak to RabbitMQ

    Presenter: Andy Gross, Basho Technologies

    This will cover using Riak to publish to RabbitMQ using post-commit hooks and gen_bunny.

  • 8:30 – General Riak/Distributed Systems Conversation and Networking

Note: There is only seating for 50, so you’ll want to get there on time to secure a seat.

Basho will be providing food (pizza) and refreshments (beer, soda, etc.). And for those of you who can’t join us next Thursday, I will also be filming the talks with the goal of posting them online if everything goes to plan.

You can RSVP on the Riak Meetup Page. So go do it. Now!

Hope to see you there.

Mark