Tag Archives: Erlang

statebox, an eventually consistent data model for Erlang (and Riak)

May 13, 2011

This was originally posted by Bob Ippolito on May 9th on the Mochi Media Labs Blog. If you’re planning to comment, please do so on the original post.

A few weeks ago when I was on call at work I was chasing down a bug in friendwad [1] and I realized that we had made a big mistake. The data model was broken, it could only work with transactions but we were using Riak. The original prototype was built with Mnesia, which would’ve been able to satisfy this constraint, but when it was refactored for an eventually consistent data model it just wasn’t correct anymore. Given just a little bit of concurrency, such as a popular user, it would produce inconsistent data. Soon after this discovery, I found another service built with the same invalid premise and I also realized that a general solution to this problem would allow us to migrate several applications from Mnesia to Riak.

When you choose an eventually consistent data store you’re prioritizing availability and partition tolerance over consistency, but this doesn’t mean your application has to be inconsistent. What it does mean is that you have to move your conflict resolution from writes to reads. Riak does almost all of the hard work for you [2], but if it’s not acceptable to discard some writes then you will have to set allow_mult to true on your bucket(s) and handle siblings [3] from your application. In some cases, this might be trivial. For example, if you have a set and only support adding to that set, then a merge operation is just the union of those two sets.

statebox is my solution to this problem. It bundles the value with repeatable operations [4] and provides a means to automatically resolve conflicts. Usage of statebox feels much more declarative than imperative. Instead of modifying the values yourself, you provide statebox with a list of operations and it will apply them to create a new statebox. This is necessary because it may apply this operation again at a later time when resolving a conflict between siblings on read.

Design goals (and non-goals):

  • The intended use case is for data structures such as dictionaries and sets
  • Direct support for counters is not required
  • Applications must be able to control the growth of a statebox so that it does not grow indefinitely over time
  • The implementation need not support platforms other than Erlang and the data does not need to be portable to nodes that do not share code
  • It should be easy to use with Riak, but not be dependent on it (clear separation of concerns)
  • Must be comprehensively tested, mistakes at this level are very expensive
  • It is ok to require that the servers’ clocks are in sync with NTP (but it should be aware that timestamps can be in the future or past)

Here’s what typical statebox usage looks like for a trivial application (note: Riak metadata is not merged [5]). In this case we are storing an orddict in our statebox, and this orddict has the keys following and followers.

-export([add_friend/2, get_friends/1]).

-define(BUCKET, <<“friends”>>).
-define(STATEBOX_MAX_QUEUE, 16). %% Cap on max event queue of statebox
-define(STATEBOX_EXPIRE_MS, 300000). %% Expire events older than 5 minutes
-define(RIAK_HOST, “”).
-define(RIAK_PORT, 8087).

-type user_id() :: atom().
-type orddict(T) :: [T].
-type ordsets(T) :: [T].
-type friend_pair() :: {followers, ordsets(user_id())} |
{following, ordsets(user_id())}.

-spec add_friend(user_id(), user_id()) -> ok.
add_friend(FollowerId, FolloweeId) ->
statebox_orddict:f_union(following, [FolloweeId])},
statebox_orddict:f_union(followers, [FollowerId])}],

-spec get_friends(user_id()) -> [] | orddict(friend_pair()).
get_friends(Id) ->
statebox_riak:get_value(?BUCKET, friend_id_to_key(Id), connect()).

%% Internal API

connect() ->
{ok, Pid} = riakc_pb_client:start_link(?RIAK_HOST, ?RIAK_PORT),

connect(Pid) ->
statebox_riak:new([{riakc_pb_client, Pid},
{max_queue, ?STATEBOX_MAX_QUEUE},
{expire_ms, ?STATEBOX_EXPIRE_MS},
{from_values, fun statebox_orddict:from_values/1}]).

friend_id_to_key(FriendId) when is_atom(FriendId) ->
%% NOTE: You shouldn’t use atoms for this purpose, but it makes the
%% example easier to read!
atom_to_binary(FriendId, utf8).

To show how this works a bit more clearly, we’ll use the following sequence of operations:

add_friend(alice, bob), %% AB
add_friend(bob, alice), %% BA
add_friend(alice, charlie). %% AC

Each of these add_friend calls can be broken up into four separate atomic operations, demonstrated in this pseudocode:

%% add_friend(alice, bob)
Alice = get(alice),
put(update(Alice, following, [bob])),
Bob = get(bob),
put(update(Bob, followers, [alice])).


Realistically, these operations may happen with some concurrency and cause conflict. For demonstration purposes we will have AB happen concurrently with BA and the conflict will be resolved during AC. For simplicity, I’ll only show the operations that modify the key for

AB = get(alice), %% AB (Timestamp: 1)
BA = get(alice), %% BA (Timestamp: 2)
put(update(AB, following, [bob])), %% AB (Timestamp: 3)
put(update(BA, followers, [bob])), %% BA (Timestamp: 4)
AC = get(alice), %% AC (Timestamp: 5)
put(update(AC, following, [charlie])). %% AC (Timestamp: 6)

Timestamp 1:

There is no data for alice in Riak yet, so
statebox_riak:from_values([]) is called and we get a statebox
with an empty orddict.

Value = [],
Queue = [].

Timestamp 2:

There is no data for alice in Riak yet, so
statebox_riak:from_values([]) is called and we get a statebox
with an empty orddict.

Value = [],
Queue = [].

Timestamp 3:

Put the updated AB statebox to Riak with the updated value.

Value = [{following, [bob]}],
Queue = [{3, {fun op_union/2, following, [bob]}}].

Timestamp 4:

Put the updated BA statebox to Riak with the updated value. Note
that this will be a sibling of the value stored by AB.

Value = [{followers, [bob]}],
Queue = [{4, {fun op_union/2, followers, [bob]}}].

Timestamp 5:

Uh oh, there are two stateboxes in Riak now… so
statebox_riak:from_values([AB, BA]) is called. This will apply
all of the operations from both of the event queues to one of the
current values and we will get a single statebox as a result.

Value = [{followers, [bob]},
{following, [bob]}],
Queue = [{3, {fun op_union/2, following, [bob]}},
{4, {fun op_union/2, followers, [bob]}}].

Timestamp 6:

Put the updated AC statebox to Riak. This will resolve siblings
created at Timestamp 3 by BA.

Value = [{followers, [bob]},
{following, [bob, charlie]}],
Queue = [{3, {fun op_union/2, following, [bob]}},
{4, {fun op_union/2, followers, [bob]}},
{6, {fun op_union/2, following, [charlie]}}].

Well, that’s about it! alice is following both bob and charlie despite the concurrency. No locks were harmed during this experiment, and we’ve arrived at eventual consistency by using statebox_riak, statebox, and Riak without having to write any conflict resolution code of our own.


And if you’re at all interested in getting paid to do stuff like this, Mochi is hiring.


[1] friendwad manages our social graph for Mochi Social and MochiGames.
It is also evidence that naming things is a hard problem in
computer science.
[2] See Basho’s articles on Why Vector Clocks are Easy and
Why Vector Clocks are Hard.
[3] When multiple writes happen to the same place and they have
branching history, you’ll get multiple values back on read.
These are called siblings in Riak.
[4] An operation F is repeatable if and only if F(V) = F(F(V)).
You could also call this an idempotent unary operation.
[5] The default conflict resolution algorithm in statebox_riak
chooses metadata from one sibling arbitrarily. If you use
metadata, you’ll need to come up with a clever way to merge it
(such as putting it in the statebox and specifying a custom
resolve_metadatas in your call to statebox_riak:new/1).

Two new Erlang/OTP applications added to Riak products – 'riak_err' and 'cluster_info'

November 17, 2010

The next release of Riak will include two new Erlang/OTP applications: riak_err and cluster_info. The riak_err application will improve Riak’s runtime robustness by strictly limiting the amount of RAM that is used while processing event log messages. The cluster_info application will assist troubleshooting by automatically gathering lots of environment, configuration, and runtime statistics data into a single file.

Wait a second, what are OTP applications?

The Erlang virtual machine provides most of the services that an operating system like Linux or Windows provides: memory management, file system management, TCP/IP services, event management, and the ability to run multiple applications. Most modern operating systems allow you to run a Web browser, word processor, spreadsheet, instant messaging app, and many others. And if your email GUI app crashes, your other applications continue running without interference.

Likewise, the Erlang virtual machine supports running multiple applications. Here’s the list of applications that are running within a single Riak node — we ask the Erlang CLI to list them for us.

(riak@> application:which_applications().
[{cluster_info,"Cluster info/postmortem app","0.01"},
{skerl,"Skein hash function NIF","0.1"},
{riak_kv,"Riak Key/Value Store","0.13.0"},
{riak_core,"Riak Core","0.13.0"},
{luke,"Map/Reduce Framework","0.2.2"},
{mochiweb,"MochiMedia Web Server","1.7.1"},
{erlang_js,"Interface between BEAM and JS","0.4.1"},
{runtime_tools,"RUNTIME_TOOLS version 1","1.8.3"},
{crypto,"CRYPTO version 1","1.6.4"},
{os_mon,"CPO CXC 138 46","2.2.5"},
{riak_err,"Custom error handler","0.1.0"},
{sasl,"SASL CXC 138 11","2.1.9"},
{stdlib,"ERTS CXC 138 10","1.16.5"},
{kernel,"ERTS CXC 138 10","2.13.5"}]

Yes, that’s 17 different applications running inside a single node. For each item in the list, we’re told the application’s name, a human-readable name, and that application’s version number. Some of the names like ERTS CXC 138 10 are names assigned by Ericsson.

Each application is, in turn, a collection of one or more processes that provide some kind of computation service. Most of these processes are arranged in a “supervisor tree”, which makes the task of managing faults (e.g., if a worker process crashes, what do you do?) extremely easy. Here is the process tree for the kernel application.

And here is the process tree for the riak_kv application.

The riak_err application

See the GitHub README for riak_err for more details.

The Erlang/OTP runtime provides a useful mechanism for managing all of the info, error, and warning events that an application might generate. However, the default handler uses some not-so-smart methods for making human-friendly message strings.

The big problem is that the representation used internally by the virtual machine is a linked list, one list element per character, to store the string. On a 64-bit machine, that’s 16 bytes of RAM per character. Furthermore, if the message contains non-printable data (i.e., not ASCII or Latin-1 characters), the data will be formatted into numeric representation. The string “Bummer” would be formatted just like that, Bummer. But if each character in that string had the constant 140 added to it, the 6-byte string would be formatted as the 23-byte string 206,257,249,249,241,254 instead (an increase of about 4x). And, in rare but annoying cases, there’s some additional expansion on top of all of that.

The default error handler can take a one megabyte error message and use over 1 megabyte * 16 * 4 = 32 megabytes of RAM. Why should error messages be so large? Depending on the nature of a user’s input (e.g. a large block of data from a Web client), the process’s internal state, and other factors, error messages can be much, much bigger than 1MB. And it’s really not helpful to consume half a gigabyte of RAM (or more) just to format one such message. When a system is under very heavy load and tries to format dozens of such messages, the entire virtual machine can run out of RAM and crash.

The riak_err OTP application replaces about 90% of the default Erlang/OTP info/error/warning event handling mechanism. The replacement handler places strict limits on the maximum size of a formatted message. So, if you want to limit the maximum length of an error message to 64 kilobytes, you can. The result is that it’s now much more difficult to get Riak to crash due to error message handling. It makes us happy, and we believe you’ll be happier, too.

Licensing for the riak_err application

The riak_err application was written by Basho Technologies, Inc. and is licensed under the Apache Public License version 2.0. We’re very interested in bug reports and fixes: our mailbox is open 24 hours per day for GitHub comments and pull requests.

The cluster_info application

The cluster_info application is included in the packaging for the Hibari key-value store, which is also written in Erlang. It provides a flexible and easily-extendible way to dump the state of a cluster of Erlang nodes.

Some of the information that the application gathers includes:

  • Date & time
  • Statistics on all Erlang processes on the node
  • Network connection details to all other Erlang nodes
  • Top CPU- and memory-hogging processes
  • Processes with large mailboxes
  • Internal memory allocator statistics
  • ETS table information
  • The names & versions of each code module loaded into the node

The app can also automatically gather all of this data from all nodes and write it into a single file. It’s about as easy as can be to take a snapshot of all nodes in a cluster. It will be a valuable tool for Basho’s support and development teams to diagnose problems in a cluster, as a tool to aid capacity planning, and merely to answer a curious question like, “What’s really going on in there?”

Over time, Basho will be adding more Riak-specific info-gathering functions. If you’ve got feature suggestions, (e.g., dump stats on how many times you woke up last night, click here to send all this data to Basho’s support team via HTTP or SMTP), we’d like to hear them. Or, if you’re writing the next game-changing app in Erlang, just go ahead and hack the code to fit your needs.

Licensing for the cluster_info application

The cluster_info application was written by Gemini Mobile Technologies, Inc. and is licensed under the Apache Public License version 2.0.


Erlang Factory London Recap

June 14, 2010

This was originally posted by @rklophaus on his blog, rklophaus.com.

Erlang Factory London gathers Erlang pioneers from across the world—Berlin to Boston, Krakow to Cordoba, and San Francisco to Shanghai—for a two-day conference of innovative Erlang development.

The summaries below are just a small sampling of the talks at Erlang Factory London. There were three tracks running back-to-back for two days, and I often couldn’t decide which of the three to attend. Slides and videos will be released by Erlang Solutions, and can be found under individual track pages on the Erlang Factory website.

Day 1 – June 10, 2010

Opening Session

Francesco Cessarini (Chief Strategy Officer, Erlang Solutions Ltd.), began the conference with a warm welcome and a quick review of progress made by Erlang-based companies in the last year.

Some highlights:

The History of the Erlang Virtual Machine – Joe Armstrong, Robert Virding

Joe Armstrong and Robert Virding gave a colorful, back-and-forth history of the Erlang’s birth and early years. A few notable milestones and achievements:

  • Joe’s early work on reduction machines. Robert’s complete rewrite of Joe’s work. Joe’s complete rewrite of Robert’s work. (etc.)
  • How Erlang was almost based on Smalltalk rather than Prolog
  • The quest to make Erlang 1.0x 80 times faster
  • Experiments with different memory management and garbage collection schemes
  • The train set used demonstrate Erlang, now in Robert’s basement
  • The addition of linked processes, distribution, OTP, and bit syntax

It’s easy to take a language like Erlang for granted and assume that its builders followed some well-known, pre-ordained path. Hearing Erlang’s history from two of its main creators provided an excellent reminder that building software is both an art and a science, uncertain and exciting like any creative process.

Riak from the Inside – Justin Sheehy

Justin Sheehy (CTO of Basho Technologies) opened his talk by introducing Riak, “a scalable, highly-available, networked, open-source key/value store.” He then very quickly announced that he wasn’t there to talk about using Riak, he was there to talk about how Riak was built using Erlang and OTP

There are eight distinct layers involved in reading/writing Riak data:

  • The Client Application using Riak
  • The client-side HTTP API or Protocol Buffers API that talks to the Riak cluster
  • The server-side Riak Client containing the combined backing code for both APIs
  • The Dynamo Model FSMs that interact with nodes using Dynamo style quorum behavior and conflict resolution
  • Riak Core provides the fundamental distribution of the system (not covered in the talk)
  • The VNode Master that runs on every physical node, and coordinates incoming interaction with individual VNodes
  • Individual VNodes (Virtual Nodes) which are treated as lightweight local abstractions over K/V storage
  • The swappable Storage Engine that persists data to disk

During his talk, Justin discussed each layer’s responsibilities and interactions with the layers above and below it.

Justin’s main point is that carefully managed complexity in the middle layers allows for simplicity at the edge layers. The top three layers present a simple key/value interface, and the bottom two layers implement a simple key/value store. The middle layers (FSMs, Riak Core, and VNode Master) work together to provide scalability, replication, etc. Erlang makes this possible, and was chosen because it provides a platform that evolves in useful and relatively-predictable ways (this is a good thing, a surprising evolution is bad).

Mnesia for the CAPper – Ulf Wiger

Ulf Wiger (CTO of Erlang Solutions) discussed where Mnesia might fit into the changing world of databases, given the new focus on “NoSQL” solutions. Ulf gave a quick introduction to ACID properties, Brewer’s CAP theorem, and the history of Mnesia, and then dove into a feature level description/comparison of Mnesia with other databases:

  • Deployed commercially for over 10 years
  • Comparable performance to current top performers clustered SQL space
  • Scalable to 50 nodes
  • Distributed transactions with loose time limits (in other words, appropriate for transactions across remote clusters)
  • Built-in support for sharding (fragments)
  • Incremental backup

The downsides are:

  • Erlang only interface
  • Tables limited to 2GB
  • Deadlock prevention scales poorly
  • Network partitions are not automatically handled, must recombine tables automatically

Ulf and others have done work to get around some of these limitations. Ulf showed code for an extension to Mnesia that automatically merges tables after they have split, using vector clocks.

Riak Search – Rusty Klophaus

I presented Riak Search, a distributed indexing and full-text search engine built on (and complementary to) Riak.

Part one covered the main reason for building Riak search: clients have built applications that eventually need to find data by value, not just by key. This is difficult, if not impossible, in a key/value store.

Part two described the shape of the final solution we set out to create. The goal of Riak Search is to support the Lucene interface, with Lucene syntax support and Solr endpoints, but with the operations story of Riak. This means that Riak Search will scale easily by adding new machines, and will continue to run after machine failure.

Part three was an introduction to Inverted Indexing, which is the heart of all search systems, as well as the difference between Document-Partitioning and Term-Partitioning, which forms the ongoing battle in the distributed search field. Part three continued with a deep-dive into parsing, planning, and executing the search query on Erlang.

Slides: http://www.slideshare.net/rklophaus/riak-search-erlang-factory-london-2010

Building a Scalable E-commerce Framework – Michael Nordström and Daniel Widgren

Michael Nordström and Daniel Widgren presented an Erlang-based e-commerce framework on behalf of their project team from Uppsala University (Christian Rennerskog, Shahzad Gul, Nicklas Nordenmark,
Manzoor Ahmad Mubashir, Mikael Nordström, Kim Honkaniemi, Tanvir Ahmad, Yujuan Zou, and Daniel Widgren) and their industrial partner, Klarna AB.

The application uses a “LERN stack” (Linux, Erlang, Riak, Nitrogen), to provide a reusable web shop that can be quickly set up by clients, customized via templates and themes, and extended via plugins to support different payment providers.

The project is currently going a rewrite to update to the latest versions of Riak and Nitrogen.

GitHub: http://github.com/mino4071/CookieCart-2.0

Twitter: @Cookie_Cart

Clash of the Titans: Erlang Clusters and Google App Engine – Panos Papadopoulos, Jon Vlachoyiannis, Nikos Kakavoulis

Panos, Jon, and Nikos took turns describing the technical evolution of their startup, SocialCaddy, and why they were forced to move away from the Google App Engine. SocialCaddy is a tool that mines your online profiles for important events and changes, and tells you about them. For example, if a friend gets engaged, SocialCaddy will tell you about it, and assist you in sending a congratulatory note.

Google App Engine imposes a 30 second limit on requests. As SocialCaddy processed larger and larger social graphs, they bumped into this limit, which made GAE unusable as a platform. In response, the team developed Erlust, which allows you to submit jobs (written in any language) to a cluster. An Erlang application coordinates the jobs, and each job should read from a queue, process messages, and write to another queue.

Using Open-Source Trifork QuickCheck to test Erjang – Kresten Krab Thorup

Kresten Krab Thorup (CTO of Trifork) stirred up dust when he originally announced his intention to build a version of Erlang that ran on the JVM. Since then, he has made astounding progress. Erjang turns Erlang .beam files into Java .class files, now supporting a broad enough feature set to run Mnesia over distributed Erlang. Kresten claimed performance matching (or at times exceeding) that of the Erlang VM.

Erjang is still a work in progress, there are many BIFs that still need to be ported, but if a prototype exists to prove viability, then this prototype was certainly a success. One slide showed the code for the spawn_link function reimplemented in Java in ~15 lines of simple Java code.

For the second half of his talk, Kresten showed off Triq (short for Trifork Quickcheck), a scaled-down, open-source QuickCheck inspired testing framework that he built in order to test Erjang. Triq supports basic generators (called domains), value picking, and shrinking. Kresten showed examples of using Triq to validate that Erjang performs binary operations with the exact same results as Erlang.

More information about Erjang here: http://wiki.github.com/krestenkrab/erjang/

Day 2 – June 11, 2010

Efene: A Programming Language for the Erlang VM – Mariano Guerra

Mariano Guerra presented Efene, a new language that is translated into Erlang source code. Efene is intended to help coax developers into the world of Erlang who might otherwise be intimidated by the Prolog-inspired syntax of Erlang. We’ve heard about a number of other projects compiling into Erlang byte-code (such as Reia and Lisp-Flavored Erlang), but Efene takes a different approach in that the language is parsed and translated using Leex and Yecc into standard Erlang code, which is then compiled as normal. By doing this, Mariano manages to leave most of the heavy lifting of optimizations to the existing Erlang compiler.

Efene actually supports two different syntax flavors, one with curly brackets, the other without, leading to a syntax that feels vaguely like Javascript or Python, respectively. (The syntax without curly brackets is called Ifene, for “Indented Efene”, and is otherwise identical to Efene.)

In some places, Efene syntax is a bit more verbose than Erlang. This is done to make the language more readable than Erlang. (“if” and “case” statements have more structure in Efene than Erlang.) In other places, Efene requires less typing, multi-claused function definitions don’t require you to repeat the function name, for example.

Code samples and more information: http://marianoguerra.com.ar/efene

Erlang in Embedded Systems – Gustav Simonsson, Henrik Nordh, Fredrik Andersson, Fabian Bergstrom, Niclas Axelsson and Christofer Ferm

Gustav, Henrik, Fredrik, Fabian, Niclas, and Christofer (Uppsala University), in cooperation with Erlang Solutions, worked on a project to shrink the Erlang VM (plus the Linux platform on which it runs) down to the smallest possible footprint for use on Gumstix and BeagleBoard hardware.

The team experimented with OpenEmbedded and Angstrom, using BusyBox, uClibc, and stripped .beam files to further decrease the footprint. During the presentation, they played a video showing how to install Erlang on a Gumstix single-board computer in 5 minutes using their work.

More information about Embedded Erlang here: http://embedded-erlang.org

Zotonic: Easy Content Management with Erlang’s Performance and Flexibility – Marc Worrell

Marc Worrell (WhatWebWhat) breaks CMSs into:

  • 1st Generation – Static text and images
  • 2nd Generation – Database- and template-driven systems (covers current CMS systems)
  • 3rd Generation – Highly interactive, real-time, personalized data exchanges and frameworks

Zotonic is aimed squarely at the third generation, Zotonic turns a CMS into a living, breathing thing, where modules on a page talk to each other and other sessions via comet, and the system can be easily extended, blurring the line between CMS and application framework.

This interactivity is what motivated Marc to write the system in Erlang; at one point he compared the data flowing through the system to a telephone exchange. Zotonic uses Webmachine, Mochiweb, ErlyDTL, and a number of other Erlang libraries, with data in PostgreSQL. (Marc also mentioned Nitrogen as an early inspiration for Zotonic, parts of Zotonic are based on Nitrogen code, though much has been rewritten.)

The data model is physically simple, with emergent functionality. A site is developed in terms of objects (called pages) interlinked with other objects. In other words, from a data perspective, adding an image to a web page is the same as linking from a page to a subpage, or tagging a page with an author. Mark gave a live demo of Zotonic’s ability to easily add and change menu structures, modify content, and add and re-order images. Almost everything can be customized using ErlyDTL templates. Very polished stuff.

Marc then introduced his goal of “Elastic Zotonic”, a Zotonic that can scale in a distributed, fault-tolerant, “buzzword-compliant” way, which will involve changes to the datastore and some of the architecture.

Marc is now working with Maximonster to develop an education-oriented social network on top of Zotonic.

More information: http://zotonic.com

Closing Session

Francesco (CSO, Erlang Solutions, Ltd.) thanked the sponsors, presenters, and audience. Frank then gave a big special thanks to Frank Knight and Joanna Włodarczyk, who both worked tirelessly to organize the conference and make everything go smoothly.

Final Thoughts

Erlang is gaining momentum in the industry as a platform that enables you to solve distributed, massively concurrent problems. People aren’t flocking directly to Erlang itself, they are instead flocking to projects built in Erlang, such as RabbitMQ, ejabberd, CouchDB, and of course, Riak. At the same time, other languages are adopting some of the key features that make Erlang special, including a message-passing architecture and lightweight threads.

Enormous opportunity as relational database model begins to fail under the strain of Big Data and the Internet

CAMBRIDGE, MA – March 30Erlang Solutions, Ltd., the leading provider of Erlang services, and Basho Technologies, the makers of Riak, a high-availability data store written in Erlang, today announced they have entered into a multi-faceted partnership to deliver scalable, fault tolerant applications built on Riak, to the global market. Erlang OTP, an Ericsson open source technology, has found application in a new generation of fault tolerant use from companies like Facebook and Amazon.

“Erlang Solutions constantly seeks out new technologies and services to bring to its clients,” said Marcus Taylor, CEO of Erlang Solutions. “With Basho, we add not just the Riak software, but a partner to help us build an ecosystem of Erlang-based high-availability applications.”

The partnership has three major components: 1) both companies develop and support training and certification, 2) Erlang Solutions architects, load tests, and builds Riak-based applications for clients, and 3) Erlang Solutions provides Basho clients with training, systems design, prototyping, and development services.

“Erlang Solutions is globally recognized as the business thought leaders in the Erlang community,” said Earl Galleher, Basho Technologies Chairman and CEO. “Our current and future customers now have access to a new tier of professional services and we help Erlang Solutions push Erlang further into the mainstream market.”

With this partnership, Erlang Solutions now represents Basho Technologies in Europe for services and distribution. The teams will focus on high growth markets like mobile telephony, social media and e-commerce applications where current Relational Database Systems (RDBMS) solutions are struggling to keep up.

“Look at the explosive growth of SMS IM traffic,” said Francesco Cesarini, founder of Erlang Solutions, “and the cost to scale traditional infrastructure. Basho’s Riak helps clients contain these costs while increasing reliability. An ecosystem of high-availability solutions is forming and the relationship between Erlang Solutions and Basho Technologies will soon expand to include other partners and richer solutions.”

About Erlang Solutions Ltd

Founded in 1999, Erlang Solutions Ltd. www.erlang-solutions.com is an international company specialised in the Open Source language -Erlang and its middleware OTP. Erlang Solutions solves all your Erlang needs – Training, Certification, Consulting, Contracting, System Development, Support and Conferences. Erlang Solutions expert and certified consultants are the most experienced anywhere, with many having used Erlang since its early days. With offices in the UK, Sweden and Poland and clients on six continents, Erlang Solutions is available for short and long term opportunities world-wide.

About Basho Technologies

Basho Technologies, Inc., founded in January 2008 by a core group of software architects, engineers, and executive leadership from Akamai Technologies, Inc. (Nasdaq:AKAM – News), is headquartered in Cambridge, Massachusetts. Basho produces Riak, a distributed data store that combines extreme fault tolerance, rapid scalability, and ease of use. Designed from the ground up to work with applications that run on the Internet and mobile networks, Riak is particularly well-suited for users of cloud infrastructure such as Amazon’s AWS and Joyent’s Smart platform and is available in both an open source and a paid commercial version. Current customers of Riak include Comcast Corporation, MIG-CAN, and Mochi Media.

Media Contacts
Earl Galleher
CEO, Basho Technologies, Inc.

Riak in Production – A Distributed Event Registration System Written in Erlang

March 20, 2010

Riak, at its core, is an open source project. So, we love the opportunity to hear from our users and find out where and how they are using Riak in their applications. It is for that reason that we were excited to hear from Chris Villalobos. He recently put a Distributed Event Registration application into production at his church in Gainesville, Florida, and after hearing a bit about it, we asked him to write a short piece about it for the Basho Blog.

Use Case and Prototyping

As a way of going paperless at our church, I was tasked with creating an event registration system that was accessible via touchscreen kiosk, SMS, and our website, to be used by members to sign up for various events. As I was wanting to learn a new language and had dabbled in Erlang (specifically Mochiweb) for another small application, I decided that I was going to try and do the whole thing in Erlang. But how to do it, and on a two month time line, was quite the challenge.

The initial idea was to have each kiosk independently hold pieces of the database, so that in the event something happened to a server or a kiosk, the data would still be available. Also, I wanted to use the fault-tolerance and distributed processing of Erlang to help make sure that the various frontends would be constantly running and online. And, as I wanted to stay as close to pure Erlang as possible, I decided early against a SQL database. I tried Mnesia but I wasn’t happy with the results. Using QLC as an interface, interesting issues arose when I took down a master node. (I was also facing a time issue so playing with it extensively wasn’t really an option.)

It just so happened that Basho released Riak 0.8 the morning I got fed up with it. So I thought about how I could use a key/value store. I liked how the Riak API made it simple to get data in and out of the database, how I could use map-reduce functionality to create any reports I needed and how the distribution of data worked out. Most importantly, no matter what nodes I knocked out while the cluster was running, everything just continued to click. I found my datastore.

During the initial protoyping stages for the kiosk, I envisioned a simple key/value store using a data model that looked something like this:

{key1, {Title, Icon, Background Image, Description, [signup_options]}},
{key2, {…}}

This design would enable me to present the user with a list of options when the kiosk was started up. I found that by using Riak, this was simple to implement. I also enjoyed that Riak was great at getting out of the way. I didn’t have to think about how it was going to work, I just knew that it would. ( The primary issue I kept running into when I thought about future problems was sibling entries. If two users on two kiosks submit information at the same time for the same entry, (potentially an issue as the number of kiosks grow), then that would result in sibling entries because of the way user data is stored:

<>, <>, [user data]

But, by checking for siblings when the reports are generated, this problem became a non-issue.)

High Level Architecture

The kiosk is live and running now with very few kinks (mostly hardware) and everything is in pure Erlang. At a high level, the application architecture looks like this:

Each Touchscreen Kiosk:

  • wxErlang
  • Riak node

Web-Based Management/SMS Processing Layer:

  • Nitrogen Framework speaking to Riak for Kiosk Configuration/Reporting
  • Nitrogen/Mochiweb processing SMS messages from SMS aggregator

Periodic Email Sender:

  • Vagabond’s gen_smtp client on a eternal receive after 24 hours send email-loop.

In Production

Currently, we are running four Riak nodes (writing out to the Filesystem backend) outside of the three Kiosks themselves. I also have various Riak nodes on my random linux servers because I can use the CPU cycles on my other nodes to distribute MapReduce functions and store information in a redundant fashion.

By using Riak, I was able to keep the database lean and mean with creative uses of keys. Every asset for the kiosk is stored within Riak, including images. These are pulled only whenever a kiosk is started up or whenever an asset is created, updated, or removed (using message passing). If an image isn’t present on a local kiosk, it is pulled from the database and then stored locally. Also, all images and panels (such as the on-screen keyboard) are stored in memory to make things faster.

All SMS messages are stored within an SMS bucket. Every 24 hours all the buckets are checked with a “mapred_bucket” to see if there are any new messages since the last time the function ran. These results are formatted within the MapReduce function and emailed out using the gen_smtp client. As assets are removed from the system, the current data is stored within a serialized text file and then removed the database.

As I bring more kiosks into operation, the distributed map-reduce feature is becoming more valuable. Since I typically run reports during off hours, the kiosks aren’t overloaded by the extra processing power. So far I have been able to roll out a new kiosk within 2 hours of receiving the hardware. Most of this time is spent doing the installation and configuration of the touchscreen. Also, the system is becoming more and more vital to how we are interfacing with people, giving members multiple ways of contacting us at their convenience. I am planning on expanding how I use the system, especially with code-distribution. For example, with the Innostore interface, I might store the beam files inside and send them to the kiosks using Erlang commands. (Version Control inside Riak, anyone?)

What’s Next?

I have ambitious plans for the system, especially on the kiosk side. As this is a very beta version of the software, it is only currently in production in our little community. That said, I hope to open source it and put it on github/bitbucket/etc. as soon as I pretty up all the interfaces.

I’d say probably the best thing about this whole project is getting to know the people inside the Erlang community, especially the Basho people and the #erlang regulars on IRC. Anytime I had a problem, someone was there willing to work through it with me. Since I am essentially new to Erlang, it really helped to have a strong sense of community. Thank you to all the folks at Basho for giving me a platform to show what Erlang can do in everyday, out of the way places.

Chris Villalobos

Using Innostore with Riak

February 22, 2010

Innostore is an Erlang application that provides an API for storing and retrieving key/value data using the InnoDB storage system. This storage system is the same one used by MySQL for reliable, transactional data storage. It’s a proven, fast system and perfect for use with Riak if you have a large amount of data to store. Let’s take a look at how you can use Innostore as a backend for Riak.

(Note: I assume that you have successfully built an instance of Riak for your platform. If you built Riak from source in ~/riak, then set $RIAK to ~/riak/rel/riak.”)

We first get started by grabbing a stable release of Innostore. You’ll need to download the source for a release from: https://github.com/basho/innostore

Looking in the “Tags & snapshots” section, you should download the source for the highest available RELEASE_* tag. In my case, RELEASE_4 is the most recent release, so I’ll grab the bz2 file associated with it.

Once I have the source code, it’s time to unpack it and build:

$ tar -xjf innostore-RELEASE_4.tar.bz2

$ cd innostore

$ make

Depending on the speed of the machine you are building on, this may take a few minutes to complete. At the end, you should see a series of unit tests run, with the output ending:

All 7 tests passed.

100222 7:43:58 InnoDB: Shutdown completed; log sequence number 90283

Cover analysis: /Users/dizzyd/src/public/innostore/.eunit/index.html

Now that we have successfully built Innostore, it’s time to install it into the Riak distribution:

$ ./rebar install target=$RIAK/lib

If you look in the $RIAK/lib directory now, you should see the innostore-4 directory alongside a bunch of .ez files and other directories which compose the Riak release.

Now, we need to tell Riak to use the Innostore driver as a backend. Make sure Riak is not running. Edit $RIAK/etc/app.config, setting the value for “storage_backend” as follows:

{storage_backend, innostore_riak},

In addition, append the configuration for the Innostore application after the SASL section:

{sasl, [ ....

]}, %% < -- make sure you add a comma here!!

{innostore, [

{data_home_dir, "data/innodb"}, %% Where data files go

{log_group_home_dir, "data/innodb"}, %% Where log files go

{buffer_pool_size, 2147483648} %% 2G in-memory buffer in bytes


You may need to adjust the directories for your data_home_dir and log_group_home_dirs to match where you want the inno data and log files to be stored. If possible, make sure that the data and log dirs are on separate disks — this can yield much better performance.

Once you’ve completed the changes to $RIAK/etc/app.config, you’re ready to start Riak:

$ $RIAK/bin/riak console

As it starts up, you should see messages from Inno that end with something like:

100220 16:36:58 InnoDB: highest supported file format is Barracuda.

100220 16:36:58 Embedded InnoDB started; log sequence number 45764

That’s it! You’re ready to start using Riak for storing truly massive amounts of data.


Dave Smith

Basho Podcast Three – An Introduction To Innostore

February 2, 2010

You may remember that Basho recently open-sourced Innostore, our standalone Erlang application that provides a simple interface to embedded InnoDB…

In this podcast, Dave “Dizzy” Smith and Justin Sheehy discuss the release of Innostore, why we built it, how we use it in Riak, and why it might be useful for other Erlang projects. The discussion focuses on the stability and predictability of InnoDB, especially under load and as compared with other storage backends like DETS.

And of course, go download Innostore when you are done with the podcast.



If you are having problems getting the podcast to play, click here to play in new window or right click to download the podcast.

Innostore — connecting Erlang to InnoDB

January 26, 2010

Riak has pluggable storage engines, and so we’re always on the lookout for better ways that users can store their data locally. Recent experiences with some Basho customers managing some large datasets led us to believe that InnoDB might work out very well for them.

To answer that question and fill that need, Innostore was written. It is a standalone Erlang application that provides a simple interface to Embedded InnoDB. So far its performance has been quite good, though InnoDB (with or without the Innostore API) is highly dependent on tuning the local configuration to match the local hardware. Luckily, Dizzy — the author of Innostore — has some heavy-duty experience doing that kind of tuning and as a result we’ve been able to help people meet their performance goals using Innostore.


Basho Podcast Two – An Introduction to erlang_js

January 19, 2010

Check out the new Basho podcast featuring Kevin Smith and Bryan Fink that discusses erlang_js, a simple and easy-to-use binding between Erlang and JavaScript. It is packaged as an OTP application so developers can easily embed Javascript inside their own applications.

Once you are done with the podcast, go download erlang_js.


Mark Phillips

Right click here to download the Podcast

A Quick Note on Rebar

January 10, 2010

As many of you Erlang and Riak fans know, Dave Smith has been hard at work on Rebar. For those of you who don’t know, Rebar is a truly cool packaging and build tool for Erlang applications. Dave took a break from coding this morning to post a few words on his blog Gradual Epiphany about why he was inspired to write Rebar and what it means for building and deploying applications. Check it out. It’s a great read.

Also, if you haven’t had a chance to join the Rebar mailing list, you can do so here.