Erlang Factory London Recap

June 14, 2010

This was originally posted by @rklophaus on his blog, rklophaus.com.

Erlang Factory London gathers Erlang pioneers from across the world—Berlin to Boston, Krakow to Cordoba, and San Francisco to Shanghai—for a two-day conference of innovative Erlang development.

The summaries below are just a small sampling of the talks at Erlang Factory London. There were three tracks running back-to-back for two days, and I often couldn’t decide which of the three to attend. Slides and videos will be released by Erlang Solutions, and can be found under individual track pages on the Erlang Factory website.

Day 1 – June 10, 2010

Opening Session

Francesco Cessarini (Chief Strategy Officer, Erlang Solutions Ltd.), began the conference with a warm welcome and a quick review of progress made by Erlang-based companies in the last year.

Some highlights:

LShift (RabbitMQ, built in Erlang) acquired by VMWare
Couchio formed and funded to develop and support CouchDB
MochiMedia (Flash game ad network using Erlang stack), acquired by Shanda Games
Riak received two rounds of funding to develop and support Riak
Klarna received a round of investment from Sequia Capital, and Michael Moritz previous board member of Google, Yahoo, and Paypal, joined the Klarna board.

The History of the Erlang Virtual Machine – Joe Armstrong, Robert Virding

Joe Armstrong and Robert Virding gave a colorful, back-and-forth history of the Erlang’s birth and early years. A few notable milestones and achievements:

Joe’s early work on reduction machines. Robert’s complete rewrite of Joe’s work. Joe’s complete rewrite of Robert’s work. (etc.)
How Erlang was almost based on Smalltalk rather than Prolog
The quest to make Erlang 1.0x 80 times faster
Experiments with different memory management and garbage collection schemes
The train set used demonstrate Erlang, now in Robert’s basement
The addition of linked processes, distribution, OTP, and bit syntax

It’s easy to take a language like Erlang for granted and assume that its builders followed some well-known, pre-ordained path. Hearing Erlang’s history from two of its main creators provided an excellent reminder that building software is both an art and a science, uncertain and exciting like any creative process.

Riak from the Inside – Justin Sheehy

Justin Sheehy (CTO of Riak Technologies) opened his talk by introducing Riak, “a scalable, highly-available, networked, open-source key/value store.” He then very quickly announced that he wasn’t there to talk about using Riak, he was there to talk about how Riak was built using Erlang and OTP

There are eight distinct layers involved in reading/writing Riak data:

The Client Application using Riak
The client-side HTTP API or Protocol Buffers API that talks to the Riak cluster
The server-side Riak Client containing the combined backing code for both APIs
The Dynamo Model FSMs that interact with nodes using Dynamo style quorum behavior and conflict resolution
Riak Core provides the fundamental distribution of the system (not covered in the talk)
The VNode Master that runs on every physical node, and coordinates incoming interaction with individual VNodes
Individual VNodes (Virtual Nodes) which are treated as lightweight local abstractions over K/V storage
The swappable Storage Engine that persists data to disk

During his talk, Justin discussed each layer’s responsibilities and interactions with the layers above and below it.

Justin’s main point is that carefully managed complexity in the middle layers allows for simplicity at the edge layers. The top three layers present a simple key/value interface, and the bottom two layers implement a simple key/value store. The middle layers (FSMs, Riak Core, and VNode Master) work together to provide scalability, replication, etc. Erlang makes this possible, and was chosen because it provides a platform that evolves in useful and relatively-predictable ways (this is a good thing, a surprising evolution is bad).

Mnesia for the CAPper – Ulf Wiger

Ulf Wiger (CTO of Erlang Solutions) discussed where Mnesia might fit into the changing world of databases, given the new focus on “NoSQL” solutions. Ulf gave a quick introduction to ACID properties, Brewer’s CAP theorem, and the history of Mnesia, and then dove into a feature level description/comparison of Mnesia with other databases:

Deployed commercially for over 10 years
Comparable performance to current top performers clustered SQL space
Scalable to 50 nodes
Distributed transactions with loose time limits (in other words, appropriate for transactions across remote clusters)
Built-in support for sharding (fragments)
Incremental backup

The downsides are:

Erlang only interface
Tables limited to 2GB
Deadlock prevention scales poorly
Network partitions are not automatically handled, must recombine tables automatically

Ulf and others have done work to get around some of these limitations. Ulf showed code for an extension to Mnesia that automatically merges tables after they have split, using vector clocks.

Riak Search – Rusty Klophaus

I presented Riak Search, a distributed indexing and full-text search engine built on (and complementary to) Riak.

Part one covered the main reason for building Riak search: clients have built applications that eventually need to find data by value, not just by key. This is difficult, if not impossible, in a key/value store.

Part two described the shape of the final solution we set out to create. The goal of Riak Search is to support the Lucene interface, with Lucene syntax support and Solr endpoints, but with the operations story of Riak. This means that Riak Search will scale easily by adding new machines, and will continue to run after machine failure.

Part three was an introduction to Inverted Indexing, which is the heart of all search systems, as well as the difference between Document-Partitioning and Term-Partitioning, which forms the ongoing battle in the distributed search field. Part three continued with a deep-dive into parsing, planning, and executing the search query on Erlang.

Slides: http://www.slideshare.net/rklophaus/riak-search-erlang-factory-london-2010

Building a Scalable E-commerce Framework – Michael Nordström and Daniel Widgren

Michael Nordström and Daniel Widgren presented an Erlang-based e-commerce framework on behalf of their project team from Uppsala University (Christian Rennerskog, Shahzad Gul, Nicklas Nordenmark,
Manzoor Ahmad Mubashir, Mikael Nordström, Kim Honkaniemi, Tanvir Ahmad, Yujuan Zou, and Daniel Widgren) and their industrial partner, Klarna AB.

The application uses a “LERN stack” (Linux, Erlang, Riak, Nitrogen), to provide a reusable web shop that can be quickly set up by clients, customized via templates and themes, and extended via plugins to support different payment providers.

The project is currently going a rewrite to update to the latest versions of Riak and Nitrogen.

GitHub: http://github.com/mino4071/CookieCart-2.0

Twitter: @Cookie_Cart

Clash of the Titans: Erlang Clusters and Google App Engine – Panos Papadopoulos, Jon Vlachoyiannis, Nikos Kakavoulis

Panos, Jon, and Nikos took turns describing the technical evolution of their startup, SocialCaddy, and why they were forced to move away from the Google App Engine. SocialCaddy is a tool that mines your online profiles for important events and changes, and tells you about them. For example, if a friend gets engaged, SocialCaddy will tell you about it, and assist you in sending a congratulatory note.

Google App Engine imposes a 30 second limit on requests. As SocialCaddy processed larger and larger social graphs, they bumped into this limit, which made GAE unusable as a platform. In response, the team developed Erlust, which allows you to submit jobs (written in any language) to a cluster. An Erlang application coordinates the jobs, and each job should read from a queue, process messages, and write to another queue.

Using Open-Source Trifork QuickCheck to test Erjang – Kresten Krab Thorup

Kresten Krab Thorup (CTO of Trifork) stirred up dust when he originally announced his intention to build a version of Erlang that ran on the JVM. Since then, he has made astounding progress. Erjang turns Erlang .beam files into Java .class files, now supporting a broad enough feature set to run Mnesia over distributed Erlang. Kresten claimed performance matching (or at times exceeding) that of the Erlang VM.

Erjang is still a work in progress, there are many BIFs that still need to be ported, but if a prototype exists to prove viability, then this prototype was certainly a success. One slide showed the code for the spawn_link function reimplemented in Java in ~15 lines of simple Java code.

For the second half of his talk, Kresten showed off Triq (short for Trifork Quickcheck), a scaled-down, open-source QuickCheck inspired testing framework that he built in order to test Erjang. Triq supports basic generators (called domains), value picking, and shrinking. Kresten showed examples of using Triq to validate that Erjang performs binary operations with the exact same results as Erlang.

More information about Erjang here: http://wiki.github.com/krestenkrab/erjang/

Day 2 – June 11, 2010

Efene: A Programming Language for the Erlang VM – Mariano Guerra

Mariano Guerra presented Efene, a new language that is translated into Erlang source code. Efene is intended to help coax developers into the world of Erlang who might otherwise be intimidated by the Prolog-inspired syntax of Erlang. We’ve heard about a number of other projects compiling into Erlang byte-code (such as Reia and Lisp-Flavored Erlang), but Efene takes a different approach in that the language is parsed and translated using Leex and Yecc into standard Erlang code, which is then compiled as normal. By doing this, Mariano manages to leave most of the heavy lifting of optimizations to the existing Erlang compiler.

Efene actually supports two different syntax flavors, one with curly brackets, the other without, leading to a syntax that feels vaguely like Javascript or Python, respectively. (The syntax without curly brackets is called Ifene, for “Indented Efene”, and is otherwise identical to Efene.)

In some places, Efene syntax is a bit more verbose than Erlang. This is done to make the language more readable than Erlang. (“if” and “case” statements have more structure in Efene than Erlang.) In other places, Efene requires less typing, multi-claused function definitions don’t require you to repeat the function name, for example.

Code samples and more information: http://marianoguerra.com.ar/efene

Erlang in Embedded Systems – Gustav Simonsson, Henrik Nordh, Fredrik Andersson, Fabian Bergstrom, Niclas Axelsson and Christofer Ferm

Gustav, Henrik, Fredrik, Fabian, Niclas, and Christofer (Uppsala University), in cooperation with Erlang Solutions, worked on a project to shrink the Erlang VM (plus the Linux platform on which it runs) down to the smallest possible footprint for use on Gumstix and BeagleBoard hardware.

The team experimented with OpenEmbedded and Angstrom, using BusyBox, uClibc, and stripped .beam files to further decrease the footprint. During the presentation, they played a video showing how to install Erlang on a Gumstix single-board computer in 5 minutes using their work.

More information about Embedded Erlang here: http://embedded-erlang.org

Zotonic: Easy Content Management with Erlang’s Performance and Flexibility – Marc Worrell

Marc Worrell (WhatWebWhat) breaks CMSs into:

1st Generation – Static text and images
2nd Generation – Database- and template-driven systems (covers current CMS systems)
3rd Generation – Highly interactive, real-time, personalized data exchanges and frameworks

Zotonic is aimed squarely at the third generation, Zotonic turns a CMS into a living, breathing thing, where modules on a page talk to each other and other sessions via comet, and the system can be easily extended, blurring the line between CMS and application framework.

This interactivity is what motivated Marc to write the system in Erlang; at one point he compared the data flowing through the system to a telephone exchange. Zotonic uses Webmachine, Mochiweb, ErlyDTL, and a number of other Erlang libraries, with data in PostgreSQL. (Marc also mentioned Nitrogen as an early inspiration for Zotonic, parts of Zotonic are based on Nitrogen code, though much has been rewritten.)

The data model is physically simple, with emergent functionality. A site is developed in terms of objects (called pages) interlinked with other objects. In other words, from a data perspective, adding an image to a web page is the same as linking from a page to a subpage, or tagging a page with an author. Mark gave a live demo of Zotonic’s ability to easily add and change menu structures, modify content, and add and re-order images. Almost everything can be customized using ErlyDTL templates. Very polished stuff.

Marc then introduced his goal of “Elastic Zotonic”, a Zotonic that can scale in a distributed, fault-tolerant, “buzzword-compliant” way, which will involve changes to the datastore and some of the architecture.

Marc is now working with Maximonster to develop an education-oriented social network on top of Zotonic.

More information: http://zotonic.com

Closing Session

Francesco (CSO, Erlang Solutions, Ltd.) thanked the sponsors, presenters, and audience. Frank then gave a big special thanks to Frank Knight and Joanna Włodarczyk, who both worked tirelessly to organize the conference and make everything go smoothly.

Final Thoughts

Erlang is gaining momentum in the industry as a platform that enables you to solve distributed, massively concurrent problems. People aren’t flocking directly to Erlang itself, they are instead flocking to projects built in Erlang, such as RabbitMQ, ejabberd, CouchDB, and of course, Riak. At the same time, other languages are adopting some of the key features that make Erlang special, including a message-passing architecture and lightweight threads.

Riak Blog