Tag Archives: webmachine

Basho and Open Source

July 23, 2013

This week is O’Reilly OSCON, a conference dedicated to all things open source. Basho is a sponsor and Basho engineer, Eric Redmond, will be delivering a presentation entitled “Distributed Patterns In Action“.

Basho first open sourced Riak in 2009. It’s a decision that helped us grow our business, and become a leader in newer, agile enterprise environments. Our participation in the open source community benefits our culture, our development process, and our business.

In honor of OSCON, we thought it important to explore the commercial aspects of our open source decision.

The Business of Open Source

Open source is in the DNA of our company, with both Riak and Riak CS available under the Apache 2 license. (It is worth noting that these products are but a few of our open source contributions, which also include Webmachine and Lager.) To turn this great code into a business, we chose to stay true to our roots as a software company, instead of just selling services. The enterprise versions of Riak and Riak CS offer the entirety of our open source software, with the addition of multi-datacenter replication and monitoring capabilities.

The decision to sell licenses to the enterprise, rather than to rely just on services, makes Basho unique. It allows us to engage with our enterprise customers in the transformation of their application architecture. They can be confident in the software’s availability and in Basho’s commitments to support them – as customers. Enterprises need an alternative to traditional database vendors, but one that can still fit — in license structure, operational management, and process integration — into a traditional organization.

Our licensing model for Riak Enterprise and Riak CS Enterprise lets us balance agility with tradition. Our community helps us develop groundbreaking software, while the enterprise license helps corporate IT and Operations sleep at night.

Open source drives adoption (a concept discussed at length in Stephen O’Grady’s book The New Kingmakers). That means Riak is used across many different industries, powering thousands of applications. That commercial validation — our success in production deployments — is accelerated due to the open source availability.

We remain keenly aware, and tremendously appreciative, that our community (from the individuals to the large organizations) guides Riak and Riak CS updates, and has been crucial to the refinement and forward momentum of this software.

Basho’s success is open source’s success. Our strengths reside both in our team and in our community, as their combined efforts improve our technology and its utilization. We are excited to see what other open source showcases are in view at OSCON 2013.

Greg Collins

Riak 1.4: Riak Control

July 16, 2013

Riak Control is a web-based administrative console for inspecting and manipulating Riak clusters.

Although Riak Control is maintained as a separate application, the necessary code for Control ships with Riak 1.1 and above and requires no additional installation steps. For details on setting up Riak Control, check out our docs.

Those are things you may already know about Control. Now let’s look at the changes in 1.4.

Cluster Management with Staging

The riak-admin command-line tool has offered staged clustering since Riak 1.2. Riak 1.4 brings that functionality to Control.

The new Cluster Management interface allows you to stage cluster node additions and removals. Once the changes have been reviewed, they can be committed to the cluster. After being committed, Control displays partition transfers and memory utilization changes as they occur.

Staged changes to the cluster:

Staged Changes

Changes committed; transfers active:

Changes Committed

Cluster stabilizes after changes:

Cluster Stabilized

Standalone Node Management Interface

Because the Cluster Management interface now operates on staged changes, actions that cannot be staged have been moved to the Node Management interface. Here, changes to individual nodes, such as stopping or marking them as down, can be applied.

Node Management

Contributing to Riak Control

Riak Control’s user interface is built using Ember.js, and for persistence, Ember Data. The backend is written in Erlang using Webmachine.

Riak Control’s modular design prevents users from having to understand every detail of its existing functionality to contribute. If you’re interested in contributing, we have outlined the process of setting up a development environment, as well as some basic rules for contribution.

Check out our Riak 1.4 announcement to learn what else is included in this release.

Hector Castro

Basho to Speak About Riak at Lambda Jam

Chicago, IL – July 8, 2013 – Throughout the Lambda Jam Conference this week, Basho will be presenting twice about various aspects of Riak, as well as hosting a workshop on Webmachine. Lambda Jam is a conference for functional programmers and features a mix of sessions and workshops. It takes place in Chicago from July 8-10.

John Daily (Technical Evangelist at Basho), will be presenting first on “Distributed Programming with Riak Core and Pipe.” During his talk, he will dive into how Riak Core and Riak Pipe can be used, both within and beyond Basho. His talk begins at 9am on Tuesday, July 9th.

On July 10th at 9:50am, Basho Architect, Steve Vinoski, will be speaking on “Addressing Network Congestion in Riak Clusters.” In this talk, he will discuss an experimental approach to alleviating network congestion effects, such as timeouts and throughput collapse for Riak clusters under extreme load. In addition to exploring network scalability issues, this talk shows how Erlang can seamlessly integrate with non-FP languages.

Finally, Sean Cribbs and Chris Meiklejohn (Software Engineers at Basho) will be hosting a workshop entitled, “Functional Web Applications with Webmachine.” This workshop will provide guidance for understanding and getting started with Webmachine. It will then gradually expose richer HTTP features, while building out an application that is used by browsers and API clients alike. Their workshop begins at 1pm on July 10th.

To see where else Basho will be speaking, please visit our Basho Events Page.

Webmachine 1.10.0: never breaks eye contact

May 3, 2013

We recently tagged version 1.10.0 of Webmachine and, in addition to a slew of bug fixes, it includes some notable new features. Those features are the subject of today’s post; but first a bit of background on the driving force for these additions.

The development of Riak CS is great for dogfooding and bringing home some of the pain points in application development using Riak. The same is also true for Webmachine.

Webmachine has not received a great deal of attention recently because it had what Riak needed and, for the most part, Webmachine has just worked. With Riak CS we needed things from Webmachine that either were not possible or did not work in a way that suited our needs. Besides there was more pressing work to be done making Riak more awesome. With Riak CS that was not always the case. So we have been adding new features we needed and we believe these features will be of use and interest to the larger Webmachine community. Dogfooding FTW again!

We have now also created a 1.11.0 tag that includes an updated tag of mochiweb so that Webmachine can be built and used with Erlang R16.

New features for 1.10.0

Run multiple dispatch groups within a single application
Users can now specify multiple groups of dispatch rules that listen on different IP addresses and ports within the same Erlang application. Read about how to configure this here.

Event-based logging system
The server modules that previously handled Webmachine logging have been replaced with an event-based system. Log event handlers can be added and removed dynamically and custom log modules can be easily added and run in concert with any existing log handlers. More details about the new logging system are here.

Ablity to specify a URL rewrite module
This feature is very similar to the mod_rewrite module for Apache httpd. A rewrite module specifies a set of rules for rewriting the URL and the rewritten URL is what is processed by the dispatch rules of Webmachine. Docs are here. The module used by Riak CS to rewrite S3 API requests can be found here.

Stream large response objects without buffering the entire object in memory
Streaming content has long been possible with Webmachine, but it was not suitable for use with large objects when not using multipart/mixed because Webmachine buffered all of the content in memory to determine the size in order to properly set the Content-Length header. This was important for Riak CS because it needed to stream back very large objects and the S3 API does not use multipart responses for this operation. Now streaming large content where the size can be determined in advance can be accomplished without having to pay the price of buffering everything in memory. More info on using this feature is here.

Ability to override the resource_module value in the request metadata
The impetus for this feature is more esoteric than the other features so an example is probably the best description. Take the case where the Webmachine resource modules duplicate a lot of code in implementing the required callbacks to service requests. One way to address this is to move much of that common code to a single module and use that common module as the resource in all dispatch rules. The ModOpts for each dispatch rule are used to specify a smaller set of callbacks for resource specialization so that logging data reflects the specialized resource module and not the common module. We will provide further details about the motivations this in a subsequent blog post focused on Riak CS. Documentation on how to configure this option can be found here.

Kelly Mclaughlin

Webmachine in the Data Center

May 19, 2010

While Riak is Basho’s most-heavily developed and widely distributed piece of open source software, we also hack on a whole host of other projects that are components of Riak but also have myriad standalone uses. Webmachine is one of those projects.

We recently heard from Caleb Tennis, a member of The Data Cave team who was tasked with building out and simplifying operations in their 80,000 square foot data center. Caleb filled us in on how he and his team are using Webmachine in their day to day operations to iron out the complexities that come with running such a massive facility. After some urging on our part, he was gracious enough to put together this illustrative blog post.



Community Manager

A Data Center from the Ground Up

Building a new data center from the ground up is a daunting task. While most of us are familiar with the intricacies of the end product (servers, networking gear, and cabling), there’s a whole backside supporting infrastructure that also must be carefully thought out, planned, and maintained. Needless to say, the facilities side of a data center can be extremely complex.

Having built and maintained complex facilities before, I already had both experience and a library of software in my tool belt that I had written to help manage the infrastructure of these facilities. However, I recognized that if I was to use the legacy software, some of which was over 10 years old, it would require considerable work to fit it to my current needs. And, during that period, many other software projects and methodologies had matured to a state that it made sense to at least consider a completely different approach.

The main crux of such a project is that it involves communications with many different pieces of equipment throughout the facility, each of which has its own protocols and specifications for communication. Thus, the overall goal of this project is to abstract the communications behind the scenes and present a consistent and clear interface to the user so that the entire process is easier.

Take for example the act of turning on a pump. There are a number of pumps located throughout the facility that need to be turned on and off dynamically. To the end user, a simple “on/off” style control is what they are looking for. However, the actual process of turning that pump on is more complicated. The manufacturer for the pump controller has a specific way of receiving commands. Sometimes this is a proprietary serial protocol, but other times this uses open standard protocols like Modbus, Fieldnet, or Devicenet.

In the past, we had achieved this goal using a combination of open source libraries, commercial software, and in-house software. (Think along the lines of something like Facebook’s Thrift, where you define the interface and let the backend implementation be handled behind the scenes;in our case, the majority of the backend was written in C++.)

“This is what led us to examine Erlang…”

But as we were looking at the re-implementation of these ideas for our data center, we took a moment to re-examine them. The implementation we had, for the most part, was stateless, meaning that as systems would GET and SET information throughout the facility, they did so without prior knowledge and without attempting to cache the state of any of the infrastructure. This is a good thing, conceptually, but is also difficult in that congestion on the communication networks can occur if too many things need access to the same data frequently. It also suffered from the same flaws as many other projects: it was big and monolithic; changes to the API were not always easy; and, most of all, upgrading the code meant stops and restarts, so upgrading was done infrequently. As we thought about the same type of implementation in our data center, it became clear that stops and restarts in general were not acceptable at all.

This is what led us to examine Erlang, with its promise of hot code upgrades, distributed goodness, and fault -tolerance. In addition, I had been wanting to learn Erlang for a while now but never really had an excuse to sit down and focus on it. This was my excuse.

While thinking about how this type of system would be implemented in Erlang, I began by writing a Modbus driver for Erlang, as a large portion of the equipment we interact with uses Modbus as part of its communications protocol. I published the fruits of these labors to GitHub (http://github.com/ctennis/erlang-modbus), in the hopes that it might inspire others to follow the same path. The library itself is a little rough (it was my first Erlang project) but it served as the catalyst for thinking about not only how to design this system in Erlang, but also how to write Erlang code in general.

Finding Webmachine

While working on this library, I kept thinking about the overall stateless design, and thought that perhaps a RESTful interface may be appropriate. Using REST (and HTTP) as the way to interface with the backend would simplify the frontend design greatly, as there are myriad tools already available for client side REST handling. This would eliminate the need to write a complicated API and have a complicated client interface for it. This is also when I found Webmachine.

Of course there are a number of different ways this implementation could have been achieved, Erlang or not. But the initial appeal of Webmachine was that it used all of the baked in aspects of HTTP, like the error and status codes, and made it easy to use URLs to disseminate data in an application. It is also very lightweight, and the dispatching is easy to configure.

Like all code, the end result was the product of several iterations and design changes, and may still be refactored or rewritten as we learn more about how we use the code and how it fits into the overall infrastructure picture.

Webmachine in Action

Let’s look at how we ultimately ended up using Webmachine to communicate with devices in our data center…

For the devices in the facility that communicate via modbus, we created a modbus_register_resource in Webmachine that handles that interfacing. For our chilled water pumps (First Floor, West, or “1w”), the URL dispatching looks like this:

{["cw_pump","1w",parameter],modbus_register_resource,[cw_pump, {cw_pump_1w, tcp, "", 502, 1}]}.

This correlates to the url: http://address:8000/cw_pump/1w/PARAMETE

So we can formulate URIs something like this:

http://address:8000/cw_pump/1w/motor_speed or http://address:8000/cw_pump/1w/is_active

And by virtue of the fact that our content type is text:

content_types_provided(RD,Ctx) ->
{[{"text/plain", to_text}], RD, Ctx}.

We use HTTP GETs to retrieve the desired result, as text. The process is diagrammed below:

This is what is looks like from the command line:

user@host:~# curl http://localhost:8000/cw_pump/1w/motor_speed


It even goes further. If the underlying piece of equipment is not available (maybe it’s powered off), we use Webmachine to send back HTTP error codes to the requesting client. This whole process is much easier than writing it in C++, compiling, distributing, and doing all of the requisite exception handling, especially across the network. Essentially, what had been developed and refined over the past 10 years as our in-house communications system was basically redone from scratch with Erlang and Webmachine in a matter of weeks.

For updating or changing values, we use HTTP POSTs to change values that are writable. For example, we can change the motor speed like this:

user@host:~# curl -X POST http://localhost:8000/cw_pump/1w/motor_speed?value=50

user@host:~# curl http://localhost:8000/cw_pump/1w/motor_speed


But we don’t stop here. While using Webmachine to communicate directly with devices is nice, there also needs to exist an infrastructure that is more user friendly and accessible for viewing and changing these values. In the past, we did this with client software, written in Windows, that communicated with the backend processes and presented a pretty overview of what was happening. Aside from the issue of having to write the software in Windows, we also had to maintain multiple copies of it actively running as a failover in case one of them went down. Additionally, we had to support people running the software remotely, from home for example, via a VPN, but still had to be able to communicate with all of the backend systems. We felt a different approach was needed.

What was this new approach? Webmachine is its own integrated webserver, so this gave us just about everything we needed to host the client side of the software within the same infrastructure. By integrating jQuery and jQTouch into some static webpages, we built an entire web based control system directly within Webmachine, making it completely controllable via mobile phone. Here is a screenshot:

What’s Next?

Our program is still a work in progress, as we are still learning all about the various ways the infrastructure works as well as how we can best interact with it from both Webmachine and the user perspective. We are very happy with the progress made thus far, and feel quite confident about the capabilities that we will be able to achieve with Webmachine as the front end to our very complex infrastructure.


Caleb Tennis is President of The Data Cave, a full service, Tier IV compliant data center located in Columbus, Indiana.

Collecta Selects Basho’s Riak EnterpriseDS to Scale Real Time Search Business

Collecta Collaborates with Basho to develop Basho’s Riak Search Product

CAMBRIDGE, MA – Dec. 15Basho Technologies, Inc., a provider of commercial and open source distributed, highly scalable data store software and analytics tools, today announced Collecta has selected Basho’s Riak Search as it continues to build its real-time search engine for the entire Web, not just social media.

Collecta, which already uses Webmachine, Basho’s open-source RESTful application server, provided the initial requirements and critical input into the design and validation of Riak’s new search and indexing features. Currently in testing with Collecta, as well as a handful of other clients, Basho will make Riak Search widely available January 15th.

“Collecta really provided the spark of motivation for us. Once they determined Riak EnterpriseDS met their stability and performance needs for document storage and retrieval, we collaborated with them to implement a search and indexing system for Riak,” said Earl Galleher, Chairman and CEO of Basho Technologies, Inc.

“Traditional indexing systems are not designed for the needs of real-time search. Using Riak, we will get much better performance and save money on infrastructure.” said Jack Moffitt, Collecta’s CTO. “We already knew Basho made great software because we used and trusted Webmachine, but Riak Search takes it to the next level.”

In reference to the Search features Mr. Moffitt said, “After an initial exchange of ideas about how to build a better indexing system on distributed databases like Riak, the Basho team went off and built a prototype. The results were so promising that we knew the combination of Collecta’s domain expertise and Basho’s data engine expertise will make for a perfect relationship.”

“Collecta immediately grasped how an indexing solution on a truly distributed, scalable data store would change the game for search,” said Justin Sheehy, Basho’s CTO. “Riak allows users to add and remove capacity as traffic spikes and ebbs or data sets grow. With no special master nodes, distributed MapReduce, and other Riak technologies, Collecta can double their analytical power and storage by doubling the number of low-cost nodes in a cluster.”

About Basho Technologies

Basho Technologies, Inc., founded in January 2008 by a core group of software architects, engineers and executive leadership from Akamai Technologies, Inc. (Nasdaq: AKAM) is headquartered in Cambridge, Massachusetts. Basho produces Riak, a distributed data store that combines high availability, easily-scalable capacity and throughput, and ease of use. Riak’s high availability data store means that applications built using Riak remain both read and write available under almost any operational conditions and without requiring intervention. Available in both an open source and a paid commercial version, Riak provides unprecedented performance and availability to web, mobile, and enterprise applications.

About Collecta

Collecta represents a new way to experience search, in real time. Collecta is the Web’s most powerful streaming real-time search engine, posting matching stories, blogs, photos, and comments as they happen. By aggregating content in real time, Collecta offers a new and more comprehensive view of what’s going on in the world right now. For more information, visit http://Collecta.com.

Media Contacts
Earl Galleher
CEO, Basho Technologies, Inc.

Collecta Chooses Riak Search

December 15, 2009

Another big announcement for the team here at Basho: Collecta, which makes a truly cool real-time streaming search engine, has chosen to use Riak Search. They are longtime Webmachine users and when they learned about Riak, they partnered with us to define Riak Search and validate the prototype.

Look for a blog post later in the day from Justin Sheehy on what it was like to work with Collecta. (Hint: it was awesome!)

Stay tuned.