Tag Archives: Python

What's New In Riak's Python Client?

August 4, 2011

There’s a whole slew of new and noteworthy features in today’s release of the Python client for Riak, and I thought it’d be a good idea for us to sit down and look at a bunch of them so I can add more detail to what is already in the release notes.

Test Server

Ripple has had an in-memory test server for a while now, and I thought the Python client should have something similar too. By the way, a lot of the features here draw heavy inspiration from Ripple in general, so credit where credit is due.

The basic idea is that instead of using a locally installed Riak instance with file system storage you use one that stores data in memory instead. This is not only faster than storing everything on disk, it makes it much easier to just wipe all the data and start from scratch, without having to restart the service. In short, this is a neat way to integrate Riak into your test suite.

All the test server requires is a local installation to use the libraries from and to steal some files to build a second Riak installation in a temporary directory. Let’s look at an example:

“`python
from riak.test_server import TestServer

server = TestServer()
server.prepare()
server.start()
“`

This will start a Riak instance in the background, with the Python part interacting with it through the Erlang console. That allows you to do things like wiping all data to have a minty fresh and empty Riak installation for the next test run:

python
server.recycle()

The TestServer class has a default of where to look for a Riak installation, but the path could be anywhere you put a Riak build you made from an official release archive. Just point it to that Riak installation’s bin directory, and you’re good to go:

python
server = TestServer(bin_dir="/usr/local/riak/0.14.2/bin")

You can also overwrite the default settings used to generate the app.config file for the in-memory Riak instance. Just specify a keyword pointing to a dictionary for every section in the app.config like so:

python
server = TestServer(riak_core={"web_port": 8080})

By default the test server listens on ports 9000 (HTTP) and 9001 (Protocol buffers), so make sure you adapt your test code accordingly.

Using Riak Search’s Solr-compatible HTTP Interface

One of the nice things about Riak Search is its Solr-compatible HTTP interface. So far, you were only able to use Riak Search through MapReduce. New in release 1.3 of the Python client is support to directly index and query documents using Riak Search’s HTTP interface.

The upside is that you can use Riak Search with a Python app as a scalable full-text search without having to store data in Riak KV for them to be indexed.

The interface is as simple as it is straight forward, we’ve added a new method to the RiakClient class called solr() that returns a small façade object. That in turn allows you to interact with the Solr interface, e.g. to add documents to the index:

python
client.solr().add("superheroes",
{"id": "hulk",
"name": "hulk"
"skill": "Hulksmash!"})

You just specify an index and a document, which must contain a key-value pair for the id, and that’s it.

The beauty about using the Solr interface is that you can use all the available parameters for sorting, limiting result sets and setting default fields to query on, without having to do that with a set of reduce functions.

python
client.solr().query("superheroes", "hulk", df="name")

Be sure to check our documentation for the full set of supported parameters. Just pass in a set of keyword arguments for all valid parameters.

Something else that’s new on the search integration front is the ability to programmatically enable and disable indexing on a bucket by installing or removing the relevant pre-commit hook.

python
bucket = client.bucket("superheroes")
if not bucket.search_enabled():
bucket.enable_search()

Storing Large Files With Luwak

When building Riagi, the application showcased in the recent webinar, I missed Luwak support in the Python client. Luwak is Riak’s way of storing large files, chunked into smaller bits and stored across your Riak cluster. So we added it. The API consists of three simple functions, store_file, get_file, and delete_file.

“`python
client.store_file(“hulk”, “hulk.jpg”)

client.get_file(“hulk”)

client.delete_file(“hulk”)
“`

Connection Caching for Protocol Buffers and HTTP

Thanks to the fine folks at Formspring the Python client now sports easier ways to reuse protocol buffer and even HTTP connections, and to make their use more efficient. All of them are useful if you’re doing lots of requests or want to reuse connections across several requests, e.g. in the context of a single web request.

Here’s a summary of the new transports added in the new release, all of them accept the same parameters as the original transport classes for HTTP and PBC:

  • riak.transports.pbc.RiakPbcCachedTransport
    A cache that reuses a set of protocol buffer connections. You can set a boundary of connections kept in the cache by specifying a maxsize attribute when creating the object.
  • riak.transports.http.RiakHttpReuseTransport
    This transport is more efficient when reusing HTTP connections by setting SO_REUSEADDR on the underlying TCP socket. That allows the TCP stack to reuse connections before the TIME_WAIT state has passed.
  • riak.transports.http.RiakHttpPoolTransport
    Use the urllib3 connection pool to pool connections to the same host.

We’re always looking for contributors to the Python client, so keep those pull requests coming!

Mathias

Follow Up To Riak And Python Webinar

August 3, 2011

Thanks to everyone who attended yesterday’s webinar on Riak and Python. If you missed the webinar, we’ve got you covered. Find a screencast of the webinar below, or check it out directly on Vimeo. Sorry that it’s missing the questions I answered at the end. I recapped the questions in written form below to make up for it.

We made the slides available for your viewing pleasure as well. But most importantly, have a look at our Python library for Riak, which got a lot of feature love lately, and at the source code for Riagi, the sample Django application. It utilizes Riak for session and file storage, and Riak Search storing user and image metadata.

Thanks to dotCloud for providing the hosting for Riagi. A little birdie told us they’re working on supporting Riak as part of their official stack soon.

The Python client for Riak wouldn’t exist without the community, so we want to say thank you for your contributions, and we’re always ready for more pull requests!

Keep an eye out for a new release of the Python client this week, including several of the new features shown in the webinar!

The two questions asked at the end were:

  • Is there a Scala client for Riak? Links relevant to the answer in the video: Outdated Scala Library, a more recent fork, and our official Java client.
  • Is Protocol Buffers more efficient than using HTTP? Answer is detailed in the webinar video.