December 28, 2010
Those of you who are familiar with the Riak Wiki may have been surprised by what you saw at wiki.basho.com starting yesterday. That’s because we just gave it a facelift. More importantly, the team at Basho has spent a fair amount of time converting the Riak Wiki to use Gollum. Gollum is a “simple, Git-powered wiki with a sweet API and local frontend” written and maintained by the awesome team at GitHub.
Those of you who are accustomed to the “old wiki” will find the same great content with a revamped design, something that more closely resembles the look and feel of basho.com and the Riak Function Contrib.
Arguably the best feature of using Gollum is that contributing to the Riak Wiki is now dead simple, and is no different than contributing code to any repo on GitHub. Riak users are, after all, developers. Why not let them work with a wiki the same way they work with Riak code itself?
This is one of several website enhancements you’ll be seeing from the team at Basho over the coming months, so stay tuned.
December 16, 2010
The community contributions to Riak have been increasing at an exciting rate. The pull requests are starting to roll in, and I wanted to take a moment and recognize several of the many contributions we’ve received over the past months and weeks (and days).
Anyone who uses Riak with Ruby knows about Ripple. This is Basho’s canonical Ruby driver and its development has been spearheaded by Basho hacker Sean Cribbs. Not long after Sean started developing this code, he saw a significant influx in Rubyists who were interested in using Riak with Ruby and wanted to lend a hand in the driver’s development. Sean was happy to oblige and, as a result, there are now 15 developers in addition to Sean who have contributed to Ripple in a significant way. Special recognition should also be given to Duff Omelia and Adam Hunter who have made significant contributions to the code and use it in production.
Francisco Treacy and the team at Widescript made it known many months ago that they were looking into Riak to power part of their application. They, along with several other community members, were experimenting with Riak and Node.js. There were a few Node clients for Riak, but they were primarily experimental and immature. Basho had plans to write one but development time was stretched and a node client was several months off.
So, they rolled their own. Francisco, along with Alexander Sicular, James Sadler, Jakub Stastny, and Rick Olson developed and released riak-js. Since its release, it has picked up a ton of users and is being used in applications all over the place. (We liked it so much we even decided to build an app on it… more on this later.).
Thanks, guys, for the node client and helping to kickstart the Riak+Node.js community.
Riak Support in Spring Data
VMware’s Spring Data project is an ambitious one, and it has huge implications for the proliferation of new database technologies in application stacks everywhere. VMware made it known that Riak was slated for integration, needing only someone to take the time to write the code to connect the two. Jon Brisbin took up the task and never looked back.
Jon’s twitter stream is essentially a running narrative of how his work on Riak developed and, as you can see, it took about a month to build support for Riak into the Grails framework, the culmination of which was the 1.0.0.M1 release of the Riak Support in Spring Data.
So, if you’re using Riak with Spring Data, you have Jon Brisbin to thank for the code that made it possible. Thanks, Jon.
I met Daniel Lindsley at StrangeLoop in October. Rusty Klophaus and I were helping him debug a somewhat punishing benchmarking test he was running against a three node Riak cluster on his laptop (during a Cassandra talk) using Basho’s Python client. About a month later Daniel wrote a fantastic blog post called Getting Started With Riak & Python. Though his impressions of Riak were positive on the whole, one of the main points of pain for Daniel was that the Python library had poor documentation. At the time, this was true. Though the library was quite mature as far as functionality goes, the docs had been neglected. I got in touch with Daniel, thanked him for the post, and let him know we were working on the docs. He mentioned he would take a stab at updating the docs if he had a free moment. Shortly thereafter Daniel sent over a huge pull request. He rewrote all of the Python documentation! And it’s beautiful. Check them out here.
Thanks to Daniel and the rest of the team at Pragmatic Badger, we have robust Python documentation. Thanks for the contribution.
Want to contribute to Riak? There is still much code to be written and the Riak community is a great place to work and play. Download the code, join us on IRC, or take a look at GitHub repos to get started.
October 1, 2012
New to Riak? Join us this Thursday for an intro to Riak webcast with Mark Phillips, Basho director of community, and Shanley Kane, director of product management. In this 30 minute talk we’ll cover the basics of:
Good and bad use cases for Riak
- Some user stories of note
- Riak’s architecture: consistent hashing, hinted handoff, replication, gossip protocol and more
- APIs and client libraries
- Features for searching and aggregating data: Riak Search, Secondary Indexes and Map Reduce
- What’s new in the latest version of Riak
Register for the webcast here.
**October 03, 2012**
[PyConZA](http://za.pycon.org/) starts this Thursday in Cape Town, South Africa, and Basho is proud to be [a sponsor](http://za.pycon.org/sponsors_basho.html). As our blurb on the PyConZA site states:
Basho is proud to sponsor PyConZA because we believe in the power and future of the South African tech community. We look forward to the valuable, lasting technologies that will be coming out of Cape Town and the surrounding areas in the coming years.
You may remember that back in August we [put out the call](http://basho.com/blog/technical/2012/08/23/Riak-ambassadors-needed-for-PyCon-ZA/) for Riak Community Ambassadors to help us out with PyconZA. As hard as it is to miss out on a chance to go to The Mother City, it wasn’t feasible for us to make it with [RICON](http://ricon2012.com) happening next week. I’m happy to report that a few fans of durability have stepped forward as Ambassadors to make sure Riak is fully-represented. If you’re lucky enough to be going to PyconZA (it’s sold out), be on the lookout for the following two characters:
### Joshua Maserow
### Mike Jones
In addition to taking in the talks, Mike and Joshua will be on hand to answer your Riak questions (or tell you where to look for the answers if they don’t have them). There will also be some production Riak users in the crowd, so you won’t have to look too far if you want to get up to speed on why you should be running Riak.
Enjoy PyconZA. And thanks again to Mike and Joshua for volunteering to represent Riak in our absence.
**October 22, 2012**
If you missed our last ‘Intro to Riak’ webcast, not to fear, we’re doing another. This Thursday (11am PT / 2pm ET) , join Shanley, Basho’s director of product marketing, and Mark, director of coumminity, for a 30 minute webcast introducing Riak’s architecture, use cases, user stories, operations and data model.
Register for the webcast [here](http://info.basho.com/IntroToRiakOct25.html).
**January 10, 2013**
Join us on Thursday, January 17 at 11 am PT / 2 pm ET for an intro to Riak webcast with Shanley Kane, director of product management and Mark Phillips, director of community management.
In 30 minutes, we’ll cover:
* Good and bad use cases for Riak
* User stores of note, including social, content, advertising and session storage
* Riak’s architecture and mechanisms for remaining highly available in failure conditions
* APIs, data model and client libraries
* Features for querying and searching data
* What’s new in the latest version of Riak and what’s next
[Sign up for the webcast here](http://info.basho.com/IntroToRiakJan17.html).
**January 02, 2013**
New to Riak? Thinking about using Riak instead of a relational database? Join Basho chief architect Andy Gross and director of product management Shanley Kane for an intro this Thursday (11am PT/2pm ET). In about 30 minutes, we’ll cover the basics of:
* Scalability benefits of Riak, including an examination of limitations around master/slave architectures and sharding, and what Riak does differently
* A look at the operational aspects of Riak and where they differ from relational approaches
* Riak’s data model and benefits for developers, as well as the tradeoffs and limitations of a key/value approach
* Migration considerations, including where to start when migrating existing apps
* Riak’s eventually consistent design
* Multi-site replication options in Riak
Register for the webcast [here](http://info.basho.com/RelationalToRiakJan3.html).
December 8, 2010
Thank you to all who attended the webinar yesterday. The turnout was great, and the questions at the end were also very thoughtful. Since I didn’t get to answer very many, I’ve reviewed all of the questions below, in no particular order.
Q: Can you touch on upcoming filtering of keys prior to map reduce? Will it essentially replace the need for one to explicitly name the bucket/key in a M/R job? Does it require a bucket list-keys operation?
Key filters, in the upcoming 0.14 release, will allow you to logically select a population of keys from a bucket before running them through MapReduce. This will be faster than a full-bucket map since it only loads the objects you’re really interested in (the ones that pass the filter). It’s a great way to make use of meaningful keys that have structure to them. So yes, it does require an list-keys operation, but doesn’t replace the need to be explicit about which keys to select; there are still many useful queries that can be done when the keys are known ahead of time.
For more information on key-filters, see Kevin’s presentation on the upcoming MapReduce enhancements.
Q: How can you validate that you’ve reached a good/valid KV model when migrating a relational model?
The best way is to try out some models. The thing about schema design for Riak that turns your process on its head is that you design for optimizing queries, not for optimizing the data model. If your queries are efficient (single-key lookup as much as possible), you’ve probably reached a good model, but also weigh things like payload size, cost of updating, and difficulty manipulating the data in your application. If your design makes it substantially harder to build your application than a relational design, Riak may not be the right fit.
Q: Are there any “gotchas” when thinking of a bucket as we are used to thinking of a table?
Like tables, buckets can be used to group similar data together. However, buckets don’t automatically enforce data structure (columns with specified types, referential integrity) like relational tables do; that part is still up to your application. You can, however, add precommit hooks to buckets to perform any data validation that your application shouldn’t handle.
Q: How would you create a ‘manual index’ in Riak? Doesn’t that need to always find unique keys?
One basic way to structure a manually-created index in Riak is to have a bucket specifically for the index. Keys in this bucket correspond to the exact value you are indexing (for fuzzy or incomplete values,
use Riak Search). The objects stored at those keys have links or lists of keys that refer to the original object(s). Then you can find the original simply by following the link or using MapReduce to extract and find the related keys.
The example I gave in the webinar Q&A was indexing users by email. To create the index, I would use a bucket named
users_by_email. If I wanted to lookup my own user object by email, I’d try to fetch the object
firstname.lastname@example.org, then follow the link in it (something like
riaktag="indexed") to find the actual data.
Whether those index values need to be unique is up to your application to design and enforce. For example, the index could be storing links to blog posts that have specific tags, in which case the index need not be unique.
To create the index, you’ll either have to perform multiple writes from your application (one for the data, one for the index), or add a commit hook to create and modify it for you.
Q: Can you compare/contrast buckets w/ Cassandra column families?
Cassandra has a very different data model from Riak, and you’ll want to consult with their experts to get a second opinion, but here’s what I know. Column families are a way to group related columns together that you will always want to retrieve together, and is something that you design up-front (it requires restarting the cluster for changes to take effect). It’s the closest thing to a relational table that Cassandra has.
Although you do use buckets to group similar data items, in contrast, Riak’s buckets:
- Don’t understand or enforce any internal structure of the values,
- Don’t need to be created or designed ahead of time, but pop into existence when you first use them, and
- Don’t require a restart to be used.
Q: How would part sharing be achieved? (this is a reference to the example given in the webinar, Radiant CMS)
Radiant shares content parts only when specified by the template language, and always by inheritance from ancestor pages. So if the layout contained
<r:content part="sidebar" inherit="true"
/>, then if the currently rendering page doesn’t have that content part, it will look up the hierarchy until it finds it. This is one example of why it’s so important to have an efficient way to traverse the site hierarchy, and why I presented so many options.
Q: What is the max number of links an object can have for Link Walking?
There’s no cut-and-dry answer for this. Theoretically, you are limited only by storage space (disk and RAM) and the ability to retrieve the object from the desired interface. In a practical sense this means that the default HTTP interface limits you to around 100,000 links on a single object (based on previous discussions of the limits of HTTP packets and header lengths). Still, this is not going to be reasonable to deal with in your application. In some applications we’ve seen links on the order of hundreds per object negatively impact link-walking performance. If you need to have that many, you’ll be better off exploring other designs.
Again, thanks for attending! Look for our next webinar coming in about month.
— Sean, Developer Advocate
December 2, 2010
A short while ago I made it known on the Riak Mailing list that the Basho Dev team was working on getting “more resources out the door to help the community better use Riak.” Today we are pleased to announce that Riak Function Contrib, one of these resources, is now live and awaiting your usage and code contributions.
What is Riak Function Contrib?
Riak Function Contrib is a home for MapReduce, Pre- and Post-Commit, and “Other” Functions or pieces of code that developers are using or might have a need for in their applications or testing. Put another way, it’s a Riak-specific function repository and library. Riak developers are using a lot of functions in a ton of different ways. So, we built Function Contrib to promote efficient development of Riak apps and to encourage a deeper level of community interaction through the use of this code.
How do I use it?
There are two primary ways to make use of Riak Function Contrib:
- Find a function – If, for instance, you needed a Pre-Commit Hook to validate a JSON document before you store it in Riak, you could use or adapt this to your needs. No need to write it yourself!
Riak Function Contrib is far from being a complete library. That’s where you come in. If you have a function, script, or some other piece of code that you think may be beneficial to someone using Riak (or Riak Search for that matter), we want it. Head over to the Riak Function Contrib Repo on GitHub and check out the README for details.
In the meantime, the Basho Dev team will continue polishing up the site and GitHub repo to make it easier to use and contribute to.
We are excited about this. We hope you are, too. As usual, thanks for being a part of Riak.
More to come…
December 1, 2010
Moving applications to Riak involves a number of changes from the status quo of RDBMS systems, one of which is taking greater control over your schema design. You’ll have questions like: How do you structure data when you don’t have tables and foreign keys? When should you denormalize, add links, or create MapReduce queries? Where will Riak be a natural fit and where will it be challenging?
We invite you to join us for a free webinar on Tuesday, December 7 at 2:00PM Eastern Time to talk about Schema Design for Riak. We’ll discuss:
- Freeing yourself of the architectural constraints of the “relational” mindset
- Gaining a fuller understanding of your existing schema and its queries
- Strategies and patterns for structuring your data in Riak
- Tradeoffs of various solutions
We’ll address the above topics and more as we design a new Riak-powered schema for a web application currently powered by MySQL. The presentation will last 30 to 45 minutes, with time for questions at the end.
If you missed the previous version of this webinar in July, here’s your chance to see it! We’ll also use a different example this time, so even if you attended last time, you’ll probably learn something new.
Fill in the form below if you want to get started building applications on top of Riak!
Sorry, registration is closed! Video of the presentation will be posted on Vimeo after the webinar has ended.