November 21, 2013

Tapjoy is a mobile advertising and monetization platform that allows end users to select personalized advertisements that they can engage with in exchange for rewards. Tapjoy is available on over one billion devices to users all over the world. Riak has been the cornerstone of their data management strategy for the past year.

Tapjoy’s global growth required the company to consider scalability. Their original infrastructure was built on SimpleDB; however, with billions of requests coming in on an average day, they started to experience performance issues due to latency, as well as limits on the size and location of data being stored. With their growth straining their data store, they wanted to find a new solution that would guarantee performance and uptime, even with peak traffic.

“Tapjoy can’t have downtime ever, planned or unplanned,” stated Wes Jossey, Systems Engineer at Tapjoy. “If Tapjoy goes down, end users can’t interact with our platform and they leave, which is unacceptable to us and to our partners.” Due to Tapjoy’s high requirements for availability, scalability, and data redundancy, there were really only a few players in the space to choose from.

The Tapjoy team found that DynamoDB didn’t have all of the features they needed (especially Secondary Indexes, at that time); HBase wasn’t the right fit for their use case; and Cassandra was deemed too operationally intensive for their small team, based on information provided by third parties who had been using Cassandra in production for years. With Riak, the Tapjoy team estimates that they have been able to keep costs down, decrease engineering complexity, and reduce operational effort due to its ease of use and general stability.

With Riak, Tapjoy is able to meet its high availability mandate, and achieve its stringent low-latency requirements with requests as quick as 750 microseconds (due to the real-time aspects of their platform). Tapjoy stores 48TB of data in Riak and operates hundreds of thousands of reads and writes per second against their clusters.

Their current clusters are replicated between multiple data-centers, to allow for failover in the event of unexpected downtime in one of their main facilities. Tapjoy opted to become Riak Enterprise customers not only to facilitate this replication requirement, but also because of the excellent customer support the Basho team is able to provide. “We rarely have issues with Riak, so I don’t get paged,” said Jossey. “Riak is a critical piece of our business, and it’s a huge relief that it just works.”

Tapjoy leverages many open source tools and cloud-based technologies to achieve the team’s “Get stuff done” philosophy. Their stack includes:

  • Javascript, jQuery, Backbone.js, D3.js
  • Ruby on Rails, Java, Objective-C
  • Amazon Web Services
  • Chef, IronFan, Sensu, RabbitMQ
  • Riak, MySQL, Couchbase, PostgreSQL, Zookeeper
  • Hadoop, HBase, Vertica