March 30, 2015
This is the first post in a series of blog posts, entitled Riak Customer Stories, where we will look at common use cases for Riak and their applicability in specific verticals. Our first customer stories will focus on how Riak is helping Gaming companies achieve massive scalability.
Online gaming continues to grow in popularity, whether for huge gaming communities like Riot Games’ League of Legends or gaming sites like bet365, one of the world’s leading online gambling groups. This growth is forcing changes to existing infrastructure in order to keep up with demand and innovation. Traditional relational databases can’t meet the requirements for massive scalability, speed, and fault tolerance
Innovation is critical to retain long-term customer loyalty and is changing the way gamers play online. These changes include the move away from single bets on an event to in-game betting on an ever-increasing range of metrics. The advent of regional gaming competitions, like the League of Legends World Championship with an annual grand prize of $1 million, show just how far gaming has come.
Gaming on Riak
Companies who build games or betting sites use Riak in three key ways:
- Player Data – Riak provides low-latency, highly available data storage for key player data, including user and profile information, game performance, statistics and rankings, and more. Riak also provides many different tools for querying and indexing this data, such as a full-text search engine and secondary indexing.
- Session Storage – Riak is used to store and serve session data with predictable low-latency, which is necessary for game play. Riak imposes no restrictions on the type of content stored (since all objects are stored on disk as binaries), so session data can be encoded in many ways and can evolve without administrative changes to schemas.
- Global Data Locality – While gaming, players require a low-latency experience, regardless of their physical location. Interrupted or slow game play can lead to poor user experience and player abandonment. Riak Enterprise’s multi-datacenter capabilities allow game data to be physically close to players and for fast response times regardless of player location.
- Social Information – Riak is built for very fast data storage. Due to its inherent design and Riak’s simple key/value data model, Riak is ideal for storing and serving social content such as social graph information, player profiles, player relationships, social authentication accounts, and other types of social gaming data.
By using Riak, companies have achieved global availability, massive scalability, while still maintaining operational simplicity These benefits are derived from the core architectural decisions made in the design of Riak.
By design Riak is masterless. Each node in a Riak cluster is the same, containing a complete and independent copy of the Riak package. There is no “master” or coordinating node. This uniformity provides the basis for Riak’s fault-tolerance and scalability. When this is coupled with an even distribution of data around the cluster via consistent hashing, there is a significant decrease in risky “hot spots” in the database while lowering the operational burden associated with manually sharding data. In addition, new nodes can easily be added with automatic, minimal redistribution of data.
This distribution of data in a masterless system is supplemented with a process of “hinted handoff”. Hinted handoff lets Riak cleanly handle node failure. If a node fails, a neighboring node takes over its storage operations. When the failed node returns, any updates received by the neighboring node are handed back to it. This ensures availability for writes and updates and happens automatically.These are discussed in greater detail in a blog post entitled Why Riak Just Works.
Modeling Gaming Applications in Riak
The table below illustrates key/value mappings for common application types. Remember that values in Riak are opaque and stored on disk as binaries – JSON or XML documents, images, text, etc. Riak has a “schemaless” design. Objects are comprised of key/value pairs, which are stored in flat namespaces called “buckets.” The way data is organized in Riak should take into account the unique needs of the application, including access patterns such as read/write distribution, latency differences between various operations, use of Riak features (including MapReduce, Search, Secondary Indexes), and more.
Here are some common approaches to structuring gaming data with Riak’s key/value design:
|Player Data||Login, email, UUID||Player Attributes (often stored as a JSON document); Player Rewards and Stats|
|Social Data||Login, email, UUID||Player Profiles, Social Graph Information, Facebook/Twitter Tokens|
|Session Information||User/Session ID||Session Data|
|Image or Video Content||Content Name, ID, or Integer||.JPG, .PNG, .GIF or other image format; .MOV, .MPG, .MP4 or other video file format|
Gaming Customer Stories
In a recent webinar, Dan Macklin, Head of Research and Development at bet365, provided an overview of their decision making process in choosing Riak. As one of the world’s leading online gambling groups, with over 18 million customers in two hundred countries, bet365 has a unique perspective on making an informed, strategic decision when designing an always available application architecture.
In this webinar, Dan discussed:
- bet365’s journey to Riak
- The evaluation and technical challenges being addressed
- The triumphs of migrating to Riak
- Advice for anyone evaluating their database requirements
bet365 was faced with a massive scale issue. Their existing SQL, relational database solution was simply unable to keep up with the demand placed on it by their infrastructure without needed to incur the complexity and cost of sharding. The lack of scalability was causing undue stress on their infrastructured leading to a loss of availability. Of particular interest, for those sharing a similar decision making process, is that Dan discusses not only their search for a solution but their decision making process that ultimately identified Riak
The session is available for replay here.
At RICON 2014, Basho’s distributed systems conference for developers, Michal Ptaszek gave a session entitled Let’s Chat About Chat. This session provided detailed insight into how Riot Games built their League of Legends chat system with Riak to handle 70 million players.
In League of Legends, just as in any competitive team game, communication is essential to success. Therefore, when building Chat for the game we had to make sure that the new service would be absolutely rock solid in every respect. This includes not only guaranteed message delivery and consistent presence propagation across the system, but also maintenance of the created social network graph.
In this talk I would like to present how we achieved linear scalability for Chat, improved its overall fault tolerance, and got ready for the new features we wanted to ship. I will also discuss in detail why we migrated our data from MySQL to Riak and how we used CRDTs to deal with conflicting object updates.
As is thematic in gaming use cases, database scalability was a primary consideration and was an architectural consideration from the start. Riot Games started their application modeling with MySQL –a relational database– but hit multiple performance, reliability, and scalability issues. As an example, it simply was not possible to update the database schema fast enough to track changes made in code.
In addition, Riot Games leverages the multi-datacenter capabilities off Riak Enterprise to export persistent data to a secondary Riak cluster. Costly ETL queries, like social graph queries, are run on the secondary cluster without interrupting the primary cluster. This design pattern is often referred to as a “Secondary Analytics Cluster”.
Some statistics that highlight the immense scale that Riot deals with:
- 67 million unique players every month (not counting other services using chat)
- 27 million daily players
- 7.5 million concurrent players
- 1 billion events routed per server, per day, only using 20-30 percent of available CPU and RAM
- 11K messages per second
- A few hundred chat servers are deployed around the world. Managed by 3 people
- 99% uptime
To learn more about Riak in the Gaming and Gambling industry, there are several useful resources to begin your research and design your deployment.
- Riak Solution for Gaming – This Solution Brief discusses using Riak for a variety of gaming and gambling use cases.
- Riak Tech Talk – Our experienced team can help develop your use case, answer questions, and make sure you are successful at every step from development to production. We can arrange either in-person or virtual meetings, depending on availability and location.
- Why bet365 chose Riak – Get a better understanding of how to make informed strategic decisions directly from someone who has taken the journey. Dan Macklin, Head of Research and Development at bet365 will show you how. His story about choosing Riak will captivate anyone that needs to ensure their data is always available.
February 23, 2015
Over the last week, for a variety of reasons, the topic of security in the NoSQL space has become a prominent news item. Chief among these reasons was the announcement of a popular NoSQL database having multiple instances exposed to the public internet. From the headlines you might think that NoSQL solutions have inherent security problems. In fact, in some cases, the discussion is positioned intentionally as a relational vs. NoSQL issue. The reality is that NoSQL is not more or less secure than a traditional RDBMS.
The Security of any component of the technology stack is both the responsibility of the vendor providing the technology and those that are deploying it. How many routers are running with the default administrative password still set? Similarly, exposing any database, regardless of type, to the public internet without taking appropriate security precautions, including user authentication and authorization, is a “bad idea.” A base level of network security is an absolute requirement when deploying any data persistence utility. For Riak this can include:
- Appropriate physical security (including policies about root access)
- Securing the epmd listener port, handoff_port listener port, and the range ports specified in the riak.conf
- Defining users and optionally, groups (using Riak Security in Riak 2.0)
- Defining an authentication source for each user
- Granting necessary permissions to each user (and/or group)
- Checking Erlang MapReduce code for invocations of Riak modules other than riak_kv_mapreduce
- Ensuring your client software passes authentication information with each request, supports HTTPS or encrypted Protocol Buffers traffic
If you enable Riak security without having an established functioning SSL connection, all request to Riak will fail because Riak security (when enabled) requires a secure SSL connection. You will need to generate SSL certificates, enable SSL, and establish a certification configuration on each node.
The security discussion does not, however, end at the network. In fact, for those who are familiar with the Open Systems Interconnection model (OSI), a 7 layer conceptual model that characterizes and standardizes the internal functions of a communication system by partitioning it into abstraction layers, (ISO 7498-1) there is a corresponding security architecture reference (ISO 7498-2)…and that is just for the network. It is necessary to take adopt a comprehensive approach to security at every layer of the application stack…including the database.
The process of securing a database, which is only a component of the application stack, requires striking a fine balance. Basho has worked with large enterprise customers to ensure that Riak’s security architecture meets the needs of their application deployments and balances the effort required with the security, or compliance, requirements demanded by some of the worlds largest deployments.
NoSQL vs. Relational Security
As enterprises continue to adopt NoSQL more broadly, the question of security will continue to be raised. The reality is simple, it is necessary to evaluate the security of the database you are exploring in the same way that you would evaluate its scalability or availability characteristics. There is nothing inherent to the NoSQL market that makes it less, or more, secure that relational databases. It is true that some relational database, by aegis of their age and maturation, have more expansive security tooling available. However, when adopting a holistic, risk-based approach to security NoSQL solutions — like Riak — are as secure as required.
Security and Compliance
A compliance checklist (be it HIPAA or PCI) details, in varying specificity, the security requirements to achieve compliance. This checklist is subsequently verified through an audit by an independent entity…as well as ongoing internal audits.
So can I use NoSQL in compliant environments?
Without question, Yes. The difficulty of achieving compliance will depend on how the database is configured, what controls it provides for authentication and authorization, and many other elements of your application stack (including physical security of the datacenter, etc). Basho customers have deployed Riak in highly regulated environments and achieved their compliance requirements.
I would encourage you, however, to realize that compliance is an event. The process of securing your application, database, datacenter, etc. is an ongoing exercise. Many, particularly those in the payments industry, refer to this as a “risk-based” approach to security vs. a “compliance-based” approach.
Security and Riak
In nearly all commercial deployments of Riak, Riak is deployed on a trusted network and unauthorized access is restricted by firewall routing rules. This is expected, this is necessary and is sufficient for many use cases (when included as part of a holistic security posture including locking down ports, reasonable policies regarding root access, etc.). Some applications need an additional layer of security to meet business or regulatory compliance requirements.
To that end, in Riak 2.0, the security store changed substantially. While you should — without question — apply network layer security on top of Riak and the systems that Riak runs upon, there are now security features built into Riak that protect Riak itself, not just its network. This includes authentication (the process of identifying a user) and authorization (verifying whether the authenticated user has access to perform the requested operation). Riak’s new security features were explicitly modeled after user- and role-based systems like PostgreSQL. This means that the basic architecture of Riak Security should be familiar to most.
In Riak, administrators can selectively control access to a wide variety of Riak functionality. Riak Security allows you to both authorize users to perform specific tasks (from standard read/write/delete operations to search queries to managing bucket types and more) and to authenticate users and clients using a variety of security mechanisms. In other words, Riak operators can now verify who a connecting client is and determine what that client is allowed to do (if anything). In addition, Riak Security in 2.0 provides four options for security sources:
- trust — Any user accessing Riak from a specified IP may perform the permitted operations
- password — Authenticate with username and password (works essentially like basic auth)
- pam — Authenticate using a pluggable authentication module (PAM)
- certificate – Authenticate using client-side certificates
More detail on the Riak 2.0 Security capabilities are presented in the Security section of the documentation, in particular the section entitled Authentication and Authorization.
With a NoSQL system that provides authentication and authorization, and a properly secured network, you have progressed a long way in reducing the risk profile of your system. The application layer, of course, must still be considered.
Relational databases are still a part of the technology stack for many companies; others are innovating and incorporating NoSQL solutions either as a replacement for or alongside existing relational databases. As a result they have simplified their deployments, enhanced their availability, and reduced their costs.
Join us for this webinar where we will look at the differences between relational databases and NoSQL databases like Riak. We will look at why companies choose Riak over a relational database. We will analyze the decision points you should consider when choosing between relational and NoSQL databases and we will look at specific use cases, review data modeling and query options.
This Webinar is being held in two time slots:
- Wednesday, March 4, 2015 8:00-9:00 AM PST (4:00-5:00 PM GMT)
- Wednesday, March 4, 2015 12:00-1:00 PM PST (3:00-4:00 PM EST)
February 1st, 2015
If you missed last week’s webinar Preparing for the Deluge of Unstructured Data you can still watch it on-demand. Dorothy Pults and I discuss the news emanating from the 2015 Consumer Electronics show and highlight that the Internet of Thing, connected devices, and the resulting explosion of unstructured data are front and center of growth trends in 2015. In particular, we covered the topics of:
- What is driving the growth in unstructured data
- The challenges associated with managing unstructured data
- How companies are capitalizing on the opportunities that unstructured data presents, to save money, time, and create new market opportunities
The webinar covers each of these topic in great details and provides some insights on distributed systems.
Why Distributed Systems?
Companies like Facebook, Amazon, and Google have built huge distributed systems with strict requirements around scalability, fault tolerance, and global footprints. These same concepts must now be considered by companies of all sizes…from the Enterprise to the startup.
The reality is that everything works at small scale. Challenges arise as it becomes necessary to scale out, up and down, predictably and linearly. When assuming that failure and latency are part of the equation, it is necessary to choose a distributed database that enables horizontal scale. And, similarly, that it enables this scale on commodity hardware or the compute instance that your business has adopted in its architecture. This is particularly important when data governance is a key component of your design considerations.
Ultimately, the customer experience matters. When designing your distributed architecture, and choosing persistence solutions like Riak, ensure that there is a solution for the geographic distribution of data (like Riak Enterprise’s multi-datacenter replication capability) to provide low latency experiences for your customers, regardless of their physical location.
For more information on this topic space, we have compiled a few resources to enable your education and decision-making.
December 23, 2013
A few weeks ago, we hosted a webinar with 451 Research entitled, “Beyond NoSQL – Distributed Databases in Production.” This webinar featured Matt Aslett (Research Director at 451 Research), Bobby Patrick (EVP and CMO at Basho Technologies), and Wes Jossey (Systems Engineer at Tapjoy).
During this one-hour webinar, we discuss the history of NoSQL, the current NoSQL landscape, and then dive into Basho’s Riak. Wes Jossey also presents a case study from Riak User, Tapjoy, about how they use Riak as the cornerstone of their data management strategy. Finally, we wrap up with a look at what’s to come with Riak 2.0.
If you weren’t able to attend this webinar (or would like to rewatch it), the recording is now available. Simply register here to receive a link to watch the recording: info.basho.com/BeyondNoSQL_Recorded.html
December 9, 2013
Tomorrow (December 10th) at 10am PT/1pm ET, we will be hosting a live webinar, “Beyond NoSQL – Distributed Databases in Production.” This webinar will feature Matt Aslett (Research Director at 451 Research), Bobby Patrick (EVP and CMO at Basho Technologies), and Wes Jossey (Systems Engineer at Tapjoy). There are still seats available, and you can register here for more details.
This webinar will talk about the history of NoSQL and what issues NoSQL aimed to solve in regard to relational systems. It will then look at the current NoSQL landscape and architecture trends. From there, the webinar will focus on Basho’s Riak, a distributed NoSQL database, and some of its key features and use cases. Tapjoy, the mobile performance-based advertising platform (and Riak user) will discuss how they use Riak to provide reliable data locality to their customers and why they selected Riak to be the cornerstone of their data management strategy. Finally, it will wrap up with a look at what’s to come with Riak 2.0 and have a live question and answer session.
Be sure and register now for “Beyond NoSQL – Distributed Databases in Production.”
November 27, 2013
Join Basho and 451 Research on Tuesday, December 10th at 10am PT for a live webinar, “Beyond NoSQL – Distributed Databases in Production.”
This webinar will feature Matt Aslett, Research Director at 451 Research, and Bobby Patrick, EVP and CMO at Basho Technologies. This webinar will set the stage with NoSQL trends and adoption across various industries. It will then discuss some of the key benefits of distributed NoSQL systems and explore how systems like Riak are evolving.
Wes Jossey, Systems Engineer at Tapjoy, will also be joining the webinar to discuss how Tapjoy uses distributed databases to provide reliable data locality to their customers through multi-datacenter replication.
Register here for the free “Beyond NoSQL – Distributed Databases in Production” webinar.
March 12, 2013
Riak provides low-latency, highly available storage to power gaming platforms and applications. Gaming companies use Riak to store player and game data, session and social information, and a variety of gaming content and events. This post offers a quick look at the advantages of Riak and some user case studies. Later this week, we’ll publish an in-depth look at the common gaming use cases and examples of data modeling.
For a complete overview, download the whitepaper, “Gaming on Riak: A Technical Introduction.”
Advantages of Riak
- Support for Rapid Growth: Built for operational ease-of-use, Riak yields a near-linear performance and throughput increase as capacity is added.
- Low-Latency Design: Riak is designed to store data and serve requests predictably and quickly, even during peak times.
- Flexible, Reliable Storage: Riak has a flexible data model with redundancy built-in, and a number of mechanisms to maintain availability even in the event of node failure or network partition. Riak is content-agnostic, providing flexibility for document, image, video, and other storage.
- Multi-Datacenter Replication: Riak Enterprise’s multi-datacenter replication provides disaster recovery and data locality.
Hibernum is a creator and developer of unique gaming experiences that combine the latest in social gaming, top quality visuals and animations, and cutting edge design. They switched from a relational database to Riak due to its high availability, ability to scale to peak loads, and predictable operational cost. Riak is used to store user game information for one of their most popular social games. For more information about how Hibernum uses Riak, check out the complete case study.
Kiip is a platform that lets brands provide rewards to mobile gamers for in-game achievements. Kiip replaced MongoDB with Riak in order to achieve low read/write latencies and horizontal scalability. Kiip uses Riak for session and device data. To learn more about Kiip’s experience selecting Riak, check out this video by two of their engineers.
To learn more about how your gaming platform can benefit from Riak, download “Gaming on Riak: A Technical Introduction.” For more information about Riak, sign up for out webcast on Thursday, March 14.
January 31, 2013
This is the second in a series of blog posts covering Riak for retail and eCommerce platforms. To learn more, join our “Retail on Riak” webcast on Friday, February 8th or download the “Riak for Retail” whitepaper.
In our last post, we looked at three Riak users in the eCommerce/retail space. In this post, we will look at some common use cases for Riak and how to start building them with Riak’s key/value model and querying features.
- Shopping Carts: Riak’s focus on availability makes it attractive to retailers offering shopping carts and other “buy now” functionality. If the shopping cart is unavailable, loses product additions, or responds slowly to users, it has a direct impact on revenue and user trust.
- Product Catalogs: Retailers need to store anywhere from thousands to tens of thousands of inventory items and associated information – such as photos, descriptions, prices, and category information. Riak’s flexible, fast storage makes it a good fit for this type of data.
- User Information: As mobile, web, and multi-channel shopping become more social and personalized, retailers have to manage increasing amounts of user information. Riak scales efficiently to meet increased data and traffic needs and ensures user data is always available for online shopping experiences.
- Session Data: Riak provides a highly reliable platform for session storage. User/session IDs are usually stored in cookies or otherwise known at lookup time, a good fit for Riak’s key/value mode.
In Riak, objects are comprised of key/value pairs, which are stored in flat namespaces called “buckets.” Riak is content-type agnostic, and stores all objects on disk as binaries, giving retailers lots of flexibility to store anything they want. Here are some common approaches to modeling the data and services discussed above in Riak:
Riak provides several features for querying data:
Riak Search: Riak Search is a distributed, full-text search engine. It provides support for various MIME types & analyzers, and robust querying.
Possible Use Cases: Searching product information or product descriptions.
Secondary Indexing: Secondary Indexing (2i) gives developers the ability, at write time, to tag an object stored in Riak with one or more values, queryable by exact matches or ranges of an index.
Possible Use Cases: Tagging products with categories, special promotion identifiers, date ranges, price or other metadata.
Possible Use Cases: Filtering product information by tag, counting items, and extracting links to related products.
Check out our docs for more information on building applications and services with Riak.
January 29, 2013
This is the first in a series of blog posts covering the benefits Riak offers to developers and operators of retail and eCommerce platforms. To learn more, join our “Retail on Riak” webcast on Friday, February 8th.
As retailers grow and have to store more and more data, traditional relational databases aren’t always the best option. Retailers want to scale easily, without the operational burden of manual sharding. Meanwhile, business requirements demand their data is always available for reads and writes. Riak is a highly available, low latency distributed database that is ideal for retailers who need to serve product data quickly and maintain “always on” shopping experiences. Riak is based on architectural principles from Amazon. Riak is designed for high availability and scale so retailers can always serve customers, even under failure conditions, and rapidly grow to meet peak loads.
Retailers of all sizes have chosen Riak to power parts of their business, including:
- Best Buy: Best Buy is North America’s top specialty retailer of consumer electronics, personal computers, entertainment software, and appliances. Riak has been an integral part in the transformation push to re-platform Best Buy’s eCommerce platform. For more info, check out Best Buy’s talk from our 2012 developer conference, RICON.
- ideal: ideel is one of the fastest growing retailers with over 5 million members and more than 1,000 brand partners. They use Riak to serve HTML documents and user-specific products. ideel chose Riak to power their event-based shopping experience due to Riak’s ability to serve users information at low latency and provide ease of use and scale to ideel’s operations team. Check out the complete case study for more details.
- Copious: Copious is a social commerce marketplace that makes it easy for people to buy and sell the things they love. They currently store all registered accounts in Riak as well as the tokens that make it possible for users to authenticate with Copious via their Facebook or Twitter accounts. They chose to use Riak for their social login functionality because of its operational simplicity, which allows them to easily scale up without sharding and provides the high availability required for a smooth user experience. For more details, check out the complete Copious story on our blog.
For more information about the benefits of Riak for retailers and the retailers already using it, register for our “Retail on Riak” webcast on February 8th!
April 15, 2011
Thanks to all who attended Thursday’s webinar on Riak Operations. If you couldn’t make it you can find a screencast of the webinar below. You can also check out the slides directly.
This webinar took users through many of the different operational aspects of running a production Riak cluster. Topics covered include: basic Riak configuration, monitoring, performance, backups, and much more.