By Adam Wray, CEO
In January this year we announced several significant milestones on our journey to becoming the world’s leading developer of database solutions for unstructured data. I’m pleased to tell you that one quarter into the New Year, we haven’t lost a single step, and there’s much more to on the way.
Last week we announced that we have continued our bookings growth, with a 65% bookings increase from Q1 2014 to Q1 2015. Our licensing to services revenue remain at a very healthy 9:1 ratio, and innovative improvements to Riak continue unabated. But what’s most exciting to me is that the core driver of our success is that we have a strong product that meets the need of a growing number of organizations who are faced with the challenge of unlocking the value of unstructured data.
The single most important dynamic separating strategic IT from simply functional IT is making the most of the data that we generate and store – deriving true business value from the ocean of unrelated data points being generated by a rapidly expanding range of applications. NoSQL databases, given their ability to scale quickly and to enable data retrieval quickly and reliably from completely unstructured data sets, are key to enabling the extraction of that value. But we don’t make it happen alone. Our ever-expanding partner ecosystem is critical to our success, and a few of them have truly taken great strides in the past few months to move adoption of Riak forward:
- Cloudsoft recently released tested, optimized Riak blueprints to help devops teams deploy applications faster and easier across a variety of clouds including Amazon Web Services (AWS) and IBM Softlayer.
- Out longtime partner Erlang Solutions developed WombatOAM, an operations and maintenance framework which provides full visibility into the state of Riak clusters.
- One of our most innovative customer’s Tapjoy, has built a queuing system on top of Riak. They plan to open source this tool and work with Basho on future projects that leverage our highly-available distributed systems foundation.
In addition to these new partner-contributed capabilities, we also introduced a number of new technical improvements to both Riak and Riak CS, our object storage solution, that greatly enhance not only performance, but ease of deployment, scale and usability. Significantly, Riak 2.1 introduces the concept of “write once” buckets, buckets whose entries are intended to be written exactly once, and never updated or over-written. This development improves write speeds up to 2x faster in some scenarios. You will find more about these new releases in a blog published last week by our Vice President of Product, Peter Coppola, available here.
We are extremely proud of our product, our people, our partners and our prospects for the future. The work that we’ve done and the foundation we’ve built – in both business and technical terms, have us perfectly poised for the next level of growth.
It’s going to be an exciting year! In the coming months we’ll unveil some very significant product developments that the team at Basho is very excited about. Thanks for your interest and I sincerely look forward to discussing these new developments with you in a blog post in the very near future.
March 30, 2015
This is the first post in a series of blog posts, entitled Riak Customer Stories, where we will look at common use cases for Riak and their applicability in specific verticals. Our first customer stories will focus on how Riak is helping Gaming companies achieve massive scalability.
Online gaming continues to grow in popularity, whether for huge gaming communities like Riot Games’ League of Legends or gaming sites like bet365, one of the world’s leading online gambling groups. This growth is forcing changes to existing infrastructure in order to keep up with demand and innovation. Traditional relational databases can’t meet the requirements for massive scalability, speed, and fault tolerance
Innovation is critical to retain long-term customer loyalty and is changing the way gamers play online. These changes include the move away from single bets on an event to in-game betting on an ever-increasing range of metrics. The advent of regional gaming competitions, like the League of Legends World Championship with an annual grand prize of $1 million, show just how far gaming has come.
Gaming on Riak
Companies who build games or betting sites use Riak in three key ways:
- Player Data – Riak provides low-latency, highly available data storage for key player data, including user and profile information, game performance, statistics and rankings, and more. Riak also provides many different tools for querying and indexing this data, such as a full-text search engine and secondary indexing.
- Session Storage – Riak is used to store and serve session data with predictable low-latency, which is necessary for game play. Riak imposes no restrictions on the type of content stored (since all objects are stored on disk as binaries), so session data can be encoded in many ways and can evolve without administrative changes to schemas.
- Global Data Locality – While gaming, players require a low-latency experience, regardless of their physical location. Interrupted or slow game play can lead to poor user experience and player abandonment. Riak Enterprise’s multi-datacenter capabilities allow game data to be physically close to players and for fast response times regardless of player location.
- Social Information – Riak is built for very fast data storage. Due to its inherent design and Riak’s simple key/value data model, Riak is ideal for storing and serving social content such as social graph information, player profiles, player relationships, social authentication accounts, and other types of social gaming data.
By using Riak, companies have achieved global availability, massive scalability, while still maintaining operational simplicity These benefits are derived from the core architectural decisions made in the design of Riak.
By design Riak is masterless. Each node in a Riak cluster is the same, containing a complete and independent copy of the Riak package. There is no “master” or coordinating node. This uniformity provides the basis for Riak’s fault-tolerance and scalability. When this is coupled with an even distribution of data around the cluster via consistent hashing, there is a significant decrease in risky “hot spots” in the database while lowering the operational burden associated with manually sharding data. In addition, new nodes can easily be added with automatic, minimal redistribution of data.
This distribution of data in a masterless system is supplemented with a process of “hinted handoff”. Hinted handoff lets Riak cleanly handle node failure. If a node fails, a neighboring node takes over its storage operations. When the failed node returns, any updates received by the neighboring node are handed back to it. This ensures availability for writes and updates and happens automatically.These are discussed in greater detail in a blog post entitled Why Riak Just Works.
Modeling Gaming Applications in Riak
The table below illustrates key/value mappings for common application types. Remember that values in Riak are opaque and stored on disk as binaries – JSON or XML documents, images, text, etc. Riak has a “schemaless” design. Objects are comprised of key/value pairs, which are stored in flat namespaces called “buckets.” The way data is organized in Riak should take into account the unique needs of the application, including access patterns such as read/write distribution, latency differences between various operations, use of Riak features (including MapReduce, Search, Secondary Indexes), and more.
Here are some common approaches to structuring gaming data with Riak’s key/value design:
|Player Data||Login, email, UUID||Player Attributes (often stored as a JSON document); Player Rewards and Stats|
|Social Data||Login, email, UUID||Player Profiles, Social Graph Information, Facebook/Twitter Tokens|
|Session Information||User/Session ID||Session Data|
|Image or Video Content||Content Name, ID, or Integer||.JPG, .PNG, .GIF or other image format; .MOV, .MPG, .MP4 or other video file format|
Gaming Customer Stories
In a recent webinar, Dan Macklin, Head of Research and Development at bet365, provided an overview of their decision making process in choosing Riak. As one of the world’s leading online gambling groups, with over 18 million customers in two hundred countries, bet365 has a unique perspective on making an informed, strategic decision when designing an always available application architecture.
In this webinar, Dan discussed:
- bet365’s journey to Riak
- The evaluation and technical challenges being addressed
- The triumphs of migrating to Riak
- Advice for anyone evaluating their database requirements
bet365 was faced with a massive scale issue. Their existing SQL, relational database solution was simply unable to keep up with the demand placed on it by their infrastructure without needed to incur the complexity and cost of sharding. The lack of scalability was causing undue stress on their infrastructured leading to a loss of availability. Of particular interest, for those sharing a similar decision making process, is that Dan discusses not only their search for a solution but their decision making process that ultimately identified Riak
The session is available for replay here.
At RICON 2014, Basho’s distributed systems conference for developers, Michal Ptaszek gave a session entitled Let’s Chat About Chat. This session provided detailed insight into how Riot Games built their League of Legends chat system with Riak to handle 70 million players.
In League of Legends, just as in any competitive team game, communication is essential to success. Therefore, when building Chat for the game we had to make sure that the new service would be absolutely rock solid in every respect. This includes not only guaranteed message delivery and consistent presence propagation across the system, but also maintenance of the created social network graph.
In this talk I would like to present how we achieved linear scalability for Chat, improved its overall fault tolerance, and got ready for the new features we wanted to ship. I will also discuss in detail why we migrated our data from MySQL to Riak and how we used CRDTs to deal with conflicting object updates.
As is thematic in gaming use cases, database scalability was a primary consideration and was an architectural consideration from the start. Riot Games started their application modeling with MySQL –a relational database– but hit multiple performance, reliability, and scalability issues. As an example, it simply was not possible to update the database schema fast enough to track changes made in code.
In addition, Riot Games leverages the multi-datacenter capabilities off Riak Enterprise to export persistent data to a secondary Riak cluster. Costly ETL queries, like social graph queries, are run on the secondary cluster without interrupting the primary cluster. This design pattern is often referred to as a “Secondary Analytics Cluster”.
Some statistics that highlight the immense scale that Riot deals with:
- 67 million unique players every month (not counting other services using chat)
- 27 million daily players
- 7.5 million concurrent players
- 1 billion events routed per server, per day, only using 20-30 percent of available CPU and RAM
- 11K messages per second
- A few hundred chat servers are deployed around the world. Managed by 3 people
- 99% uptime
To learn more about Riak in the Gaming and Gambling industry, there are several useful resources to begin your research and design your deployment.
- Riak Solution for Gaming – This Solution Brief discusses using Riak for a variety of gaming and gambling use cases.
- Riak Tech Talk – Our experienced team can help develop your use case, answer questions, and make sure you are successful at every step from development to production. We can arrange either in-person or virtual meetings, depending on availability and location.
- Why bet365 chose Riak – Get a better understanding of how to make informed strategic decisions directly from someone who has taken the journey. Dan Macklin, Head of Research and Development at bet365 will show you how. His story about choosing Riak will captivate anyone that needs to ensure their data is always available.
Yesterday it was announced that Apple has acquired FoundationDB. As you may imagine, I have been asked to comment on what this means for the NoSQL database industry and for those who are investing heavily in retooling their traditional database infrastructures with new technologies to meet the availability, scalability, and fault-tolerance characteristics required by the massive influx of data.
NoSQL databases are an increasingly critical part of enterprises’ ability to derive real business value from the massive amounts of data that users, devices and online systems generate. They are also an important part of the developers’ toolkits when building applications for the Internet of Things, a major contributor to this ever-growing body of data. Apple is acutely aware of the importance of being able to reliably scale to meet the real-time data needs of today’s global applications. The news of Apple’s intent to acquire FoundationDB greatly amplifies these points to a growing number of IT and engineering leaders.
Part of the comments around the announcement are the discussion of Open Source software both as an underpinning for enterprise infrastructure and as a viable business model. I contributed to a detailed discussion about the latter in a recent article on Silicon Angle entitled NoSQL market frames larger debate: Can open source be profitable?, noting that there is enormous opportunity for Open Source NoSQL companies if they can serve the specific needs of enterprise customers. We feel that we are doing so, and that our approximately 1:10 ratio of paying customers to Open Source users is an indicator of our solution’s value and the strength of our business. Our clear path to being cash-flow-positive includes a measured, strategic investment in R&D which is essential to ensuring Basho’s corporate viability for all customers who have, already, made multi-million dollar investments in their business critical workloads.
Unlike others, the core underpinnings of Riak as a distributed, multi-model data persistence platform are, and will remain, Open Source. Basho builds premium, enterprise-grade features atop this distributed infrastructure, and these features help us attract a higher percentage of paying customers than others in the industry.
Acquisition and consolidation — whether done to enhance technical capabilities, secure talent, or expand a company’s customer base — are essential to the high technology arena. The NoSQL space will be the focus of more of this activity than most in the coming year, given the amount of attention it has already received, with PwC naming NoSQL as one of the “surprising digital bets for 2015” and given the success of the HortonWorks IPO. Combine that buzz with the fact that a prominent database ranking tool lists more than 200 different database management systems, and we are certain to see more industry consolidation.
The decision to re-architect an existing enterprise data workload infrastructure is not one to be taken lightly. Basho’s commitment to Open Source, our commitment to long-term business viability, and our impressive list of customers making substantial investments, point to a bright future not only for our company but for those who choose Riak as a core underpinning of their persistence infrastructure. Apple’s acquisition of FoundationDB strongly validates the value of the solutions we offer and underscores the criticality of these technologies to companies that need to scale business-critical applications.
February 23, 2015
Over the last week, for a variety of reasons, the topic of security in the NoSQL space has become a prominent news item. Chief among these reasons was the announcement of a popular NoSQL database having multiple instances exposed to the public internet. From the headlines you might think that NoSQL solutions have inherent security problems. In fact, in some cases, the discussion is positioned intentionally as a relational vs. NoSQL issue. The reality is that NoSQL is not more or less secure than a traditional RDBMS.
The Security of any component of the technology stack is both the responsibility of the vendor providing the technology and those that are deploying it. How many routers are running with the default administrative password still set? Similarly, exposing any database, regardless of type, to the public internet without taking appropriate security precautions, including user authentication and authorization, is a “bad idea.” A base level of network security is an absolute requirement when deploying any data persistence utility. For Riak this can include:
- Appropriate physical security (including policies about root access)
- Securing the epmd listener port, handoff_port listener port, and the range ports specified in the riak.conf
- Defining users and optionally, groups (using Riak Security in Riak 2.0)
- Defining an authentication source for each user
- Granting necessary permissions to each user (and/or group)
- Checking Erlang MapReduce code for invocations of Riak modules other than riak_kv_mapreduce
- Ensuring your client software passes authentication information with each request, supports HTTPS or encrypted Protocol Buffers traffic
If you enable Riak security without having an established functioning SSL connection, all request to Riak will fail because Riak security (when enabled) requires a secure SSL connection. You will need to generate SSL certificates, enable SSL, and establish a certification configuration on each node.
The security discussion does not, however, end at the network. In fact, for those who are familiar with the Open Systems Interconnection model (OSI), a 7 layer conceptual model that characterizes and standardizes the internal functions of a communication system by partitioning it into abstraction layers, (ISO 7498-1) there is a corresponding security architecture reference (ISO 7498-2)…and that is just for the network. It is necessary to take adopt a comprehensive approach to security at every layer of the application stack…including the database.
The process of securing a database, which is only a component of the application stack, requires striking a fine balance. Basho has worked with large enterprise customers to ensure that Riak’s security architecture meets the needs of their application deployments and balances the effort required with the security, or compliance, requirements demanded by some of the worlds largest deployments.
NoSQL vs. Relational Security
As enterprises continue to adopt NoSQL more broadly, the question of security will continue to be raised. The reality is simple, it is necessary to evaluate the security of the database you are exploring in the same way that you would evaluate its scalability or availability characteristics. There is nothing inherent to the NoSQL market that makes it less, or more, secure that relational databases. It is true that some relational database, by aegis of their age and maturation, have more expansive security tooling available. However, when adopting a holistic, risk-based approach to security NoSQL solutions — like Riak — are as secure as required.
Security and Compliance
A compliance checklist (be it HIPAA or PCI) details, in varying specificity, the security requirements to achieve compliance. This checklist is subsequently verified through an audit by an independent entity…as well as ongoing internal audits.
So can I use NoSQL in compliant environments?
Without question, Yes. The difficulty of achieving compliance will depend on how the database is configured, what controls it provides for authentication and authorization, and many other elements of your application stack (including physical security of the datacenter, etc). Basho customers have deployed Riak in highly regulated environments and achieved their compliance requirements.
I would encourage you, however, to realize that compliance is an event. The process of securing your application, database, datacenter, etc. is an ongoing exercise. Many, particularly those in the payments industry, refer to this as a “risk-based” approach to security vs. a “compliance-based” approach.
Security and Riak
In nearly all commercial deployments of Riak, Riak is deployed on a trusted network and unauthorized access is restricted by firewall routing rules. This is expected, this is necessary and is sufficient for many use cases (when included as part of a holistic security posture including locking down ports, reasonable policies regarding root access, etc.). Some applications need an additional layer of security to meet business or regulatory compliance requirements.
To that end, in Riak 2.0, the security store changed substantially. While you should — without question — apply network layer security on top of Riak and the systems that Riak runs upon, there are now security features built into Riak that protect Riak itself, not just its network. This includes authentication (the process of identifying a user) and authorization (verifying whether the authenticated user has access to perform the requested operation). Riak’s new security features were explicitly modeled after user- and role-based systems like PostgreSQL. This means that the basic architecture of Riak Security should be familiar to most.
In Riak, administrators can selectively control access to a wide variety of Riak functionality. Riak Security allows you to both authorize users to perform specific tasks (from standard read/write/delete operations to search queries to managing bucket types and more) and to authenticate users and clients using a variety of security mechanisms. In other words, Riak operators can now verify who a connecting client is and determine what that client is allowed to do (if anything). In addition, Riak Security in 2.0 provides four options for security sources:
- trust — Any user accessing Riak from a specified IP may perform the permitted operations
- password — Authenticate with username and password (works essentially like basic auth)
- pam — Authenticate using a pluggable authentication module (PAM)
- certificate – Authenticate using client-side certificates
More detail on the Riak 2.0 Security capabilities are presented in the Security section of the documentation, in particular the section entitled Authentication and Authorization.
With a NoSQL system that provides authentication and authorization, and a properly secured network, you have progressed a long way in reducing the risk profile of your system. The application layer, of course, must still be considered.
Relational databases are still a part of the technology stack for many companies; others are innovating and incorporating NoSQL solutions either as a replacement for or alongside existing relational databases. As a result they have simplified their deployments, enhanced their availability, and reduced their costs.
Join us for this webinar where we will look at the differences between relational databases and NoSQL databases like Riak. We will look at why companies choose Riak over a relational database. We will analyze the decision points you should consider when choosing between relational and NoSQL databases and we will look at specific use cases, review data modeling and query options.
This Webinar is being held in two time slots:
- Wednesday, March 4, 2015 8:00-9:00 AM PST (4:00-5:00 PM GMT)
- Wednesday, March 4, 2015 12:00-1:00 PM PST (3:00-4:00 PM EST)
January 22, 2015
In speaking with Riak users, both open source and commercial, we are frequently told that Riak’s key/value model is more flexible and faster to develop against than a traditional relational database. Even though Riak is well suited for many applications, there are inevitable tradeoffs in terms of query options and data types that are available. With a key/value model, there is no concept of columns or rows, therefore Riak does not have join operations. Riak can be queried either directly via HTTP, the protocol buffers API and through various client libraries. However, there is no SQL or SQL-like language that is currently available.
Riak’s key/value data model does not preclude queryability. There are several powerful querying options including:
- Riak Search: Integration with Apache Solr provides full-text search and support for Solr’s client query APIs.
- Secondary Indexes: Secondary Indexes (2i) give developers the ability to tag an object stored in Riak with one or more query values. Indexes can be either integers or strings, and can be queried by either exact matches or ranges of values.
- MapReduce: Developers can leverage Riak MapReduce for tasks like filtering documents by tag, counting words in documents, and extracting links to related data.
For more information, check out the Riak documentation on Querying Data.
The table below illustrates key/value mappings for common application types. Remember that values in Riak are opaque and stored on disk as binaries – JSON or XML documents, images, text, etc. The way data is organized in Riak should take into account the unique needs of the application, including access patterns such as read/write distribution, latency differences between various operations, use of Riak features (including MapReduce, Search, Secondary Indexes), and more.
|Session||User/Session ID||Session Data|
|Advertising||Campaign ID||Ad Content|
|Sensor||Date, Date/Time||Sensor Updates|
|User Data||Login, eMail, UUID||User Attributes|
|Content||Title, Integer||Text, JSON/XML/HTML Document, Images, etc.|
Consider, for example, one of the canonical use cases for Riak…storing user and session data. In a relational database, the “users” table is well known and, basically, provides a unique identifier per user, and then a series of identifying information about that user as individual columns such as:
- First name
- Last name
- Counter of Site Visits
- Paid Account Identifier
This data can then be used to correlate or count, paid users, common interests, etc. via a series of SQL queries against the row/column structure of the users table.
Riak, in contrast, provides flexibility in how this data can be modeled based upon the application use case. It may be desirable to create a Users bucket, with the UserName (or Unique Identifier) as the key and a JSON object storing all user attributes as the value. Or, as we describe in Data Modeling with Riak Data Types, leverage the power of Riak Data Types by creating a map type for each user storing:
- first and last name strings in the register type,
- interests as a set,
- a counter for visits,
- and a flag for paid account identifier.
One of the best ways to enable application interaction with objects (a key/value pair) in Riak is to provide structured bucket and key names for the objects. This approach often involves wrapping information about the object in the object’s location data itself.
For example, appending a timestamp, UUID, or Geographical coordinate, to a key’s name allows for fine grained queryability via simple lookup to locate and retrieve a specific set of information. Leveraging the same naming mechanism as created for users (UniqueID as the key) enables, in a separate sessions bucket, storing the UUID append with a timestamp as the key and the session data (in binary format) as the object. In this way, using the same UUID, I am able to obtain both user and session data stored in different buckets and in different formats.
For additional information, and more complex considerations such as modeling relationship and advanced social applications, see the Riak documentation on use cases and data modeling.
Resolving Data Conflicts
In any system that replicates data, conflicts can arise – e.g., if two clients update the same object at the exact same time or if not all updates have yet reached hardware that is experiencing lag. Riak is “eventually consistent” – while data is always available, not all replicas may have the most recent update at the same time, causing brief periods (generally on the order of milliseconds) of inconsistency while all state changes are synchronized.
However, Riak does provide features to detect and help resolve the statistically small number of incidents when data conflicts occur. When a read request is performed, Riak looks up all replicas for that object. By default, Riak will return the most updated version, determined by looking at the object’s vector clock. Vector clocks are metadata attached to each replica when it is created. They are extended each time a replica is updated to keep track of versions. Clients can also be allowed to resolve conflicts themselves.
Further, when an outdated object is discovered as part of a read request, Riak will automatically update the out-of-sync replica to make it consistent. Read Repair, a self-healing property of the database, will even update a replica that returns a “not_found” in the event that a node loses it due to physical failure.
Riak also features “Active Anti-Entropy,” which is an automatic self-healing property that runs in the background. Rather than waiting for a read request to trigger a replica repair (as with Read Repair), Active Anti-Entropy constantly uses a hash tree exchange to compare replicas of objects and automatically repairs or updates any that are divergent, missing, or corrupt. This can be beneficial for large clusters storing “stale” data.
More information on vector clocks, dotted version vectors, and conflict resolution can be found in the online documentation in the section regarding Causal Context.
Multi-site replication is quickly becoming critical for many of today’s platforms and applications. Not only does replication across multiple clusters provide geographic data locality – the ability to serve global traffic at low-latencies – it can also be an integral part of a disaster recovery or backup strategy. Other teams may use multi-site replication to maintain secondary data stores, both for failover as well as for performing intensive computation without disrupting production load. Multi-site replication is included in Basho’s commercial extension to Riak, Riak Enterprise, which also includes 24/7 support.
Multi-site replication in Riak works differently than the typical approach seen in the relational world, multi-master replication. In Riak’s multi-datacenter replication, one cluster acts as a “primary cluster.” The primary cluster handles replication request from one or more “secondary clusters” (generally located in datacenters in other regions or countries). If the datacenter with the primary cluster goes down, a secondary cluster can take over as the primary cluster. In this sense, Riak’s multi-datacenter capabilities are “masterless.”
In multi-datacenter replication, there are two primary modes of operation: full sync and real-time. In full sync mode, a complete synchronization occurs between primary and secondary cluster(s). In real-time mode, continual, incremental synchronization occurs – replication is triggered by new updates. Full sync is performed upon initial connection of a secondary cluster, and then periodically (by default, every 6 hours). Full sync is also triggered if the TCP connection between primary and secondary clusters is severed and then recovered.
Data transfer is unidirectional (primary->secondary). However, bidirectional synchronization can be achieved by configuring a pair of connections between clusters.
Full documentation for multi-datacenter replication in Riak Enterprise is available in the online documentation.
Modeling data in any non-relational solution requires a different way of thinking about the data itself. Rather than an assumption that all data cleanly fits into a structure of rows and columns, the data domain can be overlayed on the core Key/Value store (Riak) in a variety of ways. There are, however, distinct tradeoffs and benefits to understand.
Relational Databases have:
- Foreign keys and constraints
- Sophisticated query planners
- Declarative query language (SQL)
- A Key/Value model where the value is any unstructured data
- More data redundancy that provides better availability
- Eventual consistency
- Simplified query capabilities
- Riak Search
What you will gain:
- More flexible, fluid designs
- More natural data representations
- Scaling without pain
- Reduced operational complexity
For more information on Data Modeling, or to chat with a member of the Basho team on the topic, please request a Tech Talk.
January 6, 2015
If you have read about Riak, or seen a member of the Basho team present, you have probably heard the phrase “Your data is opaque to Riak.” While this is not, strictly, true with the inclusion of distributed Data Types in Riak 2.0, it was a phrase that hinted at the core structure of Riak itself.
Riak is a Key Value data store.
In a relational database, data is organized by tables that are separate and unique structures. Within these tables exist rows of data organized into columns. As such, interaction with the database is by retrieving or updating entire tables, individual rows, or a group of columns within a set of rows.
In contrast, Riak has a much simpler data model. An Object is both the largest and smallest element of data. As such, interaction with the database is by retrieving or modifying the entire object. There is no partial fetch or update of the data.
Keys in Riak are simply a binary value (or a string) that are used to identify Objects. The Key/Value pair (or Object) is stored in a higher level namespace called a Bucket. And, with Riak 2.0, there is an extra layer of abstraction known as Bucket Types.
This Key/Value/Bucket model enables broad flexibility in modeling the applications data domain with Riak as the data store for persistence.
Another NoSQL model that many are familiar with is the document store. Unlike the Key/Value model the data store is aware of the structure of the objects stored. These objects, or documents, are grouped into “collections” — which is analogous to a relational “table” — and the datastore provides a query mechanism to search collections for objects with particular attributes. When the data that is being persisted is easily rendered as a JSON document, a document store can seem a natural fit. Some common use cases include product catalog data and content management systems.
The Basho Docs have a lengthy tutorial entitled Using Riak as a Document Store that walks you through the process of leveraging Riak as a document store for a CMS. There are many approaches to modeling, but the tutorial demonstrates the power of Riak 2.0 features by combining the maps data type and indexing that data with Riak Search.
When the data you are persisting can be represented as JSON, and you require the ability to query the data, Riak 2.0 is an excellent solution for persisting and modeling document data. The flexibility of the Key/Value model, combined with the power of Riak Search and Riak Data Types, provide you with a highly scalable, highly available document store with rich, full-text query capabilities. In addition, the inclusion of the maps data type means that you don’t have to write complex client side resolution logic when faced with network partitions. Riak Data Types handle that conflict resolution automatically.
A scalable, available document store that is operationally simple may seem compelling enough to use Riak. But when you combine the characteristics of Riak with the multi-datacenter replication capabilities of Riak Enterprise, now you have a solution that enables you to bring your data operations closer to the end user.
Scalable, available, operationally simple, and replicated. That’s the power of using Riak as a document store.
December 30, 2014
At Basho, we are proud of our documentation. All design, updates, and edits are done with our community top of mind and we encourage community participation. Given the pace at which our documentarian expert, Luc Perkins, is updating the content, it can be easy to fall behind in reading new and updated materials. So we have a holiday gift to help you out.
Below is our Top 10 suggested New Year’s reading list.
#10 – A Migrating from an SQL Database to Riak tutorial can help prepare you as embrace a new style of development and persistence.
#7 – Strong consistency has gone from having light documentation to being one of our best-documented open-source features. Strong Consistency docs are spread across the following:
#6 – We now have client-side security docs! There’s an introductory doc that walks you a bit through how client security works in Riak as well as client-specific docs for Java, Ruby, Python, and Erlang.
#5 – A new Erlang VM Tuning doc. This is still a work in progress. As we said at the beginning, we really encourage community involvement. What tuning have you done to optimize your Erlang environment?
In addition to the above, there is new documentation on the topics below.
Drum roll please….
#1 – Riak 2.0 – if you missed this you missed a lot.
We want to thank everyone in the community who participates in making the Basho documentation the most useful set of materials possible. Remember: to submit issues is human, to submit PRs is divine.
Happy New Year!
New, enhanced database and growing number of customers highlight strong year for the company
LONDON, UK. – November 20, 2014 – Basho, the creator and developer of Riak, the industry leading distributed NoSQL database, has seen a surge in deployment and a growing customer-base in EMEA as a result of the launch of Riak 2.0, the significantly enhanced version of its flagship platform.
2014 has seen significant successes for Basho, from the release of Riak 2.0 to news that Basho technology is powering Spine 2, the electronic backbone of the NHS. Basho has also seen strong growth in its EMEA customer-base, with the company working with businesses such as bet365, one of the world’s leading online gambling groups, StatPro, the cloud-based portfolio analysis service, and EE, the largest mobile operator in the United Kingdom to address their critical unstructured data needs. Basho has increased its number of customers in EMEA by 38 percent year on year, and these customer wins have contributed to revenue growth from Q2 to Q3 in 2014, which was up 90 percent.
“Our decision to implement Riak was purely strategic. After a stringent evaluation process we decided that Basho’s flexible, scalable database was best-suited to our needs,” said Martin Davies, Chief Executive Officer, Technology at bet365. “Given the huge amount of data we process on a daily basis – from customer details to betting odds – it was imperative that we had a platform to support this. We selected Riak, and have not been disappointed with the results.”
The gaming industry is becoming increasingly complex, with customers no longer satisfied with betting on a limited selection of outcomes. Now, gaming companies must offer more than your traditional betting options. For example, during football matches, it is no longer enough to offer odds on scorer or full-time result. Instead consumers are eager to bet on everything from the number of yellow cards, to corners and amount of injury time. To offer and process these options requires a huge amount of data-crunching, and in addition to the vast number of metrics and numbers processed when taking into account everything from betting odds, bets placed and the final action on each account, such businesses require a lightning-fast database to support the deluge and prevent system crashes.
Basho’s growing stature in the gaming sector has been matched by its recent success in the telecommunications space. An increasing number of telco companies like EE are using Riak to replace existing systems and provide fault-tolerance and scalability for the future. Riak’s strength in the industry is further highlighted by the market trend towards reducing the burden of managing complex hardware environments by providing a consolidated virtualized orchestration platform to replace much of the traditional hardware.
These recent deals highlight a strong year for Basho, while the reseller partnership with Nordicmind and its upcoming Riak Nordic Roadshow demonstrate its growing success in EMEA. Success in the region is further reflected in the appointment of Emmanuel Marchal as Managing Director EMEA, who will be leading enterprise focus in EMEA, as well as the continued work with companies such as Deutsche Vermögensberatung (DVAG), Germany’s largest stand-alone financial services distributor. The financial advisors of DVAG support over 6 million customers in all questions concerning financial planning, insurance and finances.
“We knew that with the release of Riak 2.0, 2014 would be a massive year for the company,” said Adam Wray, President and Chief Executive Officer at Basho. “However, the growth in deployment and the continued success of Riak was more significant than we expected – with customers responding in kind. This year alone we have made strides in several sectors, including telco, financial, gaming and healthcare, where we have helped complete a project with the NHS that could potentially save lives. Couple this with our growing number of partners, and we can happily say that Basho is going from strength-to-strength.”
By: Peter Coppola
We had the opportunity to stop by DATAVERSITY’S NoSQL Now! conference in San Jose last week. I was very impressed with the level of energy and the wide-ranging selection of sessions offered. According to Tony Shaw, the CEO of DATAVERSITY, the organizer of NoSQL Now, registrations were up 15 percent from 2013.
The exhibition hall was packed and lively as attendees jostled between booths. DATAVERSITY did an outstanding job keeping the show floor tightly packed with exhibitors. The industry was well represented by Cloudera (saw “Data is the new bacon” t-shirts), MarkLogic, MongoDB, Oracle and EnterpriseDB – all present as major sponsors. Between conversations, I was able to nab a nifty versatile screw-driver disguised as a pen from DataStax.
NoSQL Now sessions do rely heavily on sponsors, but with such a wide selection of tracks there’s bound to be a topic of interest at any given time slot. I had a choice of the following concurrent sessions at 4:15 p.m. on Wednesday:
- Internet of Things with MongoDB – MongoDB
- Out with MapReduce, In with Spark – DataStax
- Case Studies in Search and Semantics – MarkLogic
- Just the Right Weather for our Company: How We Chose Our Data Stores – The Weather Company
- NoSQL on ACID – EnterpriseDB
I attended The Weather Company’s session – not only was it the only non-vendor presenter, but the company is also a customer and big fan of Riak. The Weather Company manages five data centers that in production handle 25,000 requests per second and distribute 60 GB of data to each data center every 10 minutes. Surya Kangeyan Sivakumar took us through the journey of how The Weather Company selected its data store solutions and how it overcame the mindset of having to use its existing relational database solution just because the company had invested so much in it. Riak was selected, along with other NoSQL solutions, due to the speed and ease at which it could be stood up.
In 2015 Basho looks forward to being a more active participant in NoSQL Now.
By: Jeremy Hill
Business Intelligence makes it possible for organizations to make sense of the vast amount of customer, manufacturing and competitive information they have available in order to make smarter and better informed decisions. In turn, this enables organizations to become more responsive to customer needs, increase efficiencies in manufacturing processes, and respond to significant events quickly.
Historically the data that drives business intelligence has been stored in structured formats in a data warehouse, such as customer information on how much is spent. However, this approach misses out on the value of semi-unstructured and unstructured data, like the details from a customer call or a customer tweet.
With such information missing, a complete view of the customer or business can be limited. The consequence is that an inability to gain knowledge and measure customer information means businesses can fall behind, especially in a competitive market.
Business Intelligence needs NoSQL
Having access to all types of relevant customer information – structured, semi-structured and unstructured – is an essential requirement for business intelligence (BI) to help enterprises get ahead of the competition. Unlike structured, relational data warehouses, NoSQL databases make this possible with improved availability, scalability and fast response times. NoSQL databases are ideal for BI and data warehousing not only because of the diverse types of information it can deal with, but also because they are able to deliver data at the very time it is needed.
Enabling real-time analytics
NoSQL keeps up with transaction speeds as-it-happens, enabling real-time analytics. E-commerce transactions, for example, benefit from a NoSQL database because it can make a decision about what to do next when a buyer doesn’t complete a purchase. Instead of waiting 24 hours or longer for the data to move through a traditional data warehouse system, with a NoSQL system a feed goes straight from a transaction through a connecter to a NoSQL database. A sales analytics process can make a decision with the intelligence at that very minute, to consult the customer and understand the behavior in real-time, helping secure the purchase and preventing the loss of a customer transaction.
A recently announced Basho partner, Caserta Concepts, a technology consulting firm specializing in big data analytics, data warehousing and business intelligence, works with CIOs to deliver analytics solutions that support business goals. It uses Riak and Riak CS to accommodate unique client requirements across a broad range of data types – structured, semi-structured and unstructured – and provides continuous availability to keep critical line-of-business applications going around the clock. Caserta’s practice illustrates the viability for NoSQL in the database revolution to take on the volume, variety and velocity of data dynamics of today’s web-scale applications.
Intelligence for IoT transactions
With the vast amounts of information from Internet of Things (IoT) technologies, more business intelligence needs and use cases are at the cusp. Consider oil and gas organizations providing annual service contracts for boilers – analytics tells the business that anything beyond the second call out (or truck roll) wipes out the profit on the contract. In the connected world, NoSQL enables the next level of intelligence, which allows organizations to collect information so that, in the event of failure, they are able to determine which parts are needed in advance, eliminating the need for multiple visits. Gathering intelligence from this data also allows organizations to perform preemptive maintenance during the annual inspections to lower the frequency of unplanned, costly site visits.
With NoSQL, BI and data warehousing can become quicker and much more efficient. It allows organizations to react to events more quickly, increase customer attention, streamline the supply chain, predict customer behavior at the point it matters and predict future service calls. At the rise of big, unstructured data, NoSQL presents enormous opportunity for the future of business intelligence.