February 17, 2015
According to TechTarget, a common definition of “High Availability” is:
“In information technology, high availability refers to a system or component that is continuously operational for a desirably long length of time. Availability can be measured relative to “100% operational” or “never failing.”
The reality is that this phrase has become semantically overloaded by its inclusion in marketing copy across a disparate set of technologies. Much like “Big Data”, perspectives on availability vary based on industry and customer expectation.
For many of today’s applications and platforms, high availability has a direct impact on revenue. A few examples include: cloud services, online retail, shopping carts, gaming and betting, and advertising. Further, lack of availability can damage user trust and result in a poor user experience for many social media and chat applications, websites, and mobile applications. Riak provides the high availability needed for your critical applications.
Availability – By the Numbers
As we highlighted in an infographic entitled Down with Downtime, more than 95% of businesses with 1,000+ employees estimate that they lose more than $100,000 for every 1 hour of downtime. For more than 1 in 2 large businesses, the cost of downtime amounts to more than $300,000 per hour. At the lower end of this scale, this is $83 dollars per minute. At the upper end of the spectrum (in financial services) it can amount to $1,800 a second of downtime.
This fiscal impact has resulted in availability being measured as a percentage calculation of uptime in a given year. This percentage is often referred to as the “number of 9s” of availability. For example, “one nine” of availability equates to 90% uptime in a year. Similarly, “five nines” (the standard that was set by consulting firms on enterprise projects) equates to 99.999% availability in a year. While that percentage is often referenced, the practical reality is that it means there can be no more than 6.05 seconds of unplanned downtime per week.
Availability – A Feature or A Benefit?
Often, when describing Riak, I begin by explaining the benefits of Riak (availability, scalability, fault tolerance, operational simplicity) and then discuss, in detail, the properties that these benefits are derived from. Availability is not something that can be added to a system (be it a distributed database or otherwise), rather it is an outcome of the core architectural decisions that were made in the development of the product.
Consider, for example, the AXD 301 ATM switch. It, reportedly, delivers at or better than “nine nines” (99.9999999%) of availability to customers. This is a staggering number that requires NO MORE than 6.048 milliseconds of downtime per week. Interestingly, it shares a common architectural component with Riak also being developed in Erlang.
“How does Riak achieve high availability?” Or, perhaps better stated as, “What are the architectural decisions made in Riak that enable high availability?”
Availability – An Architectural Decision
Riak is a masterless system designed for high availability, even in the event of hardware failures or network partitions. Any server (termed a “node” in Riak) can serve any incoming request and all data is replicated across multiple nodes. If a node experiences an outage, other nodes will continue to service read and write requests. Further, if a node becomes unavailable to the rest of the cluster, a neighboring node will take over the responsibilities of the missing node. The neighboring node will pass new or updated data (termed “objects”) back to the original node once it rejoins the cluster. This process is called “hinted handoff” and it ensures that read and write availability is maintained automatically to minimize your operational burden when nodes fail or comes back on-line.
More information about the architectural decisions involved in Riak’s design are available in our documentation. In particular, the Concepts – Clusters section is deeply illustrative.
Availability – A Use Case
Consider, for example the implementation of Riak at Temetra. Temetra has thousands of users and millions of meters that create billions of data points. The massive influx of data that was being generated quickly became difficult to manage with the company’s legacy SQL database. When considering how this structured database could be overhauled, Temetra conducted evaluations with Cassandra and Hadoop but ultimately chose Riak due to its high availability, relatively self-maintaining and easy to deploy infrastructure. It is essential that the data collected from the meters is always available as it is relied on to determine correct billing for Temetra’s customers.
Availability – A Summary
The reality is that a database, even a distributed, masterless, multi-model platform like Riak, is only one component of the application stack. Understanding your availability requirements requires deep knowledge of the entirety of the deployment environment. “High Availability” cannot be retrofit into a system. Rather it requires conscious effort in the early stages to ensure that customer requirements are met and that downtime does not result in lost customers and lost revenue.
October 21, 2013
Irish-based utility meter management company, Temetra, has developed a first-of-its-kind wireless meter reading system that lowers the overall price of utilities by providing customers with highly accurate readings. To support this system, Temetra needed a scalable and reliable solution to access and store the growing volumes of critical data created by readings, which – for the average household – can number up to 400 each year. After reviewing Cassandra and Hadoop, Temetra chose Basho’s open source distributed database, Riak, to optimize efficiency and deliver a nimble and affordable service to customers.
Simplifying the Data Collection Process
Temetra offers a comprehensive data collection infrastructure that provides homes and businesses in the UK, Ireland, and Australia with intelligent metering for utility usage. This means that instead of manually checking meters periodically throughout each year, Temetra works from a wireless network that automatically collects and analyses usage data. This is done by simply driving past the meter whereas traditionally, meter readers had to visually copy the data index by hand. This new method allows the company to better predict usage and deliver more accurate results and pricing – saving Temetra time and the customer money.
The wireless system can collect 300-400 reads per year for the average household, as opposed to the normal rate of two reads. As many as 35,000 reads per year are now collected by fixed networks for larger consumers such as hotels and hospitals. This approach has been fundamental to Temetra’s competitive differentiation; however, with such a high volume of data, Temetra faced the challenge of finding a simple, scalable solution to store and access its data easily. “We needed a reliable solution that would allow us to support more and more meters on a fixed networks,” said Paul Barry, Temetra’s Managing Director. “Our relational SQL database just could not cope with the quickly rising levels of revenue critical data.”
Billions of Data Points
Temetra has thousands of users and millions of meters that create billions of data points. The massive influx of data that was being generated quickly became difficult to manage with the company’s legacy SQL database. When considering how this structured database could be overhauled, Temetra conducted evaluations with Cassandra and Hadoop but ultimately chose Riak due to its high availability, relatively self-maintaining and easy to deploy infrastructure. It is essential that the data collected from the meters is always available as it is relied on to determine correct billing for Temetra’s customers. This point was stressed by Barry with his statement, “As a small company managing a lot of revenue critical data, it is really important for us to have a reliable and easily accessible database. For example, during our development and testing phase, a Riak node went down for a day and it was only through monitoring that we spotted it. The ability to lose a node and not affect our service in any way is a huge advantage for us.”
The move from a relational database to the non-relational Riak was a big step for Temetra. The shift required an adjustment to treat the database as a low maintenance, high performance, and high availability key value store. For Temetra, the biggest change was denormalizing the data or, in other words, allowing for several copies to be stored. Riak provided the best performance by allowing the company to store data in different ways. For example, data comes in from meters as a single data point that needs to be loaded and turned into compute data. In Riak, Temetra is able to store the meter data in multiple ways, allowing it to come out pre-calculated in the quick and ready form of consumption data. Once they were comfortable with Riak’s replication technology, Temetra was able to load its data from the legacy file store and use Riak as a limitless data store across its multiple sites.
The Benefits of Riak
As a growing company with 3.5 million meters currently collecting revenue critical data, Riak’s ability to easily support additional nodes is a key benefit for the company as it continues to scale up. With Riak, Temetra can continue to expand without huge amounts of additional costs and hardware and is able to bring another server online within 20 minutes, allowing the business to prepare for big, new customers very quickly. This flexibility reflects and underpins Temetra’s own fast growing innovative nature.
Another benefit is the pricing. Temetra’s competitors operate in SQL mode and are unable to scale as easily or as quickly. Working with Riak has allowed Temetra to break away from those limitations which is reflected in the pricing for customers, giving the company a significant competitive edge. Additionally, Riak was very easy for Temetra to introduce. As a small company, Temetra doesn’t want a large administration staff dedicated to looking after the IT infrastructure; therefore, Riak’s relatively self-maintaining nature, alongside Basho’s expert support, was a definite advantage.
“Most of our competitors still operate in SQL mode,” noted Barry. “By working with a distributed database that, from a flexibility and resiliency perspective puts us more in line with the way Google or Twitter work, we can disrupt the way that data is stored traditionally to scale faster and easier. This is reflected in our prices and our ability to rapidly introduce new functionality. I think our customers definitely see that benefit.”
March 21, 2013
Temetra is the most widely used meter management system for water in Ireland. They provide utility companies with a cloud service to store and manage meter data for utilities such as water, gas and electricity.
Each time one of Temetra’s utility customers reads a meter, that reading is stored by Temetra and made available to the utility. As Temetra found success at home and overseas, it resulted in huge amounts and variety of data: from the meter readings themselves, through meter location information and utility staff work schedules, to audit logs.
With the switch to smart meters, where one meter can now provide 35,000 readings per year instead of just four, Temetra needed to improve the scalability and availability of their platform. They chose Riak as their primary data store and use it to collect and process all meter data.
Paul Barry, Temetra’s CEO, told the London Riak meet-up why they chose Riak and how they went from their initial evaluation to deploying it in production. Check out his interesting talk here.