Power-up your analytics.

Modern Big Data applications need to process data in real time to reveal patterns, trends, and associations. These applications need the speed to process large datasets and a computational engine that can keep up with the workloads. You can power up your application with Apache Spark.

Riak TS with the Apache Spark connector moves data from Riak TS to Spark for in-memory analysis, plus the results can be stored back in Riak TS. The ability to persist these results to Riak TS provides flexibility for future data processing.

Riak TS with the Apache Spark connector, powers up your operational analytics.


riak-ts-with-spark-demo-video-thumb

Note, prior to watching this video, we suggest you watch the Introduction to Riak TS video as the video above builds on the case study showing how the data was inserted into Riak TS and how to do SQL queries on the data.

 

OPERATIONAL ANALYTICS ARE COMPLEX

We know that operational analytics add value for the business, but how do you do these analytics at scale? Riak TS uses the power of Apache Spark to enhance real-time analytics of time series data. Spark supports both batch and streaming analysis, meaning you can use a single framework for your batch processing as well as your real-time analytics.

The Spark connector allows you to expose data stored in Riak TS as Spark Resilient Distributed Datasets (RDDs) or DataFrames, as well as output data from Spark RDDs or DataFrames into Riak TS. As Riak TS supports subset of SQL language, Spark SQL can be leveraged as well.

Spark Connector features:

  • Expose data in Riak TS tables as a Spark RDD or Spark DataFrames
  • Leverage Riak TS SQL range queries using Spark SQL
  • The ability to construct an RDD from a given set of keys
  • Parallel full-table reads into multiple partitions
  • Save Spark DataFrames in Riak TS
  • Save an RDD in Riak TS

BENEFITS OF SPARK CONNECTOR

When managing millions of devices you need to quickly analyze events and process data to make real-time decisions. Riak TS with Apache Spark enables you to more easily perform complex and fast analysis of your IoT and time series data.

Make real-time decisions
Whether you’re examining the past or forecasting the future, you want advanced analytics for your IoT or time series application. Riak TS with the Apache Spark connector lets you perform fast, large-scale data analysis to make better real-time decisions.

Faster time to market
When your IoT or time series application requires complex analysis, you don’t want your developers doing all the work. With Riak TS and the Apache Spark connector, a lot of the heavy lifting is done for you. Your developers get a broad set of APIs to write complex aggregations, and testing is simpler. This means you can do more complex processing with less effort allowing you to complete your applications faster to get to market sooner.

Increase performance and scale

With time series data sets growing exponentially, you need a solution that can not only analyze data sets fast, but also scales easily on demand. Riak TS with the Apache Spark connector provides high performance analytics and linear scale using commodity hardware.

“In today’s Internet of Everything, people, processes, data and things are all connected and generating tremendous amounts of data. Sensors in particular generate mountains of time series data that is often stored at the network edges and gathered in the cloud. The integration of NoSQL and Spark will play a vital role in analyzing this data to identify patterns and generate insights, and the introduction of Riak TS makes analyzing this data simple, effective, and scalable.”

– Ken Owens, chief technology officer for Cisco Intercloud Services

  1.  RESILIENCY
  2.  SCALABILITY
  3. OPERATIONALSIMPLICITY
  4. DATACO-LOCATION
  5. SQLCOMMANDS
  6. SQL RANGEQUERIES
  7.  AGGREGATIONS
  8. GLOBAL OBJECTEXPIRATION
  9. APACHE SPARKCONNECTOR
  10. APIS/CLIENTLIBRARIES
  11. MULTI-CLUSTERREPLICATION
  12. APACHE MESOSFRAMEWORK