Solving the Challenges of IoT Analytics

July 12, 2018

Jon Bock

Excitement about the opportunities created by the internet of things (IoT) has been around for some time. A survey conducted by Cisco in 2017 illustrates some of the reasons for that excitement–over 73% of organizations said they are using IoT data to improve their business in ways that range from improving customer experience to delivering operational and product excellence.

However, to date IoT hype has been far ahead of reality. In fact, that same survey revealed that almost 75% of IoT projects are failing. What’s the problem? Among the top challenges cited in the survey were time to completion, limited internal expertise, quality of data, integration across teams, and budget overruns.

Technology Barriers to IoT Analytics

A significant driver of those challenges: technology that was never designed for the demands of IoT analytics. Trying to stitch together and force-fit available technology into an IoT analytics use case requires expertise, patience, and complexity that are beyond tolerable for most enterprises, driving a major contribution to the high rate of failure of IoT projects. Among the challenges:

  • The complexity of integrating multiple technologies, including designing around limitations in functionality, performance, and scalability that become prominent in IoT scenarios, leads to multiple iterations that delay and extend projects while driving up costs.
  • That patchwork of technologies also creates poorly connected data silos and islands that make it difficult to get integrated insights across analytics and across teams.
  • Traditional approaches to data integration struggle to keep up with the volume and velocity of IoT data generated by sensors and devices that needs to be processed and analyzed immediately.

That’s nowhere more evident than in the technology powering the data pipeline and data processing for IoT analytics. That technology, critical to any IoT analytics project, has typically required stitching together a laundry list of components to handle data collection, transfer, cleansing, transformation, processing, event storage, serving, and more. Not only does this patchwork make it difficult to deploy and integrate solutions to support IoT analytics, it makes operating and supporting solutions burdensome and prone to failure.

Figure 1. Traditional IoT solution architecture
Figure 1. Traditional IoT solution architecture

Enabling a New Approach to IoT Analytics

It’s clear that a new approach is needed to break out of the cycle of failed IoT projects. A new approach needs to start with new technology solutions better suited to the demands of IoT analytics.

Thankfully, new technologies have emerged that make it possible to build better solutions. At Streamlio we’ve worked to deliver a technology solution, using proven open source technology, that provides a simpler, more capable data pipeline for IoT analytics.

The approach we’ve taken to delivering that solution is based on the following:

  • Consolidate and unify the data pipeline. The technology we provide brings together in a single solution multiple components and capabilities needed to build data pipelines and process data for IoT analytics. Unifying capabilities for connecting, processing, and storing IoT data in a single solution dramatically reduces cost and complexity of both deployment and operation.
  • Design for performance and scalability. With a unique architecture, the technology at the core of the Streamlio solution can keep up with the ever-growing data volume and velocity of IoT data in a single deployment, without requiring multiple islands of technology and without burdensome complexity. That ensures that timely data and analytics can be delivered to automated systems and to operators.
  • Data-driven processing. Unlike traditional technologies, which are burdened by a legacy of batch-oriented processing concepts and approaches, the technology solution we provide is data-driven throughout, executing stream-native processing of data as it arrives in order to deliver immediate data and analytics to support alerting, monitoring, and other IoT analytics applications that are always up to date, not delayed waiting for the next batch processing interval.
Figure 2. Simplified IoT solution architecture
Figure 2. Simplified IoT solution architecture

Using this solution, IoT projects benefit from a significant reduction in complexity that leads to faster implementations, reduced risk, and a unified environment. Rather than stitching together multiple incomplete solutions with the complexity and fragility that creates, teams can deploy a unified data fabric for moving, processing and storing streaming IoT data that can operate near the edge, in the cloud, and in the datacenter. That unified environment, because of its performance and scalability, makes it much easier for teams to work together while reducing costs and ensuring a consistent view of data. The stream-native processing capabilities in this fabric make it possible to filter, aggregrate, transform, and process analytics on data immediately as it arrives, removing delays to making up-to-date data and analytics available to dashboards, operators, and applications.

Having a better technology foundation for IoT data processing enables organizations to more successfully realize the benefits of IoT analytics in a wide range of use cases such as failure monitoring, supply and demand management, network optimization, and many more.

Example: IoT Analytics in the Energy Sector

Streamlio has worked with multiple energy companies looking to improve the performance and reliability of systems and infrastructure used to produce and deliver electricity. Power generation and distribution is a mission-critical activity, and even a small failure can create a cascading sequence of events that can render a power grid unstable.

In these scenarios, it is essential to have technology that can not only stream relevant data as quickly as possible, but also analyze it immediately. Most technologies are unable to deliver on this requirement because their limited performance and scalability means that they can only process a limited number of data points in a limited amount of time. As a result, the fidelity and granularity of the insights on whether a failure is about to occur is limited.

One energy company needed these insights to alert engineers of any problems in the generation and distribution systems using real-time sensor data. Data is collected throughout the generation and distribution network, with each sensor generating tens of thousands of operational data points per second, resulting in hundreds of millions of event messages that needed to be processed and stored in a back-end data store, where they need to be rapidly aggregated and forwarded to front-end monitoring systems.

When this company initially tried to do this using Apache Kafka together with Spark Streaming, they experienced substantial latencies. Given the requirements for low latencies, this company needed a different solution.

They turned to Apache Pulsar, the technology powering the Streamlio solution, for a better solution. Not only did Pulsar allow them to meet their latency requirements, it was also far easier for their developers to use and for their operations team to deploy and manage. All-in-all, going from proof-of-concept to production for this large scale infrastructure took approximately six weeks.

As a next step, they look to incorporate additional sources of IoT sensor data. Using the stream-native processing in Pulsar, they will be able to perform event processing directly on the data flows to calculate diverse aggregations on data as it arrives, supporting more responsive dashboards as well as enabling them to expand access to those analytics to a much broader set of users and applications to enable even greater efficiencies and reliability.