IoT Analytics Across Edge and Cloud Platforms
Internet of Things (IoT) is seeing the rapid deployment of sensing, control and communication infrastructure across various application domains. These range from utility infrastructure such as smart power and transport, to consumer devices like Fitbit and Nest. This growing ability to observe and control devices in real-time to efficiently manage public or lifestyle services is motivating the need for responsive analytics, the software platforms to coordinate them, and the computing resources to execute them. While device and communication technologies were at the vanguard of IoT, the next wave of IoT innovation will be driven by data analytics and computing. To this end, distributed analytics platforms that can utilize heterogeneous computing resources, at the edge and in the Cloud, are starting to be essential.
Contemporary IoT software architectures are typically Cloud-centric or Device-centric (see Figure 1). In the former, data from a sensor is streamed to a Cloud data center, where analytics and decision making happen using current and historic data from this and other sensors, and the control signals transmitted back to the actuators at the edge. This forms the predominant interaction model at present. In a device-centric model, proprietary logic present within the device operates on the sensed observations and makes local decisions, with the firmware updated occasionally. Examples of this include stand-alone consumer devices. However, both of these are narrow architectural views.
Edge Devices as First-class Computing Platforms
An emerging model of distributed analytics is one that spans devices at the edge of the network and Cloud resources seamlessly, leveraging the relative merits of each. There are several motivations for this.
- Data sources and control sinks at the Edge. Most observation sources that drive the analytics lie at the edge of the network. These include physical sensors that measure the infrastructure, mobile devices for participatory sensing, and gateways devices that interact with local sensors and actuators. Similarly, the controls for managing the infrastructure are also at the edge. There are of course exceptions to this, such as social streams, historical data, and web service controls hosted in the Cloud.
- Network constraints between Edge and Cloud. Moving data from the edge to a remote data center incurs network latency of 100’s of milliseconds, which is unacceptable for interactive or mission-critical applications. Similarly, urban surveillance applications generate large data volumes that are bandwidth-prohibitive to completely move to the Cloud. The network connectivity may also be intermittent, causing a loss of functionality if the Cloud connectivity is lost.
- Resource Costs. Cloud resources are charged on a pay-as-you-go model. As a result, compute, storage and bandwidth add to the operational costs that the application provider pays for, or the user is charged for. Hence, there is a trade-off between the revenue or value earned by the IoT application and its running costs on the Cloud.
Advances to device and processor technologies have ensured that contemporary sensor and gateway devices have non-trivial computing power. E.g., the Raspberry Pi device that is popular as IoT gateways has a 64-bit ARM processor with 4 cores, each of which offers a third of the computing power as an Intel Xeon processor core on a Cloud Virtual Machine (VM). Further, these captive devices are part of the one-time capital expenditure and their maintenance is ensured. Lastly, they are co-located with the sensors and actuators, either on-board or in the local network. Consequently, there exist significant benefits if edge devices can be leveraged as active platforms for analytics, besides Cloud resources.
Figure 1: Different interaction models and roles for edge and Cloud resources.
Gaps in Edge+Cloud Computing Platform
While there is a growing trend in utilizing the edge devices as first-class computing platforms, several gaps exist. This has parallels with the transition of mobile telephony from feature to smart phones – while feature phones started to have high computing power, it was the advent of a software platform for app development (iOS, Android) and Cloud-based services that enabled that transformation. We are at that cusp now.
The key limitation to using edge devices effectively is the lack of a platform ecosystem that allows generic and distributed applications to be designed, deployed and executed on them. This can be at the infrastructure layer, whereby a light-weight “container” with pre-defined applications can be spawned on the edge devices, similar to VMs in an Infrastructure-as-a-Service (IaaS) Cloud. This should allow container resources to be acquired transparently, on-demand, and applications deployed within them. The infrastructure would also offer access to on-board sensors and controllers, and linkages with Cloud services. We are seeing such solutions from VMWare Liota and Eclipse Kura for gateway device management, but more is required for distributed device management rather than a single-device.
Alternatively, a Platform-as-a-Service (PaaS) offering would make defining, deploying and managing distributed IoT applications across edge and Cloud easy. A simple model would be like sandboxed “apps” running on a single device, much like a smart phone. E.g., Cloud providers like Microsoft’s Azure IoT Gateway and Amazon’s AWS Greengrass offer limited support for defining event analytics on the edge coupled with scalable stream processing in their Clouds, and Apache Edgent, supported by IBM, is a similar open source offering. But the need is for more general purpose application platforms, and for their distributed execution across multiple edges and Cloud VMs.
Recently, we are leveraging Apache NiFi, a light-weight dataflow execution engine, to compose and execute generic IoT applications on Pi-class devices. We have extended NiFi to operate in a distributed model across multiple edge devices cooperatively, or between edge and Cloud VMs. This is well suited for streaming execution of micro-batch datasets, and can be coupled with other specialized application platforms as well. E.g., one of our applications classifies vehicles from video streams using a Tensorflow deep neural network encapsulated within a NiFi dataflow executing across multiple Pis. This helps with local analytics of video data streams close to the camera source, but with the flexibility of using the same deployment in the Cloud too, say, when the edge is constrained. Another emerging edge-centric platform based on Node.js is Node-RED. The potential to design many novel applications exists if such application platforms for the edge become popular.
Edge devices need to be complemented with Cloud resources to help with coordination and also to off-load computing when the edge is over loaded, draining battery, or needs access to large datasets. An edge-only solution, such as Peer-to-Peer (P2P) computing, poses unnecessary complexity given the ready availability of Cloud resources. Fog computing, where accelerated servers or mini-clusters are available close to the edge (e.g., every city block), will also become feasible within city infrastructure deployments.
Security, privacy and trust are additional concerns, but one can make arguments both in favor and against the use of edge and/or Cloud resources. Edge devices may be fully controlled by the end-user or utility, but their location in public spaces makes them physically accessible, while public Clouds with multi-tenancy may pose limitations for highly sensitive data. The choice depends on individual IoT applications.
Lastly, we need to consider the reliability and scalability of edge devices. Using edge platforms for generic applications makes them less resilient than embedded applications, and the application should be robust to such faults. Connectivity with the Cloud may also be intermittent, and mobility of the edge devices raise additional issues. These will all have to be carefully considered before edge devices take-off as ubiquitous computing platforms to design the next wave of innovative IoT applications.
- Edge-centric Computing: Vision and Challenges, Pedro Garcia Lopez, et al, ACM SIGCOMM Computer Communication Review, Volume 45 Issue 5, October 2015.
- Demystifying Fog Computing: Characterizing Architectures, Applications and Abstractions, Prateeksha Varshney, Yogesh Simmhan, in IEEE International Conference on Fog and Edge Computing, 2017.
Yogesh Simmhan is an Assistant Professor of Computational and Data Sciences at the Indian Institute of Science, Bangalore. His research explores abstractions, algorithms and applications on distributed systems, spanning Cloud and Edge Computing, Internet of Things, and Big Data platforms. He has applied these to several Smart City projects including the Los Angeles Smart Grid and the IISc Smart Campus.
Yogesh has won the IEEE/ACM HPC Storage Challenge and IEEE TCSC SCALE Challenge, and has over 75 refereed publications. He is a Senior Member of IEEE and ACM. He has a Ph.D. in Computer Science from Indiana University, and was earlier affiliated with Microsoft Research, and the University of Southern California. He can be reached at email@example.com and www.dream-lab.in.
Sign Up for IoT Technical Community Updates
Calendar of Events
IoT Vertical and Topical Summit on Tourism
20-24 September 2021
Call for Papers
Special issue on AI and Blockchain powered IoT sustainable computing
Submission Deadline: 15 September 2021
Special issue on Aerial Computing for the Internet-of-Things (IoT)
Submission Deadline: 1 September 2021