Event-Driven Cloud Architecture Considerations for an Interactive Internet of Things

Javier Moreno Molina
March 14, 2018

 

The Internet of Things represents a vision of a world where computer systems are connected and completely integrated with the physical world. Communications, sensing, and actuation interfaces are present now in more and more objects, not only in industry but also in our daily life.

However, the lifecycle of data has grown in a very uneven way. Huge amounts of data are now being generated by million embedded sensors. Data is kept in storages and data warehouses all over the world. Unfortunately, the use of this data is far away from seeing its full potential. Most data is meaningless at collection time and only acquires value during offline analytics when actionable insights are finally obtained [2].

Actionable data, already at collection time, still has a long way to go. IoT applications must match every new data input, with current and previously collected data inputs in order to identify relevant events and provide the best context-aware responses. Moreover, they must do it within reasonable latency. Only then, a seamless interactive Internet of Things will actually come true.

The challenges, and the paradigm shift, from the cloud architecture perspective, are not small.

Event-Driven Services

Unlike Internet applications so far, which have been driven by request-response schemas, IoT applications excel with event-driven services. Instead of delivering service upon a well-specified request, proactively made by a human end-user that directly supervises most of the process, an IoT application can properly identify a service based on heterogeneous data inputs from different sources, and deliver it just at the same time the user notices that he needs it [1].

This seamless interaction with users, anticipating their demands, and delivering what they need at the very moment they need it, is the cornerstone of the interactive Internet of Things. To achieve it, IoT applications need to introduce the following changes:

  • Databases to Streams: There is no specific trigger that allows you to occasionally query the exact information you need. Instead, triggers must be recognized from data-driven IoT services that continuously receive new data inputs. While databases were optimized for the former case, the latter requires continuous queries which are more effective using data streams.
  • Immutable Data Sources: To enable the evolution of IoT applications, the data sources used to produce application events must be preserved. They need to survive as the source of truth for the applications that deliver service based on them. Hence, an improved application with corrected bugs, enhanced algorithms or that expands the spectrum of input data sources, can reprocess all data based on their current implementation, and not be restricted or burdened by the anachronistic decisions from the past.
  • Non-Blocking Interfaces: The throughput of incoming events increases significantly. At the same time, in most cases, there will not be a customer sitting down in front of the screen waiting for a response to every request he makes. An immediate response is not always required, and a blocked input may pass unnoticed. Just as it happened with voice communication, maintaining a sufficient Quality of Service, allows for satisfactory service delivery while obtaining a much higher throughput.
  • Asynchronous Downward Channels: Whenever an action results from event processing, IoT cloud services need to be able to quickly communicate this action to the device or devices that will take part in the execution. In most cases, these devices will not even be aware of the events being processed. Therefore, they need to asynchronously communicate and obtain the necessary information to perform their corresponding actions in time.

The result of applying these design principles is a completely asynchronous communication, that could be difficult to assimilate at first, but which will end up being decisive to enable providing the best responses to customers that are continuously providing data through different sources at the same time.

Figure 1: IoT Event-Driven Application Architecture.

Figure 1: IoT Event-Driven Application Architecture.

Low-Latency Complex Event Processing

In order to provide context-aware responses, IoT applications must be capable of building their own events [4] based on the broad space of data input streams. This means determining which data inputs, or combinations of them, become relevant to their service delivery. This approach has been the field of study of Complex Event Processing for decades.

The critical challenge for IoT interactive services is to be able to perform complex event processing and deliver in “real-time”. There is frequently no time to perform database queries at processing time. The way to reduce latency is to provision event processors with as much context as possible. This way, when a new event occurs, all the required data to take an action is quickly accessible.

There are two main, non-exclusive ways to achieve this:

  • Edge Computing: Assuming most of the contextual data will have geographical coherence, every data input could be enriched with additional contextual data at edge location. In the ideal case, there is no need to access any other remote data, as the incoming event contains all necessary information to make it actionable. Cloud services must just host the business logic to decide which action to execute, and could even be implemented as lambda functions.
  • Stateful Stream Processors: In this approach, cloud services must consume all data they consider relevant for their decisions, and store the states in a very quickly accessible cache memory they can lookup while processing new events. There are already tools like, Apache Samza or Kafka Streams that maintain key-value tables with their relevant states, in order to perform stateful real-time event processing. They both keep states in embedded databases (LevelDB, RocksDB) so that no remote database needs to be accessed.

In practice, while edge computing is crucial to seamlessly provide contextual data, that may be difficult to infer otherwise, it is also very likely to need additional information from external data sources.

Conclusions

Event-streams are present in some well-known architectures, such as in lambda architecture, to provide real-time data analytics. However, IoT applications, especially those focused on closing the cyber-physical loop, rely solely on providing a “real-time” response based on all the information available. A concept that matches very well with the so-called Kappa architecture [3].

Serverless architectures may work well for self-contained events, but as soon as it is needed to remotely fetch additional data, performance will drop dramatically.

The IoT interactive application can be seen as a set of memory hungry micro-applications that subscribe to events of interest and provision themselves with all the additional data they require to execute their business logic (see Figure 1). Ideally, all data will persist in immutable data streams that allow the micro-applications to rebuild their own state tables whenever they need.

References

  1. Bonér, J., Farley, D., Kuhn, R., & Thompson, M. (2014). The reactive manifesto.
  2. Reinsel, D., Gantz, J., Rydning, J. (2017). Data Age 2025: The Evolution of Data to Life-Critical. IDC White Paper.
  3. Kreps, J. (2014). Questioning the Lambda Architecture. O’Reilly.
  4. Fowler, M. (2006). Focusing on Events.

 


 

Javier Moreno MolinaJavier Moreno Molina (IEEE Member) received his master’s degree in Telecommunications Engineering from Technical University of Madrid, and his Ph.D. in Electrical Engineering from Vienna University of Technology. He has worked as a researcher in Cyber-Physical Systems in University of Kaiserslautern, before joining BEEVA (BBVA) as an Internet of Things Solutions Architect. He has participated in several international projects, always in the IoT field, such as GEODES, FP7 SmartCoDe and H2020 VICINITY, and has numerous publications on networked and distributed embedded systems.