Abstract: Internet of Things (IoT) is revolutionizing the way how information is processed and stored. Due to latency sensitive applications and huge amounts of data produced at the edge of the network, more and more data is processed where it is produced - namely on the edge. This development results in completely new network topologies where besides massive data centers we experience growing amount of so called micro data centers on the edge of the network. However, increasing complexity of multiple data centers necessary to execute an application represents a new challenge for the deployment and runtime operation of large scale applications like those in the area of smart cities, self-driving vehicles and tele medicine. The challenge thereby is to deploy application in a way to satisfy user requirements in form of different Quality of Service parameters (e.g., latency) but at the same time minimize energy consumption necessary to execute the application.
In this talk we discuss several research challenges that arise when deploying near real time analytics on the edge of the network. A critical challenge for data stream processing is the consistency of the machine learning models in distributed worker nodes. Especially in the case of non-stationary streams, which exhibit high degree of data set shift, mismanagement of models poses the risks of suboptimal accuracy due to staleness and ignored data. We discuss the model consistency challenges based on an online machine learning scenario. Additionally, we propose metrics for measuring the level and speed of data set shift.