Extending Apache Flink stream processing with Apache Samoa machine learning methods

Many stream processing applications can benefit from or need to rely on the prediction made with machine learning (ML) methods. In this presentation, new features of Apache Samoa are presented with a real data processing scenario. These features make Apache SAMOA fully accessible for Apache Flink users: (1) the data stream processed within Apache Flink is forwarded to Apache Samoa stream mining engine to perform predictions with stream-oriented ML models, (2) ML models evolve after every labelled instance and, at the same time, new predictions are sent back to Apache Flink. In both cases, Apache Kafka is used for data exchange. Hence, Apache Samoa is used as stream mining engine, provided with input data from, and sending predictions to Apache Flink. During the presentation, real life aspects are illustrated with code examples, such as input and prediction stream integration and monitoring latency of data processing and stream mining.