“Hit me, baby, just one time” – Building End-to-End Exactly-Once Applications with Flink

Getting data in and out of Flink is by far the most important aspect, and an everyday typical requirement of building Flink applications. Doing so in an end-to-end exactly-once manner, however, can be tricky. Being able to reliably consume data from the outside world without any duplicate processing and guaranteeing consistent distributed state, and at the same time provide computed results back to the outside world also without introducing duplicates, is crucial for the consistency and correctness of applications built upon stream processors. In this talk, we will talk about how end-to-end exactly-once guarantees can be achieved with Apache Flink. We will talk about Flink’s checkpointing mechanism, and how exactly to leverage it when consuming and producing data from your Flink streaming pipelines. In particular, we will be having a detailed review on how our supported connectors do so, with the aim to provide reference implementations for your own custom consumers and sinks.