The last 5 years, Kafka and Flink have become mature technologies that have allowed us to embrace the streaming paradigm. You can bet on them to build reliable and efficient applications. They are active projects backed by companies using them in production. They have a good community contributing, and sharing experience and knowledge. Kafka and Flink are solid choices if you want to build a data platform that your data scientists or developers can use to collect, process, and distribute data. You can put together Kafka Connect, Kafka, Schema Registry, and Flink. First, you will take care of their deployment. Then, for each case, you will setup each part, and of course develop the Flink job so it can integrate easily with the rest. Looks like a challenging but exciting project, isn't it? In this session, you will learn how you can build such data platform, what are the nitty-gritty of each part, how you can plug them together, in particular how to plug Flink in the Kafka ecosystem, what are the common pitfalls to avoid, and what it requires to be deployed on kubernetes. Even if you are not familiar with all the technologies, there will be enough introduction so you can follow. Come and learn how we can actually cross the streams!
30. ⚠Warning
➔ JVM Heap & RocksDB Memory & Container Memory
➔ State Backend
➔ HA Setup
➔ Rootless Container with random UID
Flink
31. ⚠Warning
➔ JVM Heap & RocksDB Memory & Container Memory → explicit allocation
➔ State Backend
➔ HA Setup
➔ Rootless Container with random UID
Flink
32. ⚠Warning
➔ JVM Heap & RocksDB Memory & Container Memory → explicit allocation
➔ State Backend → e.g. HDFS
➔ HA Setup
➔ Rootless Container with random UID
Flink
33. ⚠Warning
➔ JVM Heap & RocksDB Memory & Container Memory → explicit allocation
➔ State Backend → e.g. HDFS
➔ HA Setup → e.g. HDFS
➔ Rootless Container with random UID
Flink
34. ⚠Warning
➔ JVM Heap & RocksDB Memory & Container Memory → explicit allocation
➔ State Backend → e.g. HDFS
➔ HA Setup → e.g. HDFS
➔ Rootless Container with random UID → Build your own Docker Image
Flink
40. ghosts
id: INT name: TEXT
2 Slimer
movies
id: INT name: TEXT year: INT
1 Ghostbusters 1984
2 Ghostbusters II 1989
ghosts_in_movies
ghost_id: INT movie_id: INT id: INT
2 1 2
2 2 3
Seed the data into PostgreSQL (seed.sql)
50. ghosts
id: INT name: TEXT
2 Slimer
movies
id: INT name: TEXT year: INT
1 Ghostbusters 1984
2 Ghostbusters II 1989
ghosts_in_movies
ghost_id: INT movie_id: INT id: INT
2 1 2
2 2 3
Query Result Example