What is Apache Kafka?
• It is an 🙌 open-source 🙌 event streaming platform.
• Simply put, it is a way of moving data between systems.
• (i.e. between applications, and servers)
3 key capabilities for event streaming:
• To publish (write) and subscribe to (read) streams of events, including
continuous import/export of data from other systems.
• To store streams of events durably and reliably for as long as a dev needs.
• To process streams of events as they occur or retrospectively.
What can I use event streaming for?
• To process payments and financial transactions in real-time
• stock exchanges
• To track and monitor cars, trucks, fleets, and shipments in real-time
• automotive industry
• To continuously capture and analyze sensor data from IoT devices/equipment
• wind parks
Event streaming – cont’d…
• To collect and immediately react to customer interactions and orders
• hotel and travel industry
• mobile applications
• To connect, store, and make available data produced by different divisions of
• To serve as the foundation for data platforms, event-driven architectures,
Where is Apache Kafka used?
• Netflix is using Kafka to apply
recommendations in real-time while
you're watching TV shows.
• Uber uses Kafka to gather user, taxi
and trip data in real-time to compute
and forecast demand and pricing in
• LinkedIn uses Kafka to prevent spam in
their platform, collect user
interactions and make better
T is for Topics and Twilight Sparkle
• A particular stream of data.
• A topic is identified by its name.
• Topics are split into partitions.
• Each partition is ordered.
• Each message within a partition gets an
incremental id, called offset.
• Central protagonist of the show
• Most intellectual member of the Mane Six
• Her cutie mark represents her talent
for magic and her love
for books and knowledge.
0 1 2 3 4
0 1 2 3
0 1 2 3 4 5 6
Topic Replication is the process to offer fail-over
capability for a topic.
• If one broker (holds topics and partitions) is
down, the other broker can serve the data.
• This replication factor defines the number of
copies of a topic in a Kafka cluster (made up of
multiple Kafka brokers).
• Kafka stores messages in topics that
are partitioned and replicated across
multiple brokers in a cluster.
B is for Brokers and Babs Seed
• Holds topics and partitions.
• Each broker is identified with its ID (integer).
• Each broker contains certain topic partitions.
• After connecting to any broker, you're
connected to an entire cluster.
• Apple Bloom's cousin from Manehattan.
• Former member of the Cutie Mark
P is for Partitions and Pinkie Pie
• An ordered, immutable record sequence.
• Once data is written to a partition, it can't be
• Data is assigned randomly to a partition, unless a key
• At any one time, only ONE broker can be a leader for
a given partition.
• That leader can receive and serve data for a
• The other brokers will just be passive
replicas and synchronize the data.
• Each partition is going to have one leader, and
multiple ISR (in-sync replica).
• Baker at Sugarcube Corner.
• Toothless pet alligator, Gummy.
• Represents the element of laughter.
P is also for Producers and Pound Cake
• How do we get data in Kafka?
• Producers write data to topics (which are made up of
• Producers automatically know to which broker and
partition to write to.
• Producers Message Keys: producers can choose to
send a key with a message. (i.e. string, number, .etc)
• If a key is sent, all messages for that key will
always go to the same partition.
• A key is sent if you need message ordering for a
specific field (i.e. truck_id).
• Parents are surprised to find out that
Pound Cake is a male Pegasus, and his twin
foal sister (Pumpkin Cake) is a female
unicorn… even though their parents are
both Earth ponies.
C is for Consumers and Princess Celestia
• How do we read data in Kafka?
• They read data from a topic (identified by name).
• Consumers know which broker to read from.
• In case of broker failures, consumers know how to recover.
• Data is read in order within each partition.
• Consumer Groups: consumers read data in consumer
• Each consumer within a group will read directly from
• If you have more consumers than partitions, some will
• Most magical pony.
• Responsible for raising the sun to create
light in Equestria.
• Over 1,000 years old!
Z is for Zookeeper and Zecora
• Keeps track of status of the Kafka cluster nodes.
• Also keeps track of Kafka topics and partitions.
• Currently, Apache Kafka® uses Apache ZooKeeper™ to
store its metadata (i.e. location of partitions, the
configuration of topics).
⚠️ Apache Kafka is removing the Apache ZooKeeper
• In 2019, they outlined a plan to break this
dependency and bring metadata management back
into Kafka itself.
• Female zebra shaman and herbalist.
• Always speaks in rhyme.
Offenbar haben Sie einen Ad-Blocker installiert. Wenn Sie SlideShare auf die Whitelist für Ihren Werbeblocker setzen, helfen Sie unserer Gemeinschaft von Inhaltserstellern.
Sie hassen Werbung?
Wir haben unsere Datenschutzbestimmungen aktualisiert.
Wir haben unsere Datenschutzbestimmungen aktualisiert, um den neuen globalen Regeln zum Thema Datenschutzbestimmungen gerecht zu werden und dir einen Einblick in die begrenzten Möglichkeiten zu geben, wie wir deine Daten nutzen.
Die Einzelheiten findest du unten. Indem du sie akzeptierst, erklärst du dich mit den aktualisierten Datenschutzbestimmungen einverstanden.