SlideShare ist ein Scribd-Unternehmen logo
1 von 43
What’s Kafka
• It’s an open-source message broker written in Scala
Java…
• Which aims to provide a unified, high-throughput,
low-latency platform for handling real-time data
feeds.
• Whose design is heavily influenced by transaction
logs.
Kafka it’s also…
• A distributed, partitioned, replicated commit log
service.
• A streaming process platform.
• Both queue and publish/subscribe paradigms
Use Cases
Kafka concepts
• Maintains feeds of messages in categories called
topics.
• Processes that publish messages to Kafka are
called producers.
• Processes that subscribe to topics and process the
feed of published messages are called consumers.
• Run as a cluster comprised of one or more servers
each of which is called a broker.
Data Retention
• Kafka retains all published messages for a
configurable period of time.
• Retaining lots of data is not a problem.
Producers and Consumers
Producers send messages over the network to the
Kafka cluster which in turn serves them up to
consumers like this:
The Topic
A topic is a category or feed name to which messages are published.
For each topic, the Kafka cluster maintains a partitioned log that looks
like this:
The Partition
• Each partition is an ordered, immutable sequence
of messages that is continually appended to.
• The messages in the partitions are each assigned a
sequential number called the offset.
• The offset uniquely identifies each message within
the partition.
Partitions and Consumers
More on partitions
• Partitions in the log allow it to scale beyond a size
that would fit on a single server.
• A topic may have many partitions.
• Partitions also act as the unit of parallelism.
Partitions… again…
• Partitions are distributed over the servers in the
Kafka cluster.
• Each partition is replicated across servers for fault
tolerance.
Guess what… Yep,
partitions…
• Each partition has one server which acts as the
“leader".
• Each partition has zero or more servers which act
as “followers".
• If the leader fails, one of the followers will become
the leader.
…
• The leader handles all requests for the partition
while the followers replicate the leader.
• Each server/node/broker acts as a leader for some
of its partitions and a follower for others.
Data Replication
Producers
• Producers publish data to the topics of their choice.
• The producer is responsible for choosing which
message to assign to which partition within the
topic.
Producers
Consumers
• Kafka offers a single consumer abstraction called
the consumer group.
• Consumers label themselves with a consumer
group name.
• Each message published to a topic is delivered to
one consumer within each consumer group.
Consumers Groups
Consumers Groups
Consumers Groups
Consumers Groups
Consumers Groups
Guarantees
• Messages sent by a producer to a particular topic
partition will be appended in the order they are
sent.
• A consumer instance sees messages in the order
they are stored in the log.
• For a topic with replication factor N, Kafka will
tolerate up to N-1 server failures without losing
any messages committed to the log.
Zookeeper
• Kafka uses Zookeeper to store metadata about
the Kafka cluster, as well as consumer client
details.
AVRO
• AVRO is the preferred serialization format for
Kafka messages.
• It’s independent of platform and/or language.
• Allows schemas to be evolved.
• Schemas are defined in a JSON like format.
AVRO
{"namespace": "customerManagement.avro",
"type": "record",
"name": "Customer",
"fields": [
{"name": "id", "type": "int"},
{"name": "name", "type": "string""},
{"name": "faxNumber", "type": ["null", "string"], "def
]
}
Schema Registry
• It’s a REST service.
• Allows a AVRO schema to be registered to one or
more topics.
• Stores multiple versions of a schema.
• Validates schemas compatibility.
Schema Registry
Schema Registry API
https://docs.confluent.io/current/schema-registry/docs/api.html
http://ae34acbe5ed9b11e8810a0a4e9b68c10-2021023861.us-east-
1.elb.amazonaws.com:8081/subjects
http://ae34acbe5ed9b11e8810a0a4e9b68c10-2021023861.us-east-
1.elb.amazonaws.com:8081/subjects/orders-avro-value/versions
http://ae34acbe5ed9b11e8810a0a4e9b68c10-2021023861.us-east-
1.elb.amazonaws.com:8081/subjects/orders-avro-value/versions/1
http://ae34acbe5ed9b11e8810a0a4e9b68c10-2021023861.us-east-
1.elb.amazonaws.com:8081/subjects/orders-avro-value/versions/1/schema
http://ae34acbe5ed9b11e8810a0a4e9b68c10-2021023861.us-east-
1.elb.amazonaws.com:8081/schemas/ids/41
Monitoring
http://aebb8cb14eec211e8810a0a4e9b68c10-1296651079.us-east-
1.elb.amazonaws.com:9000/
Monitoring
Kafka Streams
• Is a client library for building applications and microservices, where the input
and output data are stored in Kafka clusters.
Kafka Streams
• A stream is the most important abstraction provided by Kafka Streams. It
represents an unbounded, continuously updating data set.
• A stream processing application is any program that makes use of the Kafka
Streams library.
• A stream processor is a node in the processor topology.
• There are two special processors in the topology:
• Source Processor: A source processor is a special type of stream
processor that does not have any upstream processors.
• Sink Processor: A sink processor is a special type of stream processor that
does not have down-stream processors.
• Dataframe Schema (Reading)
Spark & Kafka
• Dataframe Schema (Writing)
Spark & Kafka
• Required configurations (Reading)
Spark & Kafka
• Required configurations (Writing)
Spark & Kafka
Spark & Kafka
val ordersStreamDF = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", brokers)
.option("subscribe", topic)
.option("startingOffsets", "earliest")
.load()
• Reading
• Writing
Spark & Kafka
val ordersDFQueryKafka =
ordersWithItemsAndProductsDF
.selectExpr("CAST(Timestamp as STRING) as key", "CAST(Discount as
.writeStream
.format(“kafka")
.option("kafka.bootstrap.servers", brokers)
.option("topic", topic + "-out")
.option(“checkpointLocation",
checkpointBucketKafka)
.start()
Ecosystem
• https://cwiki.apache.org/confluence/display/KAFK
A/Ecosystem
HANDS ON
THANK YOU!

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache Kafka - Overview
Apache Kafka - OverviewApache Kafka - Overview
Apache Kafka - Overview
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
kafka
kafkakafka
kafka
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 

Ähnlich wie Kafka basics

apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
TarekHamdi8
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
Deep Shah
 

Ähnlich wie Kafka basics (20)

apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
 
Kafka
KafkaKafka
Kafka
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
 
RabbitMQ vs Apache Kafka - Part 1
RabbitMQ vs Apache Kafka - Part 1RabbitMQ vs Apache Kafka - Part 1
RabbitMQ vs Apache Kafka - Part 1
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming Platform
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache Kafka
 
Kafka pub sub demo
Kafka pub sub demoKafka pub sub demo
Kafka pub sub demo
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
 
Kafka Introduction.pptx
Kafka Introduction.pptxKafka Introduction.pptx
Kafka Introduction.pptx
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platform
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
 

Mehr von João Paulo Leonidas Fernandes Dias da Silva (7)

Apache spark intro
Apache spark introApache spark intro
Apache spark intro
 
Query driven development
Query driven developmentQuery driven development
Query driven development
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
OpenCL Heterogeneous Parallel Computing
OpenCL Heterogeneous Parallel ComputingOpenCL Heterogeneous Parallel Computing
OpenCL Heterogeneous Parallel Computing
 
Apache Storm Basics
Apache Storm BasicsApache Storm Basics
Apache Storm Basics
 
Unit testing basics
Unit testing basicsUnit testing basics
Unit testing basics
 
Qcon Rio 2015 - Data Lakes Workshop
Qcon Rio 2015 - Data Lakes WorkshopQcon Rio 2015 - Data Lakes Workshop
Qcon Rio 2015 - Data Lakes Workshop
 

Kürzlich hochgeladen

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Kürzlich hochgeladen (20)

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 

Kafka basics