SlideShare ist ein Scribd-Unternehmen logo
1 von 43
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Enterprise Kafka
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Why Am I Here?
 You want to find out what this “Kafka” thing is
 You’re running Kafka, but you want to go big
 You’re looking for some neat whizbangs
2
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Clark Haskins
Todd Palino
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Who Are We?
 Kafka SRE at LinkedIn
 Site Reliability Engineering
– Administrators
– Architects
– Developers
 Keep the site running, always
4
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Kafka Overview
5
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
What Is Kafka?
6
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
What Is Kafka?
Broker
A
P0
A
P1
A
P0
7
Consumer
Producer
Zookeeper
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Attributes of a Kafka Cluster
 Disk Based
 Durable
 Scalable
 Low Latency
 Finite Retention
 NOT Idempotent (yet)
8
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Kafka At LinkedIn
 Multiple Datacenters, Multiple Clusters
 Mirroring between clusters
 Message Types
– Metrics
– Tracking
– Queuing
 Data transport from applications to Hadoop, and back
9
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Kafka At LinkedIn
10
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Kafka At LinkedIn
 300+ Kafka brokers
 Over 18,000 topics
 140,000+ Partitions
 220 Billion messages per day
 40 Terabytes In
 160 Terabytes Out
 Peak Load
– 3.25 Million messages per second
– 5.5 Gigabits/sec Inbound
– 18 Gigabits/sec Outbound
11
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Challenges We Have Overcome
12
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Solutions
 Kafka is young…..we Influenced development
 Operations wizardry…
13
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Hyper Growth
 Need to expand clusters to keep up with site traffic, and then balance them.
14
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Adding brokers
15
Brokers
Consumers
Producers
A
P1
A
P0
B
P1
B
P0
a
P5
A
P4
B
P5
B
P4
A
P3
A
P2
B
P3
B
P2
A
P7
A
P6
B
P7
B
P6
A
P5
A
P4
B
P5
B
P4
A
P1
A
P0
B
P1
B
P0
A
P7
A
P6
B
P7
B
P6
A
P3
A
P2
B
P3
B
P2
C
P1
C
P0
C
P3
C
P2
C
P1
C
P0
C
P3
C
P2
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Adding a broker(with broker leveling)
16
Brokers
Consumers
Producers
A
P1
A
P0
B
P1
B
P0
A
P5
A
P4
B
P5
B
P4
A
P3
A
P2
B
P3
B
P2
A
P7
A
P6
B
P7
B
P6
A
P5
A
P4
B
P5
B
P4
A
P1
A
P0
B
P1
B
P0
A
P7
A
P6
B
P7
B
P6
A
P3
A
P2
B
P3
B
P2
C
P1
C
P0
C
P3
C
P2
C
P1
C
P0
C
P3
C
P2
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Logs vs. Metrics
 Logging data killed the metrics cluster
17
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Quality of Service with Kafka
18
Brokers
Consumers
Producers
A
P1
A
P0
B
P1
B
P0
A
P5
A
P4
B
P5
B
P4
A
P3
A
P2
B
P3
B
P2
A
P7
A
P6
B
P7
B
P6
A
P5
A
P4
B
P5
B
P4
A
P1
A
P0
B
P1
B
P0
A
P7
A
P6
B
P7
B
P6
A
P3
A
P2
B
P3
B
P2
C
P1
C
P0
C
P3
C
P2
C
P1
C
P0
C
P3
C
P2
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Deployment Nightmares
 Parallel deployment wasn’t possible so…
 Babysitting sequential deployments
19
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Easy deployments
 Kafka 0.8.1 makes sure the cluster is in a good state before shutting down
– If any brokers in the cluster have under replicated partitions, Kafka will not shut
down
– Kafka ensures that only 1 broker is in shutdown sequence at a time.
20
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Killing Zookeeper
 Consumer offset management done within Zookeeper
 Every consumer committing offsets every minute for every partition makes
ZK very unhappy.
21
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Zookeeper on SSD
22
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Monitoring
23
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Kafka Is Broken!
24
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Kafka Is Broken!
 Everything is Kafka’s fault first
 What is lag?
 Consumer Problems
– Application problems
– Kafka client problems
25
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
How Do We Sleep At Night?
 Educating Users
– Why lag is their fault
 Monitoring the Ecosystem
– Kafka Brokers
– Zookeeper
– Mirror Makers
– Audit
– REST Interfaces
 Week Over Week
26
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Cluster Health and Utilization
 Under replicated partitions
 Offline partitions
 Broker partition count
 Data size on disk
 Leader partition count
 Network utilization
27
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Zookeeper
 Ensemble availability
 Latency
 Outstanding requests
28
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Mirror Maker and Audit
 Mirror Maker
– Lag
– Dropped Messages
 Audit Consumer
– Lag
– Completeness check
 Audit UI
29
Producer
Cluster ClusterMM
MessagesMessage
Counts
Audit
Consumer
All
Messages
Audit
State
Audit
Consumer
Audit
UI
Audit
State
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Audit UI
30
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Audit UI
31
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Tuning
32
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Hardware and OS
 Kernel Tuning
– Swapping is Death
– Allow more dirty pages
– Allow less dirty cache
 Disk throughput
– More spindles
– Longer commit interval
33
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Java Virtual Machine
34
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Garbage Collection
35
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Garbage Collection
 Java 7, update 51
 Garbage First (G1) Collector
– Set the heap size
– Specify a target GC pause time
– Don’t set the New size
 GC Times
– Less than 15ms per second in GC
– Steady 20-22ms GC intervals
– Almost no full GC cycles (and only 200-400ms when it does)
36
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Closing
37
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
What’s Coming in 0.8.2
 Consumer offsets in the broker
 Delete topic
 Further down the road
– New producer
– Improved producer API
38
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Upcoming Operational Work
 Learning to share
 Shrinking a cluster
 Cluster comparison
 Advanced monitoring
39
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
How Can You Get Involved?
 http://kafka.apache.org
 Join the mailing lists
– users@kafka.apache.org
 irc.freenode.net - #apache-kafka
 Contribute tools
40
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Talk To Us
 Kafka SREs at LinkedIn
– Clark Haskins
 https://www.linkedin.com/in/clarkhaskins
 chaskins@linkedin.com
– Todd Palino
 https://www.linkedin.com/in/toddpalino
 tpalino@linkedin.com
41
SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved.
Questions
42
Enterprise Kafka: Kafka as a Service

Weitere ähnliche Inhalte

Was ist angesagt?

Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedInBuilding a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
DataWorks Summit
 

Was ist angesagt? (20)

Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ NetflixWhoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
Towards Flink 2.0: Unified Batch & Stream Processing - Aljoscha Krettek, Verv...
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafka
Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache KafkaKafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafka
Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafka
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Apache Pulsar Development 101 with Python
Apache Pulsar Development 101 with PythonApache Pulsar Development 101 with Python
Apache Pulsar Development 101 with Python
 
Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedInBuilding a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier Architectures
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Hardening Kafka Replication
Hardening Kafka Replication Hardening Kafka Replication
Hardening Kafka Replication
 

Ähnlich wie Enterprise Kafka: Kafka as a Service

Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafka
Nitin Kumar
 
Web rtc infrastructure the hard parts v4
Web rtc infrastructure the hard parts v4Web rtc infrastructure the hard parts v4
Web rtc infrastructure the hard parts v4
Dialogic Inc.
 
App-First & Cloud-Native: How InterMiles Boosted CX with AWS & Infostretch
App-First & Cloud-Native: How InterMiles Boosted CX with AWS & InfostretchApp-First & Cloud-Native: How InterMiles Boosted CX with AWS & Infostretch
App-First & Cloud-Native: How InterMiles Boosted CX with AWS & Infostretch
Infostretch
 

Ähnlich wie Enterprise Kafka: Kafka as a Service (20)

Kafka overview and use cases
Kafka overview and use casesKafka overview and use cases
Kafka overview and use cases
 
WebRTC Infrastructure the Hard Parts: Media
WebRTC Infrastructure the Hard Parts: MediaWebRTC Infrastructure the Hard Parts: Media
WebRTC Infrastructure the Hard Parts: Media
 
Scribe Online CDK & Connector Development
Scribe Online CDK & Connector DevelopmentScribe Online CDK & Connector Development
Scribe Online CDK & Connector Development
 
Linked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafkaLinked in multi tier, multi-tenant, multi-problem kafka
Linked in multi tier, multi-tenant, multi-problem kafka
 
Multi tier, multi-tenant, multi-problem kafka
Multi tier, multi-tenant, multi-problem kafkaMulti tier, multi-tenant, multi-problem kafka
Multi tier, multi-tenant, multi-problem kafka
 
Web rtc infrastructure the hard parts v4
Web rtc infrastructure the hard parts v4Web rtc infrastructure the hard parts v4
Web rtc infrastructure the hard parts v4
 
Cisco Connect Vancouver 2017 - Cisco's Digital Network Architecture - deeper ...
Cisco Connect Vancouver 2017 - Cisco's Digital Network Architecture - deeper ...Cisco Connect Vancouver 2017 - Cisco's Digital Network Architecture - deeper ...
Cisco Connect Vancouver 2017 - Cisco's Digital Network Architecture - deeper ...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Continuous Delivery pour vos applications avec Cloud Foundry et Jenkins
Continuous Delivery pour vos applications avec Cloud Foundry et JenkinsContinuous Delivery pour vos applications avec Cloud Foundry et Jenkins
Continuous Delivery pour vos applications avec Cloud Foundry et Jenkins
 
App-First & Cloud-Native: How InterMiles Boosted CX with AWS & Infostretch
App-First & Cloud-Native: How InterMiles Boosted CX with AWS & InfostretchApp-First & Cloud-Native: How InterMiles Boosted CX with AWS & Infostretch
App-First & Cloud-Native: How InterMiles Boosted CX with AWS & Infostretch
 
Cisco Meraki - Simplifying Powerful Technology
Cisco Meraki - Simplifying Powerful TechnologyCisco Meraki - Simplifying Powerful Technology
Cisco Meraki - Simplifying Powerful Technology
 
Concevoir et déployer vos applications a base de microservices sur Cloud Foundry
Concevoir et déployer vos applications a base de microservices sur Cloud FoundryConcevoir et déployer vos applications a base de microservices sur Cloud Foundry
Concevoir et déployer vos applications a base de microservices sur Cloud Foundry
 
Build your first IoT device - The tricky interface of Product and R&D with Ni...
Build your first IoT device - The tricky interface of Product and R&D with Ni...Build your first IoT device - The tricky interface of Product and R&D with Ni...
Build your first IoT device - The tricky interface of Product and R&D with Ni...
 
INTERFACE, by apidays - Design and Build Great Web APIs
INTERFACE, by apidays - Design and Build Great Web APIsINTERFACE, by apidays - Design and Build Great Web APIs
INTERFACE, by apidays - Design and Build Great Web APIs
 
IBM i Development: Increase Accuracy and Efficiency with SEQUEL's ABSTRACT a...
 IBM i Development: Increase Accuracy and Efficiency with SEQUEL's ABSTRACT a... IBM i Development: Increase Accuracy and Efficiency with SEQUEL's ABSTRACT a...
IBM i Development: Increase Accuracy and Efficiency with SEQUEL's ABSTRACT a...
 
What does it take to be an architect
What does it take to be an architectWhat does it take to be an architect
What does it take to be an architect
 
What does it take to be architect (for Cjicago JUG)
What does it take to be architect (for Cjicago JUG)What does it take to be architect (for Cjicago JUG)
What does it take to be architect (for Cjicago JUG)
 
Vbrownbag container networking for real workloads
Vbrownbag container networking for real workloadsVbrownbag container networking for real workloads
Vbrownbag container networking for real workloads
 
apidays LIVE New York - Building Great Web APIs by Mike Amundsen
apidays LIVE New York - Building Great Web APIs by Mike Amundsenapidays LIVE New York - Building Great Web APIs by Mike Amundsen
apidays LIVE New York - Building Great Web APIs by Mike Amundsen
 
Bringing Partners, Teams & Systems Together through APIs
Bringing Partners, Teams & Systems Together through APIsBringing Partners, Teams & Systems Together through APIs
Bringing Partners, Teams & Systems Together through APIs
 

Mehr von Todd Palino

Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Todd Palino
 
I'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedInI'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedIn
Todd Palino
 

Mehr von Todd Palino (11)

Leading Without Managing: Becoming an SRE Technical Leader
Leading Without Managing: Becoming an SRE Technical LeaderLeading Without Managing: Becoming an SRE Technical Leader
Leading Without Managing: Becoming an SRE Technical Leader
 
From Operations to Site Reliability in Five Easy Steps
From Operations to Site Reliability in Five Easy StepsFrom Operations to Site Reliability in Five Easy Steps
From Operations to Site Reliability in Five Easy Steps
 
Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart WayCode Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
 
Why Does (My) Monitoring Suck?
Why Does (My) Monitoring Suck?Why Does (My) Monitoring Suck?
Why Does (My) Monitoring Suck?
 
URP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowURP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to Know
 
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
Redefine Operations in a DevOps World: The New Role for Site Reliability Eng...
 
Running Kafka for Maximum Pain
Running Kafka for Maximum PainRunning Kafka for Maximum Pain
Running Kafka for Maximum Pain
 
I'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedInI'm No Hero: Full Stack Reliability at LinkedIn
I'm No Hero: Full Stack Reliability at LinkedIn
 
More Datacenters, More Problems
More Datacenters, More ProblemsMore Datacenters, More Problems
More Datacenters, More Problems
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
Tuning Kafka for Fun and Profit
Tuning Kafka for Fun and ProfitTuning Kafka for Fun and Profit
Tuning Kafka for Fun and Profit
 

Kürzlich hochgeladen

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 

Enterprise Kafka: Kafka as a Service

  • 1. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Enterprise Kafka
  • 2. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Why Am I Here?  You want to find out what this “Kafka” thing is  You’re running Kafka, but you want to go big  You’re looking for some neat whizbangs 2
  • 3. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Clark Haskins Todd Palino
  • 4. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Who Are We?  Kafka SRE at LinkedIn  Site Reliability Engineering – Administrators – Architects – Developers  Keep the site running, always 4
  • 5. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Kafka Overview 5
  • 6. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. What Is Kafka? 6
  • 7. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. What Is Kafka? Broker A P0 A P1 A P0 7 Consumer Producer Zookeeper
  • 8. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Attributes of a Kafka Cluster  Disk Based  Durable  Scalable  Low Latency  Finite Retention  NOT Idempotent (yet) 8
  • 9. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Kafka At LinkedIn  Multiple Datacenters, Multiple Clusters  Mirroring between clusters  Message Types – Metrics – Tracking – Queuing  Data transport from applications to Hadoop, and back 9
  • 10. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Kafka At LinkedIn 10
  • 11. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Kafka At LinkedIn  300+ Kafka brokers  Over 18,000 topics  140,000+ Partitions  220 Billion messages per day  40 Terabytes In  160 Terabytes Out  Peak Load – 3.25 Million messages per second – 5.5 Gigabits/sec Inbound – 18 Gigabits/sec Outbound 11
  • 12. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Challenges We Have Overcome 12
  • 13. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Solutions  Kafka is young…..we Influenced development  Operations wizardry… 13
  • 14. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Hyper Growth  Need to expand clusters to keep up with site traffic, and then balance them. 14
  • 15. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Adding brokers 15 Brokers Consumers Producers A P1 A P0 B P1 B P0 a P5 A P4 B P5 B P4 A P3 A P2 B P3 B P2 A P7 A P6 B P7 B P6 A P5 A P4 B P5 B P4 A P1 A P0 B P1 B P0 A P7 A P6 B P7 B P6 A P3 A P2 B P3 B P2 C P1 C P0 C P3 C P2 C P1 C P0 C P3 C P2
  • 16. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Adding a broker(with broker leveling) 16 Brokers Consumers Producers A P1 A P0 B P1 B P0 A P5 A P4 B P5 B P4 A P3 A P2 B P3 B P2 A P7 A P6 B P7 B P6 A P5 A P4 B P5 B P4 A P1 A P0 B P1 B P0 A P7 A P6 B P7 B P6 A P3 A P2 B P3 B P2 C P1 C P0 C P3 C P2 C P1 C P0 C P3 C P2
  • 17. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Logs vs. Metrics  Logging data killed the metrics cluster 17
  • 18. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Quality of Service with Kafka 18 Brokers Consumers Producers A P1 A P0 B P1 B P0 A P5 A P4 B P5 B P4 A P3 A P2 B P3 B P2 A P7 A P6 B P7 B P6 A P5 A P4 B P5 B P4 A P1 A P0 B P1 B P0 A P7 A P6 B P7 B P6 A P3 A P2 B P3 B P2 C P1 C P0 C P3 C P2 C P1 C P0 C P3 C P2
  • 19. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Deployment Nightmares  Parallel deployment wasn’t possible so…  Babysitting sequential deployments 19
  • 20. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Easy deployments  Kafka 0.8.1 makes sure the cluster is in a good state before shutting down – If any brokers in the cluster have under replicated partitions, Kafka will not shut down – Kafka ensures that only 1 broker is in shutdown sequence at a time. 20
  • 21. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Killing Zookeeper  Consumer offset management done within Zookeeper  Every consumer committing offsets every minute for every partition makes ZK very unhappy. 21
  • 22. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Zookeeper on SSD 22
  • 23. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Monitoring 23
  • 24. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Kafka Is Broken! 24
  • 25. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Kafka Is Broken!  Everything is Kafka’s fault first  What is lag?  Consumer Problems – Application problems – Kafka client problems 25
  • 26. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. How Do We Sleep At Night?  Educating Users – Why lag is their fault  Monitoring the Ecosystem – Kafka Brokers – Zookeeper – Mirror Makers – Audit – REST Interfaces  Week Over Week 26
  • 27. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Cluster Health and Utilization  Under replicated partitions  Offline partitions  Broker partition count  Data size on disk  Leader partition count  Network utilization 27
  • 28. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Zookeeper  Ensemble availability  Latency  Outstanding requests 28
  • 29. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Mirror Maker and Audit  Mirror Maker – Lag – Dropped Messages  Audit Consumer – Lag – Completeness check  Audit UI 29 Producer Cluster ClusterMM MessagesMessage Counts Audit Consumer All Messages Audit State Audit Consumer Audit UI Audit State
  • 30. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Audit UI 30
  • 31. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Audit UI 31
  • 32. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Tuning 32
  • 33. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Hardware and OS  Kernel Tuning – Swapping is Death – Allow more dirty pages – Allow less dirty cache  Disk throughput – More spindles – Longer commit interval 33
  • 34. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Java Virtual Machine 34
  • 35. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Garbage Collection 35
  • 36. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Garbage Collection  Java 7, update 51  Garbage First (G1) Collector – Set the heap size – Specify a target GC pause time – Don’t set the New size  GC Times – Less than 15ms per second in GC – Steady 20-22ms GC intervals – Almost no full GC cycles (and only 200-400ms when it does) 36
  • 37. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Closing 37
  • 38. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. What’s Coming in 0.8.2  Consumer offsets in the broker  Delete topic  Further down the road – New producer – Improved producer API 38
  • 39. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Upcoming Operational Work  Learning to share  Shrinking a cluster  Cluster comparison  Advanced monitoring 39
  • 40. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. How Can You Get Involved?  http://kafka.apache.org  Join the mailing lists – users@kafka.apache.org  irc.freenode.net - #apache-kafka  Contribute tools 40
  • 41. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Talk To Us  Kafka SREs at LinkedIn – Clark Haskins  https://www.linkedin.com/in/clarkhaskins  chaskins@linkedin.com – Todd Palino  https://www.linkedin.com/in/toddpalino  tpalino@linkedin.com 41
  • 42. SITE RELIABILITY ENGINEERING©2014 LinkedIn Corporation. All Rights Reserved. Questions 42