SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
Kahn Chen 、Liang Wang 13/06/2021
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline
at Tencent
Self-Introduction
Agenda
Main application scenarios at Tencent
Business requirements in Using Apache Kafka
Architecture evolution process in using Apache kafka
Operational data at Tencent
Incorporate new features of the Apache Kafka community
Name:Liang Wang
Title :Tencent Senior Engineer
Self-Introduction
Main application scenarios
As one of the world’s biggest internet-based platform companies, Tencent uses technology to enrich the lives of users and assist the digital upgrade of
enterprises. An example product is the popular WeChat application, which has over one billion active users worldwide. The Platform and Content Group (PCG)
is responsible for integrating Tencent’s internet, social, and content platforms. PCG promotes the cross-platform and multi-modal development of IP, with
the overall goal of creating more diversified premium digital content experiences. Since its inception, many major products—from the well-known QQ,
QZone, video, App Store, news, and browser, to relatively new members of the ecosystem such as live broadcast, anime, and movies—have been evolving on
top of consolidated infrastructure and a set of foundational technology components.
At our center stands the real-time messaging system that connects data and computing. We have proudly built many essential data pipelines and messaging
queues around Apache Kafka. Our application of Kafka is similar to other organizations: We build pipelines for cross-region log ingestion, machine learning
platforms, and asynchronous communication among microservices. The unique challenges come from stretching our architecture along multiple dimensions,
namely scalability, customization, and SLA.
Business requirements in Using Apache Kafka
l Low latency and high SLA
In the actual operation, the business often requires very low latency. And we found that the annual disk failure rate is as high as 5%, there may also be cases
such as computer room network failures. However, the current disaster tolerance of Kafka in the community can only support the Broker level, and it is difficult
to support the low delay required by our business.
l Topic partition needs to support lossless scaling down
In some real-world operation scenarios, such as after the peak traffic during the holidays, we need to recycle the machines that have been temporarily added to
the cluster, and scale down the temporarily expanded partitions. However, currently Kafka does not support this capability. It requires the developer change
the business logic of the code to achieve this.
l A large number of topics need to be supported on a single cluster
In Tencent real-time data warehouse scenarios, table1->flink_sql-> table2 ...
table1 and table2 are usually single topics in real-time stream. Thus, a lot of topics are needed. However, because of the performance bottleneck of Zookeeper,
apache Kafka cannot support over one million of the partitions well before version 2.8.0.
In order to meet the above requirements, what have we done
first federal kafka design
Example: A logical topic with eight partitions (P0–P7) is distributed to two physical clusters each with four partitions (P0–P3)
On the one hand We add Two proxy services, one for the producer (pproxy) and another for the consumer (cproxy), implement the core protocols of the Kafka broker.
They are also responsible for mapping logical topics to their physical incarnation. The application uses the same Kafka SDK to connect directly to the proxy, which
acts as a broker.
On the other hand We extract the logic of the Kafka controller node (such as topic metadata) into a separate service, which is also called “the controller,” but
it is different from Kafka’s own controller functionality. This service is responsible for collecting the Metadata of physical clusters, composing the partition
information logical topics, and then publishing it to the proxy brokers.
Key components and workflows
1. Controller reads the mapping between logical topic and physical topic from the config database.
2. Controller scans metadata of all physical topics from each Kafka cluster.
3. Controller finds the list of available pproxy from heartbeat message.
4. Controller composes the metadata for the logical topics.
5. Controller pushes both the topic mapping and topic metadata to proxy brokers.
6. The metadata is sent to the Kafka SDK.
Cluster provisioning workflow
Cluster decommission workflow
1. Controller marks the cluster (cluster 1) to be decommissioned in read-only mode in the config database.
2. Controller picks up this state change when it fetches the metadata from the database.
3. Controller reconstructs metadata for producers, removing cluster 1 from the physical clusters.
4. Controller pushes a new version of producer metadata to the proxy. At this moment, data stops being written to cluster 1 , while the consuming
side is not impacted.
5. While all topics in the cluster are expired, the controller reconstructs the consumer metadata again and removes the physical topics no longer
valid.
6. Consumer proxy loads the new metadata and now when the new consumer arrives, it will no longer pick up cluster 1. This completes the
decommission operation.
Advantages and disadvantages of Cluster Federation
Advantages
ü The topic partition can achieve lossless expansion and contraction
ü Disaster tolerance can be achieved at the sub-cluster level
Disadvantages
l Proxy service needs to implement all kafka protocols, and it is difficult to
incorporate new features of apache kafka community
l Add an additional 30% of the cost to deploy the Proxy service
1. How to reduce the additional 30% overhead caused by proxy services
2. How to make it easy to incorporate new features from the apache kafka
community
Now we have two new problems
First, the role of the proxy service is to proxy the partition request of the logical topic to the correct physical topic. If we want to remove the proxy service, the
metadata obtained by the sdk must be directly requested on the real broker. We can merge the metadata of the topics distributed on multiple sub-clusters to make
the SDK look like a complete topic.
Example: A logical topic with eight partitions (P0–P7) is distributed to two physical clusters,the first one with four partitions (P0–P3) and the other with four
partitions (P4–P7,partition id was started from 4), and the topic name must be same in the two clusters
P4
P5
P6
P7
new design of federal Kafka (Aims to remove proxy service)
P4 P6 P5 P7
P5 P7 P4 P6
partition id does not start from 0
The improved version of the Cluster Federation(without proxy service layer)
In order to make the sdk get a merged metadata, we added a new module "Master cluster", which is a complete Kafka cluster with zookeeper, but there is no broker role on it. On
this "Master cluster" cluster, we have mounted multiple kafka clusters with Brokers, and divided these kafka clusters into groups. In operation, we ensure that a topic partition will
be allocated to all sub-clusters in a group, but cross-groups are not allowed. Therefore, this improved federated cluster can achieve the purpose of expanding the partitions by
expanding the sub-clusters under the group where the topic is located when the number of topic partitions is insufficient. When the number of topics in the federated cluster is
insufficient, the purpose of expanding the topic can be achieved by expanding the group.
"Master cluster" will obtain the metadata of all sub-clusters, and sort and summarize the metadata of the same topic to obtain a complete metadata and send it to the SDK. so that
the message production without transaction is already available Up. Since there is no correlation between the sub-clusters and a consumer group tends to subscribe to multiple
topics, the management of consumer groups and transactions must be implemented on the "Master Cluster".
How does the new Cluster Federation work ?
After the "Master Cluster" is started, the metadata of all sub-clusters will be summarized and sorted by Topic
1、The SDK obtains metadata from the "Master Cluster" cluster
2、"Master Cluster" will return the aggregated topic metadata
3、The SDK normally reads and writes messages based on metadata, but Metadata Update and FindCoordiantor will only request from the
"Master Cluster".
Produce/Fetch request for partition 0 and partition 3 of U_TopicA
How to realize the expansion and disaster tolerance of “Master Cluster”?
1、We have realized the role of brokers in KIP 500, so it becomes very easy for Coordinator to expand according to actual needs.
2、When a new sub-cluster is registered or a sub-cluster is unregistered, this change will be written to Redis.
3、The data requested by OffsetCommit will also be written to Redis asynchronously.
4、When“Master Cluster” fails,we will use the backup cluster to recover the data from Redis immediately.
l Limit the number of partitions according to the cluster specifications to prevent performance
problems caused by the out-of-control file number.
How to achieve cluster high performance and high availability
l Mute topics and partitions of failure cluster, and we have submitted KIP-694 about it.
l Mute reads and writes for lossless cluster upgrades.
l The SDK has the ability to mute abnormal partitions, and we have submitted KIP-693 about it
also.
Operational data at Tencent
Apach kafka Tencent's customized kafka
Read and write recovery time after
cluster failure
It usually takes 30-60 minutes to
restart a broker with 20TB data
The client will stop writing to the
abnormal sub-cluster, and the
recovery time is <1min
Number of Brokers in a single
cluster
Before version 2.8.0, the number of
brokers in a single cluster can
reach about 200.
The number of brokers in a cluster
can support up to 1000
Machine recycling and partition
shrinking after traffic peaks
Machine recycling can be achieved
through rebalance, but partition
scaling down cannot be supported
You can achieve business-
insensitive partition scaling down
and recycling machines without
balancing
Incorporate new features of the Apache Kafka community
When we designed the federation mode, kafka 2.8.0 was not released, so our overall
design idea is to be better compatible with the features of KIP 500 in the future, such as
role splitting, and Controller unified management metadata.
In the future we will incorporate these cool features of the community
Thinks

Weitere ähnliche Inhalte

Was ist angesagt?

Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...
HostedbyConfluent
 
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
HostedbyConfluent
 
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
HostedbyConfluent
 

Was ist angesagt? (20)

How Yelp Leapt to Microservices with More than a Message Queue
How Yelp Leapt to Microservices with More than a Message QueueHow Yelp Leapt to Microservices with More than a Message Queue
How Yelp Leapt to Microservices with More than a Message Queue
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...
 
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
 
Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...Launching the Expedia Conversations Platform: From Zero to Production in Four...
Launching the Expedia Conversations Platform: From Zero to Production in Four...
 
Building Event-Driven Services with Apache Kafka
Building Event-Driven Services with Apache KafkaBuilding Event-Driven Services with Apache Kafka
Building Event-Driven Services with Apache Kafka
 
Scaling an Event-Driven Architecture with IBM and Confluent | Antony Amanse a...
Scaling an Event-Driven Architecture with IBM and Confluent | Antony Amanse a...Scaling an Event-Driven Architecture with IBM and Confluent | Antony Amanse a...
Scaling an Event-Driven Architecture with IBM and Confluent | Antony Amanse a...
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
 
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
 
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...How did we move the mountain? - Migrating 1 trillion+ messages per day across...
How did we move the mountain? - Migrating 1 trillion+ messages per day across...
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
 
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
 
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikKeeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
 
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
Kafka in Context, Cloud, & Community (Simon Elliston Ball, Cloudera) Kafka Su...
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Guaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, Nutanix
Guaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, NutanixGuaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, Nutanix
Guaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, Nutanix
 
Putting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream ProcessingPutting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream Processing
 
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
 
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillDelivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
 

Ähnlich wie Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | Kahn Chen and Thirteen Wang, Tencent

Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Shameera Rathnayaka
 

Ähnlich wie Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | Kahn Chen and Thirteen Wang, Tencent (20)

Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configurationRemoving performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configuration
 
[OSS Upstream Training] 5 open stack liberty_recap
[OSS Upstream Training] 5 open stack liberty_recap[OSS Upstream Training] 5 open stack liberty_recap
[OSS Upstream Training] 5 open stack liberty_recap
 
open stackliberty_recap_by_VietOpenStack
open stackliberty_recap_by_VietOpenStackopen stackliberty_recap_by_VietOpenStack
open stackliberty_recap_by_VietOpenStack
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Apache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt LtdApache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt Ltd
 
101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
 
Rha cluster suite wppdf
Rha cluster suite wppdfRha cluster suite wppdf
Rha cluster suite wppdf
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
03 2014 Apache Spark Serving: Unifying Batch, Streaming, and RESTful Serving
03 2014 Apache Spark Serving: Unifying Batch, Streaming, and RESTful Serving03 2014 Apache Spark Serving: Unifying Batch, Streaming, and RESTful Serving
03 2014 Apache Spark Serving: Unifying Batch, Streaming, and RESTful Serving
 
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
 
Kubernetes Architecture with Components
 Kubernetes Architecture with Components Kubernetes Architecture with Components
Kubernetes Architecture with Components
 
prodops.io k8s presentation
prodops.io k8s presentationprodops.io k8s presentation
prodops.io k8s presentation
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
 

Mehr von HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 

Mehr von HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | Kahn Chen and Thirteen Wang, Tencent

  • 1. Kahn Chen 、Liang Wang 13/06/2021 Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent
  • 2. Self-Introduction Agenda Main application scenarios at Tencent Business requirements in Using Apache Kafka Architecture evolution process in using Apache kafka Operational data at Tencent Incorporate new features of the Apache Kafka community
  • 3. Name:Liang Wang Title :Tencent Senior Engineer Self-Introduction
  • 4. Main application scenarios As one of the world’s biggest internet-based platform companies, Tencent uses technology to enrich the lives of users and assist the digital upgrade of enterprises. An example product is the popular WeChat application, which has over one billion active users worldwide. The Platform and Content Group (PCG) is responsible for integrating Tencent’s internet, social, and content platforms. PCG promotes the cross-platform and multi-modal development of IP, with the overall goal of creating more diversified premium digital content experiences. Since its inception, many major products—from the well-known QQ, QZone, video, App Store, news, and browser, to relatively new members of the ecosystem such as live broadcast, anime, and movies—have been evolving on top of consolidated infrastructure and a set of foundational technology components. At our center stands the real-time messaging system that connects data and computing. We have proudly built many essential data pipelines and messaging queues around Apache Kafka. Our application of Kafka is similar to other organizations: We build pipelines for cross-region log ingestion, machine learning platforms, and asynchronous communication among microservices. The unique challenges come from stretching our architecture along multiple dimensions, namely scalability, customization, and SLA.
  • 5. Business requirements in Using Apache Kafka l Low latency and high SLA In the actual operation, the business often requires very low latency. And we found that the annual disk failure rate is as high as 5%, there may also be cases such as computer room network failures. However, the current disaster tolerance of Kafka in the community can only support the Broker level, and it is difficult to support the low delay required by our business. l Topic partition needs to support lossless scaling down In some real-world operation scenarios, such as after the peak traffic during the holidays, we need to recycle the machines that have been temporarily added to the cluster, and scale down the temporarily expanded partitions. However, currently Kafka does not support this capability. It requires the developer change the business logic of the code to achieve this. l A large number of topics need to be supported on a single cluster In Tencent real-time data warehouse scenarios, table1->flink_sql-> table2 ... table1 and table2 are usually single topics in real-time stream. Thus, a lot of topics are needed. However, because of the performance bottleneck of Zookeeper, apache Kafka cannot support over one million of the partitions well before version 2.8.0.
  • 6. In order to meet the above requirements, what have we done
  • 7. first federal kafka design Example: A logical topic with eight partitions (P0–P7) is distributed to two physical clusters each with four partitions (P0–P3)
  • 8. On the one hand We add Two proxy services, one for the producer (pproxy) and another for the consumer (cproxy), implement the core protocols of the Kafka broker. They are also responsible for mapping logical topics to their physical incarnation. The application uses the same Kafka SDK to connect directly to the proxy, which acts as a broker. On the other hand We extract the logic of the Kafka controller node (such as topic metadata) into a separate service, which is also called “the controller,” but it is different from Kafka’s own controller functionality. This service is responsible for collecting the Metadata of physical clusters, composing the partition information logical topics, and then publishing it to the proxy brokers. Key components and workflows
  • 9. 1. Controller reads the mapping between logical topic and physical topic from the config database. 2. Controller scans metadata of all physical topics from each Kafka cluster. 3. Controller finds the list of available pproxy from heartbeat message. 4. Controller composes the metadata for the logical topics. 5. Controller pushes both the topic mapping and topic metadata to proxy brokers. 6. The metadata is sent to the Kafka SDK. Cluster provisioning workflow
  • 10. Cluster decommission workflow 1. Controller marks the cluster (cluster 1) to be decommissioned in read-only mode in the config database. 2. Controller picks up this state change when it fetches the metadata from the database. 3. Controller reconstructs metadata for producers, removing cluster 1 from the physical clusters. 4. Controller pushes a new version of producer metadata to the proxy. At this moment, data stops being written to cluster 1 , while the consuming side is not impacted. 5. While all topics in the cluster are expired, the controller reconstructs the consumer metadata again and removes the physical topics no longer valid. 6. Consumer proxy loads the new metadata and now when the new consumer arrives, it will no longer pick up cluster 1. This completes the decommission operation.
  • 11. Advantages and disadvantages of Cluster Federation Advantages ü The topic partition can achieve lossless expansion and contraction ü Disaster tolerance can be achieved at the sub-cluster level Disadvantages l Proxy service needs to implement all kafka protocols, and it is difficult to incorporate new features of apache kafka community l Add an additional 30% of the cost to deploy the Proxy service
  • 12. 1. How to reduce the additional 30% overhead caused by proxy services 2. How to make it easy to incorporate new features from the apache kafka community Now we have two new problems
  • 13. First, the role of the proxy service is to proxy the partition request of the logical topic to the correct physical topic. If we want to remove the proxy service, the metadata obtained by the sdk must be directly requested on the real broker. We can merge the metadata of the topics distributed on multiple sub-clusters to make the SDK look like a complete topic. Example: A logical topic with eight partitions (P0–P7) is distributed to two physical clusters,the first one with four partitions (P0–P3) and the other with four partitions (P4–P7,partition id was started from 4), and the topic name must be same in the two clusters P4 P5 P6 P7 new design of federal Kafka (Aims to remove proxy service) P4 P6 P5 P7 P5 P7 P4 P6 partition id does not start from 0
  • 14. The improved version of the Cluster Federation(without proxy service layer) In order to make the sdk get a merged metadata, we added a new module "Master cluster", which is a complete Kafka cluster with zookeeper, but there is no broker role on it. On this "Master cluster" cluster, we have mounted multiple kafka clusters with Brokers, and divided these kafka clusters into groups. In operation, we ensure that a topic partition will be allocated to all sub-clusters in a group, but cross-groups are not allowed. Therefore, this improved federated cluster can achieve the purpose of expanding the partitions by expanding the sub-clusters under the group where the topic is located when the number of topic partitions is insufficient. When the number of topics in the federated cluster is insufficient, the purpose of expanding the topic can be achieved by expanding the group. "Master cluster" will obtain the metadata of all sub-clusters, and sort and summarize the metadata of the same topic to obtain a complete metadata and send it to the SDK. so that the message production without transaction is already available Up. Since there is no correlation between the sub-clusters and a consumer group tends to subscribe to multiple topics, the management of consumer groups and transactions must be implemented on the "Master Cluster".
  • 15. How does the new Cluster Federation work ? After the "Master Cluster" is started, the metadata of all sub-clusters will be summarized and sorted by Topic 1、The SDK obtains metadata from the "Master Cluster" cluster 2、"Master Cluster" will return the aggregated topic metadata 3、The SDK normally reads and writes messages based on metadata, but Metadata Update and FindCoordiantor will only request from the "Master Cluster". Produce/Fetch request for partition 0 and partition 3 of U_TopicA
  • 16. How to realize the expansion and disaster tolerance of “Master Cluster”? 1、We have realized the role of brokers in KIP 500, so it becomes very easy for Coordinator to expand according to actual needs. 2、When a new sub-cluster is registered or a sub-cluster is unregistered, this change will be written to Redis. 3、The data requested by OffsetCommit will also be written to Redis asynchronously. 4、When“Master Cluster” fails,we will use the backup cluster to recover the data from Redis immediately.
  • 17. l Limit the number of partitions according to the cluster specifications to prevent performance problems caused by the out-of-control file number. How to achieve cluster high performance and high availability l Mute topics and partitions of failure cluster, and we have submitted KIP-694 about it. l Mute reads and writes for lossless cluster upgrades. l The SDK has the ability to mute abnormal partitions, and we have submitted KIP-693 about it also.
  • 18. Operational data at Tencent Apach kafka Tencent's customized kafka Read and write recovery time after cluster failure It usually takes 30-60 minutes to restart a broker with 20TB data The client will stop writing to the abnormal sub-cluster, and the recovery time is <1min Number of Brokers in a single cluster Before version 2.8.0, the number of brokers in a single cluster can reach about 200. The number of brokers in a cluster can support up to 1000 Machine recycling and partition shrinking after traffic peaks Machine recycling can be achieved through rebalance, but partition scaling down cannot be supported You can achieve business- insensitive partition scaling down and recycling machines without balancing
  • 19. Incorporate new features of the Apache Kafka community When we designed the federation mode, kafka 2.8.0 was not released, so our overall design idea is to be better compatible with the features of KIP 500 in the future, such as role splitting, and Controller unified management metadata. In the future we will incorporate these cool features of the community