SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Getting Started with
MirrorMaker 2
Mickael Maison - IBM
Ryanne Dolan - Twitter
Kafka Summit EU 2021
Summary
- Pain points of MM1
- Overview of MM2 Connectors
- Deployment modes
- Use cases and Scenarios
- Tips and Tricks to get started
Why MM2?
• Address problems with legacy MirrorMaker (MM1)
• Take advantage of Connect ecosystem
• Enable new replication use-cases
MirrorMaker1 Pain Point #1
Lack of consumer group offsets mirroring
• Data replicated, but not consumer offsets
• No offset translation
• Timestamp-based recovery
MM2:
• Offset translation
• Consumer group checkpoints
MirrorMaker1 Pain Point #2
Hard to deploy, monitor
• No centralized "control plane"
• Each individual consumer and producer configured separately
• No high-level metrics
MM2:
• High-level "driver" manages replication between many clusters
• High-level configuration file defines global replication topology
• Cross-cluster metrics like Replication Latency
MirrorMaker1 Pain Point #3
Unable to keep topics synchronized
• Configuration changes not sync'd
• Partitions not sync'd
• ACL not sync'd
MM2:
• Topic configuration sync'd
• Partitions sync'd
• ACLs sync'd
The
Connectors
MirrorSourceConnector
• Replicates "remote topics"
• Sync topic configuration
• Sync topic ACLs
• Emit offset sync
us-east
MirrorSourceConnector
MirrorSourceConnector
us-west.topic1
us-west
Configs
ACLS
Records
Offset syncs
topic1
mm2-offset-syncs.us-east.internal
us-west
MirrorCheckpointConnector
• Consumes offset syncs
• Emit checkpoints: consumer group state
• Enables failover:
• Automatically: __consumer_offsets (since 2.7.0)
• Programmatically: mirror-client's translateOffsets()
us-east
MirrorCheckpoint
Connector
Checkpoints
mm2-offset-syncs.us-east.internal mm2-checkpoints.us-west.internal
__consumer_offsets
__consumer_offsets
MirrorHeartbeatConnector
• Send heartbeats to remote clusters
• Useful for monitoring replication flows
• Enables clients to discover replication topology
• mirror-client's upstreamClusters()
us-west
MirrorHeartbeat
Connector heartbeats
MirrorSource
Connector
us-east
us-west.heartbeats
Deployment
Modes
Dedicated aka Driver mode
• connect-mirror-maker.sh
• Easy configuration
• Runs all connectors
Dedicated aka driver mode
Source Connector
Checkpoint Connector
Heartbeat Connector
Target Connect Source Connect
Mirror Maker 2
Connect Distributed
• Reuse existing Connect cluster
• Full control
• More configuration
Use cases
and
scenarios
Active/Standby
us-west us-east
MM2
topic1 us-west.topic1
topic2
Active/Standby - Dedicated
mm2.properties
clusters=us-west,us-east
us-west.bootstrap.servers=…
us-east.bootstrap.servers=…
us-west->us-east.enabled=true
Active/Standby - Connect
connect-distributed.properties
https://github.com/apache/kafka/blob/trunk/config/connect-distributed.properties
source-connector.json
{
"name": "MirrorSourceConnector",
"config":{
"connector.class":
"org.apache.kafka.connect.mirror.MirrorSourceConnector",
"name": "MirrorSourceConnector",
"topics": ".*",
"tasks.max": "30",
"source.cluster.alias": "us-west",
"target.cluster.alias": "us-east",
}
}
checkpoint-connector.json
{
"name": "MirrorCheckpointConnector",
"config":{
"connector.class":
"org.apache.kafka.connect.mirror.MirrorCheckpointConnector"
,
"name": "MirrorCheckpointConnector",
"groups": ".*",
"tasks.max": "15",
"source.cluster.alias": "us-west",
"target.cluster.alias": "us-east",
}
}
Active/Active
us-west us-east
MM2
topic1 us-west.topic1
topic2
us-east.topic2
Active/Active - Dedicated
mm2.properties
clusters=us-west,us-east
us-west.bootstrap.servers=…
us-east.bootstrap.servers=…
us-west->us-east.enabled=true
us-east->us-west.enabled=true
Active/Active - Connect
us-west us-east
MM2
topic1 us-west.topic1
topic2
us-east.topic2
MM2
Active/Active - Connect
connect-distributed.properties
source-connector.json
checkpoint-connector.json
heartbeat-connector.json
connect-distributed.properties
source-connector.json
checkpoint-connector.json
heartbeat-connector.j
Going
into
Production
Monitoring
• Throughput/latency per partition
• kafka.connect.mirror:type=MirrorSourceConnector - byte-rate|record-age-ms|replication-latency-ms
• Offset Checkpoint latency
• kafka.connect.mirror:type=MirrorCheckpointConnector - checkpoint-latency-ms
• Connect task/Connector health
• http://kafka.apache.org/documentation/#connect_monitoring
• Connect task configurations
• /<connector>/tasks-config since Kafka 2.8
• Duplicated tasks Connect JIRA: KAFKA-9849
• Fixed in 2.4.2, 2.5.1, 2.6.0 and above
Controls
• Scale Connect
tasks.max
Number of workers
• Select Mirroring workload
topics and groups settings
• Offset reset policy
consumer.auto.offset.reset=latest since Kafka 2.8
Kafka Improvement Proposals
• KIP-310: Add a Kafka Source Connector to Kafka Connect ✅ (withdrawn in favor of MM2)
• KIP-382: MirrorMaker 2.0 ✅
• KIP-597: MirrorMaker2 internal topics Formatters ✅
• KIP-605: Expand Connect Worker Internal Topic Settings ✅
• KIP-618: Atomic commit of source connector records and offsets
• KIP-661: Expose task configurations in Connect REST API ✅
• KIP-656: MirrorMaker2 Exactly-once Semantics
• KIP-690: Add additional configuration to control MirrorMaker 2 internal topics naming convention
• KIP-710: Full support for distributed mode in dedicated MirrorMaker 2.0 clusters
• KIP-712: Shallow Mirroring
• KIP-716: Allow configuring the location of the offset-syncs topic with MirrorMaker2
• KIP-720: Deprecate MirrorMaker 1 ✅
Notable Progress
• KAFKA-8930: MirrorMaker v2 documentation
• KAFKA-9175 MirrorMaker 2 emits invalid topic partition metrics
• KAFKA-9352 unbalanced assignment of topic-partition to tasks
• KAFKA-9849 Fix issue with worker.unsync.backoff.ms creating zombie workers
when incremental cooperative rebalancing is used
• KAFKA-10710 MirrorMaker 2 creates all combinations of herders
• KAFKA-12254 MirrorMaker 2.0 creates destination topic with default configs
Ongoing:
• KAFKA-10339 and KAFKA-10483: MirrorSinkConnectors and EOS
• KAFKA-9726 LegacyReplicationPolicy
Thank You!
Mickael Maison - @MickaelMaison
Ryanne Dolan -@DolanRyanne
https://kafka.apache.org/documentation/#georeplication
https://github.com/apache/kafka/tree/trunk/connect/mirror
https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0

Weitere ähnliche Inhalte

Was ist angesagt?

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...HostedbyConfluent
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David AndersonVerverica
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registryconfluent
 
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/NeutronOverview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/Neutronvivekkonnect
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basicsJuraj Hantak
 
Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in KafkaJayesh Thakrar
 
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...HostedbyConfluent
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...HostedbyConfluent
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleFlink Forward
 
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Jean-Paul Azar
 
Kafka Quotas Talk at LinkedIn
Kafka Quotas Talk at LinkedInKafka Quotas Talk at LinkedIn
Kafka Quotas Talk at LinkedInAditya Auradkar
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 

Was ist angesagt? (20)

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/NeutronOverview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Prometheus - basics
Prometheus - basicsPrometheus - basics
Prometheus - basics
 
Data Loss and Duplication in Kafka
Data Loss and Duplication in KafkaData Loss and Duplication in Kafka
Data Loss and Duplication in Kafka
 
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
 
Kafka Quotas Talk at LinkedIn
Kafka Quotas Talk at LinkedInKafka Quotas Talk at LinkedIn
Kafka Quotas Talk at LinkedIn
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 

Ähnlich wie Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan, Twitter, Inc

Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBMariaDB plc
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringContinuent
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitinbloomreacheng
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)DECK36
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Lucidworks
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Nitin S
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationNitin Sharma
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudRose Toomey
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudDatabricks
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...xKinAnx
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Productionconfluent
 
Apache Helix DevOps & LSPE-IN Meetup
Apache Helix DevOps & LSPE-IN Meetup Apache Helix DevOps & LSPE-IN Meetup
Apache Helix DevOps & LSPE-IN Meetup Shahnawaz Saifi
 
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...Microsoft Technet France
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarKarthik Ramasamy
 
Training Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & TroubleshootingTraining Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & TroubleshootingContinuent
 
Virtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin MurrayVirtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin MurrayDatabricks
 
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~Brocade
 

Ähnlich wie Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan, Twitter, Inc (20)

Hochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDBHochverfügbarkeitslösungen mit MariaDB
Hochverfügbarkeitslösungen mit MariaDB
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten Clustering
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
 
SVCC-2014
SVCC-2014SVCC-2014
SVCC-2014
 
EMEA Airheads- Layer-3 Redundancy for Mobility Master - ArubaOS 8.x
EMEA Airheads- Layer-3 Redundancy for Mobility Master - ArubaOS 8.xEMEA Airheads- Layer-3 Redundancy for Mobility Master - ArubaOS 8.x
EMEA Airheads- Layer-3 Redundancy for Mobility Master - ArubaOS 8.x
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin Presentation
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
Apache Helix DevOps & LSPE-IN Meetup
Apache Helix DevOps & LSPE-IN Meetup Apache Helix DevOps & LSPE-IN Meetup
Apache Helix DevOps & LSPE-IN Meetup
 
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 deu...
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
 
Training Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & TroubleshootingTraining Slides: 202 - Monitoring & Troubleshooting
Training Slides: 202 - Monitoring & Troubleshooting
 
Cassandra at Lithium
Cassandra at LithiumCassandra at Lithium
Cassandra at Lithium
 
Virtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin MurrayVirtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin Murray
 
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
 

Mehr von HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

Mehr von HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan, Twitter, Inc

Hinweis der Redaktion

  1. Replication use cases including: disaster recovery, backup, failover/failback, cloud migration, and so on.
  2. The basic problem here is that, in Kafka, offsets are never guaranteed to be consistent between clusters, even if the same records are sent in the exact same order. (Actually, you can observe this even within one cluster if you try sending the same records to two different topics. – maybe same order) This is problematic if we want one cluster to be a mirror of another cluster. The data might be the same, but the offsets will definitely be different. So unless we solve this problem, we can’t really have a so-called “backup cluster”. Not a very good backup. MM2: we’ll talk about how MM2 solves this problem, but basically we need to keep a mapping of offsets between clusters so we can translate offsets between them. Timestamp-based recovery has been available since KIP-33. Basically, rewind to a previous point in time and use this as a basis for disaster recovery. Very problematic in practice. For example, you hafta assume each consumer is caught up to real-time. If there is a lagging consumer, you might end up fast-forwarding accidentally Consumer group offset mirroring is the biggest feature of MM2. Each consumer group is checkpointed automatically between clusters, so you know how to recover each individual consumer.
  3. Consumer producer config: bad UX High level driver: think of as a bunch of replication workers running together under one consistent control plane. Much better than configuring a bunch of individual producers and consumers. Driver spins up a whole bunch of producers and consumers.
  4. Key word here is “synchronized” (not just replicated). ”Topics” is more than just ”records”. Topics have metadata, e.g. the number of paritions, ACLs, etc. So again, MM1 didn’t create a very good “mirror”.
  5. In the second part of the session, I want to give you tips and practical knowledge about running MM2. By the end of this session, you should be able to get it running yourself The first decision to make is the deployment mode, how are you going to run MM2. As said, MM2 is a set of connectors for Kafka Connect but there are 2 options: - Dedicated mode - Explicitly on Connect
  6. Within the MM2 process, you get 2 Connect runtimes 1 runtime for the target cluster where the source and checkpoint connectors run 1 runtime for the source cluster as the heartbeat connector produces records to the source cluster
  7. In the second part of the session, I want to give you tips and practical knowledge about running MM2. By the end of this session, you should be able to get it running yourself The first decision to make is the deployment mode, how are you going to run MM2. As said, MM2 is a set of connectors for Kafka Connect but there are 2 options: - Dedicated mode - Explicitly on Connect
  8. Dedicated also known as driver mode This is the mode first encountered by many people as it’s what happens when you run the connect-mirror-maker.sh tool. A lot happens behind the scenes. You don’t interact with Connect explicitly and the REST API is not available. This mode offers a very expressive way to configure it and is set up via a single file. It runs all connectors directly. It’s great to get started or if you have a small to medium use case without specific requirements.
  9. Within the MM2 process, you get 2 Connect runtimes 1 runtime for the target cluster where the source and checkpoint connectors run 1 runtime for the source cluster as the heartbeat connector produces records to the source cluster
  10. You can also run the Connectors directly in Connect like any other connectors we know and already use Connect Distributed, I’m not going to cover Connect Standalone. Great if you have Connect clusters This provides full control you can start exactly the connectors you want. Also to keep Connect runtimes near clusters with their topics trade-off Configured via JSON files, 1 per connector so it’s more complex
  11. Hopefully you know understand the deployment options and have picked your preferred solution. Let’s now look at at use cases MM2 enable. It covers a lot of scenarios and pretty much any cluster topology can be built. In the interest of time, I’ll cover the 2 most common ones. Ryanne in his talk at the last Kafka Summit in London demonstrated a few more advanced scenarios
  12. Active/Standby, misleading name as you can use the target cluster. Just mirroring is unidirectional Any topics/groups on us-west will be mirrored to us-east. Naming is fully configurable
  13. List your clusters Connection information + SSL + SASL Use the fancy arrow notation to describe mirroring direction Very simple
  14. A bit more configuration with Connect Slides will be available later. The point is not look at the exact payloads but instead see it’s not a lot of JSON in the end. No heartbeat as it would requires second connect runtime
  15. Very similar to Active/Standby MM2 prevents loops Both runtimes run all 3 connectors
  16. Basically just add an extra line enabling mirroring in the other direction. That’s it done! Note that here you’ll be running Source connectors on both runtimes. One of them is distant from its Kafka cluster.
  17. In order to do active-active you need 2 connect clusters. You could deploy Dedicated this way too
  18. More configuration files. Hopefully at this point you are not doing curl to start connectors. You should have a system to deploy connectors so in the end this should not be a lot of work/overhead
  19. Now that we learned how to run MM2, let’s look at some tips to go into production
  20. Obviously like any production systems, you want to monitor MM2 closely Fortunately, MM2 connectors provide many metrics. Source connector: check throughput and latency. Also consider record-age if mirroring existing topics with old records. Record-age is difference between record timestamp and time MM2 consume record. Latency is difference between record timestamp and Connect successfully produced record to target cluster. Checkpoint connector latency Overall Connect health/ task count and state Are all tasks running? How many tasks? Have we reached max? Be sure to run one of the latest releases to have the fix for KAFKA-9849. Connect could duplicate tasks when rebalancing. Data could be mirrored twice and significant load increase!
  21. It’s also important to be aware of the controls you have as an operator. In terms of performance, you can scale connectors via 2 mechanisms Number of tasks, How many tasks can be packed onto a worker depends on many factors. Monitor your worker system resources Number of workers running tasks You can also adjust the workload and make sure what is being mirrored is what you want. MM2 prevents creating loops but still be careful as default setting is .*! In many scenarios you typically don’t want to mirror all your topics/groups. Careful with regex as it’s easy to make a mistake. Since 2.5 (KIP-558), you can use the Connect REST API to see the active list of topics and check it’s what you expect. From Kafka 2.8 (KIP-661), you can use connector/tasks-config to see partitions assigned to each task Finally you can adjust where your connectors starts mirroring with the offset reset policy especially if mirroring large topics
  22. MM2 leverages Connect, so improvements to Connect help MM2! (e.g. EOS) Lot of MM2-related KIPs recently. Real momentum! Sorry if I missed some!
  23. New georeplication section replaces old MM1 documentation.
  24. For we’ve given you the tools to get started with MM2 and hope you’ll be able to run it successfully. Thank you for attending our session. Feel free to reach us on Twitter if you have any questions.