SlideShare ist ein Scribd-Unternehmen logo
1 von 22
1
1
How Zillow Unlocked Kafka to 50
Teams in 8 months
2
PROPERTY
MANAGERS
& LANDLORDS
BUYERS &
SELLERS
RENTERS
HOMEOWNERS
REAL
ESTATE
AGENTS
MORTGAGE
PROVIDERS
“Give people the power to unlock life’s next chapter”
The Living Database of All Homes
3
Technology at Zillow
• AWS Shop
• Many AWS Accounts
• 1000s of Microservices
• 100s of Data Pipelines
• “Right tool for the job”
Notable streaming use cases
• Click Stream Analysis
• Property & Listings Data
• Personalization
• Data Lake ingestion
4
Messaging and Streaming Data at Zillow: 2017 Look
Microservice Communication
5
Messaging and Streaming Data at Zillow: 2017 Look
Data Streaming Platform
6
Proliferation of Streaming Scenarios
Capacity sharing
7
Proliferation of Streaming Scenarios
Data Quality Issues
8
Proliferation of Streaming Scenarios
Long tail costs
9
Proliferation of Streaming Scenarios
Slow to innovate
10
Kafka Ecosystem as All-in-One
● Better scale for both streaming & pub/sub scenarios
○ Especially for 1:many consumers
● All data is structured & documented through the Schema Registry
● One place for metadata - All data is “owned”
● Multi tenancy for cost efficiency
11
The Reality of a new Platform Team
● Top down approach is not healthy
● Trust needs to be earned
● Migration is an overhead/technical debt task
● Platform team is a bottleneck
12
Lesson 1: Gain Trust - Quickly!
● Published SLOs from day 1
○ Availability:
under min-isr = 0
successful read canary events
○ Latency:
P50 & P99 Produce
P99 Consume (without Remote)
● Onboarded large (~40MB/s) non
critical stream
13
Lesson 2: Meet Developers Where They Are
● For us it was Terraform
● Open source Kafka Terraform exists but...
● Balance self-service & control
○ Ticketing / Merge Requests?
○ Another approval Flow?
○ Guardrail proxy
14
Lesson 2: Meet Developers Where They Are
● Terraform is declarative
● K8s Custom Resource
Definitions + K8s Operator
● Cluster protection
● Allowed configs
● Metadata
● Devs are in control
● Authz with Namespaces
15
Lesson 3: Migration comes from Value
● No one has time for migration now
● Engaging with the right customers first
○ Very high throughput
○ Many consumers
○ >1MB Message Support
○ Compaction
● Under the hood migrations
16
Lesson 4: Platform is a Product
● Customer (Developer) is our north Star
● Documentation
● CLI
● Archetypes
● Abstracted Client Library
● Internal blog posts
17
Lesson 5: This is Where we Failed - Schema Registry
● 1:1 Kafka → Schema Registry
environments
● CICD for schemas with
“RecordNameStrategy”
● Had to support Multiple Customer Envs
(e.g dev/qa in pre-prod)
Can you guess what happened next?
18
Fail Reason 1 - Schema Sharing
● Shared schemas between
customer environments
● Incompatible dev schema change
failed QA de-serializtion
19
Fail Reason 2 - Porting Data
● Kafka Avro wire format does not have
Schema Registry “ID”
● And access is needed too
20
How we Look at Schema Registry Now
● Still support multi customer envs
● Single global Schema Registry
(think “Maven for Runtime”)
● TopicNameStrategy
● Customer-environment-aware
CICD for schemas
21
Summary
● Gain trust - quickly
● Meet developers where they are
● Bottom up migrations - show value
● Platform as a Product
● Stick to single Schema Registry
How Zillow Unlocked Kafka to 50 Teams in 8 Months

Weitere ähnliche Inhalte

Was ist angesagt?

Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentHostedbyConfluent
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberYing Zheng
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Timothy Spann
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Managementconfluent
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin PodvalMartin Podval
 
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMESet your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMEconfluent
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...HostedbyConfluent
 

Was ist angesagt? (20)

Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, ConfluentTemporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
Temporal-Joins in Kafka Streams and ksqlDB | Matthias Sax, Confluent
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at Uber
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Management
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMESet your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
Building Streaming Data Pipelines with Google Cloud Dataflow and Confluent Cl...
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 

Ähnlich wie How Zillow Unlocked Kafka to 50 Teams in 8 Months

Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...Timothy Spann
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...HostedbyConfluent
 
Uber Real Time Data Analytics
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data AnalyticsAnkur Bansal
 
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architectureCommit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architectureJordi Puigsegur Figueras
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsRightScale
 
Single tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8SSingle tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8SCloudLinux
 
A day in the life of a log message
A day in the life of a log messageA day in the life of a log message
A day in the life of a log messageJosef Karásek
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Gary Arora
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaSteven Wu
 
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...Ambassador Labs
 
Потоковая обработка больших данных
Потоковая обработка больших данныхПотоковая обработка больших данных
Потоковая обработка больших данныхCEE-SEC(R)
 
How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...
How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...
How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...StreamNative
 
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsDisenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsC4Media
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...Timothy Spann
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...Grid Dynamics
 
Building Enterprise Grade Applications in Yarn with Apache Twill
Building Enterprise Grade Applications in Yarn with Apache TwillBuilding Enterprise Grade Applications in Yarn with Apache Twill
Building Enterprise Grade Applications in Yarn with Apache TwillCask Data
 

Ähnlich wie How Zillow Unlocked Kafka to 50 Teams in 8 Months (20)

Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
 
Uber Real Time Data Analytics
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data Analytics
 
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architectureCommit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
 
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOpsKubernetes and Terraform in the Cloud: How RightScale Does DevOps
Kubernetes and Terraform in the Cloud: How RightScale Does DevOps
 
Single tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8SSingle tenant software to multi-tenant SaaS using K8S
Single tenant software to multi-tenant SaaS using K8S
 
A day in the life of a log message
A day in the life of a log messageA day in the life of a log message
A day in the life of a log message
 
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
Leapfrog into Serverless - a Deloitte-Amtrak Case Study | Serverless Confere...
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
 
Потоковая обработка больших данных
Потоковая обработка больших данныхПотоковая обработка больших данных
Потоковая обработка больших данных
 
How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...
How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...
How Tencent Applies Apache Pulsar to Apache InLong —— A Streaming Data Integr...
 
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsDisenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
 
Building Enterprise Grade Applications in Yarn with Apache Twill
Building Enterprise Grade Applications in Yarn with Apache TwillBuilding Enterprise Grade Applications in Yarn with Apache Twill
Building Enterprise Grade Applications in Yarn with Apache Twill
 

Mehr von HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

Mehr von HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Kürzlich hochgeladen

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

How Zillow Unlocked Kafka to 50 Teams in 8 Months

  • 1. 1 1 How Zillow Unlocked Kafka to 50 Teams in 8 months
  • 2. 2 PROPERTY MANAGERS & LANDLORDS BUYERS & SELLERS RENTERS HOMEOWNERS REAL ESTATE AGENTS MORTGAGE PROVIDERS “Give people the power to unlock life’s next chapter” The Living Database of All Homes
  • 3. 3 Technology at Zillow • AWS Shop • Many AWS Accounts • 1000s of Microservices • 100s of Data Pipelines • “Right tool for the job” Notable streaming use cases • Click Stream Analysis • Property & Listings Data • Personalization • Data Lake ingestion
  • 4. 4 Messaging and Streaming Data at Zillow: 2017 Look Microservice Communication
  • 5. 5 Messaging and Streaming Data at Zillow: 2017 Look Data Streaming Platform
  • 6. 6 Proliferation of Streaming Scenarios Capacity sharing
  • 7. 7 Proliferation of Streaming Scenarios Data Quality Issues
  • 8. 8 Proliferation of Streaming Scenarios Long tail costs
  • 9. 9 Proliferation of Streaming Scenarios Slow to innovate
  • 10. 10 Kafka Ecosystem as All-in-One ● Better scale for both streaming & pub/sub scenarios ○ Especially for 1:many consumers ● All data is structured & documented through the Schema Registry ● One place for metadata - All data is “owned” ● Multi tenancy for cost efficiency
  • 11. 11 The Reality of a new Platform Team ● Top down approach is not healthy ● Trust needs to be earned ● Migration is an overhead/technical debt task ● Platform team is a bottleneck
  • 12. 12 Lesson 1: Gain Trust - Quickly! ● Published SLOs from day 1 ○ Availability: under min-isr = 0 successful read canary events ○ Latency: P50 & P99 Produce P99 Consume (without Remote) ● Onboarded large (~40MB/s) non critical stream
  • 13. 13 Lesson 2: Meet Developers Where They Are ● For us it was Terraform ● Open source Kafka Terraform exists but... ● Balance self-service & control ○ Ticketing / Merge Requests? ○ Another approval Flow? ○ Guardrail proxy
  • 14. 14 Lesson 2: Meet Developers Where They Are ● Terraform is declarative ● K8s Custom Resource Definitions + K8s Operator ● Cluster protection ● Allowed configs ● Metadata ● Devs are in control ● Authz with Namespaces
  • 15. 15 Lesson 3: Migration comes from Value ● No one has time for migration now ● Engaging with the right customers first ○ Very high throughput ○ Many consumers ○ >1MB Message Support ○ Compaction ● Under the hood migrations
  • 16. 16 Lesson 4: Platform is a Product ● Customer (Developer) is our north Star ● Documentation ● CLI ● Archetypes ● Abstracted Client Library ● Internal blog posts
  • 17. 17 Lesson 5: This is Where we Failed - Schema Registry ● 1:1 Kafka → Schema Registry environments ● CICD for schemas with “RecordNameStrategy” ● Had to support Multiple Customer Envs (e.g dev/qa in pre-prod) Can you guess what happened next?
  • 18. 18 Fail Reason 1 - Schema Sharing ● Shared schemas between customer environments ● Incompatible dev schema change failed QA de-serializtion
  • 19. 19 Fail Reason 2 - Porting Data ● Kafka Avro wire format does not have Schema Registry “ID” ● And access is needed too
  • 20. 20 How we Look at Schema Registry Now ● Still support multi customer envs ● Single global Schema Registry (think “Maven for Runtime”) ● TopicNameStrategy ● Customer-environment-aware CICD for schemas
  • 21. 21 Summary ● Gain trust - quickly ● Meet developers where they are ● Bottom up migrations - show value ● Platform as a Product ● Stick to single Schema Registry