SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
1 © Hortonworks Inc. 2011–2018. All rights reserved
Curing the Kafka blindness—
Streams Messaging Manager
Andrew Psaltis
2 © Hortonworks Inc. 2011–2018. All rights reserved
Why Kafka
Source
System
Source
System
Source
System
Source
System
Kafka
Hadoop Security
Systems
Real-Time
Monitoring
Data
Warehouse
Producers
Brokers
Consumers
Kafka decouples data pipelines
3 © Hortonworks Inc. 2011–2018. All rights reserved
Streaming Analytics Reference Architecture
Data Flow Apps
Powered by NiFi
Subcribing
Streaming Analytics
Apps
Kafka is Everywhere - Critical Component of Streaming Architectures
Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers
Data Syndication
Services Powered by
Kafka
syndicate-
speed
Kafka Topic
syndicate-
geo
Kafka Topic
syndicate-
transmission
Kafka Topic
syndicate-
temp
Kafka Topic
syndicate-
oil
Kafka Topic
syndicate-
breaks
Kafka Topic
syndicate-
battery
Kafka Topic
syndicate-
start/stop
Kafka Topic
syndicate-
acceleration
Kafka Topic
syndicate-
idle
Kafka Topic
Data Collection
at the Edge
US West Fleet
Truck	Sensors	 C++
Agent
US Central Fleet
Truck	Sensors	 C++
Agent
US East Fleet
Truck	Sensors	 C++
Agent
Analytics App 1
Analytics App 2
Analytics App 5
Analytics App 3
Analytics App 4
gateway-west-
raw-sensors
Kafka Topic
IOT Ingest Gateway
Powered by Kafka
gateway-central-
raw-sensors
Kafka Topic
gateway-east-
raw-sensors
Kafka Topic
4 © Hortonworks Inc. 2011–2018. All rights reserved
Kafka’s Omnipresence Has Led to the Onset of “Kafka
Blindness”
• Who is Affected?
• Platform Operation Teams
• Developers / DevOps Teams
• Security / Governance Teams
• What is “Kafka Blindness”?
• Customers who use Kafka today struggle with monitoring /
“seeing”/troubleshooting what is happening in their clusters
• What are the Symptoms?
• Difficulty seeing who is producing and consuming data
• Difficulty understanding the flow of data from producers -> topics à consumers
• Difficulty troubleshooting/monitoring.
5 © Hortonworks Inc. 2011–2018. All rights reserved
Kafka Personas and Challenges
6 © Hortonworks Inc. 2011–2018. All rights reserved
Distinct Needs of 3 Personas/Teams using Kafka
Concerned with monitoring
the overall health of the
cluster and the infrastructure
it runs on
Concerned with monitoring
the Kafka entities associated
with their apps
Concerned with audit,
compliance, access control &
chain of custody requirements
7 © Hortonworks Inc. 2011–2018. All rights reserved
SMM Platform
Operations Use Cases
8 © Hortonworks Inc. 2011–2018. All rights reserved
SMM for the Platform Ops Team
Are any of my brokers down?
How many total active producers/
consumers is there currently?
Which producers are
generating the most data
right now?
Which producers are generating
the most data right now?
Do I have any offline topic partitions?
Which producers are generating
the most data right now?
Which consumer group us
falling behind the most?
Are there any skewed
partitions for a broker?
What is the throughput in/out
for a given partition on that
broker?
How many total active
producers and consumers
exist now?
Are any of my brokers
running out of disk space?
What hosts are my brokers
located on?
How of the cluster is being used,
how much capacity do I have
available per broker / cluster?
Are any of my brokers running
hot? Which broker has the highest
throughput in and and rates?
Which partitions are
located on each broker?
How many total topics does
my Kafka cluster have?
Are all my replicas in my
topic in-synch?
Which of my topics has
produced/consumers the most
messages over the last N
minutes/hours?
9 © Hortonworks Inc. 2011–2018. All rights reserved
How would you find the most active
producer in the cluster ?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
ü
10 © Hortonworks Inc. 2011–2018. All rights reserved
Find the Most Active Producer in my Cluster
A Kafka producer called
minfi-eu-i1 is the most
active producer sending
39K messages in the last
30 mins
Click on Messages to
sort on messages sent
by all producers in the
last 30 mins
11 © Hortonworks Inc. 2011–2018. All rights reserved
How would you find the consumer
that has fallen behind the most?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
ü
12 © Hortonworks Inc. 2011–2018. All rights reserved
Find the consumer that has fallen behind the most
Click on LAG to sort on consumer lag
across all consumers in the last 30 mins
Consumer group named route-micro-
service has significantly more lag (97K)
than any another consumer in the
cluster.
13 © Hortonworks Inc. 2011–2018. All rights reserved
How would you find which broker has
the highest throughput?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
ü
ü
ü
14 © Hortonworks Inc. 2011–2018. All rights reserved
Which broker has the highest throughput?
Step 1
Click on the Brokers tab to
see a broker centric view of
the Dashboard
Step 2
Click on Throughput to sort
on data in across all brokers
For Grafana Metrics
Click on Ambari icon to view Broker
host level metrics
15 © Hortonworks Inc. 2011–2018. All rights reserved
How would you find the partitions on
a given broker and understand the
flow of data from producers to
consumers?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
X
16 © Hortonworks Inc. 2011–2018. All rights reserved
Understand flow of data flow from Producer to Consumer
Step 1
Click on Panel expand to get more
details on the broker like all partitions
that are stored on the broker
Step 2
Click on the partition to see the producers and
consumers sending/consuming from that partition.
17 © Hortonworks Inc. 2011–2018. All rights reserved
SMM DevOps/App Dev
Use Cases
18 © Hortonworks Inc. 2011–2018. All rights reserved
SMM for the DevOps/AppDev Teams
Find all entities (producer,
consumers, topics) associated with
my app.
What is the retention
rate for my app topics? What is the replication
factor for my app topics?
What brokers holds the
partitions for my app
topics?
Who are all the producers
and consumers connected
to my app topics?
What type of events are in my
application topics? What does the
event look like?
What is the total number of
messages into my topic over
the last N minutes/hours?
Did a consumer rebalance
occur for a given topic?
Are any of my consumers/
consumer-groups that are
over-consuming?
Are there consumers in a
consumer-group for a given
topic slow/falling behind?
How many active consumers
instances are in a given
consumer group?
Are any of my consumers/consumer-
groups that are under-consuming?What topic(s) are the consumer
group consuming messages from?
19 © Hortonworks Inc. 2011–2018. All rights reserved
How would you filter all of the
producers, consumers, and topics that
are related to your application?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
X
20 © Hortonworks Inc. 2011–2018. All rights reserved
Filtering producers, consumers, and topics
Use the Filter to filter on topics
and select all the IOT gateway
topics
21 © Hortonworks Inc. 2011–2018. All rights reserved
How would you find the partitions on
a given broker and understand the
flow of data from producers to
consumers?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
X
22 © Hortonworks Inc. 2011–2018. All rights reserved
Are any consumers /consumer-groups under-consuming?
Step 1
Expand topic panel
to see more details
Step 2
Click on the Topic
to see all producers
Analysis
Note that for each
partition there is no
data going out (0B)
and no data going to
any consumer groups.
23 © Hortonworks Inc. 2011–2018. All rights reserved
How does Data Flow between Producers to Topics to Consumers?
Step 1
Expand details of a topic
that has consumers
Step 2
Click on the topic to see
producers and consumers
24 © Hortonworks Inc. 2011–2018. All rights reserved
How would search for and view the
data in a topic of interest?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
ü
25 © Hortonworks Inc. 2011–2018. All rights reserved
Explore/Search Messages in the Kafka Topic
Click on the explorer icon to search for
events in the Kafka Topic
26 © Hortonworks Inc. 2011–2018. All rights reserved
SMM for the Security and Governance Team
How does data flow across
multiple kafka hops?
When was a topic created?
When was the topic configuration
last modified?
Which producers have
sent data to the topic?
Which consumers have
consumed from a topic?
What is the schema for a given topic?
How has the schema evolved for
a given topic?
When were additional
brokers added to the
topic?Which users/groups/service
accounts have read from a given
topic?
Which users/groups/service
accounts have sent data to a topic?
What are the ACL policies for a
given topic?
Who has edited the ACL
policies of a given topic?
What is the linage of a
kafka topic?
27 © Hortonworks Inc. 2011–2018. All rights reserved
How would you trace the data lineage
across multiple Kafka topics?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
X
28 © Hortonworks Inc. 2011–2018. All rights reserved
Step 1
Click on Atlas Icon to see
lineage of the the topic
gateway-west-raw-sensors
Analysis
NiFi App consumes from the gateway-
west-raw-sensors topic and publishes
events to downstream Kafka topic
called syndicate-geo-event-avro
The topic has one active consumer which is a
NiFi consumer. Which Kafka topic if any is this
NiFi Flow consumer publishing events to?
Step 2
Search for the Kafka topic
that the the nifi-truck-
sensor-west consumer was
publishing events to
Analysis
We see that that this topic has 4
downstream consumers. We just tracked
a flow across multiple Kafka topics
29 © Hortonworks Inc. 2011–2018. All rights reserved
How would integrate all of the metrics
with an APM, alerting or ticketing
system ?
Open Source Monitoring Tools
Grafana / Graphite
Command Line / Script
X
X
X
30 © Hortonworks Inc. 2011–2018. All rights reserved
SMM REST Admin Server
What is the SMM REST Server?
à SMM UI powered by first class REST
Services via the SMM REST Admin Server
à Monitoring Rest Endpoints can be used to
integrate with APM/Alerting/Ticketing
solutions
à Powered by Apache Knox
à Installed via an Ambari Management Pack
on a target HDP/HDF cluster where Kafka
Service is running
à Supports both HDP & HDF Platforms
31 © Hortonworks Inc. 2011–2018. All rights reserved
HORTONWORKS
DATA PLATFORM (HDP®)
DATA-AT-REST
PRODUCERS CONSUMERS
TOPICS
BROKERS
SMM REST Server
HORTONWORKS
DATAFLOW (HDF)
DATA-IN-MOTION
Optimize your
Kafka Clusters
Troubleshoot
Kafka Streams
Trace end-to-
end Kafka flows
Streams Messaging Manager (SMM)
32 © Hortonworks Inc. 2011–2018. All rights reserved
Questions?
33 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash CourseDataWorks Summit
 
Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's EvolutionDataWorks Summit
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the GlobeDataWorks Summit
 
The Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricThe Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricDataWorks Summit
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at ScaleDataWorks Summit
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureDataWorks Summit
 
Navigating Idiosyncrasies of IoT Development
Navigating Idiosyncrasies of IoT DevelopmentNavigating Idiosyncrasies of IoT Development
Navigating Idiosyncrasies of IoT DevelopmentDataWorks Summit
 
What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4DataWorks Summit
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0DataWorks Summit
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionDataWorks Summit
 
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureDataWorks Summit
 
Data Centric Transformation in Telecom
Data Centric Transformation in TelecomData Centric Transformation in Telecom
Data Centric Transformation in TelecomDataWorks Summit
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo DataWorks Summit
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNDataWorks Summit
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?DataWorks Summit
 
What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4DataWorks Summit
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHortonworks
 

Was ist angesagt? (20)

The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash Course
 
Keynote
KeynoteKeynote
Keynote
 
Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's Evolution
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe10 Lessons Learned from Meeting with 150 Banks Across the Globe
10 Lessons Learned from Meeting with 150 Banks Across the Globe
 
The Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricThe Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data Centric
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at Scale
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
Navigating Idiosyncrasies of IoT Development
Navigating Idiosyncrasies of IoT DevelopmentNavigating Idiosyncrasies of IoT Development
Navigating Idiosyncrasies of IoT Development
 
What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4What s new in spark 2.3 and spark 2.4
What s new in spark 2.3 and spark 2.4
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
 
Apache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the UnionApache Hadoop YARN: State of the Union
Apache Hadoop YARN: State of the Union
 
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
 
Data Centric Transformation in Telecom
Data Centric Transformation in TelecomData Centric Transformation in Telecom
Data Centric Transformation in Telecom
 
Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo Apache Hadoop YARN: state of the union - Tokyo
Apache Hadoop YARN: state of the union - Tokyo
 
Lessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARNLessons learned running a container cloud on YARN
Lessons learned running a container cloud on YARN
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical Workshop
 

Ähnlich wie Curing the Kafka Blindness – Streams Messaging Manager

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming dataCarolyn Duby
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Curing the Kafka Blindness
Curing the Kafka BlindnessCuring the Kafka Blindness
Curing the Kafka Blindnessgvetticaden
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesTimothy Spann
 
Paris FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging ManagerParis FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging ManagerAbdelkrim Hadjidj
 
The Implacable advance of the data
The Implacable advance of the dataThe Implacable advance of the data
The Implacable advance of the dataDataWorks Summit
 
7_considerations_final
7_considerations_final7_considerations_final
7_considerations_finalJane Roberts
 
Social Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and SupersetSocial Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and SupersetThiago Santiago
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics ManagerSriharsha Chintalapani
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasyDataWorks Summit
 
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...DataWorks Summit
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerDataWorks Summit
 
Express Scripts: Driving Digital Transformation from Mainframe to Microservices
Express Scripts: Driving Digital Transformation from Mainframe to MicroservicesExpress Scripts: Driving Digital Transformation from Mainframe to Microservices
Express Scripts: Driving Digital Transformation from Mainframe to Microservicesconfluent
 

Ähnlich wie Curing the Kafka Blindness – Streams Messaging Manager (20)

Kafka/SMM Crash Course
Kafka/SMM Crash CourseKafka/SMM Crash Course
Kafka/SMM Crash Course
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
Unlocking insights in streaming data
Unlocking insights in streaming dataUnlocking insights in streaming data
Unlocking insights in streaming data
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Curing the Kafka Blindness
Curing the Kafka BlindnessCuring the Kafka Blindness
Curing the Kafka Blindness
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New Features
 
Paris FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging ManagerParis FOD meetup - Streams Messaging Manager
Paris FOD meetup - Streams Messaging Manager
 
The Implacable advance of the data
The Implacable advance of the dataThe Implacable advance of the data
The Implacable advance of the data
 
7_considerations_final
7_considerations_final7_considerations_final
7_considerations_final
 
Social Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and SupersetSocial Media Monitoring with NiFi, Druid and Superset
Social Media Monitoring with NiFi, Druid and Superset
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics Manager
 
Streaming analytics manager
Streaming analytics managerStreaming analytics manager
Streaming analytics manager
 
SAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made EasySAM - Streaming Analytics Made Easy
SAM - Streaming Analytics Made Easy
 
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging Manager
 
Express Scripts: Driving Digital Transformation from Mainframe to Microservices
Express Scripts: Driving Digital Transformation from Mainframe to MicroservicesExpress Scripts: Driving Digital Transformation from Mainframe to Microservices
Express Scripts: Driving Digital Transformation from Mainframe to Microservices
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 

Mehr von DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Kürzlich hochgeladen (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Curing the Kafka Blindness – Streams Messaging Manager

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Curing the Kafka blindness— Streams Messaging Manager Andrew Psaltis
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Why Kafka Source System Source System Source System Source System Kafka Hadoop Security Systems Real-Time Monitoring Data Warehouse Producers Brokers Consumers Kafka decouples data pipelines
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Streaming Analytics Reference Architecture Data Flow Apps Powered by NiFi Subcribing Streaming Analytics Apps Kafka is Everywhere - Critical Component of Streaming Architectures Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers Data Syndication Services Powered by Kafka syndicate- speed Kafka Topic syndicate- geo Kafka Topic syndicate- transmission Kafka Topic syndicate- temp Kafka Topic syndicate- oil Kafka Topic syndicate- breaks Kafka Topic syndicate- battery Kafka Topic syndicate- start/stop Kafka Topic syndicate- acceleration Kafka Topic syndicate- idle Kafka Topic Data Collection at the Edge US West Fleet Truck Sensors C++ Agent US Central Fleet Truck Sensors C++ Agent US East Fleet Truck Sensors C++ Agent Analytics App 1 Analytics App 2 Analytics App 5 Analytics App 3 Analytics App 4 gateway-west- raw-sensors Kafka Topic IOT Ingest Gateway Powered by Kafka gateway-central- raw-sensors Kafka Topic gateway-east- raw-sensors Kafka Topic
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Kafka’s Omnipresence Has Led to the Onset of “Kafka Blindness” • Who is Affected? • Platform Operation Teams • Developers / DevOps Teams • Security / Governance Teams • What is “Kafka Blindness”? • Customers who use Kafka today struggle with monitoring / “seeing”/troubleshooting what is happening in their clusters • What are the Symptoms? • Difficulty seeing who is producing and consuming data • Difficulty understanding the flow of data from producers -> topics à consumers • Difficulty troubleshooting/monitoring.
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Kafka Personas and Challenges
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved Distinct Needs of 3 Personas/Teams using Kafka Concerned with monitoring the overall health of the cluster and the infrastructure it runs on Concerned with monitoring the Kafka entities associated with their apps Concerned with audit, compliance, access control & chain of custody requirements
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved SMM Platform Operations Use Cases
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved SMM for the Platform Ops Team Are any of my brokers down? How many total active producers/ consumers is there currently? Which producers are generating the most data right now? Which producers are generating the most data right now? Do I have any offline topic partitions? Which producers are generating the most data right now? Which consumer group us falling behind the most? Are there any skewed partitions for a broker? What is the throughput in/out for a given partition on that broker? How many total active producers and consumers exist now? Are any of my brokers running out of disk space? What hosts are my brokers located on? How of the cluster is being used, how much capacity do I have available per broker / cluster? Are any of my brokers running hot? Which broker has the highest throughput in and and rates? Which partitions are located on each broker? How many total topics does my Kafka cluster have? Are all my replicas in my topic in-synch? Which of my topics has produced/consumers the most messages over the last N minutes/hours?
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved How would you find the most active producer in the cluster ? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X ü
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved Find the Most Active Producer in my Cluster A Kafka producer called minfi-eu-i1 is the most active producer sending 39K messages in the last 30 mins Click on Messages to sort on messages sent by all producers in the last 30 mins
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved How would you find the consumer that has fallen behind the most? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X ü
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved Find the consumer that has fallen behind the most Click on LAG to sort on consumer lag across all consumers in the last 30 mins Consumer group named route-micro- service has significantly more lag (97K) than any another consumer in the cluster.
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved How would you find which broker has the highest throughput? Open Source Monitoring Tools Grafana / Graphite Command Line / Script ü ü ü
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved Which broker has the highest throughput? Step 1 Click on the Brokers tab to see a broker centric view of the Dashboard Step 2 Click on Throughput to sort on data in across all brokers For Grafana Metrics Click on Ambari icon to view Broker host level metrics
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved How would you find the partitions on a given broker and understand the flow of data from producers to consumers? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X X
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved Understand flow of data flow from Producer to Consumer Step 1 Click on Panel expand to get more details on the broker like all partitions that are stored on the broker Step 2 Click on the partition to see the producers and consumers sending/consuming from that partition.
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved SMM DevOps/App Dev Use Cases
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved SMM for the DevOps/AppDev Teams Find all entities (producer, consumers, topics) associated with my app. What is the retention rate for my app topics? What is the replication factor for my app topics? What brokers holds the partitions for my app topics? Who are all the producers and consumers connected to my app topics? What type of events are in my application topics? What does the event look like? What is the total number of messages into my topic over the last N minutes/hours? Did a consumer rebalance occur for a given topic? Are any of my consumers/ consumer-groups that are over-consuming? Are there consumers in a consumer-group for a given topic slow/falling behind? How many active consumers instances are in a given consumer group? Are any of my consumers/consumer- groups that are under-consuming?What topic(s) are the consumer group consuming messages from?
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved How would you filter all of the producers, consumers, and topics that are related to your application? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X X
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Filtering producers, consumers, and topics Use the Filter to filter on topics and select all the IOT gateway topics
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved How would you find the partitions on a given broker and understand the flow of data from producers to consumers? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X X
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Are any consumers /consumer-groups under-consuming? Step 1 Expand topic panel to see more details Step 2 Click on the Topic to see all producers Analysis Note that for each partition there is no data going out (0B) and no data going to any consumer groups.
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved How does Data Flow between Producers to Topics to Consumers? Step 1 Expand details of a topic that has consumers Step 2 Click on the topic to see producers and consumers
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved How would search for and view the data in a topic of interest? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X ü
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Explore/Search Messages in the Kafka Topic Click on the explorer icon to search for events in the Kafka Topic
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved SMM for the Security and Governance Team How does data flow across multiple kafka hops? When was a topic created? When was the topic configuration last modified? Which producers have sent data to the topic? Which consumers have consumed from a topic? What is the schema for a given topic? How has the schema evolved for a given topic? When were additional brokers added to the topic?Which users/groups/service accounts have read from a given topic? Which users/groups/service accounts have sent data to a topic? What are the ACL policies for a given topic? Who has edited the ACL policies of a given topic? What is the linage of a kafka topic?
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved How would you trace the data lineage across multiple Kafka topics? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X X
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved Step 1 Click on Atlas Icon to see lineage of the the topic gateway-west-raw-sensors Analysis NiFi App consumes from the gateway- west-raw-sensors topic and publishes events to downstream Kafka topic called syndicate-geo-event-avro The topic has one active consumer which is a NiFi consumer. Which Kafka topic if any is this NiFi Flow consumer publishing events to? Step 2 Search for the Kafka topic that the the nifi-truck- sensor-west consumer was publishing events to Analysis We see that that this topic has 4 downstream consumers. We just tracked a flow across multiple Kafka topics
  • 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved How would integrate all of the metrics with an APM, alerting or ticketing system ? Open Source Monitoring Tools Grafana / Graphite Command Line / Script X X X
  • 30. 30 © Hortonworks Inc. 2011–2018. All rights reserved SMM REST Admin Server What is the SMM REST Server? à SMM UI powered by first class REST Services via the SMM REST Admin Server à Monitoring Rest Endpoints can be used to integrate with APM/Alerting/Ticketing solutions à Powered by Apache Knox à Installed via an Ambari Management Pack on a target HDP/HDF cluster where Kafka Service is running à Supports both HDP & HDF Platforms
  • 31. 31 © Hortonworks Inc. 2011–2018. All rights reserved HORTONWORKS DATA PLATFORM (HDP®) DATA-AT-REST PRODUCERS CONSUMERS TOPICS BROKERS SMM REST Server HORTONWORKS DATAFLOW (HDF) DATA-IN-MOTION Optimize your Kafka Clusters Troubleshoot Kafka Streams Trace end-to- end Kafka flows Streams Messaging Manager (SMM)
  • 32. 32 © Hortonworks Inc. 2011–2018. All rights reserved Questions?
  • 33. 33 © Hortonworks Inc. 2011–2018. All rights reserved Thank you