SlideShare ist ein Scribd-Unternehmen logo
Twitter Heron In
Practice
KARTHIK RAMASAMY
@KARTHIKZ
#TwitterHeron
MAOSONG FU
@LOUIS_FUMAOSONG
BILL GRAHAM
@BILLGRAHAM
AGENDA
HERON OVERVIEW
HERON HANDS-ON
BEGIN
END
HERON
OVERVIEW
!
I
HERON IN
PRODUCTION
(II
CONCLUSION
K
V
HERON
LOAD
SHEDDING
ZIV
TALK OUTLINE
HERON
BACKPRESSURE
bIII
HERON
OVERVIEW
b
STORM/HERON TERMINOLOGY
TOPOLOGY
Directed acyclic graph
Vertices=computation, and edges=streams of data tuples
SPOUTS
Sources of data tuples for the topology
Examples - Kafka/Kestrel/MySQL/Postgres
BOLTS
Process incoming tuples and emit outgoing tuples
Examples - filtering/aggregation/join/arbitrary function
,
%
STORM/HERON TOPOLOGY
%
%
%
%
%
SPOUT 1
SPOUT 2
BOLT 1
BOLT 2
BOLT 3
BOLT 4
BOLT 5
WHY HERON?
PERFORMANCE PREDICTABILITY
EASE OF MANAGEABILITY
!
d
" IMPROVE DEVELOPER PRODUCTIVITY
HERON ARCHITECTURE
Topology 1
TOPOLOGY
SUBMISSION
Scheduler
Topology 2
Topology 3
Topology N
TOPOLOGY ARCHITECTURE
Topology
Master
ZK
CLUSTER
Stream
Manager
I1 I2 I3 I4
Stream
Manager
I1 I2 I3 I4
Logical Plan,
Physical Plan and
Execution State
Sync Physical Plan
CONTAINER CONTAINER
Metrics
Manager
Metrics
Manager
HERON IN
PRODUCTION
x
9
HERON SAMPLE TOPOLOGIES
Large amount of data
produced every day
Large cluster Several hundred
topologies deployed
Several billion
messages every day
HERON @TWITTER
1 stage 10 stages
3x reduction in cores and memory
Heron has been in production for 2 years
HERON USE CASES
REALTIME
ETL
REAL TIME
BI
SPAM
DETECTION
REAL TIME
TRENDS
REALTIME
ML
REAL TIME
MEDIA
REAL TIME
OPS
HERON BACK
PRESSURE
x
9
STRAGGLERS
BAD HOST EXECUTION
SKEW
INADEQUATE
PROVISIONING
b  Ñ
Stragglers are the norm in a multi-tenant distributed systems
APPROACHES TO HANDLE STRAGGLERS
SENDERS TO STRAGGLER DROP DATA
DETECT STRAGGLERS AND RESCHEDULE THEM
!
d
" SENDERS SLOW DOWN TO THE SPEED OF STRAGGLER
DROP DATA STRATEGY
UNPREDICTABLE AFFECTS
ACCURACY
POOR
VISIBILITY
b  Ñ
SLOW DOWN SENDER STRATEGY
PROVIDES
PREDICTABILITY
PROCESSES
DATA AT
MAXIMUM
RATE
REDUCE
RECOVERY
TIMES
HANDLES
TEMPORARY
SPIKES
/b  Ñ
back pressure
SLOW DOWN SENDER STRATEGY
% %
S1 B2 B3
%
B4
back pressure
S1 B2
B3
SLOW DOWN SENDERS STRATEGY
Stream
Manager
Stream
Manager
Stream
Manager
Stream
Manager
S1 B2
B3 B4
S1 B2
B3
S1 B2
B3 B4
B4
S1 S1
S1S1
BACK PRESSURE IN PRACTICE
Read pointer Write pointer
Manually restart the container
BACK PRESSURE IN PRACTICE
IN MOST SCENARIOS BACK PRESSURE RECOVERS
Without any manual intervention
SOMETIMES USER PREFER DROPPING OF DATA
Care about only latest data
!
d
"
SUSTAINED BACK PRESSURE
Irrecoverable GC cycles
Bad or faulty host
DETECTING BAD/FAULTY HOSTS
ENVIRONMENT'S SUPPORTED
STORM API
PRE- 1.0.0
POST 1.0.0
!
SUMMINGBIRD FOR HERON
CURIOUS TO LEARN MORE…
Twitter Heron: Stream Processing at Scale
Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg,
Sailesh Mittal, Jignesh M. Patel*,1
, Karthik Ramasamy, Siddarth Taneja
@sanjeevrk, @challenger_nik, @Louis_Fumaosong, @vikkyrk, @cckellogg,
@saileshmittal, @pateljm, @karthikz, @staneja
Twitter, Inc., *University of Wisconsin – Madison
ABSTRACT
Storm has long served as the main platform for real-time analytics
at Twitter. However, as the scale of data being processed in real-
time at Twitter has increased, along with an increase in the
diversity and the number of use cases, many limitations of Storm
have become apparent. We need a system that scales better, has
better debug-ability, has better performance, and is easier to
manage – all while working in a shared cluster infrastructure. We
considered various alternatives to meet these needs, and in the end
concluded that we needed to build a new real-time stream data
processing system. This paper presents the design and
implementation of this new system, called Heron. Heron is now
system process, which makes debugging very challenging. Thus, we
needed a cleaner mapping from the logical units of computation to
each physical process. The importance of such clean mapping for
debug-ability is really crucial when responding to pager alerts for a
failing topology, especially if it is a topology that is critical to the
underlying business model.
In addition, Storm needs dedicated cluster resources, which requires
special hardware allocation to run Storm topologies. This approach
leads to inefficiencies in using precious cluster resources, and also
limits the ability to scale on demand. We needed the ability to work
in a more flexible way with popular cluster scheduling software that
allows sharing the cluster resources across different types of data
Storm @Twitter
Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel*, Sanjeev Kulkarni,
Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, Dmitriy Ryaboy
@ankitoshniwal, @staneja, @amits, @karthikz, @pateljm, @sanjeevrk,
@jason_j, @krishnagade, @Louis_Fumaosong, @jakedonham, @challenger_nik, @saileshmittal, @squarecog
Twitter, Inc., *University of Wisconsin – Madison
Streaming@Twitter
Maosong Fu, Sailesh Mittal, Vikas Kedigehalli, Karthik Ramasamy, Michael Barry,
Andrew Jorgensen, Christopher Kellogg, Neng Lu, Bill Graham, Jingwei Wu
Twitter, Inc.
Abstract
Twitter generates tens of billions of events per hour when users interact with it. Analyzing these
events to surface relevant content and to derive insights in real time is a challenge. To address this, we
developed Heron, a new real time distributed streaming engine. In this paper, we first describe the design
goals of Heron and show how the Heron architecture achieves task isolation and resource reservation
to ease debugging, troubleshooting, and seamless use of shared cluster infrastructure with other critical
Twitter services. We subsequently explore how a topology self adjusts using back pressure so that the
pace of the topology goes as its slowest component. Finally, we outline how Heron implements at most
once and at least once semantics and we describe a few operational stories based on running Heron in
production.
1 Introduction
INTERESTED IN HERON?
CONTRIBUTIONS ARE WELCOME!
https://github.com/twitter/heron
http://heronstreaming.io
HERON IS OPEN SOURCED
FOLLOW US @HERONSTREAMING
QUESTIONS
and
ANSWERS
R
" Go ahead. Ask away.
Heron Hands On
BILL GRAHAM, MAOSONG FU
@BILLGRAHAM
@LOUIS_FUMAOSONG
#TwitterHeron
BEGIN
END
INSTALL
HERON
!
I
LAUNCH
TOPOLOGIES
(II
CONCLUSION
K
V
HERON
STARTER
ZIV
AGENDA OUTLINE
EXPLORE UI
bIII
HELP LINE
SLACK INVITE
heronusers.herokuapp.com
DOCUMENTATION
heronstreaming.io
INSTALL
HERON
x
9
HERON PREREQS
JAVA 7 OR ABOVE
PYTHON 2.7
LINUX PLATFORMS: INSTALL LIBUNWIND
INSTALL HERON CLIENT
STEP 1: DOWNLOAD CLIENT BINARIES
https://github.com/twitter/heron/releases/tag/0.14.2
heron-client-install-0.14.2-<platform>.sh
STEP 2: INSTALL CLIENT BINARIES
chmod +x heron-client-install-0.14.2-<platform>.sh
./heron-client-install-0.14.2-<platform>.sh —user
INSTALL HERON TOOLS
STEP 1: DOWNLOAD TOOLS BINARIES
https://github.com/twitter/heron/releases/tag/0.14.2
heron-tools-install-0.14.2-<platform>.sh
STEP 2: INSTALL TOOLS BINARIES
chmod +x heron-tools-install-0.14.2-<platform>.sh
./heron-tools-install-0.14.2-<platform>.sh —user
LAUNCH EXAMPLE TOPOLOGIES
STEP 1: SUBMIT TOPOLOGY
heron submit local ~/.heron/examples/heron-examples.jar 
com.twitter.heron.examples.ExclamationTopology 
ExclamationTopology
STEP 2: SUBMIT YET ANOTHER TOPOLOGY
heron submit local ~/.heron/examples/heron-examples.jar 
com.twitter.heron.examples.AckingTopology 
AckingTopology
STEP 3: SUBMIT YET ANOTHER ANOTHER TOPOLOGY
heron submit local ~/.heron/examples/heron-examples.jar 
com.twitter.heron.examples.MultiStageAckingTopology 
MultiStageAckingTopology
LAUNCH HERON TOOLS
STEP 1: LAUNCH HERON TRACKER
heron-tracker
STEP 2: LAUNCH HERON UI
heron-ui
http://localhost:8888
http://localhost:8889
UNDER THE COVERS
EXPLORE UI
STEP 1: LOGICAL PLAN AND PHYSICAL PLAN
STEP 2: EXPLORE DASHBOARD
STEP 3: VIEW METRICS
EXPLORE UI
STEP 4: VIEW PROCESS LOGS
STEP 5: VIEW EXCEPTIONS
STEP 6: EXPLORE OTHER FEATURES
HERON STARTER
SEVERAL STARTER EXAMPLES
https://github.com/kramasamy/heron-starter
INTERESTING EXERCISES
Trending Twitter Words
Trending Twitter Hashtags
Find top retweeters for given hash tag
QUESTIONS?
https://groups.google.com/forum/#!forum/heron-users
http://heronstreaming.io
FOR UPDATES FOLLOW
@heronstreaming
#
#ThankYou

Weitere ähnliche Inhalte

Andere mochten auch

Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Data Con LA
 
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive
 
Handson with Twitter Heron
Handson with Twitter HeronHandson with Twitter Heron
Handson with Twitter Heron
Data Engineers Guild Meetup Group
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve Omohundro
The Hive
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
Karthik Ramasamy
 
Sidechain talk
Sidechain talkSidechain talk
Sidechain talk
jojva
 
Financial Modelling - Skyline and Luge queue time
Financial Modelling - Skyline and Luge queue timeFinancial Modelling - Skyline and Luge queue time
Financial Modelling - Skyline and Luge queue time
Singapore Management University
 
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive
 
Bitcoin Cold-Storage With Bit-Card And BIP38
Bitcoin Cold-Storage With Bit-Card And BIP38Bitcoin Cold-Storage With Bit-Card And BIP38
Bitcoin Cold-Storage With Bit-Card And BIP38
Brian Fabian Crain
 
The Future of and Alternatives to Bitcoin
The Future of and Alternatives to BitcoinThe Future of and Alternatives to Bitcoin
The Future of and Alternatives to Bitcoin
Stockholm School of Economics
 
Sidechains Presentation
Sidechains PresentationSidechains Presentation
Sidechains Presentation
Brian Fabian Crain
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
The Hive
 
Bitcoin economics brian crain
Bitcoin economics   brian crainBitcoin economics   brian crain
Bitcoin economics brian crain
Brian Fabian Crain
 
Harvard Endowment Fund
Harvard Endowment Fund Harvard Endowment Fund
Harvard Endowment Fund
Singapore Management University
 
Bitcoin 2.0
Bitcoin 2.0 Bitcoin 2.0
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
The Hive
 

Andere mochten auch (19)

Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
Big Data Day LA 2016/ Big Data Track - Twitter Heron @ Scale - Karthik Ramasa...
 
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 
Handson with Twitter Heron
Handson with Twitter HeronHandson with Twitter Heron
Handson with Twitter Heron
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve Omohundro
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
 
Sidechain talk
Sidechain talkSidechain talk
Sidechain talk
 
Financial Modelling - Skyline and Luge queue time
Financial Modelling - Skyline and Luge queue timeFinancial Modelling - Skyline and Luge queue time
Financial Modelling - Skyline and Luge queue time
 
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
 
Bitcoin Cold-Storage With Bit-Card And BIP38
Bitcoin Cold-Storage With Bit-Card And BIP38Bitcoin Cold-Storage With Bit-Card And BIP38
Bitcoin Cold-Storage With Bit-Card And BIP38
 
The Future of and Alternatives to Bitcoin
The Future of and Alternatives to BitcoinThe Future of and Alternatives to Bitcoin
The Future of and Alternatives to Bitcoin
 
Sidechains Presentation
Sidechains PresentationSidechains Presentation
Sidechains Presentation
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
 
Bitcoin economics brian crain
Bitcoin economics   brian crainBitcoin economics   brian crain
Bitcoin economics brian crain
 
Harvard Endowment Fund
Harvard Endowment Fund Harvard Endowment Fund
Harvard Endowment Fund
 
Bitcoin 2.0
Bitcoin 2.0 Bitcoin 2.0
Bitcoin 2.0
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
 

Ähnlich wie The Hive Think Tank: Heron at Twitter

Evolution of The Twitter Stack
Evolution of The Twitter StackEvolution of The Twitter Stack
Evolution of The Twitter Stack
Chris Aniszczyk
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Data Con LA
 
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Karthik Ramasamy
 
Storm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paperStorm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paper
Karthik Ramasamy
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark Streaming
Paco Nathan
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using Cobweb
IRJET Journal
 
s2gx2015 who needs batch
s2gx2015 who needs batchs2gx2015 who needs batch
s2gx2015 who needs batch
Gunnar Hillert
 
From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of StreamsFrom Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of Streams
Mike Fowler
 
Systems Bioinformatics Workshop Keynote
Systems Bioinformatics Workshop KeynoteSystems Bioinformatics Workshop Keynote
Systems Bioinformatics Workshop Keynote
Deepak Singh
 
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive
 
Applying principles of chaos engineering to serverless (ServerlessCPH)
Applying principles of chaos engineering to serverless (ServerlessCPH)Applying principles of chaos engineering to serverless (ServerlessCPH)
Applying principles of chaos engineering to serverless (ServerlessCPH)
Yan Cui
 
ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
ActiVis: Visual Exploration of Industry-Scale Deep Neural Network ModelsActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
Minsuk Kahng
 
Making friends with big data resource links
Making friends with big data resource linksMaking friends with big data resource links
Making friends with big data resource links
Heather Stark
 
Twitter Heron. Evolution or Revolution
Twitter Heron. Evolution or Revolution Twitter Heron. Evolution or Revolution
Twitter Heron. Evolution or Revolution
Grzegorz Kolpuc
 
Experiments in Data Portability 2
Experiments in Data Portability 2Experiments in Data Portability 2
Experiments in Data Portability 2
Glenn Jones
 
Reanimating DevOps to Build Things that Work
Reanimating DevOps to Build Things that WorkReanimating DevOps to Build Things that Work
Reanimating DevOps to Build Things that Work
DevOpsDays Baltimore
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph Database
Guido Schmutz
 
Observability
ObservabilityObservability
Observability
Ebru Cucen Çüçen
 
Observability
ObservabilityObservability
Observability
Ebru Cucen Çüçen
 
Velocity 2015-final
Velocity 2015-finalVelocity 2015-final
Velocity 2015-final
Arun Kejariwal
 

Ähnlich wie The Hive Think Tank: Heron at Twitter (20)

Evolution of The Twitter Stack
Evolution of The Twitter StackEvolution of The Twitter Stack
Evolution of The Twitter Stack
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik Ramasamy
 
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
Twitter's Real Time Stack - Processing Billions of Events Using Distributed L...
 
Storm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paperStorm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paper
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark Streaming
 
IRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using CobwebIRJET - Twitter Spam Detection using Cobweb
IRJET - Twitter Spam Detection using Cobweb
 
s2gx2015 who needs batch
s2gx2015 who needs batchs2gx2015 who needs batch
s2gx2015 who needs batch
 
From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of StreamsFrom Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of Streams
 
Systems Bioinformatics Workshop Keynote
Systems Bioinformatics Workshop KeynoteSystems Bioinformatics Workshop Keynote
Systems Bioinformatics Workshop Keynote
 
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
 
Applying principles of chaos engineering to serverless (ServerlessCPH)
Applying principles of chaos engineering to serverless (ServerlessCPH)Applying principles of chaos engineering to serverless (ServerlessCPH)
Applying principles of chaos engineering to serverless (ServerlessCPH)
 
ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
ActiVis: Visual Exploration of Industry-Scale Deep Neural Network ModelsActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
 
Making friends with big data resource links
Making friends with big data resource linksMaking friends with big data resource links
Making friends with big data resource links
 
Twitter Heron. Evolution or Revolution
Twitter Heron. Evolution or Revolution Twitter Heron. Evolution or Revolution
Twitter Heron. Evolution or Revolution
 
Experiments in Data Portability 2
Experiments in Data Portability 2Experiments in Data Portability 2
Experiments in Data Portability 2
 
Reanimating DevOps to Build Things that Work
Reanimating DevOps to Build Things that WorkReanimating DevOps to Build Things that Work
Reanimating DevOps to Build Things that Work
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph Database
 
Observability
ObservabilityObservability
Observability
 
Observability
ObservabilityObservability
Observability
 
Velocity 2015-final
Velocity 2015-finalVelocity 2015-final
Velocity 2015-final
 

Mehr von The Hive

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead
The Hive
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
The Hive
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
The Hive
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
The Hive
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
The Hive
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
The Hive
 
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank:  Rocking the Database World with RocksDBThe Hive Think Tank:  Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive
 
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive
 
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
The Hive
 

Mehr von The Hive (14)

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
 
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank:  Rocking the Database World with RocksDBThe Hive Think Tank:  Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
 
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
 

Kürzlich hochgeladen

Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 

Kürzlich hochgeladen (20)

Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 

The Hive Think Tank: Heron at Twitter

  • 1. Twitter Heron In Practice KARTHIK RAMASAMY @KARTHIKZ #TwitterHeron MAOSONG FU @LOUIS_FUMAOSONG BILL GRAHAM @BILLGRAHAM
  • 5. STORM/HERON TERMINOLOGY TOPOLOGY Directed acyclic graph Vertices=computation, and edges=streams of data tuples SPOUTS Sources of data tuples for the topology Examples - Kafka/Kestrel/MySQL/Postgres BOLTS Process incoming tuples and emit outgoing tuples Examples - filtering/aggregation/join/arbitrary function , %
  • 6. STORM/HERON TOPOLOGY % % % % % SPOUT 1 SPOUT 2 BOLT 1 BOLT 2 BOLT 3 BOLT 4 BOLT 5
  • 7. WHY HERON? PERFORMANCE PREDICTABILITY EASE OF MANAGEABILITY ! d " IMPROVE DEVELOPER PRODUCTIVITY
  • 9. TOPOLOGY ARCHITECTURE Topology Master ZK CLUSTER Stream Manager I1 I2 I3 I4 Stream Manager I1 I2 I3 I4 Logical Plan, Physical Plan and Execution State Sync Physical Plan CONTAINER CONTAINER Metrics Manager Metrics Manager
  • 12. Large amount of data produced every day Large cluster Several hundred topologies deployed Several billion messages every day HERON @TWITTER 1 stage 10 stages 3x reduction in cores and memory Heron has been in production for 2 years
  • 13. HERON USE CASES REALTIME ETL REAL TIME BI SPAM DETECTION REAL TIME TRENDS REALTIME ML REAL TIME MEDIA REAL TIME OPS
  • 15. STRAGGLERS BAD HOST EXECUTION SKEW INADEQUATE PROVISIONING b Ñ Stragglers are the norm in a multi-tenant distributed systems
  • 16. APPROACHES TO HANDLE STRAGGLERS SENDERS TO STRAGGLER DROP DATA DETECT STRAGGLERS AND RESCHEDULE THEM ! d " SENDERS SLOW DOWN TO THE SPEED OF STRAGGLER
  • 17. DROP DATA STRATEGY UNPREDICTABLE AFFECTS ACCURACY POOR VISIBILITY b Ñ
  • 18. SLOW DOWN SENDER STRATEGY PROVIDES PREDICTABILITY PROCESSES DATA AT MAXIMUM RATE REDUCE RECOVERY TIMES HANDLES TEMPORARY SPIKES /b Ñ back pressure
  • 19. SLOW DOWN SENDER STRATEGY % % S1 B2 B3 % B4 back pressure
  • 20. S1 B2 B3 SLOW DOWN SENDERS STRATEGY Stream Manager Stream Manager Stream Manager Stream Manager S1 B2 B3 B4 S1 B2 B3 S1 B2 B3 B4 B4 S1 S1 S1S1
  • 21. BACK PRESSURE IN PRACTICE Read pointer Write pointer Manually restart the container
  • 22. BACK PRESSURE IN PRACTICE IN MOST SCENARIOS BACK PRESSURE RECOVERS Without any manual intervention SOMETIMES USER PREFER DROPPING OF DATA Care about only latest data ! d " SUSTAINED BACK PRESSURE Irrecoverable GC cycles Bad or faulty host
  • 24. ENVIRONMENT'S SUPPORTED STORM API PRE- 1.0.0 POST 1.0.0 ! SUMMINGBIRD FOR HERON
  • 25. CURIOUS TO LEARN MORE… Twitter Heron: Stream Processing at Scale Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel*,1 , Karthik Ramasamy, Siddarth Taneja @sanjeevrk, @challenger_nik, @Louis_Fumaosong, @vikkyrk, @cckellogg, @saileshmittal, @pateljm, @karthikz, @staneja Twitter, Inc., *University of Wisconsin – Madison ABSTRACT Storm has long served as the main platform for real-time analytics at Twitter. However, as the scale of data being processed in real- time at Twitter has increased, along with an increase in the diversity and the number of use cases, many limitations of Storm have become apparent. We need a system that scales better, has better debug-ability, has better performance, and is easier to manage – all while working in a shared cluster infrastructure. We considered various alternatives to meet these needs, and in the end concluded that we needed to build a new real-time stream data processing system. This paper presents the design and implementation of this new system, called Heron. Heron is now system process, which makes debugging very challenging. Thus, we needed a cleaner mapping from the logical units of computation to each physical process. The importance of such clean mapping for debug-ability is really crucial when responding to pager alerts for a failing topology, especially if it is a topology that is critical to the underlying business model. In addition, Storm needs dedicated cluster resources, which requires special hardware allocation to run Storm topologies. This approach leads to inefficiencies in using precious cluster resources, and also limits the ability to scale on demand. We needed the ability to work in a more flexible way with popular cluster scheduling software that allows sharing the cluster resources across different types of data Storm @Twitter Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel*, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, Dmitriy Ryaboy @ankitoshniwal, @staneja, @amits, @karthikz, @pateljm, @sanjeevrk, @jason_j, @krishnagade, @Louis_Fumaosong, @jakedonham, @challenger_nik, @saileshmittal, @squarecog Twitter, Inc., *University of Wisconsin – Madison Streaming@Twitter Maosong Fu, Sailesh Mittal, Vikas Kedigehalli, Karthik Ramasamy, Michael Barry, Andrew Jorgensen, Christopher Kellogg, Neng Lu, Bill Graham, Jingwei Wu Twitter, Inc. Abstract Twitter generates tens of billions of events per hour when users interact with it. Analyzing these events to surface relevant content and to derive insights in real time is a challenge. To address this, we developed Heron, a new real time distributed streaming engine. In this paper, we first describe the design goals of Heron and show how the Heron architecture achieves task isolation and resource reservation to ease debugging, troubleshooting, and seamless use of shared cluster infrastructure with other critical Twitter services. We subsequently explore how a topology self adjusts using back pressure so that the pace of the topology goes as its slowest component. Finally, we outline how Heron implements at most once and at least once semantics and we describe a few operational stories based on running Heron in production. 1 Introduction
  • 26. INTERESTED IN HERON? CONTRIBUTIONS ARE WELCOME! https://github.com/twitter/heron http://heronstreaming.io HERON IS OPEN SOURCED FOLLOW US @HERONSTREAMING
  • 28. Heron Hands On BILL GRAHAM, MAOSONG FU @BILLGRAHAM @LOUIS_FUMAOSONG #TwitterHeron
  • 32. HERON PREREQS JAVA 7 OR ABOVE PYTHON 2.7 LINUX PLATFORMS: INSTALL LIBUNWIND
  • 33. INSTALL HERON CLIENT STEP 1: DOWNLOAD CLIENT BINARIES https://github.com/twitter/heron/releases/tag/0.14.2 heron-client-install-0.14.2-<platform>.sh STEP 2: INSTALL CLIENT BINARIES chmod +x heron-client-install-0.14.2-<platform>.sh ./heron-client-install-0.14.2-<platform>.sh —user
  • 34. INSTALL HERON TOOLS STEP 1: DOWNLOAD TOOLS BINARIES https://github.com/twitter/heron/releases/tag/0.14.2 heron-tools-install-0.14.2-<platform>.sh STEP 2: INSTALL TOOLS BINARIES chmod +x heron-tools-install-0.14.2-<platform>.sh ./heron-tools-install-0.14.2-<platform>.sh —user
  • 35. LAUNCH EXAMPLE TOPOLOGIES STEP 1: SUBMIT TOPOLOGY heron submit local ~/.heron/examples/heron-examples.jar com.twitter.heron.examples.ExclamationTopology ExclamationTopology STEP 2: SUBMIT YET ANOTHER TOPOLOGY heron submit local ~/.heron/examples/heron-examples.jar com.twitter.heron.examples.AckingTopology AckingTopology STEP 3: SUBMIT YET ANOTHER ANOTHER TOPOLOGY heron submit local ~/.heron/examples/heron-examples.jar com.twitter.heron.examples.MultiStageAckingTopology MultiStageAckingTopology
  • 36. LAUNCH HERON TOOLS STEP 1: LAUNCH HERON TRACKER heron-tracker STEP 2: LAUNCH HERON UI heron-ui http://localhost:8888 http://localhost:8889
  • 38. EXPLORE UI STEP 1: LOGICAL PLAN AND PHYSICAL PLAN STEP 2: EXPLORE DASHBOARD STEP 3: VIEW METRICS
  • 39. EXPLORE UI STEP 4: VIEW PROCESS LOGS STEP 5: VIEW EXCEPTIONS STEP 6: EXPLORE OTHER FEATURES
  • 40. HERON STARTER SEVERAL STARTER EXAMPLES https://github.com/kramasamy/heron-starter INTERESTING EXERCISES Trending Twitter Words Trending Twitter Hashtags Find top retweeters for given hash tag