SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
APACHE STORM
Viet-Dung TRINH (Bill), 03/2016
Saltlux – Vietnam Development Center
Agenda
•  Overview
•  Core Storm Concepts
•  Components of Storm Cluster
•  Example
Overview
•  Apache Storm is a free and open source distributed real-time
computation system.
•  Storm makes it easy to reliably process unbounded streams of
data, doing for real-time processing what Hadoop did for batch
processing.
•  Storm is fast (million tuples processed/second/node)
•  Can be used with any programming language
Overview (cont)
•  Use cases:
•  Real-time analytics,
•  Online machine learning,
•  Continuous computation
•  …
•  Integration: with any queueing and any database system
such as:
•  Kafka
•  Kestrel
•  RabbitMG/ AMQP
•  JMS
•  Amazon Kinesis
Core Storm Concepts
•  Topology
•  Tuple
•  Stream
•  Spout
•  Bolt
•  Stream grouping
Core Storm Concepts: Topology (cont)
•  Topology: is a graph of computation, consits of NODEs
and EDGEs.
•  Nodes: represent some individual computations.
•  Edges: represent the data being passed between nodes.
Core Storm Concepts: Tuple (cont)
•  Nodes in topology send data in form of tuples
•  Tuple: is ordered list of values, where each value is
assigned a name
•  Processing of sending a tuple is called emitting tuple
Core Storm Concepts: Stream (cont)
•  Stream: is an unbounded sequence of tuples between two
nodes in topology.
•  A topology can contain any number of streams
Core Storm Concepts: Spout (cont)
•  Spout: is the source of stream in topology
•  Read data from external data source and emits tuples into
topology.
Core Storm Concepts: Bolt (cont)
•  Bolt: accepts a tuple from its input stream, performs some
computation or transformation – filtering, aggregation, join
– on tuple, and optional emits a new tuple(s)
Core Storm Concepts: Stream Grouping
•  Defines how tuples are sent between instance of spouts
and bolts.
•  Two most common groupings: shuffle grouping and fields
grouping
•  SHUFFLE GROUPING: type of stream grouping where
tuples are emitted to bolts at random.
•  FIELDS GROUPING: ensures that tuples with the same
value for a particular field name are always emitted to the
same bolt.
Components of Storm Cluster
•  Two kinds of nodes: Master and Worker
•  Master node runs daemon called Nimbus
•  Worker node runs daemon called Supervisor
•  All coordination between Nimbus and Supervisor is done
through Zookeeper.
Example: GitHub Commit Feed
Example: GitHub Commit Feed (cont)
•  Each commit comes into feed as single string containing
COMMIT_ID, followed by a SPACE, followed by EMAIL.
Breaking Down the Problem
•  Component: reads from live feed of
commits and produces single
commit message
•  Component: accepts single commit
message, extracts the developer’s
email from that commit, produces
email
•  Component: accepts developer’s
email and updates in-memory map
where key is email and value is
number of commits for that email.
Breaking Down the Problem (cont)
Tuples
•  Two types of tuple in
topology
•  COMMIT: contain
commit_id and email
•  EMAIL: developer
email
Spout
•  Listen to real-time feed of
commits being made to
repository
Bolts
•  1st Bolt: extracts developer’s
email
•  2nd Bolt: updates map of
emails to commit counts
References
[1]. Apache Storm, http://storm.apache.org
[2]. Sean T. Allen, Matthew Jankowski, Peter Pathirana,
Storm Applied, 2015
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop TechnologyManish Borkar
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARNAdam Kawa
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streamingdatamantra
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark InternalsPietro Michiardi
 

Was ist angesagt? (20)

Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
 
Spark
SparkSpark
Spark
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Cloudera Hadoop Distribution
Cloudera Hadoop DistributionCloudera Hadoop Distribution
Cloudera Hadoop Distribution
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Apache Hive - Introduction
Apache Hive - IntroductionApache Hive - Introduction
Apache Hive - Introduction
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Apache Hadoop YARN
Apache Hadoop YARNApache Hadoop YARN
Apache Hadoop YARN
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 

Ähnlich wie Apache Storm

Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Stormjustinjleet
 
Ruby Microservices with RabbitMQ
Ruby Microservices with RabbitMQRuby Microservices with RabbitMQ
Ruby Microservices with RabbitMQZoran Majstorovic
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormDavorin Vukelic
 
Low Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in HadoopLow Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in HadoopInSemble
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesOleksii Diagiliev
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresCloudLightning
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPrashant Rane
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormEugene Dvorkin
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewPoo Kuan Hoong
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlowNeural Networks with Google TensorFlow
Neural Networks with Google TensorFlowDarshan Patel
 

Ähnlich wie Apache Storm (20)

Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Storm
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
Ruby Microservices with RabbitMQ
Ruby Microservices with RabbitMQRuby Microservices with RabbitMQ
Ruby Microservices with RabbitMQ
 
Storm
StormStorm
Storm
 
1 storm-intro
1 storm-intro1 storm-intro
1 storm-intro
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
Low Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in HadoopLow Latency Streaming Data Processing in Hadoop
Low Latency Streaming Data Processing in Hadoop
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpaces
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
 
Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
 
Apache Storm Basics
Apache Storm BasicsApache Storm Basics
Apache Storm Basics
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCD
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Storm
StormStorm
Storm
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlowNeural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
 

Mehr von Nguyen Quang

Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement LearningNguyen Quang
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System ReviewNguyen Quang
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksNguyen Quang
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
Web browser architecture
Web browser architectureWeb browser architecture
Web browser architectureNguyen Quang
 
X Query for beginner
X Query for beginnerX Query for beginner
X Query for beginnerNguyen Quang
 
Redistributable introtoscrum
Redistributable introtoscrumRedistributable introtoscrum
Redistributable introtoscrumNguyen Quang
 
Text categorization
Text categorizationText categorization
Text categorizationNguyen Quang
 
A holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion miningA holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion miningNguyen Quang
 

Mehr von Nguyen Quang (13)

Apache Zookeeper
Apache ZookeeperApache Zookeeper
Apache Zookeeper
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement Learning
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System Review
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Web browser architecture
Web browser architectureWeb browser architecture
Web browser architecture
 
Eclipse orion
Eclipse orionEclipse orion
Eclipse orion
 
X Query for beginner
X Query for beginnerX Query for beginner
X Query for beginner
 
Html 5
Html 5Html 5
Html 5
 
Redistributable introtoscrum
Redistributable introtoscrumRedistributable introtoscrum
Redistributable introtoscrum
 
Text categorization
Text categorizationText categorization
Text categorization
 
A holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion miningA holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion mining
 
Overview of NoSQL
Overview of NoSQLOverview of NoSQL
Overview of NoSQL
 

Kürzlich hochgeladen

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 

Kürzlich hochgeladen (20)

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 

Apache Storm

  • 1. APACHE STORM Viet-Dung TRINH (Bill), 03/2016 Saltlux – Vietnam Development Center
  • 2. Agenda •  Overview •  Core Storm Concepts •  Components of Storm Cluster •  Example
  • 3. Overview •  Apache Storm is a free and open source distributed real-time computation system. •  Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. •  Storm is fast (million tuples processed/second/node) •  Can be used with any programming language
  • 4. Overview (cont) •  Use cases: •  Real-time analytics, •  Online machine learning, •  Continuous computation •  … •  Integration: with any queueing and any database system such as: •  Kafka •  Kestrel •  RabbitMG/ AMQP •  JMS •  Amazon Kinesis
  • 5. Core Storm Concepts •  Topology •  Tuple •  Stream •  Spout •  Bolt •  Stream grouping
  • 6. Core Storm Concepts: Topology (cont) •  Topology: is a graph of computation, consits of NODEs and EDGEs. •  Nodes: represent some individual computations. •  Edges: represent the data being passed between nodes.
  • 7. Core Storm Concepts: Tuple (cont) •  Nodes in topology send data in form of tuples •  Tuple: is ordered list of values, where each value is assigned a name •  Processing of sending a tuple is called emitting tuple
  • 8. Core Storm Concepts: Stream (cont) •  Stream: is an unbounded sequence of tuples between two nodes in topology. •  A topology can contain any number of streams
  • 9. Core Storm Concepts: Spout (cont) •  Spout: is the source of stream in topology •  Read data from external data source and emits tuples into topology.
  • 10. Core Storm Concepts: Bolt (cont) •  Bolt: accepts a tuple from its input stream, performs some computation or transformation – filtering, aggregation, join – on tuple, and optional emits a new tuple(s)
  • 11. Core Storm Concepts: Stream Grouping •  Defines how tuples are sent between instance of spouts and bolts. •  Two most common groupings: shuffle grouping and fields grouping •  SHUFFLE GROUPING: type of stream grouping where tuples are emitted to bolts at random. •  FIELDS GROUPING: ensures that tuples with the same value for a particular field name are always emitted to the same bolt.
  • 12. Components of Storm Cluster •  Two kinds of nodes: Master and Worker •  Master node runs daemon called Nimbus •  Worker node runs daemon called Supervisor •  All coordination between Nimbus and Supervisor is done through Zookeeper.
  • 14. Example: GitHub Commit Feed (cont) •  Each commit comes into feed as single string containing COMMIT_ID, followed by a SPACE, followed by EMAIL.
  • 15. Breaking Down the Problem •  Component: reads from live feed of commits and produces single commit message •  Component: accepts single commit message, extracts the developer’s email from that commit, produces email •  Component: accepts developer’s email and updates in-memory map where key is email and value is number of commits for that email.
  • 16. Breaking Down the Problem (cont)
  • 17. Tuples •  Two types of tuple in topology •  COMMIT: contain commit_id and email •  EMAIL: developer email
  • 18. Spout •  Listen to real-time feed of commits being made to repository
  • 19. Bolts •  1st Bolt: extracts developer’s email •  2nd Bolt: updates map of emails to commit counts
  • 20. References [1]. Apache Storm, http://storm.apache.org [2]. Sean T. Allen, Matthew Jankowski, Peter Pathirana, Storm Applied, 2015