SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Low-Latency Streaming
Data Processing in Hadoop
InSemble Inc.
http://www.insemble.com
Agenda
Reference Architecture for Low Latency Streaming1
Storm4
Kafka3
Flume2
Demo5
Hadoop Ecosystem
Source: Apache Hadoop Documentation
Cloudera Platform
Hortonworks Data Platform(HDP)
Real time Stream Processing
Architecture with Hadoop
Flume Architecture
• Distributed system for
collecting and
aggregating from
multiple data stores to a
centralized data store
• Agent is a JVM that
hosts the Flume
components
• Channel will store
message until
picked by a sink
• Different types of
Flume sources
• Source and Sink are
decoupled
Consolidation Architecture
Multiplexing Architecture
Kafka Introduction
• Messaging System which is distributed, partitioned and replicated
• Kafka brokers run as a cluster
• Producers and Consumers can be written in any language
Topic
• Ordered, immutable sequence numbers
• Retains messages until a period of time
• “Offset” of where they are is controlled by the consumer
• Each partition is replicated and has “leader” and 0 or more “follower”. R/W
only done on leader
Producers and Consumers
• Producer controls which partition messages goes to
• Supports both Queuing and Pub/Sub
– Abstraction called Consumer group
• Ordering within Partition
– Ordering for subscriber has to be done with only one subscriber to that
partition
Storm Introduction
• Distributed real time computational system
– Process unbounded streams of data
– Can use multiple programming languages
– Scalable, fault-tolerant and guarantees that data will be processed
• Use Cases
– Real time analytics, online machine learning
– Continuous Computation
– Distributed RPC
– ETL
• Concepts
– Topology
– Spouts
– Bolts
Concepts
• Storm Cluster
– Master node(Nimbus)
• Distributing code
• Assigns tasks to machines
• Monitors for failures
– Worker nodes(Supervisor)
• Starts/stops worker processes
• Each worker process executes subset of a topology
– Zookeeper
• Coordinates between Nimbus and Supervisors
• Nimbus and Supervisors completely stateless
• State maintained by Zookeeper or local disks
Details
• Stream
– Unbounded sequence of tuples
• Spout(write logic)
– Source of stream. Emits tuples
• Bolt(write logic)
– Processes streams and emits tuples
• Topology
– DAG of spouts and bolts
– Submit a topology to a Storm cluster
– Each node runs in parallel and parallelism is controlled
Stream groupings
• Tells a topology how to send tuples between two components
• Since tasks are executed in parallel, how do we control which tasks the
tuples are being sent to
Demo - Twitter TopN Trending Topic
• Use Flume Twitter Source to ingest data
and publish event to Kafka topic
• Use Storm as an Real-Time event
processing system to calculate TopN
trending topic
• Use Redis to store the TopN Result
• Use Node.js/JQuery for visualization
Flow Chart
Twitter
Twitter Source
Flume Agent
Mem Channel Kafka Sink
KafkaKafka SpoutParse Twitter BoltCount Bolt
TopN Ranker Bolt Report Bolt
Storm
RedisNode.js + JQuery
Twitter Source Mem Channel Kafka Sink
Flume Agent — Source
Flume Agent — Channel
Flume Agent — Sink
Storm Topology Design
Submit Topology to Cluster
ParseTweetBolt Code
ParseTweetBolt Code
ParseTweetBolt Code

Weitere ähnliche Inhalte

Was ist angesagt?

Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...confluent
 
Apache pulsar - storage architecture
Apache pulsar - storage architectureApache pulsar - storage architecture
Apache pulsar - storage architectureMatteo Merli
 
Deep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionDeep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionAlexandre Tamborrino
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafkaemreakis
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streamingdatamantra
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageStreamlio
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanStreamNative
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free FridayOtávio Carvalho
 
Pulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scalePulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scaleMatteo Merli
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Gwen (Chen) Shapira
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 

Was ist angesagt? (20)

Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Apache pulsar - storage architecture
Apache pulsar - storage architectureApache pulsar - storage architecture
Apache pulsar - storage architecture
 
Deep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionDeep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumption
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
 
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed Storage
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! Japan
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka aws
Kafka awsKafka aws
Kafka aws
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 
Pulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scalePulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scale
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Message queues
Message queuesMessage queues
Message queues
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 

Andere mochten auch

CST-2015_MWJ-Most Valuable Product
CST-2015_MWJ-Most Valuable ProductCST-2015_MWJ-Most Valuable Product
CST-2015_MWJ-Most Valuable ProductJack Adamson
 
Go beyond analytics
Go beyond analyticsGo beyond analytics
Go beyond analyticsEvalyzer
 
What is user experience
What is user experienceWhat is user experience
What is user experienceEvalyzer
 
Public Sector Show - Speakers Presentation
Public Sector Show  - Speakers PresentationPublic Sector Show  - Speakers Presentation
Public Sector Show - Speakers Presentationacademiesshow
 
Hssc i objective workbook
Hssc i objective workbookHssc i objective workbook
Hssc i objective workbookEngin Basturk
 
VCR Presentation Jessup
VCR Presentation JessupVCR Presentation Jessup
VCR Presentation JessupJames Jessup
 
hivve.me למידה שיתופית
hivve.me   למידה שיתופיתhivve.me   למידה שיתופית
hivve.me למידה שיתופיתhivve
 
JessupPTRComprehensiveAssignmentPt2
JessupPTRComprehensiveAssignmentPt2JessupPTRComprehensiveAssignmentPt2
JessupPTRComprehensiveAssignmentPt2James Jessup
 
Seniors' Stories event - keepsake booklet
Seniors' Stories event - keepsake bookletSeniors' Stories event - keepsake booklet
Seniors' Stories event - keepsake bookletAna-Maria Gheorghiu
 
hivve.me - Collaborative messeneger
hivve.me - Collaborative messeneger hivve.me - Collaborative messeneger
hivve.me - Collaborative messeneger hivve
 

Andere mochten auch (19)

Bunga
BungaBunga
Bunga
 
CST-2015_MWJ-Most Valuable Product
CST-2015_MWJ-Most Valuable ProductCST-2015_MWJ-Most Valuable Product
CST-2015_MWJ-Most Valuable Product
 
Go beyond analytics
Go beyond analyticsGo beyond analytics
Go beyond analytics
 
What is user experience
What is user experienceWhat is user experience
What is user experience
 
NEGOSIASI
NEGOSIASINEGOSIASI
NEGOSIASI
 
FINAL RESUME AS ON 010816
FINAL RESUME AS ON 010816FINAL RESUME AS ON 010816
FINAL RESUME AS ON 010816
 
Q distance
Q distanceQ distance
Q distance
 
Public Sector Show - Speakers Presentation
Public Sector Show  - Speakers PresentationPublic Sector Show  - Speakers Presentation
Public Sector Show - Speakers Presentation
 
Bunga
BungaBunga
Bunga
 
Hssc i objective workbook
Hssc i objective workbookHssc i objective workbook
Hssc i objective workbook
 
VCR Presentation Jessup
VCR Presentation JessupVCR Presentation Jessup
VCR Presentation Jessup
 
hivve.me למידה שיתופית
hivve.me   למידה שיתופיתhivve.me   למידה שיתופית
hivve.me למידה שיתופית
 
Apac Digital Trends May 2015
Apac Digital Trends May 2015Apac Digital Trends May 2015
Apac Digital Trends May 2015
 
For reprint
For reprintFor reprint
For reprint
 
MATT CV ROEVIN
MATT CV ROEVINMATT CV ROEVIN
MATT CV ROEVIN
 
ankita cv final (2)
ankita cv final (2)ankita cv final (2)
ankita cv final (2)
 
JessupPTRComprehensiveAssignmentPt2
JessupPTRComprehensiveAssignmentPt2JessupPTRComprehensiveAssignmentPt2
JessupPTRComprehensiveAssignmentPt2
 
Seniors' Stories event - keepsake booklet
Seniors' Stories event - keepsake bookletSeniors' Stories event - keepsake booklet
Seniors' Stories event - keepsake booklet
 
hivve.me - Collaborative messeneger
hivve.me - Collaborative messeneger hivve.me - Collaborative messeneger
hivve.me - Collaborative messeneger
 

Ähnlich wie Low Latency Streaming Data Processing in Hadoop

Apache flume - an Introduction
Apache flume - an IntroductionApache flume - an Introduction
Apache flume - an IntroductionErik Schmiegelow
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Stormjustinjleet
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationKnoldus Inc.
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPCMax Alexejev
 
From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...P. Taylor Goetz
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.pptvijayapraba1
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 

Ähnlich wie Low Latency Streaming Data Processing in Hadoop (20)

Apache flume - an Introduction
Apache flume - an IntroductionApache flume - an Introduction
Apache flume - an Introduction
 
Apache storm
Apache stormApache storm
Apache storm
 
Storm
StormStorm
Storm
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Mhug apache storm
Mhug apache stormMhug apache storm
Mhug apache storm
 
Follow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHPFollow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHP
 
Apache Spark Components
Apache Spark ComponentsApache Spark Components
Apache Spark Components
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Storm
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
 
From Device to Data Center to Insights
From Device to Data Center to InsightsFrom Device to Data Center to Insights
From Device to Data Center to Insights
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
 
From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analytics
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 

Kürzlich hochgeladen

SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 

Kürzlich hochgeladen (20)

SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Low Latency Streaming Data Processing in Hadoop

  • 1. Low-Latency Streaming Data Processing in Hadoop InSemble Inc. http://www.insemble.com
  • 2. Agenda Reference Architecture for Low Latency Streaming1 Storm4 Kafka3 Flume2 Demo5
  • 3. Hadoop Ecosystem Source: Apache Hadoop Documentation
  • 6. Real time Stream Processing Architecture with Hadoop
  • 7. Flume Architecture • Distributed system for collecting and aggregating from multiple data stores to a centralized data store • Agent is a JVM that hosts the Flume components • Channel will store message until picked by a sink • Different types of Flume sources • Source and Sink are decoupled
  • 10. Kafka Introduction • Messaging System which is distributed, partitioned and replicated • Kafka brokers run as a cluster • Producers and Consumers can be written in any language
  • 11. Topic • Ordered, immutable sequence numbers • Retains messages until a period of time • “Offset” of where they are is controlled by the consumer • Each partition is replicated and has “leader” and 0 or more “follower”. R/W only done on leader
  • 12. Producers and Consumers • Producer controls which partition messages goes to • Supports both Queuing and Pub/Sub – Abstraction called Consumer group • Ordering within Partition – Ordering for subscriber has to be done with only one subscriber to that partition
  • 13. Storm Introduction • Distributed real time computational system – Process unbounded streams of data – Can use multiple programming languages – Scalable, fault-tolerant and guarantees that data will be processed • Use Cases – Real time analytics, online machine learning – Continuous Computation – Distributed RPC – ETL • Concepts – Topology – Spouts – Bolts
  • 14. Concepts • Storm Cluster – Master node(Nimbus) • Distributing code • Assigns tasks to machines • Monitors for failures – Worker nodes(Supervisor) • Starts/stops worker processes • Each worker process executes subset of a topology – Zookeeper • Coordinates between Nimbus and Supervisors • Nimbus and Supervisors completely stateless • State maintained by Zookeeper or local disks
  • 15. Details • Stream – Unbounded sequence of tuples • Spout(write logic) – Source of stream. Emits tuples • Bolt(write logic) – Processes streams and emits tuples • Topology – DAG of spouts and bolts – Submit a topology to a Storm cluster – Each node runs in parallel and parallelism is controlled
  • 16. Stream groupings • Tells a topology how to send tuples between two components • Since tasks are executed in parallel, how do we control which tasks the tuples are being sent to
  • 17. Demo - Twitter TopN Trending Topic • Use Flume Twitter Source to ingest data and publish event to Kafka topic • Use Storm as an Real-Time event processing system to calculate TopN trending topic • Use Redis to store the TopN Result • Use Node.js/JQuery for visualization
  • 18. Flow Chart Twitter Twitter Source Flume Agent Mem Channel Kafka Sink KafkaKafka SpoutParse Twitter BoltCount Bolt TopN Ranker Bolt Report Bolt Storm RedisNode.js + JQuery Twitter Source Mem Channel Kafka Sink
  • 19. Flume Agent — Source
  • 20. Flume Agent — Channel