SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Building Streaming Applications
With
Apache Storm 1.1
Meetup
Hortonworks, April 20th, 2017
Presenters
• Sriharsha Chintalapani, Storm & Kafka
Committer, PMC @ Hortonworks
• Karthik Deivasigamani, Walmart Labs
• Roshan Naik, Storm Contributor, Flume
Commiter @ Hortonworks
• Hugo Louro, Storm Committer, PMC @
Hortonworks
Apache Storm
Apache Storm Brief History
• 2010 - First Streaming Framework - Backtype
• 2011 – Acquired by and Deployed at Twitter
• 2013 - Open Sourced into Apache
• Present – Large Scale Production Deployments
– Yahoo 3500+ Nodes
– Alibaba 1PB of Data per Day
Prior Releases Highlights
• 0.9.x
• Storm becomes an Apache TLP
• First Official Apache Release
• Expanded Kafka, HDFS, HBase Integration
• 0.10.x
• Multi Tenancy
• Rolling Upgrades
• Improved Logging (Log4j2)
• JDBC, Event Hubs, Hive Integration
Prior Releases Highlights
• 1.0
– Pacemaker (Replaces Zookeeper for Heartbeats)
– Security (Kerberos/Digest Authentication)
– Nimbus HA (Eliminates Single Point of Failure)
– Supervisor Health Checks
– Resource Aware Scheduler
Prior Releases Highlights
• 1.0
– Stateful Bolts
– Automatic Checkpointing/Snapshots
• ABS [2], Chandy-Lamport [3] Algorithms
– Streaming Windows
• Sliding, Tumbling, Watermarks, Out of Order Tuples
– Dynamic Log Levels
– Distributed Log Search
– Worker Profiling
– Solr, Cassandra, Elastic Search, MQTT Integration
Apache Storm 1.1.0
March 29, 2017
• Streaming SQL
• Improved Apache Kafka Integration
• PMML Support (Machine Learning)
• Druid Integration
• OpenTSDB Integration
Apache Storm 1.1.0
March 29, 2017
• AWS Kinesis Support
• HDFS Spout
• Other Enhancements
–Flux
–Topology Deployment
–Resource Aware Scheduler
Streaming SQL
• Apache Calcite for Query Parsing/Planning
• Define Topology Using SQL Like Query
• SQL Compiled and Transformed onto a Trident
Topology
• Streaming Onto/From Arbitrary Data Sources
– Kafka, Redis, HDFS, MongoDB
– Extensible Implementing ISqlTridentDataSource
Streaming SQL
• Tuple Filtering
• Projections
• CSV, TSV, and Avro input/output formats
• User Defined Functions (UDFs)
• User fine control of Parallelism of Generated
Components
Streaming SQL - Aggregate UDF
Streaming SQL – Example [1]
• Read Apache HTTPD server logs from Kafka
• Filter out everything but error log events
• Write the error events onto a Kafka topic
Streaming SQL – Example [1]
Improved Apache Kafka Integration
• Enhanced configuration API
• Support Consumer Groups
• Pluggable Translators Kafka Record ->Tuple
• Support for Topic Wildcards
• Support Multiple Streams, Topics/Stream
• Trident Kafka supporting Kafka 0.10 onwards
• Integrates with Secure Kafka Environments
Improved Apache Kafka Integration
PMML Support (Machine Learning)
• Predictive Model Markup Language
• Describes Model Learned by ML algorithms
• PmmlPredictorBolt Computes Predicted Scores
for Live Tuples according to PMML Model
• PMML Model Uploaded or Downloaded from
Distributed Cache
PMML Support (Machine Learning)
Storm 1.1.0 Improvements
• Flux
– Visualization in Storm UI
• Specify the resource requirements (Memory/CPU) for
individual topology components (Spouts/Bolts)
• Topology Deployment
– Alternative to Uber Jar
– storm jar --jars /path/to/local/jar --artifacts `resolve Maven
dependencies` -- arfifactRepository `additional Maven
repos`
• Specify the resource requirements (Memory/CPU) for
individual topology components (Spouts/Bolts)
Try Storm 1.1.0
https://hortonworks.com/hadoop-tutorial/processing-
trucking-iot-data-with-apache-storm/
Apache Storm 2.0
• Storm Code entirely in Java (no more Clojure)
• Performance Improvements
• Worker/Threading Model Redesign
• Apache Beam Integration
• Bounded Spouts
• Metrics Enhancements
• Worker-Classloader Isolation
• Improved Backpressure
• Dynamic Topology Updates
References
• [1] Taylor Goetz Presentation @ DataWorks/Hadoop Summit, Munich 2017
• [2] http://arxiv.org/pdf/1506.08603v1.pdf
• [3] http://research.microsoft.com/en-us/um/people/lamport/pubs/chandy.pdf

Weitere ähnliche Inhalte

Was ist angesagt?

Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Erik Onnen
 

Was ist angesagt? (20)

Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Kafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka MeetupKafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka Meetup
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
 
Tuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache SparkTuning and Monitoring Deep Learning on Apache Spark
Tuning and Monitoring Deep Learning on Apache Spark
 
Stream processing using Apache Storm - Big Data Meetup Athens 2016
Stream processing using Apache Storm - Big Data Meetup Athens 2016Stream processing using Apache Storm - Big Data Meetup Athens 2016
Stream processing using Apache Storm - Big Data Meetup Athens 2016
 
Real Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & StormReal Time Data Streaming using Kafka & Storm
Real Time Data Streaming using Kafka & Storm
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Kafka and Spark Streaming
Kafka and Spark StreamingKafka and Spark Streaming
Kafka and Spark Streaming
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Apache Storm In Retail Context
Apache Storm In Retail ContextApache Storm In Retail Context
Apache Storm In Retail Context
 

Ähnlich wie Building Streaming Applications with Apache Storm 1.1

Search On Hadoop Frontier Meetup
Search On Hadoop Frontier MeetupSearch On Hadoop Frontier Meetup
Search On Hadoop Frontier Meetup
gregchanan
 

Ähnlich wie Building Streaming Applications with Apache Storm 1.1 (20)

Past, Present, and Future of Apache Storm
Past, Present, and Future of Apache StormPast, Present, and Future of Apache Storm
Past, Present, and Future of Apache Storm
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Introduction to Apache NiFi And Storm
Introduction to Apache NiFi And StormIntroduction to Apache NiFi And Storm
Introduction to Apache NiFi And Storm
 
PMIx Tiered Storage Support
PMIx Tiered Storage SupportPMIx Tiered Storage Support
PMIx Tiered Storage Support
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
 
Search On Hadoop Frontier Meetup
Search On Hadoop Frontier MeetupSearch On Hadoop Frontier Meetup
Search On Hadoop Frontier Meetup
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
 
DEVNET-1106 Upcoming Services in OpenStack
DEVNET-1106	Upcoming Services in OpenStackDEVNET-1106	Upcoming Services in OpenStack
DEVNET-1106 Upcoming Services in OpenStack
 
Apache Content Technologies
Apache Content TechnologiesApache Content Technologies
Apache Content Technologies
 

Kürzlich hochgeladen

怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Abortion pills in Riyadh +966572737505 get cytotec
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 

Kürzlich hochgeladen (20)

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 

Building Streaming Applications with Apache Storm 1.1

  • 1. Building Streaming Applications With Apache Storm 1.1 Meetup Hortonworks, April 20th, 2017
  • 2. Presenters • Sriharsha Chintalapani, Storm & Kafka Committer, PMC @ Hortonworks • Karthik Deivasigamani, Walmart Labs • Roshan Naik, Storm Contributor, Flume Commiter @ Hortonworks • Hugo Louro, Storm Committer, PMC @ Hortonworks
  • 4. Apache Storm Brief History • 2010 - First Streaming Framework - Backtype • 2011 – Acquired by and Deployed at Twitter • 2013 - Open Sourced into Apache • Present – Large Scale Production Deployments – Yahoo 3500+ Nodes – Alibaba 1PB of Data per Day
  • 5. Prior Releases Highlights • 0.9.x • Storm becomes an Apache TLP • First Official Apache Release • Expanded Kafka, HDFS, HBase Integration • 0.10.x • Multi Tenancy • Rolling Upgrades • Improved Logging (Log4j2) • JDBC, Event Hubs, Hive Integration
  • 6. Prior Releases Highlights • 1.0 – Pacemaker (Replaces Zookeeper for Heartbeats) – Security (Kerberos/Digest Authentication) – Nimbus HA (Eliminates Single Point of Failure) – Supervisor Health Checks – Resource Aware Scheduler
  • 7. Prior Releases Highlights • 1.0 – Stateful Bolts – Automatic Checkpointing/Snapshots • ABS [2], Chandy-Lamport [3] Algorithms – Streaming Windows • Sliding, Tumbling, Watermarks, Out of Order Tuples – Dynamic Log Levels – Distributed Log Search – Worker Profiling – Solr, Cassandra, Elastic Search, MQTT Integration
  • 8. Apache Storm 1.1.0 March 29, 2017 • Streaming SQL • Improved Apache Kafka Integration • PMML Support (Machine Learning) • Druid Integration • OpenTSDB Integration
  • 9. Apache Storm 1.1.0 March 29, 2017 • AWS Kinesis Support • HDFS Spout • Other Enhancements –Flux –Topology Deployment –Resource Aware Scheduler
  • 10. Streaming SQL • Apache Calcite for Query Parsing/Planning • Define Topology Using SQL Like Query • SQL Compiled and Transformed onto a Trident Topology • Streaming Onto/From Arbitrary Data Sources – Kafka, Redis, HDFS, MongoDB – Extensible Implementing ISqlTridentDataSource
  • 11. Streaming SQL • Tuple Filtering • Projections • CSV, TSV, and Avro input/output formats • User Defined Functions (UDFs) • User fine control of Parallelism of Generated Components
  • 12. Streaming SQL - Aggregate UDF
  • 13. Streaming SQL – Example [1] • Read Apache HTTPD server logs from Kafka • Filter out everything but error log events • Write the error events onto a Kafka topic
  • 14. Streaming SQL – Example [1]
  • 15. Improved Apache Kafka Integration • Enhanced configuration API • Support Consumer Groups • Pluggable Translators Kafka Record ->Tuple • Support for Topic Wildcards • Support Multiple Streams, Topics/Stream • Trident Kafka supporting Kafka 0.10 onwards • Integrates with Secure Kafka Environments
  • 16. Improved Apache Kafka Integration
  • 17. PMML Support (Machine Learning) • Predictive Model Markup Language • Describes Model Learned by ML algorithms • PmmlPredictorBolt Computes Predicted Scores for Live Tuples according to PMML Model • PMML Model Uploaded or Downloaded from Distributed Cache
  • 19. Storm 1.1.0 Improvements • Flux – Visualization in Storm UI • Specify the resource requirements (Memory/CPU) for individual topology components (Spouts/Bolts) • Topology Deployment – Alternative to Uber Jar – storm jar --jars /path/to/local/jar --artifacts `resolve Maven dependencies` -- arfifactRepository `additional Maven repos` • Specify the resource requirements (Memory/CPU) for individual topology components (Spouts/Bolts)
  • 21. Apache Storm 2.0 • Storm Code entirely in Java (no more Clojure) • Performance Improvements • Worker/Threading Model Redesign • Apache Beam Integration • Bounded Spouts • Metrics Enhancements • Worker-Classloader Isolation • Improved Backpressure • Dynamic Topology Updates
  • 22. References • [1] Taylor Goetz Presentation @ DataWorks/Hadoop Summit, Munich 2017 • [2] http://arxiv.org/pdf/1506.08603v1.pdf • [3] http://research.microsoft.com/en-us/um/people/lamport/pubs/chandy.pdf