SlideShare ist ein Scribd-Unternehmen logo
1 von 7
© 2015 DataTorrent
Munagala V. Ramanath (“Ram”) <ram@datatorrent.com>
Dec 15th, 2015
Building your First Apache Apex
Application
© 2015 DataTorrent
Outline
Main concepts of an Apex Application.
Brief description of the "Sorted Word Count" application.
Hands on demonstration of cloning the source repository and building
Apex source code.
Hands on demonstration of creating a new application.
Running the application.
Code walk-through.
Questions
© 2015 DataTorrent
Main Concepts
Applications are built from Operators which implement the Operator interface;
each operator has input/output ports which are connected by streams to form a
directed acyclic graph (DAG).
A BaseOperator class is provided which provides empty implementations of all the
required methods.
Within an operator, define necessary input and output ports typically using the
DefaultInputPort and DefaultOutputPort classes.
The Application class implements the StreamingApplication interface; need only
implement populateDAG() method which wires the operators together.
Applications process data within time-based windows, typically 0.5s.
© 2015 DataTorrent
The Sorted Word Count Application
The following operators are involved:
LineReader: reads file dropped into input directory and outputs lines (on its
output port).
WordReader: splits each line into words using a regex.
WindowWordCount: compute and emit word frequencies for all words in lines
processed in current window.
FileWordCount: accumulates all word counts for current file and emits final
sorted list when EOF is reached.
WordCountWriter: writes list to output file in output directory.
© 2015 DataTorrent
The DAG
© 2015 DataTorrent
Resources
6
Apache Apex Community Page
Apache Apex LinkedIn Group
© 2015 DataTorrent
End
7

Weitere ähnliche Inhalte

Was ist angesagt?

Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
Apache Apex
 

Was ist angesagt? (20)

Apex as yarn application
Apex as yarn applicationApex as yarn application
Apex as yarn application
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017From Batch to Streaming with Apache Apex Dataworks Summit 2017
From Batch to Streaming with Apache Apex Dataworks Summit 2017
 
Java High Level Stream API
Java High Level Stream APIJava High Level Stream API
Java High Level Stream API
 
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
IoT Ingestion & Analytics using Apache Apex - A Native Hadoop Platform
 
Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)Developing streaming applications with apache apex (strata + hadoop world)
Developing streaming applications with apache apex (strata + hadoop world)
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
 
University program - writing an apache apex application
University program  - writing an apache apex applicationUniversity program  - writing an apache apex application
University program - writing an apache apex application
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - HackacIntro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
Intro to Apache Apex - Next Gen Native Hadoop Platform - Hackac
 
Apache Apex Meetup at Cask
Apache Apex Meetup at CaskApache Apex Meetup at Cask
Apache Apex Meetup at Cask
 
Apache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and ApplicationsApache Apex: Stream Processing Architecture and Applications
Apache Apex: Stream Processing Architecture and Applications
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
Building your first aplication using Apache Apex
Building your first aplication using Apache ApexBuilding your first aplication using Apache Apex
Building your first aplication using Apache Apex
 
Apache Apex Fault Tolerance and Processing Semantics
Apache Apex Fault Tolerance and Processing SemanticsApache Apex Fault Tolerance and Processing Semantics
Apache Apex Fault Tolerance and Processing Semantics
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
DataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application MeetupDataTorrent Presentation @ Big Data Application Meetup
DataTorrent Presentation @ Big Data Application Meetup
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
 
Introduction to Apache Apex
Introduction to Apache ApexIntroduction to Apache Apex
Introduction to Apache Apex
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 

Andere mochten auch

Цветочные легенды
Цветочные легендыЦветочные легенды
Цветочные легенды
Ninel Kek
 
Римский корсаков снегурочка
Римский корсаков снегурочкаРимский корсаков снегурочка
Римский корсаков снегурочка
Ninel Kek
 
High Performance Distributed Systems with CQRS
High Performance Distributed Systems with CQRSHigh Performance Distributed Systems with CQRS
High Performance Distributed Systems with CQRS
Jonathan Oliver
 
правописание приставок урок№4
правописание приставок урок№4правописание приставок урок№4
правописание приставок урок№4
HomichAlla
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
Apache Apex
 

Andere mochten auch (20)

HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
 
Цветочные легенды
Цветочные легендыЦветочные легенды
Цветочные легенды
 
Римский корсаков снегурочка
Римский корсаков снегурочкаРимский корсаков снегурочка
Римский корсаков снегурочка
 
High Performance Distributed Systems with CQRS
High Performance Distributed Systems with CQRSHigh Performance Distributed Systems with CQRS
High Performance Distributed Systems with CQRS
 
правописание приставок урок№4
правописание приставок урок№4правописание приставок урок№4
правописание приставок урок№4
 
бсп (обоб. урок)
бсп (обоб. урок)бсп (обоб. урок)
бсп (обоб. урок)
 
Troubleshooting mysql-tutorial
Troubleshooting mysql-tutorialTroubleshooting mysql-tutorial
Troubleshooting mysql-tutorial
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
 
Windowing in Apache Apex
Windowing in Apache ApexWindowing in Apache Apex
Windowing in Apache Apex
 
The 5 People in your Organization that grow Legacy Code
The 5 People in your Organization that grow Legacy CodeThe 5 People in your Organization that grow Legacy Code
The 5 People in your Organization that grow Legacy Code
 
Hadoop File System Shell Commands,
Hadoop File System Shell Commands,Hadoop File System Shell Commands,
Hadoop File System Shell Commands,
 
Hadoop basic commands
Hadoop basic commandsHadoop basic commands
Hadoop basic commands
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
 
Build your shiny new pc, with Pangoly
Build your shiny new pc, with PangolyBuild your shiny new pc, with Pangoly
Build your shiny new pc, with Pangoly
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
 
Introduction to UNIX Command-Lines with examples
Introduction to UNIX Command-Lines with examplesIntroduction to UNIX Command-Lines with examples
Introduction to UNIX Command-Lines with examples
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 

Ähnlich wie Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application

ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...
ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...
ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...
khadijahd2
 
Streamline - Stream Analytics for Everyone
Streamline - Stream Analytics for EveryoneStreamline - Stream Analytics for Everyone
Streamline - Stream Analytics for Everyone
DataWorks Summit/Hadoop Summit
 

Ähnlich wie Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application (20)

Overview of MVC Framework - by software outsourcing company india
Overview of MVC Framework - by software outsourcing company indiaOverview of MVC Framework - by software outsourcing company india
Overview of MVC Framework - by software outsourcing company india
 
Building Your First Apache Apex Application
Building Your First Apache Apex ApplicationBuilding Your First Apache Apex Application
Building Your First Apache Apex Application
 
PLNOG15: The Power of the Open Standards SDN API’s - Mikael Holmberg
PLNOG15: The Power of the Open Standards SDN API’s - Mikael Holmberg PLNOG15: The Power of the Open Standards SDN API’s - Mikael Holmberg
PLNOG15: The Power of the Open Standards SDN API’s - Mikael Holmberg
 
Oracle APEX 18.1 New Features
Oracle APEX 18.1 New FeaturesOracle APEX 18.1 New Features
Oracle APEX 18.1 New Features
 
App Load Presentation 2009
App Load Presentation 2009App Load Presentation 2009
App Load Presentation 2009
 
Stream Processing with Apache Apex
Stream Processing with Apache ApexStream Processing with Apache Apex
Stream Processing with Apache Apex
 
T2 Web Framework
T2 Web FrameworkT2 Web Framework
T2 Web Framework
 
Eclipse - Single Source;Three Runtimes
Eclipse - Single Source;Three RuntimesEclipse - Single Source;Three Runtimes
Eclipse - Single Source;Three Runtimes
 
API workshop: Introduction to APIs (TC Camp)
API workshop: Introduction to APIs (TC Camp)API workshop: Introduction to APIs (TC Camp)
API workshop: Introduction to APIs (TC Camp)
 
Oracle Application Express 20.2 New Features
Oracle Application Express 20.2 New FeaturesOracle Application Express 20.2 New Features
Oracle Application Express 20.2 New Features
 
Building Bridges: Merging RPA Processes, UiPath Apps, and Data Service to bu...
Building Bridges:  Merging RPA Processes, UiPath Apps, and Data Service to bu...Building Bridges:  Merging RPA Processes, UiPath Apps, and Data Service to bu...
Building Bridges: Merging RPA Processes, UiPath Apps, and Data Service to bu...
 
ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...
ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...
ilide.info-talend-open-studio-for-data-integration-pr_f4a743b84c8b04cbebbf4c7...
 
Oracle APEX Introduction (release 18.1)
Oracle APEX Introduction (release 18.1)Oracle APEX Introduction (release 18.1)
Oracle APEX Introduction (release 18.1)
 
Streamline - Stream Analytics for Everyone
Streamline - Stream Analytics for EveryoneStreamline - Stream Analytics for Everyone
Streamline - Stream Analytics for Everyone
 
Visualizing Big Data in Realtime
Visualizing Big Data in RealtimeVisualizing Big Data in Realtime
Visualizing Big Data in Realtime
 
APEX – jak vytvořit jednoduše aplikaci
APEX – jak vytvořit jednoduše aplikaciAPEX – jak vytvořit jednoduše aplikaci
APEX – jak vytvořit jednoduše aplikaci
 
Introduction to Apache Camel.pdf
Introduction to Apache Camel.pdfIntroduction to Apache Camel.pdf
Introduction to Apache Camel.pdf
 
Document_format_for_OData_In_A_Nutshell.pdf
Document_format_for_OData_In_A_Nutshell.pdfDocument_format_for_OData_In_A_Nutshell.pdf
Document_format_for_OData_In_A_Nutshell.pdf
 
Java files and io streams
Java files and io streamsJava files and io streams
Java files and io streams
 
Ruby On Rails Siddhesh
Ruby On Rails SiddheshRuby On Rails Siddhesh
Ruby On Rails Siddhesh
 

Mehr von Apache Apex

Mehr von Apache Apex (8)

Low Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache ApexLow Latency Polyglot Model Scoring using Apache Apex
Low Latency Polyglot Model Scoring using Apache Apex
 
Introduction to Yarn
Introduction to YarnIntroduction to Yarn
Introduction to Yarn
 
Intro to Big Data Hadoop
Intro to Big Data HadoopIntro to Big Data Hadoop
Intro to Big Data Hadoop
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)
 
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache ApexMaking sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
Making sense of Apache Bigtop's role in ODPi and how it matters to Apache Apex
 
Apache Apex & Bigtop
Apache Apex & BigtopApache Apex & Bigtop
Apache Apex & Bigtop
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Building Your First Apache Apex (Next Gen Big Data/Hadoop) Application

  • 1. © 2015 DataTorrent Munagala V. Ramanath (“Ram”) <ram@datatorrent.com> Dec 15th, 2015 Building your First Apache Apex Application
  • 2. © 2015 DataTorrent Outline Main concepts of an Apex Application. Brief description of the "Sorted Word Count" application. Hands on demonstration of cloning the source repository and building Apex source code. Hands on demonstration of creating a new application. Running the application. Code walk-through. Questions
  • 3. © 2015 DataTorrent Main Concepts Applications are built from Operators which implement the Operator interface; each operator has input/output ports which are connected by streams to form a directed acyclic graph (DAG). A BaseOperator class is provided which provides empty implementations of all the required methods. Within an operator, define necessary input and output ports typically using the DefaultInputPort and DefaultOutputPort classes. The Application class implements the StreamingApplication interface; need only implement populateDAG() method which wires the operators together. Applications process data within time-based windows, typically 0.5s.
  • 4. © 2015 DataTorrent The Sorted Word Count Application The following operators are involved: LineReader: reads file dropped into input directory and outputs lines (on its output port). WordReader: splits each line into words using a regex. WindowWordCount: compute and emit word frequencies for all words in lines processed in current window. FileWordCount: accumulates all word counts for current file and emits final sorted list when EOF is reached. WordCountWriter: writes list to output file in output directory.
  • 6. © 2015 DataTorrent Resources 6 Apache Apex Community Page Apache Apex LinkedIn Group