SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Apache Flink Introduction
By: Ahmed Nader
Agenda
• What’s Apache Flink?
• Deeper into Flink
• Quick Start and Configuration
• Get your hands dirty
• Tips and some useful links
• References
2
What’s Apache Flink?
 Open Source platform for distributed Stream and
Batch Processing.
 Large scale data processing engine.
 Real Streaming engine, not cutting stream into
batches.
 Flink has 2 APIs.
DataStrea
m DataSet
3
Datastream API
 Represents a continuous stream of data of certain
type.
 Operations applied on each element of the stream or
windows.
Data
Strea
m
Operation
Data
Strea
m
Source Sink
4
Datastream API
5
 Example Live Stock Feed:
Apple 235
Alert if
Microsoft
> 120
Apple 235
Google 516
Sum
every 10
seconds
Microsoft 124
Microsoft 124
Google 516
Write
event to
database
Alert if
sum >
10000
Dataset API
6
 Uses Batch processing.
 Special case for Stream processing where finite data
sources are just streams that happen to end.
 Offers dedicated API with machine learning and
graph processing libraries.
Data
Set
Operation
Data
Set
Source Sink
Dataset API
7
 Example Map/Reduce paradigm:
Map Reduce
a
1
2
…
Flink Stack
8
Analyzing flink stack
9
 Streaming dataflow runtime which interprets every
program as a dataflow graph.
 Some Libraries on top of Datastream and Dataset
API such as:
 Table: enables SQL like queries.
 Gelly: Graph processing to transform and traverse
graphs in a distributed fashion.
 ML: has a couple of machine learning algorithms yet
still too basic.
 CEP: easily detect complex events in a data stream.
Which can allow to get hold of what’s really important
in your data.
Deeper into Flink
10
Data Sources
From an
input file
From a
socket
From a
collection
Deeper into Flink
11
Data Sinks
Write to a
CSV File
Write to a
socket
Print on
the
terminal
Deeper into Flink
12
 Data Transformations(for DataStream API):
 Map: takes 1 element and produces 1 element.
 flatMap: takes 1 element and produces 0 or more
elements.
 Filter: Evaluates a boolean value for each element
and retains those returning true.
 KeyBy: partitions a stream into disjoint partitions
each has elements of the same key.
 Window: groups all stream events according to some
characteristic ex: data arrived in last 5 seconds.
 Union, Join, Split, Select…
Deeper into Flink
13
 Interesting Use cases:
 Processing Twitter feed and one good application for
that can be collecting statistics on that feed.
see: http://blog.brakmic.com/stream-processing-with-apache-flink/
 Identifying popular locations where people arrive by
taxis,
By applying filter and map functions on a datastream
of taxi ride records then getting the most popular
places for the last 15 minutes for example.
see: https://www.mapr.com/blog/essential-guide-streaming-first-
processing-apache-flink
Setup
14
 Pre-requisites:
 Java 7.x or higher.
 Maven 3.0.4 or higher.
 Start a new flink project using Maven:
Run the following script in the terminal:
mvn archetype:generate  -DarchetypeGroupId=org.apache.flink  -
DarchetypeArtifactId=flink-quickstart-java  -DarchetypeVersion=1.0.1
OR
 Add flink to an existing project:
see: https://ci.apache.org/projects/flink/flink-docs-release-
1.0/apis/common/index.html
Get your hands dirty:
15
Get your hands dirty:
16
Get your hands dirty:
17
Execution
Local/debugging
cluster Command Line
Interface
Web interface
See: https://ci.apache.org/projects/flink/flink-docs-release-0.7/programming_guide.htm
Tips and some useful links:
18
 Subscribe to the mailing list, by sending an empty
email to user-subscribe@flink.apache.org.
 Clone the flink project on Github for more examples.
 There’s a free course by DataArtisans
see: http://dataartisans.github.io/flink-
training/index.html
 Here are some other useful links too:
• http://www.slideshare.net/sbaltagi/stepbystep-introduction-to-apache-flink
• https://ci.apache.org/projects/flink/flink-docs-release-
0.7/programming_guide.html
• https://ci.apache.org/projects/flink/flink-docs-release-
1.0/apis/common/index.html
References
19
 http://blog.brakmic.com/stream-processing-with-apache-flink/
 http://www.slideshare.net/sbaltagi/stepbystep-introduction-to-apache-flink
 https://www.mapr.com/blog/essential-guide-streaming-first-processing-
apache-flink
 https://ci.apache.org/projects/flink/flink-docs-release-
0.7/programming_guide.html
 http://dataartisans.github.io/flink-training/index.html
 https://ci.apache.org/projects/flink/flink-docs-release-
1.0/apis/common/index.html
20
Thanks!
Any Questions??

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)Stream processing with Apache Flink (Timo Walther - Ververica)
Stream processing with Apache Flink (Timo Walther - Ververica)
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache Flink
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Apache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink MeetupApache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink Meetup
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache Spark
 

Andere mochten auch

10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...
10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...
10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...
nasrollah najibi ilkhchy
 
page 52-60_Darlings of Fashion(1)
page 52-60_Darlings of Fashion(1)page 52-60_Darlings of Fashion(1)
page 52-60_Darlings of Fashion(1)
Madelin Tomelty
 
Compensatory Mitigation SUMMER2015
Compensatory Mitigation SUMMER2015Compensatory Mitigation SUMMER2015
Compensatory Mitigation SUMMER2015
Barbara Jean Neal
 
Bruce Gee CV
Bruce Gee CVBruce Gee CV
Bruce Gee CV
Bruce Gee
 
fast50_2015_dirog_v2_2
fast50_2015_dirog_v2_2fast50_2015_dirog_v2_2
fast50_2015_dirog_v2_2
Gal Yosef
 

Andere mochten auch (20)

Los Reyes, El plagio en internet*
Los Reyes, El plagio en internet* Los Reyes, El plagio en internet*
Los Reyes, El plagio en internet*
 
10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...
10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...
10.1007-s10971-015-3661-0-Comparison of optical and structural properties of ...
 
Análisis comparativo.
Análisis comparativo.Análisis comparativo.
Análisis comparativo.
 
Elementos semiótico unidad ii
Elementos semiótico unidad iiElementos semiótico unidad ii
Elementos semiótico unidad ii
 
Transcript - Jan 2017
Transcript - Jan 2017Transcript - Jan 2017
Transcript - Jan 2017
 
Lesson 13 1
Lesson 13 1Lesson 13 1
Lesson 13 1
 
ABRIC DE LA FALGUERA
ABRIC DE LA FALGUERAABRIC DE LA FALGUERA
ABRIC DE LA FALGUERA
 
page 52-60_Darlings of Fashion(1)
page 52-60_Darlings of Fashion(1)page 52-60_Darlings of Fashion(1)
page 52-60_Darlings of Fashion(1)
 
Compensatory Mitigation SUMMER2015
Compensatory Mitigation SUMMER2015Compensatory Mitigation SUMMER2015
Compensatory Mitigation SUMMER2015
 
Цель 2016
Цель 2016 Цель 2016
Цель 2016
 
Synonyms ashutosh2
Synonyms ashutosh2Synonyms ashutosh2
Synonyms ashutosh2
 
Bruce Gee CV
Bruce Gee CVBruce Gee CV
Bruce Gee CV
 
Módulo&Unidad #6 Marketing Viral
Módulo&Unidad #6 Marketing ViralMódulo&Unidad #6 Marketing Viral
Módulo&Unidad #6 Marketing Viral
 
NNH
NNHNNH
NNH
 
Parte Uno: Artes Gráficas
Parte Uno:  Artes GráficasParte Uno:  Artes Gráficas
Parte Uno: Artes Gráficas
 
History
HistoryHistory
History
 
Daniel Clapper Visual Draft
Daniel Clapper Visual Draft Daniel Clapper Visual Draft
Daniel Clapper Visual Draft
 
Parte 4: Teoria del Color
Parte 4:  Teoria del Color Parte 4:  Teoria del Color
Parte 4: Teoria del Color
 
The Bronze Age in Alicante
The Bronze Age in AlicanteThe Bronze Age in Alicante
The Bronze Age in Alicante
 
fast50_2015_dirog_v2_2
fast50_2015_dirog_v2_2fast50_2015_dirog_v2_2
fast50_2015_dirog_v2_2
 

Ähnlich wie Apache flink

Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World London
Stephan Ewen
 
Flink Cummunity Update July (Berlin Meetup)
Flink Cummunity Update July (Berlin Meetup)Flink Cummunity Update July (Berlin Meetup)
Flink Cummunity Update July (Berlin Meetup)
Robert Metzger
 

Ähnlich wie Apache flink (20)

Apache Fink 1.0: A New Era for Real-World Streaming Analytics
Apache Fink 1.0: A New Era  for Real-World Streaming AnalyticsApache Fink 1.0: A New Era  for Real-World Streaming Analytics
Apache Fink 1.0: A New Era for Real-World Streaming Analytics
 
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
 
Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World London
 
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
 
Apache Flink: Past, Present and Future
Apache Flink: Past, Present and FutureApache Flink: Past, Present and Future
Apache Flink: Past, Present and Future
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
 
Robust stream processing with Apache Flink
Robust stream processing with Apache FlinkRobust stream processing with Apache Flink
Robust stream processing with Apache Flink
 
Flink Streaming Berlin Meetup
Flink Streaming Berlin MeetupFlink Streaming Berlin Meetup
Flink Streaming Berlin Meetup
 
Flink Cummunity Update July (Berlin Meetup)
Flink Cummunity Update July (Berlin Meetup)Flink Cummunity Update July (Berlin Meetup)
Flink Cummunity Update July (Berlin Meetup)
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
 
Introduction to Apache Flink at Vienna Meet Up
Introduction to Apache Flink at Vienna Meet UpIntroduction to Apache Flink at Vienna Meet Up
Introduction to Apache Flink at Vienna Meet Up
 
Data Analysis With Apache Flink
Data Analysis With Apache FlinkData Analysis With Apache Flink
Data Analysis With Apache Flink
 
Data Analysis with Apache Flink (Hadoop Summit, 2015)
Data Analysis with Apache Flink (Hadoop Summit, 2015)Data Analysis with Apache Flink (Hadoop Summit, 2015)
Data Analysis with Apache Flink (Hadoop Summit, 2015)
 
Stateful stream processing made easy with Apache Flink. - A.Mancini F.Tosi - ...
Stateful stream processing made easy with Apache Flink. - A.Mancini F.Tosi - ...Stateful stream processing made easy with Apache Flink. - A.Mancini F.Tosi - ...
Stateful stream processing made easy with Apache Flink. - A.Mancini F.Tosi - ...
 
Stateful stream processing made easy with Apache Flink
Stateful stream processing made easy with Apache FlinkStateful stream processing made easy with Apache Flink
Stateful stream processing made easy with Apache Flink
 
project_docs
project_docsproject_docs
project_docs
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
 
Berlin Apache Flink Meetup May 2015, Community Update
Berlin Apache Flink Meetup May 2015, Community UpdateBerlin Apache Flink Meetup May 2015, Community Update
Berlin Apache Flink Meetup May 2015, Community Update
 

Kürzlich hochgeladen

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 

Kürzlich hochgeladen (20)

WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 

Apache flink

  • 2. Agenda • What’s Apache Flink? • Deeper into Flink • Quick Start and Configuration • Get your hands dirty • Tips and some useful links • References 2
  • 3. What’s Apache Flink?  Open Source platform for distributed Stream and Batch Processing.  Large scale data processing engine.  Real Streaming engine, not cutting stream into batches.  Flink has 2 APIs. DataStrea m DataSet 3
  • 4. Datastream API  Represents a continuous stream of data of certain type.  Operations applied on each element of the stream or windows. Data Strea m Operation Data Strea m Source Sink 4
  • 5. Datastream API 5  Example Live Stock Feed: Apple 235 Alert if Microsoft > 120 Apple 235 Google 516 Sum every 10 seconds Microsoft 124 Microsoft 124 Google 516 Write event to database Alert if sum > 10000
  • 6. Dataset API 6  Uses Batch processing.  Special case for Stream processing where finite data sources are just streams that happen to end.  Offers dedicated API with machine learning and graph processing libraries. Data Set Operation Data Set Source Sink
  • 7. Dataset API 7  Example Map/Reduce paradigm: Map Reduce a 1 2 …
  • 9. Analyzing flink stack 9  Streaming dataflow runtime which interprets every program as a dataflow graph.  Some Libraries on top of Datastream and Dataset API such as:  Table: enables SQL like queries.  Gelly: Graph processing to transform and traverse graphs in a distributed fashion.  ML: has a couple of machine learning algorithms yet still too basic.  CEP: easily detect complex events in a data stream. Which can allow to get hold of what’s really important in your data.
  • 10. Deeper into Flink 10 Data Sources From an input file From a socket From a collection
  • 11. Deeper into Flink 11 Data Sinks Write to a CSV File Write to a socket Print on the terminal
  • 12. Deeper into Flink 12  Data Transformations(for DataStream API):  Map: takes 1 element and produces 1 element.  flatMap: takes 1 element and produces 0 or more elements.  Filter: Evaluates a boolean value for each element and retains those returning true.  KeyBy: partitions a stream into disjoint partitions each has elements of the same key.  Window: groups all stream events according to some characteristic ex: data arrived in last 5 seconds.  Union, Join, Split, Select…
  • 13. Deeper into Flink 13  Interesting Use cases:  Processing Twitter feed and one good application for that can be collecting statistics on that feed. see: http://blog.brakmic.com/stream-processing-with-apache-flink/  Identifying popular locations where people arrive by taxis, By applying filter and map functions on a datastream of taxi ride records then getting the most popular places for the last 15 minutes for example. see: https://www.mapr.com/blog/essential-guide-streaming-first- processing-apache-flink
  • 14. Setup 14  Pre-requisites:  Java 7.x or higher.  Maven 3.0.4 or higher.  Start a new flink project using Maven: Run the following script in the terminal: mvn archetype:generate -DarchetypeGroupId=org.apache.flink - DarchetypeArtifactId=flink-quickstart-java -DarchetypeVersion=1.0.1 OR  Add flink to an existing project: see: https://ci.apache.org/projects/flink/flink-docs-release- 1.0/apis/common/index.html
  • 15. Get your hands dirty: 15
  • 16. Get your hands dirty: 16
  • 17. Get your hands dirty: 17 Execution Local/debugging cluster Command Line Interface Web interface See: https://ci.apache.org/projects/flink/flink-docs-release-0.7/programming_guide.htm
  • 18. Tips and some useful links: 18  Subscribe to the mailing list, by sending an empty email to user-subscribe@flink.apache.org.  Clone the flink project on Github for more examples.  There’s a free course by DataArtisans see: http://dataartisans.github.io/flink- training/index.html  Here are some other useful links too: • http://www.slideshare.net/sbaltagi/stepbystep-introduction-to-apache-flink • https://ci.apache.org/projects/flink/flink-docs-release- 0.7/programming_guide.html • https://ci.apache.org/projects/flink/flink-docs-release- 1.0/apis/common/index.html
  • 19. References 19  http://blog.brakmic.com/stream-processing-with-apache-flink/  http://www.slideshare.net/sbaltagi/stepbystep-introduction-to-apache-flink  https://www.mapr.com/blog/essential-guide-streaming-first-processing- apache-flink  https://ci.apache.org/projects/flink/flink-docs-release- 0.7/programming_guide.html  http://dataartisans.github.io/flink-training/index.html  https://ci.apache.org/projects/flink/flink-docs-release- 1.0/apis/common/index.html