SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
Introduction to DataFlow
management using Apache NiFi
Presented by: Anshuman Ghosh
Topics we will cover
 DataFlow and problems.
 What is Apache NiFi – History, key features, core components
 Architecture To start with NiFi (Single server setup)
 Architecture To scale with NiFi (NiFi cluster setup)
 Fundamentals of NiFi Web UI
 Building a NiFi DataFlow Processor
 Live demo
 Testing
 Deployment and automation
 What next?
 Q&A
DataFlow
 The term “DataFlow” can be used in variety of contexts.
 In our context it is the flow of information between systems.
 It is crucial to have a robust platform to create, manage and automate the
flow of enterprise data.
 There are many tools for data gathering and data flow, but more often
than not we lack an integrated platform for that.
 Probably an ideal situation would be have a seamless integration ,..
What enterprises look for
To be able to get data from any source
… To the systems that performs Analytics
… And to those for user availability
Common DataFlow challenges
 System failure
 Difference between data production and consumption
 Change in dynamic data priority
 Protocols and format changes; new systems, new protocols
 Need of bidirectional data flow
 Transparency and control
 Security and privacy
Brief history of Apache NiFi
 Developed at NSA (National Security Agency, USA) for over 8 years.
 Onyara engineers, for NSA, have developed a project called “Niagara
Files” which later went on to become NiFi.
 Trough NSA Technology transfer program it was made available as an open
source Apache project “Apache NiFi” in the year 2014.
 Hortonworks has a partnership with Onyara on their “Hortonworks DataFlow
powered by Apache NiFi”
What is Apache NiFi
 Holistically Apache NiFi is an integrated platform to collect, conduct and
curate real-time data (data in motion).
 Provides an end to end DataFlow management from any source* to any
destination*.
 Provides data logistics – real-time operational visibility and control of
DataFlow.
 Supports powerful and scalable directed graphs of data routing and data
transformation.
 All these in a reliable and secure manner.
*complete list of source and destination on official documentation
Key features
 Guaranteed data delivery – “at least once” semantics
 Data buffering and Back pressure
 Data prioritization in queue
 Flow specific setting for “latency vs. throughput”
 Data provenance
 Visual control
 Flow templates
 Recovery/ Recording through content repository
 Clustering to scale-out
 Security
 Classloader Isolation
Core components of NiFi
 NiFi at it’s core follow the concept of Flow Based programming.
 Core components of NiFi are
 FlowFile – the unit of information packet
 FlowFile Processor – the processing engine; black box.
 Connection – the relation between Processors and bounded buffer.
 Flow Controller – the scheduler in real world.
 Process Group – the compact function or subnet
Core components diagram
 This is how a typical NiFi DataFlow might look
NiFi Architecture
 NiFi executes within a JVM on a host Operating System.
NiFi Architecture – Clustering
 Typical NiFi cluster
Core components of NiFi Cluster
 NiFi Cluster Manager
 Nodes
 Primary Node
 Isolated Processors
 Heartbeats
Fundamentals of the Web UI
Building a DataFlow Processor
 Drag the “Processor” icon from “Component Toolbar” into the canvas; this
will provide a ‘Add Processor’ wizard
Building a DataFlow Processor
 General ‘SETTINGS’ for the processor
Building a DataFlow Processor
 ‘SCHEDULING’ information
Building a DataFlow Processor
 Setting up mandatory and optional ‘PROPERTIES’
Building a DataFlow Processor
 Auto alert mechanism
 If there is an error it will not allow to start the processor
Building a DataFlow Processor
 If everything is se, we are ready to initiate/ start the process
Demo 1
 In this demo, we will go through a NiFi DataFlow that deals with the
following steps
 Connect to Kafka and consume from a topic.
 Store consumed data in a local storage (optional).
 Anonymize IP address.
 Merge content before writing to HDFS (small file issues).
 Finally store Kafka data onto HDFS
 Look into error handling.
 Look into use of expression language.
Demo 2
 In this demo, we will go through a NiFi DataFlow that deals with the
following steps
 Collect/ fetch data files from a local location.
 Update/ add attributes.
 Parse JSON strings to DB Insert statements.
 Connect to PostgreSQL and Insert.
 Error handling.
Unit testing components
 For component testing nifi-mock module can be used with JUnit.
 The TestRunner interface allows us to test Processors and Controller Services.
 We need to instantiate and get a new TestRunner (org.apache.nifi.util)
 Add Controller Services and configure
 Set property of Processors setProperty(PropertyDescriptor, String)
 Enqueue FlowFiles by using the enqueue methods of the TestRunner class.
 Processor can be started by triggering run() method of TestRunner.
 Validate output – using the TestRunners assertAllFlowFilesTransferred and
assertTransferCount methods.
 More details can be found here – https://nifi.apache.org/docs/nifi-
docs/html/developer-guide.html#testing
 Add Maven dependency
 Call static newTestRunner method of the TestRunners class
 Call addControllerService method to add controller
 Set properties by setProperty(ControllerService, PropertyDescriptor, String)
 Enable services by enableControllerService(ControllerService)
 Set processor property setProperty(PropertyDescriptor, String)
 Override enqueue method for byte[], InputStream, or Path.
 run(int); This will call methods with @OnScheduled annotation, Processor’s
onTrigger method, and then run the @OnUnscheduled and finally @OnStopped
methods.
 Validate result by assertAllFlowFilesTransferred and assertTransferCount methods.
 Access FlowFiles by calling getFlowFilesForRelationship() method
Error handling
 Following can occur
 Unexpected data format
 Network connection, disk failure
 Bug in processor
 ProcessException and all others (like null pointer)
 ProcessException – Rollback and penalize the FlowFiles
 All others – Rollback, penalize the FlowFiles and Yield the Processor
Testing automation, Deployment
 NiFi provides ‘ReST’ API for all components and entire documentation can
be found here https://nifi.apache.org/docs/nifi-docs/rest-api/index.html
 Apache NiFi Community is working to improve on this area
 We can setup the deployment in following way
 Create an application i.e. entire DataFlow in your local machine and test.
 Create a process group around that (optional though)
 Create a template. (Can be done from Web UI/ ReST API call)
 Download the template. (Can be done from Web UI/ ReST API call)
 Use ReST API call to import the template in new environment.
 Use ReST API call to Update Processors (Properties, Schedule, and Settings etc.)
 Use ReST API call to Instantiate a template
Deployment
 There can be one more option to do it.
 Copying the whole flow (flow.xml.gz) from one environment to another
 Need to copy the entire canvas.
 Need to take care of sensitive properties encryption.
What is next
 We are planning to work on the testing, deployment side and update it.
 Please read more on NiFi development here –
https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html
 And for user guide – https://nifi.apache.org/docs/nifi-docs/html/user-
guide.html
 We have carried out POCs on some of our real use cases; please find them
here
 Link HDFS data ingestion using Apache
 Link How to setup Apache NiFi
 Link Expression Language Guide
 Any questions and/ or suggestions please come by or write 
Q&A
 Questions?
Thank you!
Presented by: Anshuman Ghosh

Weitere ähnliche Inhalte

Was ist angesagt?

Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseAldrin Piri
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsTimothy Spann
 
BYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFiBYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFiDataWorks Summit
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and FlinkBryan Bende
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiLev Brailovskiy
 
Apache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitApache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitAldrin Piri
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For OperatorsKevin Brockhoff
 
OpenShift 4 installation
OpenShift 4 installationOpenShift 4 installation
OpenShift 4 installationRobert Bohne
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesIsheeta Sanghi
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatAmazon Web Services
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiDataWorks Summit
 

Was ist angesagt? (20)

Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
BYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFiBYOP: Custom Processor Development with Apache NiFi
BYOP: Custom Processor Development with Apache NiFi
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Nifi
NifiNifi
Nifi
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
Apache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitApache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop Summit
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
OpenShift 4 installation
OpenShift 4 installationOpenShift 4 installation
OpenShift 4 installation
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Containers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red HatContainers Anywhere with OpenShift by Red Hat
Containers Anywhere with OpenShift by Red Hat
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
 

Andere mochten auch

2015 Internet Trends Report
2015 Internet Trends Report2015 Internet Trends Report
2015 Internet Trends ReportIQbal KHan
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
 
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processingApache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processingTimo Walther
 
[OracleCode SF] In memory analytics with apache spark and hazelcast
[OracleCode SF] In memory analytics with apache spark and hazelcast[OracleCode SF] In memory analytics with apache spark and hazelcast
[OracleCode SF] In memory analytics with apache spark and hazelcastViktor Gamov
 
Tracxn Research - Finance & Accounting Landscape, February 2017
Tracxn Research - Finance & Accounting Landscape, February 2017Tracxn Research - Finance & Accounting Landscape, February 2017
Tracxn Research - Finance & Accounting Landscape, February 2017Tracxn
 
Tracxn Research - Construction Tech Landscape, February 2017
Tracxn Research - Construction Tech Landscape, February 2017Tracxn Research - Construction Tech Landscape, February 2017
Tracxn Research - Construction Tech Landscape, February 2017Tracxn
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
Akka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming WorldAkka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming WorldKonrad Malawski
 
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFiTaking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFiBryan Bende
 
3P Learning (3PL) - Earning from Learning - equity research initiation report
3P Learning (3PL) - Earning from Learning - equity research initiation report3P Learning (3PL) - Earning from Learning - equity research initiation report
3P Learning (3PL) - Earning from Learning - equity research initiation reportGeorge Gabriel
 
Comparing 30 MongoDB operations with Oracle SQL statements
Comparing 30 MongoDB operations with Oracle SQL statementsComparing 30 MongoDB operations with Oracle SQL statements
Comparing 30 MongoDB operations with Oracle SQL statementsLucas Jellema
 
Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017Tracxn
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Lightbend
 
Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017Tracxn
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the LogBen Stopford
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellKoji Kawamura
 

Andere mochten auch (20)

Streamsets and spark
Streamsets and sparkStreamsets and spark
Streamsets and spark
 
2015 Internet Trends Report
2015 Internet Trends Report2015 Internet Trends Report
2015 Internet Trends Report
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processingApache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
 
[OracleCode SF] In memory analytics with apache spark and hazelcast
[OracleCode SF] In memory analytics with apache spark and hazelcast[OracleCode SF] In memory analytics with apache spark and hazelcast
[OracleCode SF] In memory analytics with apache spark and hazelcast
 
Tracxn Research - Finance & Accounting Landscape, February 2017
Tracxn Research - Finance & Accounting Landscape, February 2017Tracxn Research - Finance & Accounting Landscape, February 2017
Tracxn Research - Finance & Accounting Landscape, February 2017
 
Tracxn Research - Construction Tech Landscape, February 2017
Tracxn Research - Construction Tech Landscape, February 2017Tracxn Research - Construction Tech Landscape, February 2017
Tracxn Research - Construction Tech Landscape, February 2017
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Akka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming WorldAkka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming World
 
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFiTaking DataFlow Management to the Edge with Apache NiFi/MiNiFi
Taking DataFlow Management to the Edge with Apache NiFi/MiNiFi
 
CV
CVCV
CV
 
2017 biological databases_part1_vupload
2017 biological databases_part1_vupload2017 biological databases_part1_vupload
2017 biological databases_part1_vupload
 
3P Learning (3PL) - Earning from Learning - equity research initiation report
3P Learning (3PL) - Earning from Learning - equity research initiation report3P Learning (3PL) - Earning from Learning - equity research initiation report
3P Learning (3PL) - Earning from Learning - equity research initiation report
 
Comparing 30 MongoDB operations with Oracle SQL statements
Comparing 30 MongoDB operations with Oracle SQL statementsComparing 30 MongoDB operations with Oracle SQL statements
Comparing 30 MongoDB operations with Oracle SQL statements
 
Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
 
Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017
 
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the Log
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 

Ähnlich wie Introduction to data flow management using apache nifi

Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAdam Doyle
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101Timothy Spann
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practicesTimothy Spann
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300Timothy Spann
 
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps_Fest
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampTimothy Spann
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveAldrin Piri
 
Play framework : A Walkthrough
Play framework : A WalkthroughPlay framework : A Walkthrough
Play framework : A Walkthroughmitesh_sharma
 
Integração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxIntegração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxMarco Garcia
 
A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to KubernetesPaul Czarkowski
 
Api world apache nifi 101
Api world   apache nifi 101Api world   apache nifi 101
Api world apache nifi 101Timothy Spann
 
Maven from Scratch to Production (.odp)
Maven from Scratch to Production (.odp)Maven from Scratch to Production (.odp)
Maven from Scratch to Production (.odp)Johan Mynhardt
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioiguazio
 
Frequently asked MuleSoft Interview Questions and Answers from Techlightning
Frequently asked MuleSoft Interview Questions and Answers from TechlightningFrequently asked MuleSoft Interview Questions and Answers from Techlightning
Frequently asked MuleSoft Interview Questions and Answers from TechlightningArul ChristhuRaj Alphonse
 
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guideBasic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guideVikas Sharma
 
TechWiseTV Workshop: Catalyst Switching Programmability
TechWiseTV Workshop: Catalyst Switching ProgrammabilityTechWiseTV Workshop: Catalyst Switching Programmability
TechWiseTV Workshop: Catalyst Switching ProgrammabilityRobb Boyd
 
Create Home Directories on Storage Using WFA and ServiceNow integration
Create Home Directories on Storage Using WFA and ServiceNow integrationCreate Home Directories on Storage Using WFA and ServiceNow integration
Create Home Directories on Storage Using WFA and ServiceNow integrationRutul Shah
 

Ähnlich wie Introduction to data flow management using apache nifi (20)

Automate your data flows with Apache NIFI
Automate your data flows with Apache NIFIAutomate your data flows with Apache NIFI
Automate your data flows with Apache NIFI
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
 
ApacheCon 2021: Apache NiFi 101- introduction and best practices
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practices
 
ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300
 
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
 
Play framework : A Walkthrough
Play framework : A WalkthroughPlay framework : A Walkthrough
Play framework : A Walkthrough
 
Integração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxIntegração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia Cetax
 
A DevOps guide to Kubernetes
A DevOps guide to KubernetesA DevOps guide to Kubernetes
A DevOps guide to Kubernetes
 
Api world apache nifi 101
Api world   apache nifi 101Api world   apache nifi 101
Api world apache nifi 101
 
My Saminar On Php
My Saminar On PhpMy Saminar On Php
My Saminar On Php
 
Maven from Scratch to Production (.odp)
Maven from Scratch to Production (.odp)Maven from Scratch to Production (.odp)
Maven from Scratch to Production (.odp)
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclio
 
Function as a Service
Function as a ServiceFunction as a Service
Function as a Service
 
Frequently asked MuleSoft Interview Questions and Answers from Techlightning
Frequently asked MuleSoft Interview Questions and Answers from TechlightningFrequently asked MuleSoft Interview Questions and Answers from Techlightning
Frequently asked MuleSoft Interview Questions and Answers from Techlightning
 
Lamp Zend Security
Lamp Zend SecurityLamp Zend Security
Lamp Zend Security
 
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guideBasic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
Basic concepts for_clustered_data_ontap_8.3_v1.1-lab_guide
 
TechWiseTV Workshop: Catalyst Switching Programmability
TechWiseTV Workshop: Catalyst Switching ProgrammabilityTechWiseTV Workshop: Catalyst Switching Programmability
TechWiseTV Workshop: Catalyst Switching Programmability
 
Create Home Directories on Storage Using WFA and ServiceNow integration
Create Home Directories on Storage Using WFA and ServiceNow integrationCreate Home Directories on Storage Using WFA and ServiceNow integration
Create Home Directories on Storage Using WFA and ServiceNow integration
 

Kürzlich hochgeladen

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 

Kürzlich hochgeladen (20)

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 

Introduction to data flow management using apache nifi

  • 1. Introduction to DataFlow management using Apache NiFi Presented by: Anshuman Ghosh
  • 2. Topics we will cover  DataFlow and problems.  What is Apache NiFi – History, key features, core components  Architecture To start with NiFi (Single server setup)  Architecture To scale with NiFi (NiFi cluster setup)  Fundamentals of NiFi Web UI  Building a NiFi DataFlow Processor  Live demo  Testing  Deployment and automation  What next?  Q&A
  • 3. DataFlow  The term “DataFlow” can be used in variety of contexts.  In our context it is the flow of information between systems.  It is crucial to have a robust platform to create, manage and automate the flow of enterprise data.  There are many tools for data gathering and data flow, but more often than not we lack an integrated platform for that.  Probably an ideal situation would be have a seamless integration ,..
  • 4. What enterprises look for To be able to get data from any source … To the systems that performs Analytics … And to those for user availability
  • 5. Common DataFlow challenges  System failure  Difference between data production and consumption  Change in dynamic data priority  Protocols and format changes; new systems, new protocols  Need of bidirectional data flow  Transparency and control  Security and privacy
  • 6. Brief history of Apache NiFi  Developed at NSA (National Security Agency, USA) for over 8 years.  Onyara engineers, for NSA, have developed a project called “Niagara Files” which later went on to become NiFi.  Trough NSA Technology transfer program it was made available as an open source Apache project “Apache NiFi” in the year 2014.  Hortonworks has a partnership with Onyara on their “Hortonworks DataFlow powered by Apache NiFi”
  • 7. What is Apache NiFi  Holistically Apache NiFi is an integrated platform to collect, conduct and curate real-time data (data in motion).  Provides an end to end DataFlow management from any source* to any destination*.  Provides data logistics – real-time operational visibility and control of DataFlow.  Supports powerful and scalable directed graphs of data routing and data transformation.  All these in a reliable and secure manner. *complete list of source and destination on official documentation
  • 8. Key features  Guaranteed data delivery – “at least once” semantics  Data buffering and Back pressure  Data prioritization in queue  Flow specific setting for “latency vs. throughput”  Data provenance  Visual control  Flow templates  Recovery/ Recording through content repository  Clustering to scale-out  Security  Classloader Isolation
  • 9. Core components of NiFi  NiFi at it’s core follow the concept of Flow Based programming.  Core components of NiFi are  FlowFile – the unit of information packet  FlowFile Processor – the processing engine; black box.  Connection – the relation between Processors and bounded buffer.  Flow Controller – the scheduler in real world.  Process Group – the compact function or subnet
  • 10. Core components diagram  This is how a typical NiFi DataFlow might look
  • 11. NiFi Architecture  NiFi executes within a JVM on a host Operating System.
  • 12. NiFi Architecture – Clustering  Typical NiFi cluster
  • 13. Core components of NiFi Cluster  NiFi Cluster Manager  Nodes  Primary Node  Isolated Processors  Heartbeats
  • 15. Building a DataFlow Processor  Drag the “Processor” icon from “Component Toolbar” into the canvas; this will provide a ‘Add Processor’ wizard
  • 16. Building a DataFlow Processor  General ‘SETTINGS’ for the processor
  • 17. Building a DataFlow Processor  ‘SCHEDULING’ information
  • 18. Building a DataFlow Processor  Setting up mandatory and optional ‘PROPERTIES’
  • 19. Building a DataFlow Processor  Auto alert mechanism  If there is an error it will not allow to start the processor
  • 20. Building a DataFlow Processor  If everything is se, we are ready to initiate/ start the process
  • 21. Demo 1  In this demo, we will go through a NiFi DataFlow that deals with the following steps  Connect to Kafka and consume from a topic.  Store consumed data in a local storage (optional).  Anonymize IP address.  Merge content before writing to HDFS (small file issues).  Finally store Kafka data onto HDFS  Look into error handling.  Look into use of expression language.
  • 22.
  • 23. Demo 2  In this demo, we will go through a NiFi DataFlow that deals with the following steps  Collect/ fetch data files from a local location.  Update/ add attributes.  Parse JSON strings to DB Insert statements.  Connect to PostgreSQL and Insert.  Error handling.
  • 24.
  • 25. Unit testing components  For component testing nifi-mock module can be used with JUnit.  The TestRunner interface allows us to test Processors and Controller Services.  We need to instantiate and get a new TestRunner (org.apache.nifi.util)  Add Controller Services and configure  Set property of Processors setProperty(PropertyDescriptor, String)  Enqueue FlowFiles by using the enqueue methods of the TestRunner class.  Processor can be started by triggering run() method of TestRunner.  Validate output – using the TestRunners assertAllFlowFilesTransferred and assertTransferCount methods.  More details can be found here – https://nifi.apache.org/docs/nifi- docs/html/developer-guide.html#testing
  • 26.  Add Maven dependency  Call static newTestRunner method of the TestRunners class  Call addControllerService method to add controller  Set properties by setProperty(ControllerService, PropertyDescriptor, String)  Enable services by enableControllerService(ControllerService)  Set processor property setProperty(PropertyDescriptor, String)  Override enqueue method for byte[], InputStream, or Path.  run(int); This will call methods with @OnScheduled annotation, Processor’s onTrigger method, and then run the @OnUnscheduled and finally @OnStopped methods.  Validate result by assertAllFlowFilesTransferred and assertTransferCount methods.  Access FlowFiles by calling getFlowFilesForRelationship() method
  • 27. Error handling  Following can occur  Unexpected data format  Network connection, disk failure  Bug in processor  ProcessException and all others (like null pointer)  ProcessException – Rollback and penalize the FlowFiles  All others – Rollback, penalize the FlowFiles and Yield the Processor
  • 28. Testing automation, Deployment  NiFi provides ‘ReST’ API for all components and entire documentation can be found here https://nifi.apache.org/docs/nifi-docs/rest-api/index.html  Apache NiFi Community is working to improve on this area  We can setup the deployment in following way  Create an application i.e. entire DataFlow in your local machine and test.  Create a process group around that (optional though)  Create a template. (Can be done from Web UI/ ReST API call)  Download the template. (Can be done from Web UI/ ReST API call)  Use ReST API call to import the template in new environment.  Use ReST API call to Update Processors (Properties, Schedule, and Settings etc.)  Use ReST API call to Instantiate a template
  • 29. Deployment  There can be one more option to do it.  Copying the whole flow (flow.xml.gz) from one environment to another  Need to copy the entire canvas.  Need to take care of sensitive properties encryption.
  • 30. What is next  We are planning to work on the testing, deployment side and update it.  Please read more on NiFi development here – https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html  And for user guide – https://nifi.apache.org/docs/nifi-docs/html/user- guide.html  We have carried out POCs on some of our real use cases; please find them here  Link HDFS data ingestion using Apache  Link How to setup Apache NiFi  Link Expression Language Guide  Any questions and/ or suggestions please come by or write 
  • 32. Thank you! Presented by: Anshuman Ghosh