SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Data at Scales &
the Values of
Starting Small
Aldrin Piri - @aldrinpiri
DataWorks Summit 2017 – Munich
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Key: 'Apache NiFi’
Value: 'PMC Member'
Key: 'Work’
Value: ’Sr. Member of Technical Staff @ Hortonworks'
Key: 'Working with NiFi Since’
Value: '2010’
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
Apache NiFi: A Primer
Apache MiNiFi
Architecture
Apache NiFi: The Ecosystem
Community
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Byte Scales for Data
SI Prefix
- 10
0
kilo 10
3
mega 10
6
giga 10
9
tera 10
12
peta 10
15
exa 10
18
zetta 10
21
yotta 10
24
“Big Data”
”everything else”
Greek
for
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Problem at Hand
Producers A.K.A Things
Anything
AND
Everything
Internet!
Consumers
• User
• Storage
• System
• …More Things
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use Case: Courier Service
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center
Kafka
Core Data Center at HQ
Server Cluster
Others
Storm / Spark /
Flink / Apex
Kafka
Storm / Spark / Flink / Apex
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
NiFi NiFi NiFi NiFi NiFi NiFi
Gathering data from disparate sources
NiFi
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi: A Primer
Key Features and Principles
• Guaranteed delivery
• Data buffering
- Backpressure
- Pressure release
• Prioritized queuing
• Flow specific QoS
- Latency vs. throughput
- Loss tolerance
• Data provenance
• Recovery/recording
a rolling log of fine-
grained history
• Visual command and
control
• Flow templates
• Pluggable/multi-role
security
• Designed for extension
• Clustering
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi is based on Flow Based Programming (FBP)
FBP Term NiFi Term Description
Information
Packet
FlowFile Each object moving through the system.
Black Box FlowFile
Processor
Performs the work, doing some combination of data routing, transformation,
or mediation between systems.
Bounded
Buffer
Connection The linkage between processors, acting as queues and allowing various
processes to interact at differing rates.
Scheduler Flow
Controller
Maintains the knowledge of how processes are connected, and manages the
threads and allocations thereof which all processes use.
Subnet Process
Group
A set of processes and their connections, which can receive and send data via
ports. A process group allows creation of entirely new component simply by
composition of its components.
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
FlowFiles are like HTTP data
HTTP Data FlowFile
HTTP/1.1 200 OK
Date: Sun, 10 Oct 2010 23:26:07 GMT
Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g
Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT
ETag: "45b6-834-49130cc1182c0"
Accept-Ranges: bytes
Content-Length: 13
Connection: close
Content-Type: text/html
Hello world!
Standard FlowFile Attributes
Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'fileSize’ Value: '23609'
FlowFile Attribute Map Content
Key: 'filename’ Value: '15650246997242'
Key: 'path’ Value: './’
Binary Content *
Header
Content
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
FlowFiles & Data Agnosticism
 NiFi is data agnostic!
 But, NiFi was designed understanding that users
can care about specifics and provides tooling
to interact with specific formats, protocols, etc.
ISO 8601 - http://xkcd.com/1179/
Robustness principle
Be conservative in what you do,
be liberal in what you accept from others“
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache MiNiFi
 NiFi lives in the data center. Give it an
enterprise server or a cluster of them.
 MiNiFi lives as close to where data is born
and is a guest on that device or system
“Let me get the key parts of NiFi close to where data begins and provide bidirectional
data transfer"
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache MiNiFi
 Limited computing capability
 Limited power/network
 Restricted software library/platform availability
 No UI
 Physically inaccessible
 Not frequently updated
 Competing standards/protocols
 Scalability
 Privacy & Security
Realities of computing outside the comforts of the data center
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi: Precedent from NiFi
 Provides the semantics between two NiFi components across network boundaries
– A custom protocol for inter-NiFi communication
– Secure, Extensible, Load Balanced & Scalable Delivery to Cluster
 Extracted out to a client library which powers integration into popular frameworks like
Apache Spark, Apache Storm, Apache Flink, and Apache Apex
 Attributes and the FlowFile format maintained
A quick look at NiFi Site to Site
https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi: Precedent from NiFi
 Fine-grained, event level access of interactions with FlowFiles
– CREATE, RECEIVE, FETCH, SEND, DOWNLOAD, DROP, EXPIRE, FORK, JOIN …
 Captures the associated attributes/metadata at the time of the event
 A map of a FlowFile’s journey and how they relate to other FlowFiles in a system
– MiNiFi enables us to get more and further illuminate the map of data processing
A deeper dive into provenance
http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data-provenance
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi: Precedent from NiFi
RECEIVE event
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache MiNiFi
 The feedback loop is longer and not
guaranteed
– Removal of Web Server and UI
 Declarative configuration
– Lends itself well to CM processes
– Extensible interface to support varying formats
• Currently provided in YAML
 Reduced set of bundled components
Departures from NiFi in getting the right fit
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache MiNiFi: Scoping
 Go small: Java – Write once, run anywhere*
– Feature parity and reuse of core NiFi libraries
 Go smaller: C++ – Write once**, run anywhere
 Go smallest: Write n-many times, run anywhere
Language libraries to support tagging, FlowFile format, Site to Site protocol, and
provenance generation without a processing framework
– Mobile: Android & iOS
– Language SDKs
Provide all the key principles of NiFi in varying, smaller footprints
WHAT IS THIS!?
A NiFi FOR ANTS!?!
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi: Use Case - Connected Car
 Outside vehicle’s
network firewall
 On telematics layer
VEHICLE NETWORK FIREWALL
TRANSMIT
EXECUTE FILTER
PRIORITIZE
PARSE
LISTEN
ROUTE
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Connecting the Drops
SOURCES
REGIONAL
INFRASTRUCTURE
CORE
INFRASTRUCTURE
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use Case: Courier Service with Apache NiFi & MiNiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center
Kafka
Core Data Center at HQ
Server Cluster
Others
Storm / Spark /
Flink / Apex
Kafka
Storm / Spark / Flink / Apex
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client
Libraries
Client
Libraries
MiNiFi
MiNiFi
NiFi NiFi NiFi NiFi NiFi NiFi
Client
Libraries
Gathering data from disparate sources
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved© Hortonworks Inc. 2011 – 2017. All Rights ReservedX
Data Provenance
▪ Constrained
▪ High-latency
▪ Localized context
▪ Hybrid – cloud/on-premises
▪ Low-latency
▪ Global context
Origin – attribution
Replay – recovery
Evolution of topologies
Long retention
Types of Lineage
• Event
• Configuration
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi: The Ecosystem
 Site-to-Site in MiNiFi instances provides machine-to-machine (M2M) communication
– Data arrives to NiFi in a transparent manner allowing integration to existing flows
 Similar attention to extensibility in both Java and C++ clients allows agents to fit the
needs of your organization
 Reduced footprint allows NiFi functionality to aid in production of high fidelity data,
more closely attributable and tracked from where it is generated
We’ve provided a framework to extend the reach of data ingest
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi: The Ecosystem
 Enter the MiNiFi Command & Control
– Provide tooling to map the UX of
interactive command and control in NiFi
to the design and deploy approach of
MiNiFi
But more instances complicate my operational management!
https://cwiki.apache.org/confluence/display/MINIFI/MiNiFi+Command+and+Control
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi: The Ecosystem
 Configuration Management of Flows & Versioning
– The evolution of templates to better support SDLC functions
– https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows
 Extension Repositories
– Publish & Share extension bundles (NARs)
– https://cwiki.apache.org/confluence/display/NIFI/Extension+Repositories+%28aka+Extension+Regis
try%29+for+Dynamically-loaded+Extensions
 Variable Registry
– Initial framework support & file-based implementation
– https://cwiki.apache.org/confluence/display/NIFI/Variable+Registry
Building on efforts for reusable components in the community
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Apache NiFi & MiNiFi?
 Moving data is multifaceted in its challenges and these are present in different contexts
at varying scopes
– Think of our courier example and organizations like it: inter vs intra, domestically, internationally
 Provide common tooling and extensions that are commonly needed but be flexible for
extension
– Leverage existing libraries and expansive Java ecosystem for functionality
– Allow organizations to integrate with their existing infrastructure
 Empower folks managing your infrastructure to make changes and reason about issues
that are occurring
– Data Provenance to show context and data’s journey
– User Interface/Experience a key component
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Learn more and join us!
Apache NiFi site
https://nifi.apache.org
Subproject MiNiFi site
https://nifi.apache.org/minifi/
Subscribe to and collaborate at
dev@nifi.apache.org
users@nifi.apache.org
Submit Ideas or Issues
https://issues.apache.org/jira/browse/NIFI
https://issues.apache.org/jira/browse/MINIFI
Follow us on Twitter
@apachenifi
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi Crash Course
Thursday, 6 April
11:15 AM – 1:45PM, Room 12
• Learn more about NiFi, the community, and work through a hands-on lab
• Seats available on a first come, first served basis
• Make sure you are in possession of the latest version of VirtualBox
• More details: https://tinyurl.com/nifi-cc-munich17
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Learn, Share at Birds of a Feather
IOT, STREAMING & DATA FLOW
Thursday, April 6
5:50 pm, Room 5
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsTimothy Spann
 
Apache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleApache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleAbdelkrim Hadjidj
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseAldrin Piri
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifiAnshuman Ghosh
 
BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiAldrin Piri
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiDataWorks Summit
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemApache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemBryan Bende
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHaimo Liu
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiDataWorks Summit
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Timothy Spann
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
 
Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User GuideDeon Huang
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiDataWorks Summit
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Timothy Spann
 

Was ist angesagt? (20)

Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
Apache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scaleApache NiFi: latest developments for flow management at scale
Apache NiFi: latest developments for flow management at scale
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
 
BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFi
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemApache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFIHarnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
Harnessing Data-in-Motion with HDF 2.0, introduction to Apache NIFI/MINIFI
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
 
Apache NiFi Crash Course Intro
Apache NiFi Crash Course IntroApache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User Guide
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016
 

Ähnlich wie Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi

Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiDataWorks Summit
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupJoseph Witt
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFiHortonworks
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionMilind Pandit
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiJoe Percivall
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityAccumulo Summit
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018Timothy Spann
 
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisDataWorks Summit/Hadoop Summit
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveBryan Bende
 
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方HortonworksJapan
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Hortonworks
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerDataWorks Summit
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiDataWorks Summit
 

Ähnlich wie Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi (20)

Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming Meetup
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi Introduction
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
 
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
 
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方Apache NiFi + Tensorflow + Hadoop:Big Data AI サンドイッチの作り方
Apache NiFi + Tensorflow + Hadoop: Big Data AI サンドイッチの作り方
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Curing the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging ManagerCuring the Kafka blindness—Streams Messaging Manager
Curing the Kafka blindness—Streams Messaging Manager
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
 

Kürzlich hochgeladen

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Kürzlich hochgeladen (20)

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi

  • 1. Data at Scales & the Values of Starting Small Aldrin Piri - @aldrinpiri DataWorks Summit 2017 – Munich
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Key: 'Apache NiFi’ Value: 'PMC Member' Key: 'Work’ Value: ’Sr. Member of Technical Staff @ Hortonworks' Key: 'Working with NiFi Since’ Value: '2010’
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda Apache NiFi: A Primer Apache MiNiFi Architecture Apache NiFi: The Ecosystem Community
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Byte Scales for Data SI Prefix - 10 0 kilo 10 3 mega 10 6 giga 10 9 tera 10 12 peta 10 15 exa 10 18 zetta 10 21 yotta 10 24 “Big Data” ”everything else” Greek for
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Problem at Hand Producers A.K.A Things Anything AND Everything Internet! Consumers • User • Storage • System • …More Things
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Use Case: Courier Service Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Kafka Core Data Center at HQ Server Cluster Others Storm / Spark / Flink / Apex Kafka Storm / Spark / Flink / Apex On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ NiFi NiFi NiFi NiFi NiFi NiFi Gathering data from disparate sources NiFi
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi: A Primer Key Features and Principles • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Recovery/recording a rolling log of fine- grained history • Visual command and control • Flow templates • Pluggable/multi-role security • Designed for extension • Clustering
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi is based on Flow Based Programming (FBP) FBP Term NiFi Term Description Information Packet FlowFile Each object moving through the system. Black Box FlowFile Processor Performs the work, doing some combination of data routing, transformation, or mediation between systems. Bounded Buffer Connection The linkage between processors, acting as queues and allowing various processes to interact at differing rates. Scheduler Flow Controller Maintains the knowledge of how processes are connected, and manages the threads and allocations thereof which all processes use. Subnet Process Group A set of processes and their connections, which can receive and send data via ports. A process group allows creation of entirely new component simply by composition of its components.
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved FlowFiles are like HTTP data HTTP Data FlowFile HTTP/1.1 200 OK Date: Sun, 10 Oct 2010 23:26:07 GMT Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT ETag: "45b6-834-49130cc1182c0" Accept-Ranges: bytes Content-Length: 13 Connection: close Content-Type: text/html Hello world! Standard FlowFile Attributes Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'fileSize’ Value: '23609' FlowFile Attribute Map Content Key: 'filename’ Value: '15650246997242' Key: 'path’ Value: './’ Binary Content * Header Content
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved FlowFiles & Data Agnosticism  NiFi is data agnostic!  But, NiFi was designed understanding that users can care about specifics and provides tooling to interact with specific formats, protocols, etc. ISO 8601 - http://xkcd.com/1179/ Robustness principle Be conservative in what you do, be liberal in what you accept from others“
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache MiNiFi  NiFi lives in the data center. Give it an enterprise server or a cluster of them.  MiNiFi lives as close to where data is born and is a guest on that device or system “Let me get the key parts of NiFi close to where data begins and provide bidirectional data transfer"
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache MiNiFi  Limited computing capability  Limited power/network  Restricted software library/platform availability  No UI  Physically inaccessible  Not frequently updated  Competing standards/protocols  Scalability  Privacy & Security Realities of computing outside the comforts of the data center
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi: Precedent from NiFi  Provides the semantics between two NiFi components across network boundaries – A custom protocol for inter-NiFi communication – Secure, Extensible, Load Balanced & Scalable Delivery to Cluster  Extracted out to a client library which powers integration into popular frameworks like Apache Spark, Apache Storm, Apache Flink, and Apache Apex  Attributes and the FlowFile format maintained A quick look at NiFi Site to Site https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi: Precedent from NiFi  Fine-grained, event level access of interactions with FlowFiles – CREATE, RECEIVE, FETCH, SEND, DOWNLOAD, DROP, EXPIRE, FORK, JOIN …  Captures the associated attributes/metadata at the time of the event  A map of a FlowFile’s journey and how they relate to other FlowFiles in a system – MiNiFi enables us to get more and further illuminate the map of data processing A deeper dive into provenance http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data-provenance
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi: Precedent from NiFi RECEIVE event
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache MiNiFi  The feedback loop is longer and not guaranteed – Removal of Web Server and UI  Declarative configuration – Lends itself well to CM processes – Extensible interface to support varying formats • Currently provided in YAML  Reduced set of bundled components Departures from NiFi in getting the right fit
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache MiNiFi: Scoping  Go small: Java – Write once, run anywhere* – Feature parity and reuse of core NiFi libraries  Go smaller: C++ – Write once**, run anywhere  Go smallest: Write n-many times, run anywhere Language libraries to support tagging, FlowFile format, Site to Site protocol, and provenance generation without a processing framework – Mobile: Android & iOS – Language SDKs Provide all the key principles of NiFi in varying, smaller footprints
  • 19. WHAT IS THIS!? A NiFi FOR ANTS!?!
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi: Use Case - Connected Car  Outside vehicle’s network firewall  On telematics layer VEHICLE NETWORK FIREWALL TRANSMIT EXECUTE FILTER PRIORITIZE PARSE LISTEN ROUTE
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Connecting the Drops SOURCES REGIONAL INFRASTRUCTURE CORE INFRASTRUCTURE
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Use Case: Courier Service with Apache NiFi & MiNiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Kafka Core Data Center at HQ Server Cluster Others Storm / Spark / Flink / Apex Kafka Storm / Spark / Flink / Apex On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ Client Libraries Client Libraries MiNiFi MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi Client Libraries Gathering data from disparate sources
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved© Hortonworks Inc. 2011 – 2017. All Rights ReservedX Data Provenance ▪ Constrained ▪ High-latency ▪ Localized context ▪ Hybrid – cloud/on-premises ▪ Low-latency ▪ Global context Origin – attribution Replay – recovery Evolution of topologies Long retention Types of Lineage • Event • Configuration
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi: The Ecosystem  Site-to-Site in MiNiFi instances provides machine-to-machine (M2M) communication – Data arrives to NiFi in a transparent manner allowing integration to existing flows  Similar attention to extensibility in both Java and C++ clients allows agents to fit the needs of your organization  Reduced footprint allows NiFi functionality to aid in production of high fidelity data, more closely attributable and tracked from where it is generated We’ve provided a framework to extend the reach of data ingest
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi: The Ecosystem  Enter the MiNiFi Command & Control – Provide tooling to map the UX of interactive command and control in NiFi to the design and deploy approach of MiNiFi But more instances complicate my operational management! https://cwiki.apache.org/confluence/display/MINIFI/MiNiFi+Command+and+Control
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi: The Ecosystem  Configuration Management of Flows & Versioning – The evolution of templates to better support SDLC functions – https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows  Extension Repositories – Publish & Share extension bundles (NARs) – https://cwiki.apache.org/confluence/display/NIFI/Extension+Repositories+%28aka+Extension+Regis try%29+for+Dynamically-loaded+Extensions  Variable Registry – Initial framework support & file-based implementation – https://cwiki.apache.org/confluence/display/NIFI/Variable+Registry Building on efforts for reusable components in the community
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Apache NiFi & MiNiFi?  Moving data is multifaceted in its challenges and these are present in different contexts at varying scopes – Think of our courier example and organizations like it: inter vs intra, domestically, internationally  Provide common tooling and extensions that are commonly needed but be flexible for extension – Leverage existing libraries and expansive Java ecosystem for functionality – Allow organizations to integrate with their existing infrastructure  Empower folks managing your infrastructure to make changes and reason about issues that are occurring – Data Provenance to show context and data’s journey – User Interface/Experience a key component
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Learn more and join us! Apache NiFi site https://nifi.apache.org Subproject MiNiFi site https://nifi.apache.org/minifi/ Subscribe to and collaborate at dev@nifi.apache.org users@nifi.apache.org Submit Ideas or Issues https://issues.apache.org/jira/browse/NIFI https://issues.apache.org/jira/browse/MINIFI Follow us on Twitter @apachenifi
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi Crash Course Thursday, 6 April 11:15 AM – 1:45PM, Room 12 • Learn more about NiFi, the community, and work through a hands-on lab • Seats available on a first come, first served basis • Make sure you are in possession of the latest version of VirtualBox • More details: https://tinyurl.com/nifi-cc-munich17
  • 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Learn, Share at Birds of a Feather IOT, STREAMING & DATA FLOW Thursday, April 6 5:50 pm, Room 5
  • 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You