7. • Created by the United States National Security Agency (NSA)
• originally named Niagarafiles
• In 2014 the NSA submitted the source code to Apache
Software Foundation, via the NSATechnologyTransfer
Program, entered incubation in December 2014
• Development of Apache NiFi continued at Onyara, Inc., a start
up company
• Became ApacheTop-Level Project in July 2015
• Hortonworks acquired Onyara, Inc. in August 2015
9. • Data acquisition and delivery
• Simple transformation and data routing
• Simple event processing
• End to end provenance
• Edge intelligence and bi-directional comms.
10. NOT intended to REPLACE
‘distribute computation engines’
(a.k.a streaming processing frameworks)
12. Highly configurable
• Loss tolerant vs guaranteed delivery
• Low latency vs high throughput
• Dynamic prioritization
• Flow can be modified at runtime
• Back pressure
13. More…
• Designed for extension
• Build your own processors and more
• Secure
• SSL, SSH, HTTPS, encrypted content, etc...
• Multi-tenant authorization and internal authorization/policy management
• MiNiFi subproject
• Reduce footprint to ~ 40 MB
19. • Spout: a source of streams in a topology
• Bolt: a processing component which includes Sink
• Stream: an unbounded sequence of tuples, defined with schema
• Stream groupings: defines how that stream should be
partitioned among the bolt's tasks
• Topology: the logic for a realtime application represented to a
DAG
21. Core Trident
Computation Unit Record (tuple) Micro batch
Latency Very low (sub-seconds)
High (up to batch size)
Similar to Spark Streaming
Delivery Guarantee At least once Exactly once
API Compositional Declarative
Stateful Operator Supported from v1.0.0
Core feature
(exactly-once)
Windowing
Time (processing time, event time), Count
Tumbling window, Sliding window