Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Introduction to Apache NiFi 1.11.4

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
Dataflow with Apache NiFi
Dataflow with Apache NiFi
Wird geladen in …3
×

Hier ansehen

1 von 32 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Introduction to Apache NiFi 1.11.4 (20)

Anzeige

Weitere von Timothy Spann (20)

Aktuellste (20)

Anzeige

Introduction to Apache NiFi 1.11.4

  1. 1. Introduction to Apache NiFi 1.11.4 Timothy Spann Principal DataFlow Field Engineer Cloudera @PaasDev
  2. 2. © 2020 Cloudera, Inc. All rights reserved. 2 Welcome to Future of Data - Princeton @PaasDev https://www.meetup.com/futureofdata-princeton/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ...
  3. 3. © 2020 Cloudera, Inc. All rights reserved. 3 Meetup Presenter Who am I? Principal DataFlow Field Engineer @PaasDev DZone Zone Leader and Big Data MVB; Princeton NJ Future of Data Meetup; ex-Pivotal Field Engineer; Apache Kafka, Tensorflow, Apache Spark RefCards https://github.com/tspannhw https://www.datainmotion.dev/ https://dzone.com/users/297029/bunkertor.html
  4. 4. 4© 2020 Cloudera, Inc. All rights reserved.
  5. 5. © 2020 Cloudera, Inc. All rights reserved. 5 STORAGE LAYER sensors EXAMPLE REFERENCE ARCHITECTURE Apache NiFi Apache Kafka DATA SYNDICATION SERVICE BY KAFKA Kafka Topic iot DATA FLOW APPS POWERED BY NIFI Apache Impala Cloudera Machine Learning MODEL EXECUTION
  6. 6. © 2020 Cloudera, Inc. All rights reserved. 6 Cloudera Flow Management Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data center) to any downstream system with built in end-to-end security and provenance ACQUIRE PROCESS DELIVER • Over 300 Prebuilt Processors • Easy to build your own • Parse, Enrich & Apply Schema • Filter, Split, Merger & Route • Throttle & Backpressure • Guaranteed Delivery • Full data provenance from acquisition to delivery • Diverse, Non-Traditional Sources • Eco-system integration Advanced tooling to industrialize flow development (Flow Development Life Cycle)
  7. 7. © 2020 Cloudera, Inc. All rights reserved. 7 NiFi 1.14
  8. 8. © 2020 Cloudera, Inc. All rights reserved. 8 Stateless Engine • Granular containers per flow • Flows From NiFi Registry https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html bin/nifi.sh stateless RunFromRegistry Continuous --file kafka.json https://github.com/apache/nifi/blob/ea1becac4fc519c54b8b4d21773e68f8da364755/nifi-nar-bundles/nifi-framework-bundle/nifi- framework/nifi-stateless/README.md
  9. 9. © 2020 Cloudera, Inc. All rights reserved. 9 Stateless Engine • See also Parameters • Docker • YARN • Kubernetes (K8) • Stateful NiFi clusters • Apache OpenWhisk (FaaS) https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html {"registryUrl": "http://tspann-mbp15-hw14277:18080", "bucketId": "140b30f0-5a47-4747-9021-19d4fde7f993", "flowId": "0540e1fd-c7ca-46fb-9296-e37632021945", "ssl": { "keystoreFile": "","keystorePass": "","keyPass": "","keystoreType": "", "truststoreFile": "/Library/Java/JavaVirtualMachines/amazon-corretto-11.jdk/Contents/Home/lib/sec urity/cacerts", "truststorePass": "changeit", "truststoreType": "JKS" }, "parameters": { "broker" : "4.317.852.100:9092", "topic" : "iot", "group_id" : "nifi-stateless-kafka-consumer", "DestinationDirectory" : "/tmp/nifistateless/output2/", "output_dir": "/Users/tspann/Documents/nifi-1.10.0-SNAPSHOT/logs/output" } } https://github.com/tspannhw/stateless-examples
  10. 10. © 2020 Cloudera, Inc. All rights reserved. 10 Parameters • Parameters • Parameter Context https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
  11. 11. © 2020 Cloudera, Inc. All rights reserved. 11 Parameters • Advanced Editors • Easy to Use • PARAM https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
  12. 12. © 2020 Cloudera, Inc. All rights reserved. 12 Parameters • Configure Externally with JSON Files to Execute Stateless Flows https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
  13. 13. © 2020 Cloudera, Inc. All rights reserved. 13 Parameters • Create / Edit Parameters from NiFi or in JSON Files https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
  14. 14. © 2020 Cloudera, Inc. All rights reserved. 14 Parameter Context • Sensitive or Normal • Connect to Multiple Process Groups https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
  15. 15. © 2020 Cloudera, Inc. All rights reserved. 15 RetryFlowFile • Configurable Retries • Maximum # • Penalties • When to Fail • Reuse Mode https://medium.com/@abdelkrim.hadjidj/apache-nifi-1-10-series-simplifying-error-handling-7de86f130acd
  16. 16. © 2020 Cloudera, Inc. All rights reserved. 16 BackPressure Prediction • OrdinaryLeastSquares • SimpleRegression • Enable analytics feature http://lonnifi.blogspot.com/2019/11/back-pressure-prediction-deep-dive.html?es_id=5233333939 https://youtu.be/Tt8TSlHu7PE
  17. 17. © 2020 Cloudera, Inc. All rights reserved. 17 ParquetReader / ParquetWriter Records • Native Record Processors for Apache Parquet Files! • CSV <-> Parquet • XML <-> Parquet • AVRO <-> Parquet • JSON <-> Parquet • More... https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apac he_7.html
  18. 18. © 2020 Cloudera, Inc. All rights reserved. 18 PostSlack • Post Images to Slack https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html https://www.datainmotion.dev/2019/11/nifi-110-postslack-easy-image-upload.html
  19. 19. © 2020 Cloudera, Inc. All rights reserved. 19 Remote Input Port in a Process Group • Put Remote Connections for Site-To-Site (S2S) Anywhere! • Not only top level • Drop down simplicity https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
  20. 20. © 2020 Cloudera, Inc. All rights reserved. 20 Many New Features • Prometheus Reporting Task • Experimental Encrypted content repository • PublishKafka Partition Support • Toolkit module to generate and build Swagger • GeoEnrichIPRecord Processor • Command Line Diagnostics • RocksDB FlowFile Repository • PutBigQueryStreaming Processor • Enhanced DevOps and CD/CI https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html ELT/ETL Lookup Services • DatabaseRecordLookupService • KuduLookupService • HBase_2_ListLookupService https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.10.0
  21. 21. © 2020 Cloudera, Inc. All rights reserved. 21 NiFI 1.11 Features • Improved handling and support for partitions when sending data to Azure Event Hubs. • All repositories (Content, FlowFile, Provenance) can now be encrypted on disk controlled at an application level. • Class loader isolation now includes isolating native libraries within the Nars! Huge help for interacting with many Hadoop vendors or other systems from the same NiFi cluster. • Keytab Credential Service now supported to ensure easily configured secure communications with the Hortonworks Schema Registry. • IBM MQ now easier to integrate with for existing NiFi JMS processors. • Metrics Events Reporting Task • Rules Action Handler Lookup Service https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020&version=12 346451
  22. 22. © 2019 Cloudera, Inc. All rights reserved. 22 Apache NiFi 1.11.4 Features Reporting Tasks Total number of reporting tasks. Examples of new components: - Prometheus Reporting Task - Azure Log Analytics RT - Azure Provenance RT - Query NiFi Reporting Task - Metrics Event Reporting Task Controller Services Total number of controller services. Examples of new components: - Rules Engine Controller Service - Kudu Lookup Service - Azure Storage Credentials - Amazon S3 Encryption Service - HBase List Lookup Service - Parquet Reader/Writer Processors Total number of processors. Examples of new components: - Accumulo processors - Put Elasticsearch Record - Put BigQuery Streaming - RetryFlowFile
  23. 23. © 2019 Cloudera, Inc. All rights reserved. 23 Other Features of Apache NiFi 1.11.4 JDK 11 Support Improvements: - Class loading isolation with native libraries Security - Encrypted content repository & flow file repository (tech preview) Operations Improvements: - Monitoring analytics and rule based monitoring - Parameters to improve CI/CD and support sensitive properties https://www.youtube.com/watch?v=IUjz-rhA3xs
  24. 24. © 2019 Cloudera, Inc. All rights reserved. 24 Cloud, VMs, Containers and Pods https://hub.docker.com/r/apache/nifi/ https://hub.helm.sh/charts/cetic/nifi
  25. 25. © 2020 Cloudera, Inc. All rights reserved. 25 Example
  26. 26. © 2019 Cloudera, Inc. All rights reserved. 31 Useful Links https://www.datainmotion.dev/2020/02/connecting-apache-nifi-to-apache-atlas.html https://dev.to/tspannhw/quicktip-ingesting-google-analytics-api-with-apache-nifi-mg1 https://dev.to/tspannhw/analyzing-wood-burning-stoves-with-flank-stack-minifi-flink-ni fi-kafka-kudu-36on https://dev.to/tspannhw/cloudera-edge2ai-minifi-java-agent-with-raspberry-pi-and-ther mal-camera-and-air-quality-sensor-part-1-3oo9 https://dev.to/tspannhw/iot-series-minifi-agent-on-raspberry-pi-4-with-enviro-hat-for-en vironmental-monitoring-and-analytics-l8d https://dev.to/tspannhw/introducing-mm-flank-an-apache-flink-stack-for-rapid-streami ng-development-from-edge-2-ai-5c12 https://dev.to/tspannhw/nifi-1-10-postslack-easy-image-upload-22mh https://dev.to/tspannhw/nifi-toolkit-cli-for-nifi-1-10-213h
  27. 27. © 2020 Cloudera, Inc. All rights reserved. 32 TH N Y U

×