Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Apache NiFi Toronto Meetup

4.465 Aufrufe

Veröffentlicht am

Slides from the Apache NiFi meetup in Toronto

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Apache NiFi Toronto Meetup

  1. 1. Introducing Hortonworks DataFlow © Hortonworks Inc. 2011 – 2015. All Rights Reserved
  2. 2. Page2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Simplistic View of Enterprise Data Flow The Data Flow Thing Process and Analyze Data Acquire Data Store Data
  3. 3. Page3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Realistic View of Enterprise Data Flow
  4. 4. Page4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Enterprise DataFlow Challenges GATHER DELIVER PRIORITIZE Track from the edge Through the datacenter • Variability in Data Protocols, Formats and Schemas • Data Size and Speed • Security at Data Plane • Traceability (Data Lineage) • Prioritization of Resources • Multi-Directional Flow • Recoverability and Replay • Transparency of DataFlow • Scaling Down • Enrichment/Transformation • Unreliable Comms
  5. 5. Page5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved • Add Systems…. • Add new systems to handle the protocol differences • Add new systems to convert the data • Add new systems to reorder the data • Add new systems to filter the unauthorized data • Add new system to slow down or speed up data • Add new topics to represent ‘stages of the flow’ And Complexity…. Typical Answer to Challenges
  6. 6. Page6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hortonworks DataFlow Visual User Interface HTML 5, drag and drop, for agile execution Provenance Metadata for governance and compliance Secure End-to-End Data Routing with encryption and compression Powered by Apache NiFi
  7. 7. Page7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Manage Flow of Data in Real Time Operators • Transparency • Immediate feedback • Agility Data Scientists • Flexibility • Autonomy
  8. 8. Page8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Track Flow of Data from Beginning to End IT and Cloud Operators • Understand Traceability, Lineage • Enable Recovery and Replay Compliance Regulations • Provide an Audit Trail • Remediation Capabilities BEGIN END LINEAGE
  9. 9. Page9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Secure Data at the Edge Beyond Simple Encryption • Enterprise authorization services – entitlements can change often • People and systems with different roles require difference access levels Understanding and Classifying Data • Tagged/classified data traced • Understand who/what/when/where data is leveraged.
  10. 10. Page10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Common Apache NiFi Use Cases Compliance Gain full transparency into provenance and flow of data Digital Security Acquire and prioritize data into data lake for analysis IoT Optimization Secure, Prioritize, Enrich and Trace data at the edge Fraud Detection Move sales transaction data in real time to analyze on demand Big Data Ingest Easily and efficiently ingest data into Hadoop Value Resources Gain visibility into how data sources are used to determine value
  11. 11. Page11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage Architecture OS/Host JVM NiFi Cluster Manger – Request Replicator Web Server Master NiFi Cluster Manager (NCM) OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage Slaves NiFi Nodes
  12. 12. Page12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Security Administration Central management and consistent security • NiFi Cluster Manager Authentication Authenticate users and systems • 2-Way SSL support out of the box; additional types coming Authorization Provision access to data • Pluggable authorization designed to fit any Identity and Access Management (IAM) scheme • File-based authority provider out of the box • Multi-role Audit Maintain a record of data access • Detailed logging of all user actions • Detailed logging of key system behaviors • Data Provenance enables unparalleled tracking from the edge through the Lake Data Protection Protect data at rest and in motion • Support a variety of SSL/encrypted protocols • Tag and utilize tags on data for fine grained access controls • Encrypt/decrypt content using pre-shared key mechanisms Administrator Configure system threads, user accounts, and flow audit history Data Flow Manager Manipulate the dataflow Read Only View the dataflow only +NiFi Configure system threads, user accounts, and flow audit history Proxy Manipulate the dataflow Provenance Query the provenance repository and download content
  13. 13. Page13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache NiFi User Quotes “The NiFi user interface and ease of extension have made it extremely easy to get up and running and even customize. It is great that it also easily integrates with other parts of the Apache Big Data world like Spark, Kafka and Hadoop.” Craig Connell, Leverege, Chief Technology Officer “NiFi's well designed, mature API has made our integration process remarkably straightforward. With it, we're able to track the origin, transformation, and persistence of data throughout our analytic processes.” Mike Bishop Prescient Edge Chief Systems Architect “NiFi addresses dataflow challenges we have right now and provides upside for where we're heading. That it is designed for the global enterprise, is also a big win for us.” Alexandar Ryabov Wargaming.net Senior Director of Data Engineering
  14. 14. Page14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Thank You Page14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
  15. 15. Page15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hortonworks DataFlow Use Cases Administer Flows, Enhance Security and Manage Equipment
  16. 16. Page16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Data Flow Management Data Ingestion Data as a Service Provenance Data Regulatory Compliance DATA FLOW MANAGEMENT
  17. 17. Page17 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved DATA FLOW MANAGEMENT Data Ingestion, with bi-directional intelligence and provenance metadata • DATA INGESTION Most ingest tools are unidirectional—data streams in the same way no matter what They don’t preserve detail on in-flow data transformations PROBLEM HDF manages bi- directional, point-to- point data flows that are easily configured Data reaches its destination with its provenance data intact SOLUTION Users can update data flow logic to always receive the data they need Provenance data improves confidence in your insights IMPACT “The NiFi user interface and ease of extension have made it extremely easy to get up and running and even customize.” Craig Connell, CTO, Leverege
  18. 18. Page18 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved DATA FLOW MANAGEMENT Providers of data as a service assign value to data using NiFi’s provenance metadata • DATA AS A SERVICE PROVENANCE A new genre of companies provide data as a service They have limited ability to prioritize which data is most valuable PROBLEM NiFi’s data provenance capabilities help DaaS companies understand (in much more detail) how their data is consumed SOLUTION They can understand which information resources are valuable and which are not This helps them invest in capturing the most valuable data sources IMPACT
  19. 19. Page19 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved DATA FLOW MANAGEMENT Firms Comply with Financial Regulations by Showing Complete Chain of Custody • DATA REGULATORY COMPLIANCE Financial firms such as retail banks, capital markets firms and insurance companies are required to show chain of custody for certain transactions PROBLEM Apache NiFi’s data provenance capabilities show a complete chain of custody, for compliance with rules such as Basal capital requirements SOLUTION Firms can go back to a point in time and show regulators exactly what happened to a key piece of data in a transaction IMPACT
  20. 20. Page20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Enhance Security Asset and People Security Secure Data Ingestion Fraud and Theft Protection ENHANCE SECURITY
  21. 21. Page21 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved ENHANCE SECURITY • ASSET AND PEOPLE SECURITY Prescient Edge Helps Its Customers Protect the Physical Safety of Their Personnel With [Apache NiFi], we're able to track the origin, transformation, and persistence of data throughout our analytic processes.” Mike Bishop, Chief Systems Architect, Prescient Edge Globally distributed firms and government agencies have personnel in risky areas Prescient Edge provides analytics to protect employees PROBLEM The company uses Apache NiFi to feed real-time, unstructured data, from dozens of sources, to Prescient Edge analytics systems, to determine emergent threats, SOLUTION Prescient Edge is able to provide their clients with detailed, up to the minute threat and risk information, thereby allowing their clients to respond quickly to safeguard its teams and assets IMPACT
  22. 22. Page22 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved ENHANCE SECURITY A major US financial firm uses HDF to prioritize data ingest and speed time to protection • SECURE DATA INGESTION Digital security depends on the ability to detect threats quickly. Protection algorithms evaluate metadata with equal priority, slowing time to protection PROBLEM Apache NiFi helps to more effectively acquire, evaluate and prioritize security logs upstream, before they reach the analytics engine SOLUTION By prioritizing which data to send to its analytics engine, the company sees faster time to protection for its cyber assets IMPACT
  23. 23. Page23 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved ENHANCE SECURITY A huge US retailer uses Apache NiFi to reduce theft and shrinkage by hundreds of millions annually • FRAUD AND THEFT PROTECTION Thieves shoplift merchandise in the morning and then return the stolen goods later the same day for credit to their card PROBLEM Apache NiFi pushes a real time stream of inventory and transactional data into Hadoop more quickly, reducing the time to detect this fraudulent pattern SOLUTION The company expects to reduce shrinkage by hundreds of millions of dollars annually IMPACT
  24. 24. Page24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Manage Equipment Equipment Repair Remote Security Protection MANAGE EQUIPMENT
  25. 25. Page25 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved MANAGE EQUIPMENT Global oil company uses Apache NiFi to prioritize which sensor data to send ashore from offshore rigs • EQUIPMENT REPAIR Offshore oil rigs have physical constraints on their hardware footprints and associated bandwidth Far more sensor data is generated than can be transmitted to shore PROBLEM Apache Nifi uses rules- based prioritization to determine which sensor data is most important and thus needs to be transmitted back first, for immediate analysis SOLUTION Ability to distinguish important readings from standard readings helps the company isolate important signals and take action to improve efficiency and safety IMPACT
  26. 26. Page26 © Hortonworks Inc. 2011 – 2015. All Rights ReservedPage26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved MANAGE EQUIPMENT Firm with a high security profile enriches on-site video data to detect intrusions • REMOTE SECURITY PROTECTION Digital security cameras present a “needle in a haystack” problem Individuals monitoring video feeds can be lulled by 100s of hours where nothing happens PROBLEM Hortonworks DataFlow can identify a “trigger moment” like when a human face appears in a video, enrich that “trigger moment” with additional data and prioritize back for immediate analysis SOLUTION Analytics systems and analysts are able to more quickly sift through the “noise” to identify known human threats in a particular area IMPACT
  27. 27. Page27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache NiFi User Quotes “The NiFi user interface and ease of extension have made it extremely easy to get up and running and even customize. It is great that it also easily integrates with other parts of the Apache Big Data world like Spark, Kafka and Hadoop.” Craig Connell, Leverege, Chief Technology Officer “NiFi's well designed, mature API has made our integration process remarkably straightforward. With it, we're able to track the origin, transformation, and persistence of data throughout our analytic processes.” Mike Bishop Prescient Edge Chief Systems Architect “NiFi addresses dataflow challenges we have right now and provides upside for where we're heading. That it is designed for the global enterprise, is also a big win for us.” Alexandar Ryabov Wargaming.net Senior Director of Data Engineering
  28. 28. Page28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Thank You Page28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

×