Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Introducing Cloudera DataFlow (CDF) 2.13.19

1.186 Aufrufe

Veröffentlicht am

Watch this webinar to understand how Hortonworks DataFlow (HDF) has evolved into the new Cloudera DataFlow (CDF). Learn about key capabilities that CDF delivers such as -
-Powerful data ingestion powered by Apache NiFi
-Edge data collection by Apache MiNiFi
-IoT-scale streaming data processing with Apache Kafka
-Enterprise services to offer unified security and governance from edge-to-enterprise

Veröffentlicht in: Technologie

Introducing Cloudera DataFlow (CDF) 2.13.19

  1. 1. © Cloudera, Inc. All rights reserved. INTRODUCING CLOUDERA DATAFLOW (CDF) Dinesh Chandrasekhar Product Marketing Lead, Data-in-Motion BU Cloudera @AppInt4All George Vetticaden Product Management Lead, Data-in-Motion BU Cloudera @gvetticaden
  2. 2. © Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved. Cloud ~$410 B Streaming ~$1.65 B Data Science ~$180 B Big Data ~$210 B IoT ~$1.2 T MARKET OPPORTUNITIES
  3. 3. © Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved. IOT MARKET By 2024 more than 24.9 Billion IoT connections will be established An estimated $70 billion will be spent by global manufacturers on IoT solutions in 2020 An estimated 646 million healthcare devices (excluding fitness trackers and wearable devices) will be connected by 2020 An estimated 78% of cars shipped globally will be built with hardware that connects to the internet by 2020 50% of decision-makers in IT, services, utilities, and manufacturing have either deployed IoT, or will deploy it in the next 12-24 months $70B 646M 78% 50% 24.9B
  4. 4. © Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved. KEY CUSTOMER CHALLENGES Visibility: Lack visibility of end-to-end streaming data flows, inability to troubleshoot bottlenecks, consumption patterns etc. Data Ingestion: High-volume streaming sources, multiple message formats, diverse protocols and multi-vendor devices creates data ingestion challenges Real-time Insights: Analyzing continuous and rapid inflow (velocity) of streaming data at high volumes creates major challenges for gaining real-time insights
  5. 5. © Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved. CLOUDERA DATAFLOW
  6. 6. © Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved. WHAT IS CLOUDERA DATAFLOW (CDF)? Cloudera DataFlow (CDF) is a scalable, real-time streaming data platform that collects, curates, and analyzes data so customers gain key insights for immediate actionable intelligence.
  7. 7. © Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved. Mid-2000’s NiFi was developed and used at NSA 2015 Onyara is acquired HDF is born 2018 Strong Streaming Platform - Support for Kafka 2.0 - SMM is introduced Tomorrow: Edge-to-AI Bring this to the edge with connected platforms HISTORY OF CDF Data-in-Motion: • Comprehensive real-time streaming data platform • Manage data-in-motion from edge-to- enterprise • Power IoT-scale streaming architectures Enable next generation Modern Data Architecture 2019 Cloudera merger Enable Edge Intelligence
  8. 8. © Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved. COMMON USE CASES Data Movement Optimize resource utilization by moving data between data centers or between on-premises infrastructure and cloud infrastructure Optimize Log Collection & Analysis Optimize log analytics solutions by using CDF as a single platform to collect and deliver multiple data sources Gain key insights with Streaming Analytics Accelerate big data ROI by analyzing streaming data for patterns, comparing with ML models and delivering actionable intelligence Single view / 360° view of customer Ingest, transform and combine customer data from multiple sources into a single data view / lake Stream Processing Combine multiple streams of data in real- time, enrich the data and route it to different end points based on rules Capture IoT Data Ingest sensor data from IoT devices and stream it for further processing and comprehensive analysis
  9. 9. © Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved. Public Sector Transportation Utilities Healthcare Manufacturing Retail COMMON IOT USE CASES BY INDUSTRY Fleet Management Connected Cars Smart Cities Predictive Analytics Inventory/ Material Tracking • IoT is a $1.13T market opportunity in 2021. • Americas - $329B IoT spending. Manufacturing and Transportation are top industries, accounting for 26% of total spending. • APAC - $500B IoT spending. Manufacturing, Utilities and Transportation are top industries. • EMEA - $264B IoT spending. Manufacturing is top industry, powered by Industry 4.0 initiatives. • Worldwide IoT Analytics and Information Management Market = $573M Top 5 Use cases Utility Monitoring Predictive Maintenance Patient Monitoring Usage-based Insurance Asset Tracking / Monitoring Edge Data Collection
  10. 10. © Cloudera, Inc. All rights reserved. 10 CUSTOMERS
  11. 11. © Cloudera, Inc. All rights reserved. Improving Healthcare with SMART data Combine multi-format data streams, with hundreds of sources, into one platform • Needed a platform that could combine multi-format data streaming • Data scarcity & latency problems • Machine learning & data science • First to deliver SMART real- time streaming data • Clearsense’s Inception™ product enables fast decisions for clinicians • Customers have access to all data sources with HDP & CDF Cloud-based systems architected to deliver SMART data, using HDP and CDF • Mission critical data is now available for doctors to make critical decisions • Cost efficiencies led to access for 2,000 rural providers • Real-time data helps prevent “Code Blue” Mission-critical data and relevant insight for 2,000 rural providers Photo by rawpixel on Unsplash Lack of medical expertise around patient care, post surgery • Patient Code Blue status • Possible cardiac arrest 4– 6 hours post surgery C H A L L E N G E R E S U L TS O L U T I O NI M P A C T
  12. 12. © Cloudera, Inc. All rights reserved. Positioning technology products & services empower companies worldwide Provide accurate data for small carriers to improve business results • 95% of small carriers (less than 50 trucks) have a deficit of data available • Estimated data, price points and revenue base opportunity for controlling fuel cost • Understanding of freight and lane movement • Leveraging big data powering Blockchain, with machine learning, to revolutionize Transportation and Logistics industries • Analyzed fuel data; can consolidate data set for small carriers to generate community data lake Big Data in the Cloud with HDP, CDF, and Microsoft Azure • Managing for 4 million trucks daily • $31 billion dollars in freight movement guides customers to profitability • Blockchain driven architecture Double digit revenue increase, year over year C H A L L E N G E Photo by rawpixel.com on Unsplash Continuing on current path would slow organizational growth and impact customers • Being unable to predict weather patterns would lead to delays and decreased product quality • Operational inefficiencies prevent reaching business revenue goals, lack of insights • Loss of product during transportation R E S U L TS O L U T I O NI M P A C T
  13. 13. © Cloudera, Inc. All rights reserved. 13 PRODUCT OVERVIEW
  14. 14. © Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved. CLOUDERA DATAFLOW
  15. 15. © Cloudera, Inc. All rights reserved. 15 CLOUDERA DATAFLOW Data-in-motion platform
  16. 16. © Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved. EDGE DATA MANAGEMENT • Edge data collection powered by Apache MiNiFi • MiNiFi – smaller footprint than NiFi • Guaranteed delivery • Data buffering • Prioritized queuing • Flow-specific QoS • Data provenance • Designed for extension • C++ / Java agents • Designed for IoT
  17. 17. © Cloudera, Inc. All rights reserved. 17 CLOUDERA DATAFLOW Data-in-motion platform
  18. 18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved. FLOW MANAGEMENT • Web-based user interface • Highly configurable • Out-of-the-box data provenance • Designed for extensibility • Secure • NiFi Registry • DevOps support • FDLC • Versioning • Deployment
  19. 19. © Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved. 280+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute All Apache project logos are trademarks of the ASF and the respective projects. Fetch HTTP Syslog Email HTML Image HL7 FTP UDP XML SFTP AMQP WebSocket
  20. 20. © Cloudera, Inc. All rights reserved. 20 CLOUDERA DATAFLOW Data-in-motion platform
  21. 21. © Cloudera, Inc. All rights reserved. Streaming Analytics Reference Architecture Data Flow Apps Powered by NiFi Kafka is Everywhere. Critical Component of Streaming Architectures Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers US West Fleet Truck Sensors C++ Agent US Central Fleet Truck Sensors C++ Agent US East Fleet Truck Sensors C++ Agent Analytics App 1 Analytics App 2 Analytics App 5 Analytics App 3 Analytics App 4
  22. 22. © Cloudera, Inc. All rights reserved. Cloudera Streams Messaging Manager (SMM) What is SMM?  Kafka Management and Monitoring tool  Cure the “Kafka Blindness”  Single Monitoring Dashboard for all your Kafka Clusters across 4 entities – Broker – Producer – Topic – Consumer  REST as a First Class Citizen  Alerting  Schema Management  Integration with Schema Registry
  23. 23. © Cloudera, Inc. All rights reserved. 23 CLOUDERA DATAFLOW Data-in-motion platform
  24. 24. © Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved. STREAMING ANALYTICS • Pattern matching • Predictive and Prescriptive Analytics • Complex Event Processing • Continuous & Real-time Insights
  25. 25. © Cloudera, Inc. All rights reserved. OLAP Access PatternSQL Access Pattern Streaming Event Storage Substrate Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Kafka Topic Topic X 3 KafkaAnalyticsAccess Patterns Streaming Access Pattern N ew KAFKA SQL New KAFKA OLAP New
  26. 26. © Cloudera, Inc. All rights reserved. 26 CLOUDERA DATAFLOW Data-in-motion platform
  27. 27. © Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. ENTERPRISE SERVICES • Provisioning • Management • Monitoring • Unified Security • Single Sign-on • Audit • Compliance • Edge-to-Enterprise Governance
  28. 28. © Cloudera, Inc. All rights reserved. 28 CLOUDERA DATAFLOW Data-in-motion platform
  29. 29. © Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved. KEY DIFFERENTIATORS Comprehensive streaming platform – Only big data vendor to offer a comprehensive streaming platform from real-time data ingestion, transformation, routing to descriptive, prescriptive and predictive analytics. 100% open source technology – Only vendor with this strategy; prevents vendor lock-in 280+ pre-built processors – Only product to offer such comprehensive connectivity from edge to enterprise Built-in data provenance – Only product in the market to offer out-of-the-box data provenance on data- in-motion 3 Streaming analytics engines – Only vendor to offer a choice of three streaming analytics engines to customers for all their streaming architecture needs
  30. 30. © Cloudera, Inc. All rights reserved. 30 DEMO
  31. 31. © Cloudera, Inc. All rights reserved. 31 QUESTIONS?

×