Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google Cloud, Josh Treichel & Jeff Ferguson, Confluent) Kafka Summit 2020

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 29 Anzeige

Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google Cloud, Josh Treichel & Jeff Ferguson, Confluent) Kafka Summit 2020

Herunterladen, um offline zu lesen

Apache Kafka users who want to leverage Google Cloud Platform's (GCPs) data analytics platform and open source hosting capabilities can bridge their existing Kafka infrastructure on-premise or in other clouds to GCP using Confluent's replicator tool and managed Kafka service on GCP. Using actual customer examples and a reference architecture, we'll showcase how existing Kafka users can stream data to GCP and use it in popular tools like Apache Beam on Dataflow, BigQuery, Google Cloud Storage (GCS), Spark on Dataproc, and Tensorflow for data warehousing, data processing, data storage, and advanced analytics using AI and ML.

Apache Kafka users who want to leverage Google Cloud Platform's (GCPs) data analytics platform and open source hosting capabilities can bridge their existing Kafka infrastructure on-premise or in other clouds to GCP using Confluent's replicator tool and managed Kafka service on GCP. Using actual customer examples and a reference architecture, we'll showcase how existing Kafka users can stream data to GCP and use it in popular tools like Apache Beam on Dataflow, BigQuery, Google Cloud Storage (GCS), Spark on Dataproc, and Tensorflow for data warehousing, data processing, data storage, and advanced analytics using AI and ML.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google Cloud, Josh Treichel & Jeff Ferguson, Confluent) Kafka Summit 2020 (20)

Anzeige

Weitere von HostedbyConfluent (20)

Aktuellste (20)

Anzeige

Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google Cloud, Josh Treichel & Jeff Ferguson, Confluent) Kafka Summit 2020

  1. 1. Hybrid Streaming Analytics for Apache Kafka Users
  2. 2. Cody Irwin Solution Manager, Smart Analytics Google Cloud Josh Treichel Sr. Manager, Partner Solutions Confluent Jeff Ferguson Global Google Cloud Alliance Lead Confluent
  3. 3. Agenda: 1) Reference Architecture (5 min - Jeff Ferguson, Confluent) 2) Stream Analytics on GCP (10 min - Cody Irwin, Google Cloud) 3) Bridging your Kafka Deployment to GCP (10 min - Josh Treichel, Confluent) ➢ Customer Architectures and Stories ➢ Demo, Quickstart Guide, & Free Trial
  4. 4. Apache Kafka is an Event Streaming Platform Hadoop ... Device Logs ... App ...MicroserviceMainframes GCP Data Services Splunk ... Data Stores Logs 3rd Party Apps Custom Apps / Microservices Same Day Transactioning (Account Open) Fees Charges & Billing Real-time Customer 360 Machine Learning Models Real-time Data Transformation Real-time Fraud Detection (1) App modernization (2) Data modernization (3) Bidirectional hybrid
  5. 5. On-premises or Other Cloud Hybrid Kafka Reference Architecture Dataflow BigQuery Cloud Storage Data Studio Cloud Functions AI Platform Bigtable Confluent Replicator KSQL MySQL HDFS Teradata, Netezza Mainframe
  6. 6. Business is transforming Businesses have to anticipate and act on risks and opportunities faster than ever before The data and events needed for analysis are increasing in velocity, volume, and type Companies that are able to quickly identify and capitalize on insights within this changing landscape have a strategic advantage.
  7. 7. Why Enterprises choose Google Cloud for Streaming Analytics Serverless Architecture Robust ingestion services Unified batch and stream processing Comprehensive set of analysis tools Flexibility for users
  8. 8. Serverless data analytics From infrastructure to platform for insights Performance tuning Monitoring Reliability Deployment & configuration Utilization improvements The traditional data analytics platform Analysis and insights Resource provisioning Handling growing scale Analysis and insights The serverless data analytics model
  9. 9. Right-time Action Dashboard Visualize and share anomalous events in your data. Alerts Manage by exception through condition-based notifications. Actions Automatically trigger workflows in other systems using conditions. 1 2 3 Looker Blocks
  10. 10. Comprehensive set of analysis tools BigQuery Cloud Data Warehouse Easy setup Directly integrated with streaming Dataflow and Confluent Cloud Real time Fast insights and action powered by BigQuery’s Streaming API Intelligent Built-in ML for out-of-the-box predictive insights Cloud AI Platform AI & ML Tools Plug-and-play Easily experiment and collaborate with Google’s AI Hub Building blocks Tools for sight, language, conversation, and structured data Fast deployment Code-based AI platform quickly moves ML ideas to deployment Tensorflow Extended (TFX)
  11. 11. Improve the customer experience with Real-time AI TFX uses Dataflow and Apache Beam as the distributed data processing engine to enable several aspects of the ML life cycle, all supported with CI/CD for ML through Kubeflow pipelines. Predictive Analytics Fraud Detection Real-time Personalization More!
  12. 12. AI for every level of expertise APIs Pre-trained Models Infrastructure AI Foundation Building Blocks Platform AI Platform Development Environment AutoML Custom Models Services & Solutions Ease of Implementation Structured DataSight Language Conversation ServicesSolutions Collaboration Structured DataSight Language Integrated withBuilt-in Tools On-prem FrameworksAccelerators Document Understanding AI Talent Solution Contact Center AI ASL Professional Services Cloud AI PartnersAI Hub Video Intelligence Vision Natural Language Translation Inference Recommendations AI Speech- to-Text Text-to- Speech Dialogflow Enterprise Vision Natural Language Translation TablesVideo Datasets Training DataprocDataflow Dataprep Data StudioBigQueryKubeflowPredictions Data Labeling New Pre-built Algorithms New Notebook New VM Images New GPUTPU CPU
  13. 13. Flexibility for users Apache Beam Open-source, unified model and set of SDKs for defining and executing data processing Open source programming model Serves as the SDK for creating Cloud Dataflow jobs; community development increases flexibility Choose your language Java, Python, Scala, and GO are available; join DA Spotlight for news on languages Portability Program in Beam, and gain the ability to move between Spark, Flink, Dataflow, and more Dataflow Simplified stream and batch data processing Batch and Stream Reduce complexity and reuse code by driving batch and stream workloads from the same tool Reliable and consistent processing Exactly once processing with built-in support for fault-tolerant execution Simplified operations & management Performance, scaling, availability, security, and compliance handled automatically Integrated Integration with Kafka/Confluent Cloud, the Google Data Analytics suite, and GCP broadly Unified stream and batch processing
  14. 14. Ingest Transform Analyze Ingest and distribute data reliably Fast, correct computations quickly and simply Machine learning & data warehouse Cloud Dataflow Cloud MLPub/Sub BigQueryDataflow Flexible stream analytics with OSS KSQL
  15. 15. Title Safe > < Action Safe The Business Case Architectural Approach Business Solutions FSI | Fraud Analytics, Trade Data Capture1 Retail I Recommendations, Inventory Management, POS Processing2 Manufacturing | Anomaly Detection, Edge-to-Cloud ML3 General | Real-time Clickstream. CDC4 Many more to come!∞
  16. 16. On-premises or Other Cloud Kafka as the Real-Time Bridge Simplifies Cloud Migration Dataflow BigQuery Cloud Storage Data Studio Cloud Functions AI Platform Bigtable Confluent Replicator KSQL MySQL HDFS Teradata, Netezza Mainframe
  17. 17. Confluent Replicator Architecture Kafka Broker test-topic Kafka Broker test-topic Replicator consumer producer eventsevents Origin Destination Make clusters globally available Replicate clusters or a subset of topics across any distance Aggregate or migrate clusters anywhere Aggregate many clusters together or migrate entire clusters to a preferred environment Bridge self-managed clusters to a fully managed Kafka service Enable hybrid-cloud deployments with Confluent Cloud
  18. 18. Schema Registry Make data backwards compatible and future-proof KSQL Develop real-time stream processing apps writing only SQL Connectors Easily send data to cloud storage with BigQuery, GCS + more Confluent Cloud - Fully Managed Kafka and Much more! ! Schema Registry Kafk a topic ! Serial izer Serial izer
  19. 19. Kafka enables Unity’s massive GCP migration Unity Monetization Platform & Gaming Dev Platform Confluent Connector Dataflow Cloud Storage Confluent Replicator BigQuery Other Cloud
  20. 20. Unlock advanced AI/ML on GCP using data on prem On-premise or Other Cloud Web IoT Mobile Data Store Dataflow BigQueryEvents Confluent Replicator KSQL Train Fraud Models Deploy Models Tenso low Fraud ApplicationCurated Data Streams Full Data Stream Fraud ApplicationFraud ApplicationsFraud Applications Fraud App Consumption and Production Mainframe, Hadoop, Oracle
  21. 21. On-premises FinServ Fraud Analytics - On-prem to GCP Confluent Connector Cloud Dataproc Cloud Dataflow BigQuery Cloud Storage Cloud Bigtable Cloud Machine Learning Engine Confluent Replicator
  22. 22. Give it a try https://docs.confluent.io/current/tutorials /examples/kubernetes/replicator-gke-cc/ docs/index.html Launch it form the GCP console. Confluent Cloud $200/month for free for 3 months
  23. 23. Thank You!
  24. 24. Title Safe > < Action Safe On-premises or Other Cloud What does a hybrid Kafka architecture look like on GCP? Web IoT Mobile Data Store Dataflow BigQuery Cloud Storage 1 Trigger & Send 2 Ingest & Prepare 3 Transform & Enrich 4 Store & Analyze 5 Share & Activate Data Studio Cloud Functions AI Platform Bigtable Events Confluent Replicator
  25. 25. Unlock the value of event streaming
  26. 26. Title Safe > < Action Safe Unity leveraged Confluent Hybrid Kafka platform to build a massive data infrastructure and migrated from AWS to GCP. This infrastructure is powering Unity Gaming Dev Platform and Monetization Network, scaling to process Million events per second with zero outages.Gaming & Media Solution Confluent was chosen for better control, enterprise scale, Kafka innovation, and guidance on Kafka architecture and best practices. Challenge Bring together, unify, and modernize all the different data pipelines and technology stacks running in each department of the company as well as migrate from AWS to GCP. “As a small team we have large responsibilities that include managing the data infrastructure that underpins the Unity platform and helping make Unity a data-driven company. That’s one of the reasons we built our data infrastructure on Confluent Platform and Apache Kafka. Today, this infrastructure handles on average about a half million events per second, with peaks of about a million events per second. It also reliably handles millions of dollars of monetary transactions. In fact, since we went live with Confluent Platform and Kafka a year ago we have had zero outages that resulted in money loss.” Oguz Kayral, Engineering Manager, Data Platform, Unity Results ● Completed a massive migration with PetaBytes of data from AWS to GCP ● Scaled to handle Million events per second and reliably handle Millions of dollars of monetary transactions with zero outages ● Well-proven data infrastructure based on Confluent Platform & GCP Dataflow, BigQuery analytics has opened a lot of new possibilities for product teams across Unity. Unity-Confluent blog / Unity-Google blog
  27. 27. Title Safe > < Action Safe Other Solution Opportunities 1) App modernization with Event-driven Microservices (with Anthos option) 2) Data Lake & Data Warehouse modernization (Mainframe, Oracle, Hadoop, Teradata, ) 3) IoT (Manufacturing, Utilities, Smart Cars, etc)
  28. 28. Other Use Cases Confluent Cloud provides other opportunities to help customers in unique ways like...
  29. 29. Reference Architecture Ingest Pipelines Storage Analytics Application & Presentation App Engine Kubernetes Engine Cloud Storage Cloud Dataflow Cloud Dataflow Cloud Datastore Cloud Bigtable BigQuery Cloud Dataproc Cloud Datalab Compute Engine colo / dc / on-premises / other cloud

×