Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Confluent and Kanchan Waikar, AWS

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 30 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Confluent and Kanchan Waikar, AWS (20)

Anzeige

Weitere von HostedbyConfluent (20)

Aktuellste (20)

Anzeige

Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Confluent and Kanchan Waikar, AWS

  1. 1. Applying ML on your Data in Motion with AWS and Confluent KanchanWaikar Senior Specialist Solutions Architect at AWS kanchanwaikar
  2. 2. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Working with your data Data in your data lake Continuously generated click- stream data Third-party data procured from vendor Amazon Simple Storage Service (Amazon S3) Amazon Kinesis Amazon Managed Streaming for Apache Kafka AWS Data Exchange
  3. 3. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use AWS Data Exchange for procuring third-party data on AWS Data Providers No longer need to maintain data storage, delivery, billing, or entitling technology Automatically access new data Migrate existing subscriptions at no additional cost Easily analyze data as its published Distribute data in a secure and compliant way Quickly find diverse data in one place Migrate existing subscriptions at no additional cost Reach to millions of AWS customers Data Subscribers
  4. 4. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The AWS ML stack Broadest and most complete set of machine learning capabilities ML FRAMEWORKS & INFRASTRUCTURE TensorFlow, PyTorch, Apache MXNet Deep learning AMIs & containers GPUs Inferentia Elastic inference FPGA AI SERVICES Vision Rekognition Speech Polly Transcribe Chatbots Lex Contact centers Contact Lens Connect Voice ID Code + DevOps CodeGuru DevOps Guru Text Comprehend Translate Textract Business tools Personalize, Forecast Fraud Detector Lookout for Metrics Search Kendra Industrial Panorama Appliance and SDK, Monitron, Lookout for Equipment, Lookout for Vision Healthcare HealthLake Comprehend Medical Transcribe Medical Label data Data collection prep Store features Detect bias and explain predictions Visualize in notebooks Pick algorithm Manage & monitor Train models faster Deploy in production Tune parameters Manage edge devices SAGEMAKER STUDIO IDE CI/CD AMAZON SAGEMAKER
  5. 5. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Marketplace Flexible consumption and contract models Quick and easy deployment Helpful humans to support you 8,000+ listings 1,600+ ISVs 24 regions 290,000+ customers 1.5M+ subscriptions    
  6. 6. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Marketplace can help you get started Find A breadth of tools: Buy Free trial Pay-as-you-go Hourly | Monthly | Annual | Multi-Year Bring Your Own License (BYOL) Seller Private Offers Channel Partner Private Offers Through flexible pricing options: Deploy AWS Control Tower AWS Service Catalog AWS CloudFormation (Infrastructure as Code) Software as a Service (SaaS) Amazon Machine Image (AMI) Amazon Elastic Container Service (ECS) Amazon Elastic Kubernetes Service (EKS) With multiple deployment options:
  7. 7. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Confluent on the AWS Marketplace (free trial) https://aws.amazon.com/marketplace
  8. 8. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pre-trained ML Models
  9. 9. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ? Battle of security Buyer ’s data Vs Seller ’s IP Where to evaluate and qualify Which Model to buy Third-party AI Adoption challenges
  10. 10. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker AWS Marketplace for Machine Learning Computer vision NLP Speech recognition Text Image Audio Video Structured AWS Marketplace for Machine Learning Simplified provisioning Consolidated into your AWS billing Free | Free trial | Paid subscriptions Curated and trusted catalog of hundreds of ML model packages and algorithms
  11. 11. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A catalog of products from And more...
  12. 12. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Let you train a custom model. • Are ready-to-use. • Use models for: • Batch inference • Real-time inference • Generating Synthetic features • Use algorithms for: • Training a model • Hyperparameter optimization Pre-trained models Algorithms 2 1 E.g. Vehicle Damage Inspection E.g. AutoGluon Tabular Algorithm AWS Marketplace : Models and Algorithms Amazon SageMaker
  13. 13. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2 Amazon SageMaker Model from AWS Marketplace Subscribe 3 Deploy models in Amazon SageMaker with: • Network isolation mode (container has no internet access) • Endpoint configured in yourVPC • IAM policies and encryption Deploying ML models from AWS Marketplace 1 Find and try a model from AWS Marketplace Batch transform job Real-time inference 4 Secure REST APIAccess SECURE YOUR DATA BY DEPLOYING MODELS IN YOUR VPC, IN NETWORK ISOLATION MODE
  14. 14. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security Private AWS Marketplace AWS Service Catalog Scanning IAM policies Network isolation Ensures data protection Vulnerability scans Encryption Secure REST API access
  15. 15. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model packages : Perform inference Identify your customer Identify Passport Detail Page, etc. Generate better recommendations Identify brands, understand customer propensity to buy, etc. Improve workplace safety: Identify compliance Identify presence of non-workers at worksite, workers are wearing of PPE, masks, etc. Insurance claim automation Claim prediction, vehicle’s make/model/year identification, etc. Process call-center data intelligently Speech recognition, source separation, etc. To address business needs Want to solve your problem quickly? Yes Use pre-trained ML model packages Machine Learning problem?
  16. 16. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model packages: Identify high-quality data Identify and use good quality data Is all data of good quality? No Yes Identify and use good quality data Do Feature Engineering and train an ML model Image Image Quality Score — identify blurry or pixelated images Audio Background Noise Classifier (CPU) — identify audio files with background noise Text Text summarizer — convert wordy documents into short summaries Phishing email classifier Dynamic AI Text Similarity Model — identify text similarity Language Scoring Inference Model — identify difficulty level
  17. 17. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model packages: Generate synthetic features A custom ML model needs high-quality features Can you use an out-of-the-box solution? No Feature engineer data Train a custom ML model Image Fashion localization, interior localization—extract features for recommendation engine Audio Source separation—separate lyrics from background sound and use lyrics for your transcribe job Text Insult detection—emotion analysis model Sentence classification—extract additional insights about characteristics associated with the author
  18. 18. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Demo
  19. 19. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Demo: Deploy model and perform inference 1 Choose ML model Subnet Deploy model in form of an endpoint VPC 2 Subscribe AWS CloudFormation Perform inference AWS Command Line Interface 3
  20. 20. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon MSK Fully managed, highly available, and secure Apache Kafka service Highly secure Protect your data with multiple levels of security, including VPC network isolation, encryption at-rest and in-transit, and more Highly available Take advantage of multi-AZ replication within an AWS region Elastic stream processing Run Apache Flink applications written in SQL, Java, or Scala that elastically scale to process data streams Fully managed Focus on creating applications not managing your Apache Kafka environment Fully compatible Run your existing Apache Kafka applications on AWS without changes to source code
  21. 21. A sample use-case
  22. 22. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A real world streaming use-case Construction work-site Use-case: A non-compliance identification and notification system Goal: Reduce accidents by identifying and correcting non-compliance Infrastructure-scale: Tens of thousands of surveillance cameras, hundreds of construction sites, Thousands of construction workers Project Requirement: Design an automated PPE non-compliance reporting system that reports incidences in matter of minutes. Technical requirements: • Must scale up-and-down automatically based on loads (Cameras turned ON and OFF based on shift hours)
  23. 23. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Construction site non-compliance notification system Courtesy - https://pixabay.com/videos/construction-road-excavator-worker-26239/
  24. 24. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Summary Logs (Start) HH:mm:SSS-(End) HH:mm:SSS : Alarm/No alarm : Status Details 00:00:000-00:00:015 : No Alarm : 1 truck(s), 1 excavator(s), 1 workers found. 00:00:015-00:00:045 : No Alarm : 2 truck(s), 1 excavator(s), 1 workers found. 00:00:045-00:00:060 : No Alarm : 1 truck(s), 1 excavator(s), 1 workers found. 00:00:060-00:00:075 : No Alarm : 1 truck(s), 1 excavator(s), no workers found. 00:00:075-00:00:090 : ALARM : 1 worker(s) wearing PPE but 0 wearing hard hats, 1 truck(s), 1 excavator(s) found. 00:00:090-00:00:105 : No Alarm : 1 truck(s), 1 excavator(s), no workers found. 00:00:105-End : ALARM : 1 worker(s) wearing PPE but 0 wearing hard hats, 1 truck(s), 1 excavator(s) found. 3.0 4.5 6.0 7.5 9.0 10.5 0.0 1.5 Snapshots
  25. 25. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Components: Building ML-driven Streaming applications Amazon Simple Notification Service Notification mechanism Metadata storage ML Model Video/data storage Amazon QuickSight Amazon DynamoDB ML Model deployment mechanism Amazon SageMaker Model Data Visualization tool Amazon Athena Interactive Query service Amazon Simple Storage Service (S3) KSQLDB Event Streaming platform AWS Lambda Serverless compute ksqlDB
  26. 26. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Pre-trained ML Models AWS Lambda ksqlDB Email notification Safety Administrator Mobile client Amazon Simple Notification Service AWS Lambda S3 Sink S3 Bucket Topic Amazon Athena Alarms S3 Bucket Construction work-site Amazon Kinesis Video Streams
  27. 27. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Demo!
  28. 28. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ML Models AKTE - Forklift Detector Pre-trained Model PPE Detector for Laboratory Safety TensorIoT CV PPE Mask Detection Helmet & Vest Detector for Worker Safety Social Distancing Detector Hard Hat Detector for Worker Safety Construction Worker Detection Construction Machines Detector GluonCV YoloV3 Object Detector Pre-trained Model Pre-trained Model Pre-trained Model Pre-trained Model Pre-trained Model Pre-trained Model Pre-trained Model Pre-trained Model .. 400+ more! aws-mp-bd-ml@amazon.com
  29. 29. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Conclusion Experiment, do POCs to evaluate solutions, and innovate on behalf of your organization Scale your applications using streaming platforms such as Confluent Cloud from AWS Marketplace Build powerful streaming applications powered by machine learning models
  30. 30. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Questions? Kanchan Waikar Senior Specialist Solutions Architect at AWS kanchanwaikar

Hinweis der Redaktion

  • Hi My name is Kanchan Waikar and I am a Senior specialist architect from AWS and today, I am going to share how you can add differentiating features backed by machine learning to your applications.

    First I will talk about different types of data that you typically extract insights from and then I will share how you can use hundreds of pre-trained ml models to build differentiating features. And finally I will share a sample use-case and Joseph and I would do a demo of how you would integrate ml models into your kafka applications
  • Often there are three types of datasets that you use. In house data sitting in your data lake in your S3 buckets clickstream or real-time data, also known as data in motion and the third-party data that you procure from your data vendors

    Amazon S3) is the largest and most performant object storage service for structured and unstructured data and the storage service of choice to build a data lake. And there are several tools and services that you can use to build your data lake in s3.

    For your data in motion, Amazon Kinesis offers a variety of key capabilities such as data streams for your clickstream data, firehose for persisting data in S3, kinesis analytics for real-time analytics, and video streams for video data. And then there is a managed service for Kafka which you can use to build and run applications that use Apache Kafka to process streaming data.

    Many customers have in-house ml capabilities and time, and these customers need external, real-world, high quality data However, they need to worry about moving this data into their AWS cloud, not only once but on regular intervals since the third party data vendors often produce and share data at regular intervals. And this is exactly the problem AWS Data Exchange solves.
  • AWS Data Exchange makes it easy for you to procure third-party from your data vendors. AWS Data Exchange contains over three thousand data products which you can procure from and once procured, the data can be loaded into your S3 bucket. Once its in your data lake, you can use tools you wish to use to perform analytics.

    AWS Data exchange also supports incremental data delivery from sellers so whenever a new revision of the dataset you have subscribed to becomes available, you get a cloudwatch event notification that you can use to consume incremental data as part of your application.
     Once you have data in place, you can perform machine learning and extract insights your business needs.
  • And Amazon SageMaker can help you with that.

    You can use Amazon SageMaker to build train and deploy ML models. It provides several features that help you with the end-to-end delivery and management of machine learning models.

    If you prefer something out-of-the box, I recommend checking AI services suite.

    Amazon Rekognition makes it easy to add image and video analysis to your applications.

    Amazon Transcribe provides speech-to-text capability .

    Amazon Translate provides high-quality language translation.

    Amazon Comprehend is a natural language processing (NLP) service.

    Amazon Lex is a service for building conversational chatbots.

    Forecast helps you train forecasting ml models

    And you can use personalize to build recommendation systems.

    In short, AWS Offer a range of AI, ML, and analytics services.


    And Apart from first party services, you also get a large selection of third-party AI & ML solutions in AWS Marketplace.
  • AWS Marketplace is where you find, try, buy, and deploy third-party software from. It contains over 10,000 different catagories of software such as machine learning, analytics Data and several other categories of products. They are easy to deploy and a lot of products are even available as SaaS offerings, so you have nothing to deploy.

    And the billing of these third party product is consolidated on your AWS bill.

  • The AWS Marketplace offers a wide variety of pricing options to fit your specific need, with many of the sellers providing free trials.

    AWS Marketplace supports standard pricing options, such as hourly, monthly, and annually. If you migrate to AWS and want to take some of your existing tools, you can bring your own license as well.
     
    When you have a relationship with the Seller or Consultant, you can negotiate the price with them and generate a private offer for you via AWS Marketplace.
     
  • Infact, the kafka via Confluent Cloud SaaS product that Joseph spoke about can be procured via AWS Marketplace.

    Confluent Cloud helps you manage Apache Kafka, Schema Registry, Connect, and ksqlDB so you can effectively focus on development and delivery for your real-time streaming and analytics use cases.
    Now I am going to tell you about pre-rtained machine learning models which you can use to instantly add ML backed differentiating features to your application
  • A pre-trained ML model is an entity that accepts an input payload and returns you a prediction. A pre-trained ml model typically solves a type of a problem
    E.g. there is a ML model that accepts a car’s picture as an input and performs a prediction returning the make, model, and year of the car.

    There is another pre-trained machine learning model that identifies whether a person is wearing a mask or not.

    And customers like using pre-trained models because they let you get around the heavy lifting of hiring ML resources and training and tuning ml models from scratch.

    So users look for high-quality pre-trained ml models, typically developed by machine learning vendors, however, it can be tricky to use third-party ML solutions
  • There are plenty of high quality models available from technology companies but during the initial discussion phase itself, an important question arises –where to evaluate and qualify the model?

    Does it happen in the seller's environment or does it happen in buyer’s environment?

    Seller wants to protect IP

    And buyer wants to protect the data (which is often sensitive to their business)

    You also need to learn seller specific interface to interact the with model

    We heard all these challenges from customers and decided to solve them via AWS Marketplace

  • You can find hundreds of third party machine learning models and algorithms that you can try, buy, and deploy in your AWS environment, via amazon SageMaker

    You can try without having to learn different API interfaces just to be able to evaluate them.
  • AWS Marketplace contains a large set of models from leading machine learning ISVs, (Independent software vendors) and some open source frameworks from AWS too.
  • There are 2 types of Amazon SageMaker compatible machine learning products -

    First is Model packages - An ML model is an entity that accepts an input payload and returns you a prediction.
     
    And second one is an algorithm, that you use to train a custom ml model.

    E.g. say you have a large dataset you want to train a regression model on, well you can use high-performance AutoML algorithms such as AutoGluon-Tabular from AWS Marketplace and train a high quality ml model without having to learn a whole lot about machine learning
  • The AWS Marketplace for machine learning models works very much similar to other AWS Marketplace products. You can browse and choose the model you like. Many models have the option that lets you try a demo with your data for free without you having to subscribe.

    <click 1>
    These models are deployed using Amazon SageMaker in your AWS account with network isolation to protect your data.

    <click 1>

    You can easily perform real-time or batch inference on these models via REST API.
  • AWS Marketplace enables customers to use third party models securely via four key features.

    When sellers list the models, static and dynamic scans are performed by Amazon SageMaker for vulnerabilities to help you secure your data.

    Amazon SageMaker encrypts algorithm and model artifacts and other system artifacts in transit and at rest and it isolates the deployed algorithm/model artifacts from internet access, helping you to secure your data

    These containers are deployed in internet free environment . You can even choose to deploy them in your own private VPC and control access to the same. You can also configure and monitor VPC flow logs to see whats going in and what's coming out of the container.

    Requests to the Amazon SageMaker API are over a secure (SSL) connection

    Amazon SageMaker requires AWS Identity and Access Management credentials to access resources and data on your deployment via an IAM execution role

    And you can also use private marketplace and AWS Service catalog to further control procurement and distribution of the model
  • Developers with little data science expertise and Business users use models to easily build AI/ML backed solutions

    You can see multiple sample use-cases on the slide for example you can use an ml model to perform optical character recognition to extract characters from an image.
  • Data Analysts have to manually identify and use only good quality data and Often this process can be expediated by using an ML model.


    E.g.  
    Background noise classifier can be used to identify whether an audio file contains background noise not. With ML models you can go one step further and use a Source separation model that separates the audio from background sounds. And then use the audio files for training your ml models

     
    There is an ML models in image and text catagory too
  • during feature engineering step, data analysts, data engineers identify and create multiple features that can be used.

     There are ML models that you can use to generate additional synthetic features.

    E.g. there is a model that accepts text and returns emotion, which can be a powerful synthetic feature for your book-genre classification ML model.
     
    Now let me show you how to explore and deploy an ml model.
  • Now let me show you how to browse and perform inference on an ml model.
  • Here is a quick summary of what I did during my demo today.

    I chose a model, i subscribed to it and then i executed cloudformation template to deploy the model.

    Once model was deployed i used AWS CLI to perform inference

    You can see how application developers without any ML knowledge can integrate ML in solutions
  • Now before I start talking about how you can integrate ML in your kafka application I want to quickly cover a little more about Amazon Managed Kafka service


    Amazon MSK makes it easy for you to build and run production applications on Apache Kafka without needing Apache Kafka infrastructure management expertise. That means you spend less time managing infrastructure and more time building applications.

    With a few clicks in the Amazon MSK console you can create highly available Apache Kafka clusters with settings based on best practices. MSK automatically provisions and runs your clusters.

    It continuously monitors cluster health and automatically replaces unhealthy nodes with no downtime to your application.
  • now let me show you how you can add machine learning to your applications via a hypothetical use-case.
  • Imagine that you work for a construction site SAFETY surveliance PRODUCT company.

    YOUR customers have installed cameras provided by your company and your company supposed to monitor to help them improve compliance to safety standards.


    So, you need to ensure that onsite workers are wearing personal protective equipment and hard hats which help avoid serious injuries. You know that you need a solution which helps you identify non-compliance early in the game. You also need a system which scales to accommodate large number of cameras.
  • Let me show you what I am talking about via this video. We see that there are two workers and one of them is not wearing a hard-hat, which means a guideline has not been followed.
    We would like to identify this non-compliance incidence so that It can be fixed to reduce probability of occurrence any accidents or head injuries.
  • Ideally we want a system which summarizes actions happening via a summary log. And whenever non-compliance happens, it should get detected.
     
    On this slide, you can snapshot by snapshot the progress happeneing at the worksite.
    And you can see that around 7.5 seconds into the video, you see the driver who is handling excavator getting detected and an alarm getting generated which can potentially inform driver to wear PPE.
  • To build such an architecture, Here are the components we are going to need.
    Amazon S3 – is a scalable object storage service.
    Detection of a hard hat, PPE is a machine learning task – you need what we call a machine learning model which can take an image and return a prediction.
     
    Next you need Amazon Sagemaker – A Machine learning platform on which you can build, train and deploy ML models, For building solution for the problem I just discussed, we would deploy computer vision models which would help us identify whether a person is in picture and whether he or she is wearing high visibility vest and hard hat.
    Whenever we detect non-compliance we would want to notify administrators and for that we would use SNS – simple notification service which lets us send email, text message and other kinds of notifications.
     
    We also need an event streaming platform which would help us scale the analysis of the summary logs generated for hundreds and thousands of cameras – We would use kafka topics created using confluent cloud.
     
    Similiary we will use ksqldb to transfer non-compliance messages into a separate topic for further processing.
     
    Once store din S3, we can use tools such as Quicksight, EMR, and Athena to transform, analyse and visualize data.
     
    Now let me show you architecture as well as the demo of how we implemented the solution for the problem.
  • Here is how the architecture that uses ml models and confluent cloud from AWS Marketplace would look like.
    The feed from the camera can be fed to kinesis video stream and then store it into S3. S3 bucket can be configured to trigger lambda via event notification which would perform inference and generate summary logs as well as identify non-compliance alarms.
     
    We want the architecture to be elastic, pay-as-you-go, and scalable, to be able to scale to hundreds and thousands of such cameras. I also didn’t want to manage kafka infrastructure, and so I chose to use Kafka Confluent cloud from aws marketplace. So we would push summary logs into a kafka topic and KSQLDb query would transport the alarm topics into another topic and generate SNS notification via a lambda function.
     
    Now let us switch into AWS and confluent console, and show you how this architecture works.
  • Ok, now let me show you how you can add machine learning to your applications via a hypothetical use-case.
  • Thanks Joseph,

    To give you a little more insight into the architecture, I have listed some relevant models on this slide.

    In our todays use case, we used a machine learning model that detects specialized construction machinery such as a forklift.

    We also used a machine learning model that identifies whether a person is wearing a hard hat and a high-visibility protective vest – Which have been mandated for construction workers under OSHA guidelines to minimize exposure to hazards that cause serious workplace injuries and illnesses.


    So as you can see, with pre-trained ML models and AI services offered by AWS, it becomes really easy to add machine learning backed features to your kafka applications.

    I recommend you to explore AWS Marketplace and identify ML models that are suitable for you. And If you need any customizations in the ML model, get in touch with AWS Markteplace team.
  • To summarize, Do, experiment with different tools such as confluent cloud to see which tools or service can help you scale your architectures, evaluate them and see if they fit your needs.
    Use pre-trained machine learning models to make your workflows intelligent and to add differentiating features which help you stand out.

    Use managed solutions such as confluent cloud from AWS Marketplace
    And most importantly, innovate on behalf of your organization.
  • Feel free to reach out to me If you have any questions

×