SlideShare ist ein Scribd-Unternehmen logo
1 von 37
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
BENGALURU
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Build, Train and Deploy your ML
models with Amazon SageMaker
Sangeetha Krishnan, Member of Technical Staff | Adobe
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ABOUT MYSELF
● Software Development Engineer at Adobe Systems.
● Part of CloudTech & Adobe I/O Events development
team.
● Areas of interest:
○ Machine Learning
○ Natural Language Processing
○ Music + Travel!
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
WHY AMAZON SAGEMAKER?
★ It is a fully managed machine learning service
★ Very quick and easy to build, train and deploy your ML models
★ Integrated Jupyter notebook
★ Several built-in machine learning algorithms provided by Amazon
Sagemaker
★ Capability to automatically tune the machine learning models to
generate the best solution
★ Easy deploy to production
★ Automatic scaling for production variants
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AGENDA
Image caption 1 Image caption 2 Image caption 3
Image caption 4 Image caption 5 Image caption 6
Getting started
with Amazon
SageMaker
Built-in ML
Algorithms
Hyper-
parameter
tuning
Accessing the
model
endpoints
Blue/Green
Deployments
Security and
Best Practices
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Getting Started with Amazon
SageMaker
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Storage and
deployment
dependencies:
1. S3 bucket
2. EC2 Container
Registry
3. Notebook
instance
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Setting up the prerequisites and Notebook instance
● Create an S3 bucket ( preferable to have name of S3 starting with
sagemaker-*)
● Create a notebook instance:
https://console.aws.amazon.com/sagemaker/
● Create an IAM role give access to the S3 bucket created
● Create a Jupyter notebook
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SMS Spam Detection
● Problem Statement: Given an SMS Text, determine if the message is a spam
or ham ( non-spam )
● Kaggle dataset: https://www.kaggle.com/uciml/sms-spam-collection-dataset
● The first column is the label (spam/ham), the second column in the dataset is
the SMS text.
● 5574 messages in the dataset. 87% ham (4850) , 13% spam (724)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Built-in Machine Learning
Algorithms
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Different Categories of Algorithms
Supervised
Learning
Unsupervised
Learning
Image and Object
Focused
Text Related
● DeepAR Forecasting
● Factorisation
Machine
● Linear Learner
● XGBoost
● K Means Algorithm
● K Nearest
Neighbours
● PCA
● Random Cut Forest
● Image Classification
algorithm that uses
CNN
● Object Detection
Algorithm
● Blazing Text
● LDA
● Sequence2Sequenc
e
● Neural Topic Model
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Problem categories
Factorization Machines
● Recommendation
Systems
● Ad-click predictions
XGBoost
● Fraud predictions
DeepAR Forecasting
● Traffic, electricity,
pageviews
Random Cut Forests
● Detecting anomalous
data points
K-Means algorithm and KNN
● Document clustering
● Identifying related articles
Image and Object Detection
and Classification
BlazingText and
Seq2Seq
● Sentiment Analysis
● Named Entity
Recognition
● Machine Translation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Advantages of using Built-in Algorithms
● Designed for solving issues in a commercial setting
● General common purpose algorithms
● Designed for training on huge datasets
● Can support terabytes of data
● Greater reliability
● Faster training and streaming of the datasets
● Training on multiple instances that share their state
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
BlazingText Algorithm
● Provides highly optimized implementations of the Word2vec and text
classification algorithms.
● Is used in problems like text classification, named entity recognition, machine
translation, etc.
● Word2Vec generate word embeddings
● Ability to generate meaningful vectors for out of vocabulary words
● Provide semantic relationship between words
● Useful for many NLP problems
● Can be trained for huge datasets in a couple of minutes
● Supports multi core CPU and single GPU modes for the purpose of training
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data format
● Supervised Learning
● Binary classification
Text Classification:
● Training and Validation set (Text file)
__label__spam sms_text1
__label__ham sms_text2
● Test Data (Json request)
{"instances":["sms_text_1", “sms_text_2”,... ]}
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hyperparameter Tuning
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tunable parameters for BlazingText Classification Model
buckets [1000000, 10000000]
epochs [5, 15]
learning_rate [0.005, 0.05]
min_count [0, 100]
mode [‘supervised’]
vector_dim [32,300]
word_ngrams [1,3]
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
bt_model = sagemaker.estimator.Estimator(container,
role,
train_instance_count=1,
train_instance_type='ml.m4.xlarge',
train_volume_size = 30,
train_max_run = 360000,
input_mode= 'File',
output_path=s3_output_location,
sagemaker_session=sess)
bt_model.set_hyperparameters(mode="supervised",
epochs=10,
min_count=2,
early_stopping=True,
patience=4,
min_epochs=5)
from sagemaker.tuner import HyperparameterTuner, IntegerParameter, ContinuousParameter
hyperparameter_ranges = {'learning_rate': ContinuousParameter(0.01, 0.05),
'vector_dim': IntegerParameter(20, 50),
'word_ngrams': IntegerParameter(1, 4)}
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
objective_metric_name = 'validation:accuracy'
objective_type='Maximize'
tuner = HyperparameterTuner(estimator=bt_model,
objective_metric_name=objective_metric_name,
objective_type=objective_type,
hyperparameter_ranges=hyperparameter_ranges,
max_jobs=4,
max_parallel_jobs=2
)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hyperparameter Tuning Jobs
blazingtext-181002-
1157-001-073948fb
blazingtext-181002-
1157-002-fba7f0b8
blazingtext-181002-
1157-003-f7407aa1
blazingtext-181002-
1157-004-3a4b4863
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Maximize/Minimize Objective Metric
● In this example, the objective metric is to maximize the 'validation:accuracy'
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Accessing the Model Endpoints
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Accessing the Model Endpoints via Postman
● Get the invocation endpoint from the model details in Amazon
Sagemaker Console
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Accessing the Model Endpoints via Postman
● Generate the access key ID and secret access key from your security
credentials. This will be used in creating the Authorization token in postman
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Accessing the Model Endpoints via Lambda Functions
Setting up lambda function to
access the SageMaker
endpoint and return the
response
Setting up the API
Gateway that
1. Accepts client
request
2. Forwards the
request parameters
to the lambda
function and waits
for the response
3. Forwards the
response to the
client
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
import os
import io
import boto3
import json
import csv
ENDPOINT_NAME = os.environ['ENDPOINT_NAME']
runtime = boto3.Session().client(service_name='sagemaker-runtime',region_name='ap-southeast-2')
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
payload = "{"instances" : [""+event+""]}"
print(payload)
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME, Body=payload)
predictions = response['Body'].read().decode('utf8')
print(predictions)
return predictions
Lambda Function
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
API Gateway Configuration
Curl request:
curl -X POST 
https://los599anje.execute-api.ap-
southeast-
2.amazonaws.com/test/spamdetection 
-d '"Hello from Airtel. For 1 months free
access call 9113851022"'
Response:
"[{"prob": [0.880291223526001],
"label": ["__label__spam"]}]"%
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Blue/Green Deployments using
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deploying multiple models to the same endpoint
● Particularly useful in blue/green deployments
● Blue deployment -> current deployment
● Green deployment -> the new model that is to be tested in production
● We can do this by diverting a small amount of traffic to the green
deployment. We achieve this using ProductionVariant in SageMaker.
● Easy rollback to the blue deployment if the green one fails.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Steps in Blue/Green Deployment
User
SageMaker
Endpoint
A
100%
User
SageMaker
Endpoint
A
90%
User
SageMaker
Endpoint
100%
B
10%
B
User
SageMaker
Endpoint
A
0%
B
100%
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Switching to the new ModelProduction Variant Deployment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Invocation Metrics
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security and Best Practices
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deployment Recommendations
● Deploy multiple instances for each production endpoint
● Deploy in VPC with more than one subnets with multiple
availability regions
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security Practices
● Specify Virtual Private Cloud (VPC) for your notebook instance with
outbound connections via Network Address Translation (NAT)
● Exercise judgement when granting individuals access to notebook
instances that are attached to a VPC that contains sensitive
information.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Securing Training jobs and Endpoints
● Run Training jobs in a private VPC.
● Create a VPC endpoint to access S3
● Configure custom policy that allow access to S3 only from your private
VPC
● If you want to deny access to certain resources, add to the custom
policy
● Configure rule for the security group to allow inbound communication
between other members in the same security group
● Configure a NAT gateway that allows only outbound connections from
the private VPC.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
BENGALURU
THANK YOU

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
AWS CodeStar 및 Cloud9을 통한 서버리스(Serverless) 앱 개발 길잡이 - 윤석찬 (AWS 테크에반젤리스트)
 
ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...
ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...
ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:I...
 
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
A New Approach to Continuous Monitoring in the Cloud: Migrate to AWS with NET...
 
End-to-End Machine Learning with Amazon SageMaker
End-to-End Machine Learning with Amazon SageMakerEnd-to-End Machine Learning with Amazon SageMaker
End-to-End Machine Learning with Amazon SageMaker
 
Increase the Value of Video with ML & Media Services - SRV322 - Chicago AWS S...
Increase the Value of Video with ML & Media Services - SRV322 - Chicago AWS S...Increase the Value of Video with ML & Media Services - SRV322 - Chicago AWS S...
Increase the Value of Video with ML & Media Services - SRV322 - Chicago AWS S...
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
 
Increase the Value of Video with ML & Media Services - SRV322 - Toronto AWS S...
Increase the Value of Video with ML & Media Services - SRV322 - Toronto AWS S...Increase the Value of Video with ML & Media Services - SRV322 - Toronto AWS S...
Increase the Value of Video with ML & Media Services - SRV322 - Toronto AWS S...
 
Build Text Analytics Solutions with Amazon Comprehend and Amazon Translate
Build Text Analytics Solutions with Amazon Comprehend and Amazon TranslateBuild Text Analytics Solutions with Amazon Comprehend and Amazon Translate
Build Text Analytics Solutions with Amazon Comprehend and Amazon Translate
 
Fraud Detection and Prevention on AWS using Machine Learning
Fraud Detection and Prevention on AWS using Machine LearningFraud Detection and Prevention on AWS using Machine Learning
Fraud Detection and Prevention on AWS using Machine Learning
 
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
Get the Most out of Your Amazon Elasticsearch Service Domain (ANT334-R1) - AW...
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
 
Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)
 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn
 
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
 
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMakerAWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
 
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)
 

Ähnlich wie Build, train and deploy your ML models with Amazon Sage Maker

AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
Amazon Web Services Korea
 

Ähnlich wie Build, train and deploy your ML models with Amazon Sage Maker (20)

Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
 
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
Build Deep Learning Applications Using Apache MXNet, Featuring Workday (AIM40...
 
Build, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMakerBuild, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMaker
 
Best Practices for Scalable Monitoring (ENT310-S) - AWS re:Invent 2018
Best Practices for Scalable Monitoring (ENT310-S) - AWS re:Invent 2018Best Practices for Scalable Monitoring (ENT310-S) - AWS re:Invent 2018
Best Practices for Scalable Monitoring (ENT310-S) - AWS re:Invent 2018
 
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
 
Get Started with Deep Learning and Computer Vision Using AWS DeepLens (AIM316...
Get Started with Deep Learning and Computer Vision Using AWS DeepLens (AIM316...Get Started with Deep Learning and Computer Vision Using AWS DeepLens (AIM316...
Get Started with Deep Learning and Computer Vision Using AWS DeepLens (AIM316...
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference  Deep learning acceleration with Amazon Elastic Inference
Deep learning acceleration with Amazon Elastic Inference
 
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
 
Building Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet GluonBuilding Content Recommendation Systems using MXNet Gluon
Building Content Recommendation Systems using MXNet Gluon
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências - MCL...
Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências -  MCL...Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências -  MCL...
Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências - MCL...
 
Amazon SageMaker and Chainer: Tips & Tricks (AIM329-R1) - AWS re:Invent 2018
Amazon SageMaker and Chainer: Tips & Tricks (AIM329-R1) - AWS re:Invent 2018Amazon SageMaker and Chainer: Tips & Tricks (AIM329-R1) - AWS re:Invent 2018
Amazon SageMaker and Chainer: Tips & Tricks (AIM329-R1) - AWS re:Invent 2018
 
Introduction to Scalable Deep Learning on AWS with Apache MXNet
Introduction to Scalable Deep Learning on AWS with Apache MXNetIntroduction to Scalable Deep Learning on AWS with Apache MXNet
Introduction to Scalable Deep Learning on AWS with Apache MXNet
 
Demystifying Machine Learning On AWS - AWS Summit Sydney 2018
Demystifying Machine Learning On AWS - AWS Summit Sydney 2018Demystifying Machine Learning On AWS - AWS Summit Sydney 2018
Demystifying Machine Learning On AWS - AWS Summit Sydney 2018
 
End to End Model Development to Deployment using SageMaker
End to End Model Development to Deployment using SageMakerEnd to End Model Development to Deployment using SageMaker
End to End Model Development to Deployment using SageMaker
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker WorkshopAWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
 

Mehr von AWS User Group Bengaluru

Mehr von AWS User Group Bengaluru (20)

Demystifying identity on AWS
Demystifying identity on AWSDemystifying identity on AWS
Demystifying identity on AWS
 
AWS Secrets for Best Practices
AWS Secrets for Best PracticesAWS Secrets for Best Practices
AWS Secrets for Best Practices
 
Cloud Security
Cloud SecurityCloud Security
Cloud Security
 
Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3
 
Medlife journey with AWS
Medlife journey with AWSMedlife journey with AWS
Medlife journey with AWS
 
Building Efficient, Scalable and Resilient Front-end logging service with AWS
Building Efficient, Scalable and Resilient Front-end logging service with AWSBuilding Efficient, Scalable and Resilient Front-end logging service with AWS
Building Efficient, Scalable and Resilient Front-end logging service with AWS
 
Exploring opportunities with communities for a successful career
Exploring opportunities with communities for a successful careerExploring opportunities with communities for a successful career
Exploring opportunities with communities for a successful career
 
Slack's transition away from a single AWS account
Slack's transition away from a single AWS accountSlack's transition away from a single AWS account
Slack's transition away from a single AWS account
 
Log analytics with ELK stack
Log analytics with ELK stackLog analytics with ELK stack
Log analytics with ELK stack
 
Serverless Culture
Serverless CultureServerless Culture
Serverless Culture
 
Refactoring to serverless
Refactoring to serverlessRefactoring to serverless
Refactoring to serverless
 
Amazon EC2 Spot Instances Workshop
Amazon EC2 Spot Instances WorkshopAmazon EC2 Spot Instances Workshop
Amazon EC2 Spot Instances Workshop
 
Building Efficient, Scalable and Resilient Front-end logging service with AWS
Building Efficient, Scalable and Resilient Front-end logging service with AWSBuilding Efficient, Scalable and Resilient Front-end logging service with AWS
Building Efficient, Scalable and Resilient Front-end logging service with AWS
 
Medlife's journey with AWS from 0(zero) orders to 6 digit mark
Medlife's journey with AWS from 0(zero) orders to 6 digit markMedlife's journey with AWS from 0(zero) orders to 6 digit mark
Medlife's journey with AWS from 0(zero) orders to 6 digit mark
 
AWS Secrets for Best Practices
AWS Secrets for Best PracticesAWS Secrets for Best Practices
AWS Secrets for Best Practices
 
Exploring opportunities with communities for a successful career
Exploring opportunities with communities for a successful careerExploring opportunities with communities for a successful career
Exploring opportunities with communities for a successful career
 
Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3Lessons learnt building a Distributed Linked List on S3
Lessons learnt building a Distributed Linked List on S3
 
Cloud Security
Cloud SecurityCloud Security
Cloud Security
 
Amazon EC2 Spot Instances
Amazon EC2 Spot InstancesAmazon EC2 Spot Instances
Amazon EC2 Spot Instances
 
Cost Optimization in AWS
Cost Optimization in AWSCost Optimization in AWS
Cost Optimization in AWS
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Build, train and deploy your ML models with Amazon Sage Maker

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. BENGALURU
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Build, Train and Deploy your ML models with Amazon SageMaker Sangeetha Krishnan, Member of Technical Staff | Adobe
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ABOUT MYSELF ● Software Development Engineer at Adobe Systems. ● Part of CloudTech & Adobe I/O Events development team. ● Areas of interest: ○ Machine Learning ○ Natural Language Processing ○ Music + Travel!
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. WHY AMAZON SAGEMAKER? ★ It is a fully managed machine learning service ★ Very quick and easy to build, train and deploy your ML models ★ Integrated Jupyter notebook ★ Several built-in machine learning algorithms provided by Amazon Sagemaker ★ Capability to automatically tune the machine learning models to generate the best solution ★ Easy deploy to production ★ Automatic scaling for production variants
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AGENDA Image caption 1 Image caption 2 Image caption 3 Image caption 4 Image caption 5 Image caption 6 Getting started with Amazon SageMaker Built-in ML Algorithms Hyper- parameter tuning Accessing the model endpoints Blue/Green Deployments Security and Best Practices
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Getting Started with Amazon SageMaker
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Storage and deployment dependencies: 1. S3 bucket 2. EC2 Container Registry 3. Notebook instance
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Setting up the prerequisites and Notebook instance ● Create an S3 bucket ( preferable to have name of S3 starting with sagemaker-*) ● Create a notebook instance: https://console.aws.amazon.com/sagemaker/ ● Create an IAM role give access to the S3 bucket created ● Create a Jupyter notebook
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SMS Spam Detection ● Problem Statement: Given an SMS Text, determine if the message is a spam or ham ( non-spam ) ● Kaggle dataset: https://www.kaggle.com/uciml/sms-spam-collection-dataset ● The first column is the label (spam/ham), the second column in the dataset is the SMS text. ● 5574 messages in the dataset. 87% ham (4850) , 13% spam (724)
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Built-in Machine Learning Algorithms
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Different Categories of Algorithms Supervised Learning Unsupervised Learning Image and Object Focused Text Related ● DeepAR Forecasting ● Factorisation Machine ● Linear Learner ● XGBoost ● K Means Algorithm ● K Nearest Neighbours ● PCA ● Random Cut Forest ● Image Classification algorithm that uses CNN ● Object Detection Algorithm ● Blazing Text ● LDA ● Sequence2Sequenc e ● Neural Topic Model
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Problem categories Factorization Machines ● Recommendation Systems ● Ad-click predictions XGBoost ● Fraud predictions DeepAR Forecasting ● Traffic, electricity, pageviews Random Cut Forests ● Detecting anomalous data points K-Means algorithm and KNN ● Document clustering ● Identifying related articles Image and Object Detection and Classification BlazingText and Seq2Seq ● Sentiment Analysis ● Named Entity Recognition ● Machine Translation
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Advantages of using Built-in Algorithms ● Designed for solving issues in a commercial setting ● General common purpose algorithms ● Designed for training on huge datasets ● Can support terabytes of data ● Greater reliability ● Faster training and streaming of the datasets ● Training on multiple instances that share their state
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. BlazingText Algorithm ● Provides highly optimized implementations of the Word2vec and text classification algorithms. ● Is used in problems like text classification, named entity recognition, machine translation, etc. ● Word2Vec generate word embeddings ● Ability to generate meaningful vectors for out of vocabulary words ● Provide semantic relationship between words ● Useful for many NLP problems ● Can be trained for huge datasets in a couple of minutes ● Supports multi core CPU and single GPU modes for the purpose of training
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data format ● Supervised Learning ● Binary classification Text Classification: ● Training and Validation set (Text file) __label__spam sms_text1 __label__ham sms_text2 ● Test Data (Json request) {"instances":["sms_text_1", “sms_text_2”,... ]}
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tunable parameters for BlazingText Classification Model buckets [1000000, 10000000] epochs [5, 15] learning_rate [0.005, 0.05] min_count [0, 100] mode [‘supervised’] vector_dim [32,300] word_ngrams [1,3]
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. bt_model = sagemaker.estimator.Estimator(container, role, train_instance_count=1, train_instance_type='ml.m4.xlarge', train_volume_size = 30, train_max_run = 360000, input_mode= 'File', output_path=s3_output_location, sagemaker_session=sess) bt_model.set_hyperparameters(mode="supervised", epochs=10, min_count=2, early_stopping=True, patience=4, min_epochs=5) from sagemaker.tuner import HyperparameterTuner, IntegerParameter, ContinuousParameter hyperparameter_ranges = {'learning_rate': ContinuousParameter(0.01, 0.05), 'vector_dim': IntegerParameter(20, 50), 'word_ngrams': IntegerParameter(1, 4)}
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. objective_metric_name = 'validation:accuracy' objective_type='Maximize' tuner = HyperparameterTuner(estimator=bt_model, objective_metric_name=objective_metric_name, objective_type=objective_type, hyperparameter_ranges=hyperparameter_ranges, max_jobs=4, max_parallel_jobs=2 )
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning Jobs blazingtext-181002- 1157-001-073948fb blazingtext-181002- 1157-002-fba7f0b8 blazingtext-181002- 1157-003-f7407aa1 blazingtext-181002- 1157-004-3a4b4863
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Maximize/Minimize Objective Metric ● In this example, the objective metric is to maximize the 'validation:accuracy'
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints via Postman ● Get the invocation endpoint from the model details in Amazon Sagemaker Console
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints via Postman ● Generate the access key ID and secret access key from your security credentials. This will be used in creating the Authorization token in postman
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints via Lambda Functions Setting up lambda function to access the SageMaker endpoint and return the response Setting up the API Gateway that 1. Accepts client request 2. Forwards the request parameters to the lambda function and waits for the response 3. Forwards the response to the client
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. import os import io import boto3 import json import csv ENDPOINT_NAME = os.environ['ENDPOINT_NAME'] runtime = boto3.Session().client(service_name='sagemaker-runtime',region_name='ap-southeast-2') def lambda_handler(event, context): print("Received event: " + json.dumps(event, indent=2)) payload = "{"instances" : [""+event+""]}" print(payload) response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME, Body=payload) predictions = response['Body'].read().decode('utf8') print(predictions) return predictions Lambda Function
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. API Gateway Configuration Curl request: curl -X POST https://los599anje.execute-api.ap- southeast- 2.amazonaws.com/test/spamdetection -d '"Hello from Airtel. For 1 months free access call 9113851022"' Response: "[{"prob": [0.880291223526001], "label": ["__label__spam"]}]"%
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Blue/Green Deployments using Amazon SageMaker
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deploying multiple models to the same endpoint ● Particularly useful in blue/green deployments ● Blue deployment -> current deployment ● Green deployment -> the new model that is to be tested in production ● We can do this by diverting a small amount of traffic to the green deployment. We achieve this using ProductionVariant in SageMaker. ● Easy rollback to the blue deployment if the green one fails.
  • 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Steps in Blue/Green Deployment User SageMaker Endpoint A 100% User SageMaker Endpoint A 90% User SageMaker Endpoint 100% B 10% B User SageMaker Endpoint A 0% B 100%
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Switching to the new ModelProduction Variant Deployment
  • 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Invocation Metrics
  • 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security and Best Practices
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deployment Recommendations ● Deploy multiple instances for each production endpoint ● Deploy in VPC with more than one subnets with multiple availability regions
  • 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security Practices ● Specify Virtual Private Cloud (VPC) for your notebook instance with outbound connections via Network Address Translation (NAT) ● Exercise judgement when granting individuals access to notebook instances that are attached to a VPC that contains sensitive information.
  • 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Securing Training jobs and Endpoints ● Run Training jobs in a private VPC. ● Create a VPC endpoint to access S3 ● Configure custom policy that allow access to S3 only from your private VPC ● If you want to deny access to certain resources, add to the custom policy ● Configure rule for the security group to allow inbound communication between other members in the same security group ● Configure a NAT gateway that allows only outbound connections from the private VPC.
  • 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. BENGALURU THANK YOU