Build, train and deploy your ML models with Amazon Sage Maker

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
BENGALURU

Build, Train and Deploy your ML
models with Amazon SageMaker
Sangeetha Krishnan, Member of Technical Staff | Adobe

ABOUT MYSELF
● Software Development Engineer at Adobe Systems.
● Part of CloudTech & Adobe I/O Events development
team.
● Areas of interest:
○ Machine Learning
○ Natural Language Processing
○ Music + Travel!

WHY AMAZON SAGEMAKER?
★ It is a fully managed machine learning service
★ Very quick and easy to build, train and deploy your ML models
★ Integrated Jupyter notebook
★ Several built-in machine learning algorithms provided by Amazon
Sagemaker
★ Capability to automatically tune the machine learning models to
generate the best solution
★ Easy deploy to production
★ Automatic scaling for production variants

AGENDA
Image caption 1 Image caption 2 Image caption 3
Image caption 4 Image caption 5 Image caption 6
Getting started
with Amazon
SageMaker
Built-in ML
Algorithms
Hyper-
parameter
tuning
Accessing the
model
endpoints
Blue/Green
Deployments
Security and
Best Practices

Getting Started with Amazon
SageMaker

Storage and
deployment
dependencies:
1. S3 bucket
2. EC2 Container
Registry
3. Notebook
instance

Setting up the prerequisites and Notebook instance
● Create an S3 bucket ( preferable to have name of S3 starting with
sagemaker-*)
● Create a notebook instance:
https://console.aws.amazon.com/sagemaker/
● Create an IAM role give access to the S3 bucket created
● Create a Jupyter notebook

SMS Spam Detection
● Problem Statement: Given an SMS Text, determine if the message is a spam
or ham ( non-spam )
● Kaggle dataset: https://www.kaggle.com/uciml/sms-spam-collection-dataset
● The first column is the label (spam/ham), the second column in the dataset is
the SMS text.
● 5574 messages in the dataset. 87% ham (4850) , 13% spam (724)

Built-in Machine Learning
Algorithms

Different Categories of Algorithms
Supervised
Learning
Unsupervised
Learning
Image and Object
Focused
Text Related
● DeepAR Forecasting
● Factorisation
Machine
● Linear Learner
● XGBoost
● K Means Algorithm
● K Nearest
Neighbours
● PCA
● Random Cut Forest
● Image Classification
algorithm that uses
CNN
● Object Detection
Algorithm
● Blazing Text
● LDA
● Sequence2Sequenc
e
● Neural Topic Model

Problem categories
Factorization Machines
● Recommendation
Systems
● Ad-click predictions
XGBoost
● Fraud predictions
DeepAR Forecasting
● Traffic, electricity,
pageviews
Random Cut Forests
● Detecting anomalous
data points
K-Means algorithm and KNN
● Document clustering
● Identifying related articles
Image and Object Detection
and Classification
BlazingText and
Seq2Seq
● Sentiment Analysis
● Named Entity
Recognition
● Machine Translation

Advantages of using Built-in Algorithms
● Designed for solving issues in a commercial setting
● General common purpose algorithms
● Designed for training on huge datasets
● Can support terabytes of data
● Greater reliability
● Faster training and streaming of the datasets
● Training on multiple instances that share their state

BlazingText Algorithm
● Provides highly optimized implementations of the Word2vec and text
classification algorithms.
● Is used in problems like text classification, named entity recognition, machine
translation, etc.
● Word2Vec generate word embeddings
● Ability to generate meaningful vectors for out of vocabulary words
● Provide semantic relationship between words
● Useful for many NLP problems
● Can be trained for huge datasets in a couple of minutes
● Supports multi core CPU and single GPU modes for the purpose of training

Data format
● Supervised Learning
● Binary classification
Text Classification:
● Training and Validation set (Text file)
__label__spam sms_text1
__label__ham sms_text2
● Test Data (Json request)
{"instances":["sms_text_1", “sms_text_2”,... ]}

Hyperparameter Tuning

Tunable parameters for BlazingText Classification Model
buckets [1000000, 10000000]
epochs [5, 15]
learning_rate [0.005, 0.05]
min_count [0, 100]
mode [‘supervised’]
vector_dim [32,300]
word_ngrams [1,3]

bt_model = sagemaker.estimator.Estimator(container,
role,
train_instance_count=1,
train_instance_type='ml.m4.xlarge',
train_volume_size = 30,
train_max_run = 360000,
input_mode= 'File',
output_path=s3_output_location,
sagemaker_session=sess)
bt_model.set_hyperparameters(mode="supervised",
epochs=10,
min_count=2,
early_stopping=True,
patience=4,
min_epochs=5)
from sagemaker.tuner import HyperparameterTuner, IntegerParameter, ContinuousParameter
hyperparameter_ranges = {'learning_rate': ContinuousParameter(0.01, 0.05),
'vector_dim': IntegerParameter(20, 50),
'word_ngrams': IntegerParameter(1, 4)}

objective_metric_name = 'validation:accuracy'
objective_type='Maximize'
tuner = HyperparameterTuner(estimator=bt_model,
objective_metric_name=objective_metric_name,
objective_type=objective_type,
hyperparameter_ranges=hyperparameter_ranges,
max_jobs=4,
max_parallel_jobs=2
)

Hyperparameter Tuning Jobs
blazingtext-181002-
1157-001-073948fb
blazingtext-181002-
1157-002-fba7f0b8
blazingtext-181002-
1157-003-f7407aa1
blazingtext-181002-
1157-004-3a4b4863

Maximize/Minimize Objective Metric
● In this example, the objective metric is to maximize the 'validation:accuracy'

Accessing the Model Endpoints

Accessing the Model Endpoints via Postman
● Get the invocation endpoint from the model details in Amazon
Sagemaker Console

Accessing the Model Endpoints via Postman
● Generate the access key ID and secret access key from your security
credentials. This will be used in creating the Authorization token in postman

Accessing the Model Endpoints via Lambda Functions
Setting up lambda function to
access the SageMaker
endpoint and return the
response
Setting up the API
Gateway that
1. Accepts client
request
2. Forwards the
request parameters
to the lambda
function and waits
for the response
3. Forwards the
response to the
client

import os
import io
import boto3
import json
import csv
ENDPOINT_NAME = os.environ['ENDPOINT_NAME']
runtime = boto3.Session().client(service_name='sagemaker-runtime',region_name='ap-southeast-2')
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
payload = "{"instances" : [""+event+""]}"
print(payload)
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME, Body=payload)
predictions = response['Body'].read().decode('utf8')
print(predictions)
return predictions
Lambda Function

API Gateway Configuration
Curl request:
curl -X POST
https://los599anje.execute-api.ap-
southeast-
2.amazonaws.com/test/spamdetection
-d '"Hello from Airtel. For 1 months free
access call 9113851022"'
Response:
"[{"prob": [0.880291223526001],
"label": ["__label__spam"]}]"%

Blue/Green Deployments using
Amazon SageMaker

Deploying multiple models to the same endpoint
● Particularly useful in blue/green deployments
● Blue deployment -> current deployment
● Green deployment -> the new model that is to be tested in production
● We can do this by diverting a small amount of traffic to the green
deployment. We achieve this using ProductionVariant in SageMaker.
● Easy rollback to the blue deployment if the green one fails.

Steps in Blue/Green Deployment
User
SageMaker
Endpoint
A
100%
User
SageMaker
Endpoint
A
90%
User
SageMaker
Endpoint
100%
B
10%
B
User
SageMaker
Endpoint
A
0%
B
100%

Switching to the new ModelProduction Variant Deployment

Invocation Metrics

Security and Best Practices

Deployment Recommendations
● Deploy multiple instances for each production endpoint
● Deploy in VPC with more than one subnets with multiple
availability regions

Security Practices
● Specify Virtual Private Cloud (VPC) for your notebook instance with
outbound connections via Network Address Translation (NAT)
● Exercise judgement when granting individuals access to notebook
instances that are attached to a VPC that contains sensitive
information.

Securing Training jobs and Endpoints
● Run Training jobs in a private VPC.
● Create a VPC endpoint to access S3
● Configure custom policy that allow access to S3 only from your private
VPC
● If you want to deny access to certain resources, add to the custom
policy
● Configure rule for the security group to allow inbound communication
between other members in the same security group
● Configure a NAT gateway that allows only outbound connections from
the private VPC.

BENGALURU
THANK YOU

Build, train and deploy your ML models with Amazon Sage Maker

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Build, train and deploy your ML models with Amazon Sage Maker

Ähnlich wie Build, train and deploy your ML models with Amazon Sage Maker (20)

Mehr von AWS User Group Bengaluru

Mehr von AWS User Group Bengaluru (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Build, train and deploy your ML models with Amazon Sage Maker