SlideShare ist ein Scribd-Unternehmen logo
1 von 46
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Alastair Cousins
Senior Solutions Architect, Amazon Web Services
Building A Recommender
System On AWS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Let’s Build
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker
1
I
Notebook Instances
2
I
Algorithms
3
I
ML Training Service
4
I
ML Hosting Service
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimisation
Deployment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Content-Based Filtering
UserId Rock Jazz HipHop Classical
7653 5 2 3 1
Generate Recommendations based on known user preferences
• Easy to understand and implement
• Applicable to domains where user
preferences can be captured in full
• Cannot predict new user preferences
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Collaborative Filtering
• Based on Item-to-Item
relationships
• Derived from Explicit and
Implicit features
• Recommendations are based
on other user’s experience
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What Should Our Solution Look Like?
UserId Liked MovieId
123 12 19 87 171
456 15 19 87 231
Movies 19, 87 are
liked by multiple users
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What Should Our Solution Look Like?
UserId Liked MovieId
123 12 19 87 171
456 15 19 87 231
Likely
Recommendation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimisation
Deployment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Data Set: Movielens
• Public Data Set produced by GroupLens Research
• https://grouplens.org/datasets/movielens/
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Item Information
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
User Information
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visualising The Data
Total Feature Count = Users + Movies
= 2625 Features
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visualising The Data
Minimum rating
count per user: 20
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Preparation: Binary Classification
Not Liked Liked
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Matrix Factorisation With Factorisation Machines
Rating Matrix
≈
User
Matrix
Item
Matrix
𝑘
𝑘×
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Preparation: One-Hot Encoding
UserId MovieId Rating
2 4 5
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
Rating
1
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Preparation: One-Hot Encoding
UserId MovieId Rating
2 4 5
2 7 2
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
Rating
1
0
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Preparation: One-Hot Encoding
UserId MovieId Rating
2 4 5
2 7 2
4 5 4
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0
Rating
1
0
1
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sparse Data
• One-Hot Encoding produces a a 2625x90,570 matrix for
our training set
• This data set is 99.92% zeros.
• Use a memory efficient data structure: scipy.lil_matrix
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0
Rating
1
0
1
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SageMaker
Notebooks
SageMaker
Training
Prepare
Training Data
Amazon S3
Raw
Data
Prepared
Data
1. Import Data
2. Identify and Enrich Features
3. Format data for training
4. Write out data for distributed training
Data Preparation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimisation
Deployment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Log_loss F1 Score Seconds
SageMaker 0.494 0.277 820
Other (10 Iter) 0.516 0.190 650
Other (20 Iter) 0.507 0.254 1300
Other (50 Iter) 0.481 0.313 3250
Click Prediction 1 TB advertising dataset,
m4.4xlarge machines, perfect scaling.
$-
$50.00
$100.00
$150.00
$200.00
1. 2.75 4.5 6.25 8.
CostinDollars Billable Time in Hours
10
machines
20
machines
30
machines
4050
Factorisation Machines
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Training – Single Instance
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Training – Multiple Instances
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Evaluating The Model
F1.000 Score
73.8%
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
Prepare
Training Data
Amazon S3
Amazon S3
Train & Optimise
Raw
Data
Prepared
Data
Trained
Model
Docker
Container
Model Training
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimisation
Deployment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimisation Approaches
Add higher order
features
Hyperparameter
Optimisation
Hybrid Models
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Higher Order Features
• Adding additional features can improve accuracy
• Additional Features help with Cold Start suggestions
• Select features by experimentation
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0
Genres
1 0 0 1 0 0
0 1 0 0 1 0
1 0 0 0 1 0
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hyperparameters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hyperparameter Optimisation
Apply Machine Learning to optimise model training hyperparameters
F1.000 Score
77.2% (+3.5%)
Optimised
Hyperparameters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hybrid Models
1. Cluster individual
users into groups
2. Sort prediction data
sets based on genres
3. Generate predictions
using the clustered
user and filtered
prediction data set
that aligns best to the
application context
Horror Fans
Age 8-10
New Signups
Comedies
New Releases
Animation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimisation
Deployment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
SageMaker
Hosting
Prepare
Training Data
Amazon S3
Amazon S3
Train & Optimise Deploy
Raw
Data
Prepared
Data
Algorithm
Container
Trained
Model
Trained
Model
HPO
Deploying An Endpoint
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Calling SageMaker Endpoints
• Understand the inference request format for your algorithm
• Factorisation Machines support JSON & protobuf
• Sample JSON Payload:
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Integrating Endpoints With Applications
API
Gateway
SageMaker
Endpoint
Lambda
Function
Client
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Endpoint Invocation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Enriching The Endpoint Response
Endpoint
Response
Enriched
with MovieId
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
SageMaker
Hosting AWS
Lambda
API
Gateway
Prepare
Training Data Inference requests
Amazon S3
Amazon S3
Train & Optimise Deploy
Raw
Data
Prepared
Data
Algorithm
Container
Trained
Model
Trained
Model
HPO
User
Interactions
Solution Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Where To From Here?
- Sample Code: https://medium.com/@julsimon/building-a-movie-
recommender-with-factorization-machines-on-amazon-sagemaker-
cedbfc8c93d8
- SageMaker HPO Preview: https://pages.awscloud.com/amazon-
sagemaker-hpo-preview.html
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank You

Weitere ähnliche Inhalte

Mehr von Amazon Web Services

Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSAmazon Web Services
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAmazon Web Services
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightAmazon Web Services
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotAmazon Web Services
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Amazon Web Services
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?Amazon Web Services
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksAmazon Web Services
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Amazon Web Services
 

Mehr von Amazon Web Services (20)

Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei server
 
Crea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSightCrea dashboard interattive con Amazon QuickSight
Crea dashboard interattive con Amazon QuickSight
 
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows
 
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
La tua organizzazione è pronta per adottare una strategia di cloud ibrido?
 
Protect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced AttacksProtect your applications from DDoS/BOT & Advanced Attacks
Protect your applications from DDoS/BOT & Advanced Attacks
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
 

Building a Recommender System on AWS - AWS Summit Sydney 2018

  • 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Alastair Cousins Senior Solutions Architect, Amazon Web Services Building A Recommender System On AWS
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Let’s Build
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker 1 I Notebook Instances 2 I Algorithms 3 I ML Training Service 4 I ML Hosting Service
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimisation Deployment
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Content-Based Filtering UserId Rock Jazz HipHop Classical 7653 5 2 3 1 Generate Recommendations based on known user preferences • Easy to understand and implement • Applicable to domains where user preferences can be captured in full • Cannot predict new user preferences
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Collaborative Filtering • Based on Item-to-Item relationships • Derived from Explicit and Implicit features • Recommendations are based on other user’s experience
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What Should Our Solution Look Like? UserId Liked MovieId 123 12 19 87 171 456 15 19 87 231 Movies 19, 87 are liked by multiple users
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What Should Our Solution Look Like? UserId Liked MovieId 123 12 19 87 171 456 15 19 87 231 Likely Recommendation
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimisation Deployment
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Data Set: Movielens • Public Data Set produced by GroupLens Research • https://grouplens.org/datasets/movielens/
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Item Information
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. User Information
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Visualising The Data Total Feature Count = Users + Movies = 2625 Features
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Visualising The Data Minimum rating count per user: 20
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Preparation: Binary Classification Not Liked Liked
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Matrix Factorisation With Factorisation Machines Rating Matrix ≈ User Matrix Item Matrix 𝑘 𝑘×
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Preparation: One-Hot Encoding UserId MovieId Rating 2 4 5 Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 Rating 1
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Preparation: One-Hot Encoding UserId MovieId Rating 2 4 5 2 7 2 Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 Rating 1 0
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Preparation: One-Hot Encoding UserId MovieId Rating 2 4 5 2 7 2 4 5 4 Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 Rating 1 0 1
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Sparse Data • One-Hot Encoding produces a a 2625x90,570 matrix for our training set • This data set is 99.92% zeros. • Use a memory efficient data structure: scipy.lil_matrix Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 Rating 1 0 1
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SageMaker Notebooks SageMaker Training Prepare Training Data Amazon S3 Raw Data Prepared Data 1. Import Data 2. Identify and Enrich Features 3. Format data for training 4. Write out data for distributed training Data Preparation
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimisation Deployment
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Log_loss F1 Score Seconds SageMaker 0.494 0.277 820 Other (10 Iter) 0.516 0.190 650 Other (20 Iter) 0.507 0.254 1300 Other (50 Iter) 0.481 0.313 3250 Click Prediction 1 TB advertising dataset, m4.4xlarge machines, perfect scaling. $- $50.00 $100.00 $150.00 $200.00 1. 2.75 4.5 6.25 8. CostinDollars Billable Time in Hours 10 machines 20 machines 30 machines 4050 Factorisation Machines
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Training – Single Instance
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Training – Multiple Instances
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Evaluating The Model F1.000 Score 73.8%
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SageMaker Notebooks Training Algorithm SageMaker Training Prepare Training Data Amazon S3 Amazon S3 Train & Optimise Raw Data Prepared Data Trained Model Docker Container Model Training
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimisation Deployment
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimisation Approaches Add higher order features Hyperparameter Optimisation Hybrid Models
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Higher Order Features • Adding additional features can improve accuracy • Additional Features help with Cold Start suggestions • Select features by experimentation Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 Genres 1 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 1 0
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hyperparameters
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hyperparameter Optimisation Apply Machine Learning to optimise model training hyperparameters F1.000 Score 77.2% (+3.5%) Optimised Hyperparameters
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hybrid Models 1. Cluster individual users into groups 2. Sort prediction data sets based on genres 3. Generate predictions using the clustered user and filtered prediction data set that aligns best to the application context Horror Fans Age 8-10 New Signups Comedies New Releases Animation
  • 38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimisation Deployment
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SageMaker Notebooks Training Algorithm SageMaker Training SageMaker Hosting Prepare Training Data Amazon S3 Amazon S3 Train & Optimise Deploy Raw Data Prepared Data Algorithm Container Trained Model Trained Model HPO Deploying An Endpoint
  • 40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Calling SageMaker Endpoints • Understand the inference request format for your algorithm • Factorisation Machines support JSON & protobuf • Sample JSON Payload:
  • 41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Integrating Endpoints With Applications API Gateway SageMaker Endpoint Lambda Function Client
  • 42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Endpoint Invocation
  • 43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Enriching The Endpoint Response Endpoint Response Enriched with MovieId
  • 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SageMaker Notebooks Training Algorithm SageMaker Training SageMaker Hosting AWS Lambda API Gateway Prepare Training Data Inference requests Amazon S3 Amazon S3 Train & Optimise Deploy Raw Data Prepared Data Algorithm Container Trained Model Trained Model HPO User Interactions Solution Architecture
  • 45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Where To From Here? - Sample Code: https://medium.com/@julsimon/building-a-movie- recommender-with-factorization-machines-on-amazon-sagemaker- cedbfc8c93d8 - SageMaker HPO Preview: https://pages.awscloud.com/amazon- sagemaker-hpo-preview.html
  • 46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank You