SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Downloaden Sie, um offline zu lesen
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kwun-Hok Chan, Solution Architect
5 September 2018
Building a Recommender
System on AWS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Let’s Build
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
1
I
Notebook Instances
2
I
Algorithms
3
I
ML Training Service
4
I
ML Hosting Service
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
github.com/chankh/sagemaker-fm-movielens
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimization
Deployment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Content-Based Filtering
UserId Rock Jazz HipHop Classical
7653 5 2 3 1
Generate Recommendations based on known user preferences
• Easy to understand and implement
• Applicable to domains where user
preferences can be captured in full
• Cannot predict new user preferences
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Collaborative Filtering
• Based on Item-to-Item
relationships
• Derived from Explicit and
Implicit features
• Recommendations are based
on other user’s experience
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What Should Our Solution Look Like?
UserId Liked MovieId
123 12 19 87 171
456 15 19 87 231
Movies 19, 87 are
liked by multiple users
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What Should Our Solution Look Like?
UserId Liked MovieId
123 12 19 87 171
456 15 19 87 231
Movies 19, 87 are
liked by multiple users
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What Should Our Solution Look Like?
UserId Liked MovieId
123 12 19 87 171
456 15 19 87 231
Likely
Recommendation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimization
Deployment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Our Data Set: Movielens
• Public Data Set produced by GroupLens Research
• https://grouplens.org/datasets/movielens/
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Item Information
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
User Information
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visualizing The Data
Total Feature Count = Users +
Movies
= 2625 Features
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Visualizing The Data
Minimum rating
count per user: 20
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Preparation: Binary Classification
Not Liked Liked
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Matrix Factorization With Factorization Machines
Rating Matrix
≈
User
Matrix
Item
Matrix
𝑘
𝑘×
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Preparation: One-Hot Encoding
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
Rating
1
UserId MovieId Rating
2 4 5
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Preparation: One-Hot Encoding
UserId MovieId Rating
2 4 5
2 7 2
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
Rating
1
0
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Preparation: One-Hot Encoding
UserId MovieId Rating
2 4 5
2 7 2
4 5 4
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0
Rating
1
0
1
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sparse Data
• One-Hot Encoding produces a a 2625x90,570 matrix for
our training set
• This data set is 99.92% zeros.
• Use a memory efficient data structure:
scipy.lil_matrix
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0
Rating
1
0
1
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker
Notebooks
SageMaker
Training
Prepare
Training Data
Amazon S3
Raw
Data
Prepared
Data
1. Import Data
2. Identify and Enrich Features
3. Format data for training
4. Write out data for distributed training
Data Preparation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimization
Deployment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training – Single Instance
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training – Multiple Instances
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Evaluating The Model
F1.000 Score
73.8%
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
Prepare
Training Data
Amazon S3
Amazon S3
Train & Optimize
Raw
Data
Prepared
Data
Trained
Model
Docker
Container
Model Training
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimization
Deployment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Optimization Approaches
Add higher order
features
Hyperparameter
Optimization
Hybrid
Models
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Higher Order Features
• Adding additional features can improve accuracy
• Additional Features help with Cold Start suggestions
• Select features by experimentation
Users Movies
0 1 0 0 0 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0
Genres
1 0 0 1 0 0
0 1 0 0 1 0
1 0 0 0 1 0
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hyperparameters
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hyperparameter Optimization
Apply Machine Learning to optimize model training hyperparameters
F1.000 Score
77.2% (+3.5%)
Optimized
Hyperparameters
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hybrid Models
1. Cluster individual users
into groups
2. Sort prediction data sets
based on genres
3. Generate predictions
using the clustered user
and filtered prediction
data set that aligns best
to the application context
Horror Fans
Age 8-10
New Signups
Comedies
New Releases
Animation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Problem Framing
Data Exploration &
Preparation
Training
Optimization
Deployment
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
SageMaker
Hosting
Prepare
Training Data
Amazon S3
Amazon S3
Train & Optimize Deploy
Raw
Data
Prepared
Data
Algorithm
Container
Trained
Model
Trained
Model
HPO
Deploying An Endpoint
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Calling SageMaker Endpoints
• Understand the inference request format for your algorithm
• Factorization Machines support JSON & protobuf
• Sample JSON Payload:
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Integrating Endpoints With Applications
API
Gateway
SageMaker
Endpoint
Lambda
Function
Client
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Endpoint Invocation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Enriching The Endpoint Response
Endpoint
Response
Enriched
with MovieId
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
SageMaker
Hosting AWS
Lambda
API
Gateway
Prepare
Training Data Inference requests
Amazon S3
Amazon S3
Train & Optimize Deploy
Raw
Data
Prepared
Data
Algorithm
Container
Trained
Model
Trained
Model
HPO
User
Interactions
Solution Architecture
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Where To From Here?
- Amazon SageMaker:
https://aws.amazon.com/sagemaker/developer-resources/
- SageMaker Automatic Model Tuning:
https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-
model-tuning.html
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Your Feedback is Important!
http://bit.ly/2Cjblrc
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank You!
@kwunhok

Weitere ähnliche Inhalte

Was ist angesagt?

Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018
Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018
Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018Amazon Web Services
 
AWS Data Analytics on AWS
AWS Data Analytics on AWSAWS Data Analytics on AWS
AWS Data Analytics on AWSsampath439572
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AINing Jiang
 
generative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsgenerative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsAdventureWorld5
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduceAmazon Web Services
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...Amazon Web Services
 
Building Modern Streaming Analytics with Confluent on AWS
Building Modern Streaming Analytics with Confluent on AWSBuilding Modern Streaming Analytics with Confluent on AWS
Building Modern Streaming Analytics with Confluent on AWSconfluent
 
AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataChris Fregly
 
A Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureA Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureAmazon Web Services
 
AWS ML Model Deployment
AWS ML Model DeploymentAWS ML Model Deployment
AWS ML Model DeploymentKnoldus Inc.
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon Web Services Korea
 

Was ist angesagt? (20)

Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018
Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018
Building Advanced Workflows with AWS Glue (ANT333) - AWS re:Invent 2018
 
AWS Data Analytics on AWS
AWS Data Analytics on AWSAWS Data Analytics on AWS
AWS Data Analytics on AWS
 
Machine Learning on AWS
Machine Learning on AWSMachine Learning on AWS
Machine Learning on AWS
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AI
 
generative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsgenerative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language models
 
SaaS on AWS - ISV challenges
SaaS on AWS - ISV challengesSaaS on AWS - ISV challenges
SaaS on AWS - ISV challenges
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
 
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
How a Global Healthcare Company Built a Migration Factory to Quickly Move Tho...
 
Building Modern Streaming Analytics with Confluent on AWS
Building Modern Streaming Analytics with Confluent on AWSBuilding Modern Streaming Analytics with Confluent on AWS
Building Modern Streaming Analytics with Confluent on AWS
 
AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and Data
 
AWS for Backup and Recovery
AWS for Backup and RecoveryAWS for Backup and Recovery
AWS for Backup and Recovery
 
AWS Secrets Manager
AWS Secrets ManagerAWS Secrets Manager
AWS Secrets Manager
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
A Brief Look at Serverless Architecture
A Brief Look at Serverless ArchitectureA Brief Look at Serverless Architecture
A Brief Look at Serverless Architecture
 
AWS ML Model Deployment
AWS ML Model DeploymentAWS ML Model Deployment
AWS ML Model Deployment
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 

Ähnlich wie Building a Recommender System on AWS

Build Your Recommendation Engine on AWS Today!
Build Your Recommendation Engine on AWS Today!Build Your Recommendation Engine on AWS Today!
Build Your Recommendation Engine on AWS Today!AWS Germany
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleAWS Germany
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Amazon Web Services
 
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018Amazon Web Services
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksAmazon Web Services
 
AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018
AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018
AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018Amazon Web Services Korea
 
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018Yotam Yarden
 
Supercharge Your Machine Learning Model with Amazon SageMaker
Supercharge Your Machine Learning Model with Amazon SageMakerSupercharge Your Machine Learning Model with Amazon SageMaker
Supercharge Your Machine Learning Model with Amazon SageMakerAmazon Web Services
 
Using Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLUsing Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLAmazon Web Services
 
AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28Amazon Web Services
 
AI Services for Developers - Floor28
AI Services for Developers - Floor28AI Services for Developers - Floor28
AI Services for Developers - Floor28Boaz Ziniman
 
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...Amazon Web Services
 
Accelerate Machine Learning with Ease using Amazon SageMaker
Accelerate Machine Learning with Ease using Amazon SageMakerAccelerate Machine Learning with Ease using Amazon SageMaker
Accelerate Machine Learning with Ease using Amazon SageMakerAmazon Web Services
 
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019 딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019 AWSKRUG - AWS한국사용자모임
 
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Amazon Web Services
 
DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker DataXDay Conference by Xebia
 
From Data To Insights
From Data To Insights From Data To Insights
From Data To Insights Orit Alul
 
The Future of API Management Is Serverless
The Future of API Management Is ServerlessThe Future of API Management Is Serverless
The Future of API Management Is ServerlessChris Munns
 
Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018
Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018
Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018Amazon Web Services
 

Ähnlich wie Building a Recommender System on AWS (20)

Build Your Recommendation Engine on AWS Today!
Build Your Recommendation Engine on AWS Today!Build Your Recommendation Engine on AWS Today!
Build Your Recommendation Engine on AWS Today!
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018
AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018
AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018
 
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
 
Supercharge Your Machine Learning Model with Amazon SageMaker
Supercharge Your Machine Learning Model with Amazon SageMakerSupercharge Your Machine Learning Model with Amazon SageMaker
Supercharge Your Machine Learning Model with Amazon SageMaker
 
Using Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLUsing Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with ML
 
AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28
 
AI Services for Developers - Floor28
AI Services for Developers - Floor28AI Services for Developers - Floor28
AI Services for Developers - Floor28
 
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
 
Accelerate Machine Learning with Ease using Amazon SageMaker
Accelerate Machine Learning with Ease using Amazon SageMakerAccelerate Machine Learning with Ease using Amazon SageMaker
Accelerate Machine Learning with Ease using Amazon SageMaker
 
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019 딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
딥러닝@EDM페스티발 누가누가 잘 노나? :: 김태웅 - AWS Community Day 2019
 
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
 
DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker
 
From Data To Insights
From Data To Insights From Data To Insights
From Data To Insights
 
The Future of API Management Is Serverless
The Future of API Management Is ServerlessThe Future of API Management Is Serverless
The Future of API Management Is Serverless
 
Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018
Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018
Chaos Engineering and Scalability at Audible.com (ARC308) - AWS re:Invent 2018
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Building a Recommender System on AWS

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kwun-Hok Chan, Solution Architect 5 September 2018 Building a Recommender System on AWS
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Let’s Build
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker 1 I Notebook Instances 2 I Algorithms 3 I ML Training Service 4 I ML Hosting Service
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. github.com/chankh/sagemaker-fm-movielens
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimization Deployment
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Content-Based Filtering UserId Rock Jazz HipHop Classical 7653 5 2 3 1 Generate Recommendations based on known user preferences • Easy to understand and implement • Applicable to domains where user preferences can be captured in full • Cannot predict new user preferences
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Collaborative Filtering • Based on Item-to-Item relationships • Derived from Explicit and Implicit features • Recommendations are based on other user’s experience
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What Should Our Solution Look Like? UserId Liked MovieId 123 12 19 87 171 456 15 19 87 231 Movies 19, 87 are liked by multiple users © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What Should Our Solution Look Like? UserId Liked MovieId 123 12 19 87 171 456 15 19 87 231 Movies 19, 87 are liked by multiple users
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What Should Our Solution Look Like? UserId Liked MovieId 123 12 19 87 171 456 15 19 87 231 Likely Recommendation
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimization Deployment
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Our Data Set: Movielens • Public Data Set produced by GroupLens Research • https://grouplens.org/datasets/movielens/
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Item Information
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. User Information
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Visualizing The Data Total Feature Count = Users + Movies = 2625 Features
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Visualizing The Data Minimum rating count per user: 20
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Preparation: Binary Classification Not Liked Liked
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Matrix Factorization With Factorization Machines Rating Matrix ≈ User Matrix Item Matrix 𝑘 𝑘×
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Preparation: One-Hot Encoding Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 Rating 1 UserId MovieId Rating 2 4 5
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Preparation: One-Hot Encoding UserId MovieId Rating 2 4 5 2 7 2 Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 Rating 1 0
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Preparation: One-Hot Encoding UserId MovieId Rating 2 4 5 2 7 2 4 5 4 Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 Rating 1 0 1
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sparse Data • One-Hot Encoding produces a a 2625x90,570 matrix for our training set • This data set is 99.92% zeros. • Use a memory efficient data structure: scipy.lil_matrix Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 Rating 1 0 1
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker Notebooks SageMaker Training Prepare Training Data Amazon S3 Raw Data Prepared Data 1. Import Data 2. Identify and Enrich Features 3. Format data for training 4. Write out data for distributed training Data Preparation
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimization Deployment
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training – Single Instance
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Training – Multiple Instances
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Evaluating The Model F1.000 Score 73.8%
  • 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker Notebooks Training Algorithm SageMaker Training Prepare Training Data Amazon S3 Amazon S3 Train & Optimize Raw Data Prepared Data Trained Model Docker Container Model Training
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimization Deployment
  • 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Optimization Approaches Add higher order features Hyperparameter Optimization Hybrid Models
  • 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Higher Order Features • Adding additional features can improve accuracy • Additional Features help with Cold Start suggestions • Select features by experimentation Users Movies 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 Genres 1 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 1 0
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameters
  • 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Optimization Apply Machine Learning to optimize model training hyperparameters F1.000 Score 77.2% (+3.5%) Optimized Hyperparameters
  • 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hybrid Models 1. Cluster individual users into groups 2. Sort prediction data sets based on genres 3. Generate predictions using the clustered user and filtered prediction data set that aligns best to the application context Horror Fans Age 8-10 New Signups Comedies New Releases Animation
  • 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Problem Framing Data Exploration & Preparation Training Optimization Deployment
  • 38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker Notebooks Training Algorithm SageMaker Training SageMaker Hosting Prepare Training Data Amazon S3 Amazon S3 Train & Optimize Deploy Raw Data Prepared Data Algorithm Container Trained Model Trained Model HPO Deploying An Endpoint
  • 39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Calling SageMaker Endpoints • Understand the inference request format for your algorithm • Factorization Machines support JSON & protobuf • Sample JSON Payload:
  • 40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Integrating Endpoints With Applications API Gateway SageMaker Endpoint Lambda Function Client
  • 41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Endpoint Invocation
  • 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Enriching The Endpoint Response Endpoint Response Enriched with MovieId
  • 43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SageMaker Notebooks Training Algorithm SageMaker Training SageMaker Hosting AWS Lambda API Gateway Prepare Training Data Inference requests Amazon S3 Amazon S3 Train & Optimize Deploy Raw Data Prepared Data Algorithm Container Trained Model Trained Model HPO User Interactions Solution Architecture
  • 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Where To From Here? - Amazon SageMaker: https://aws.amazon.com/sagemaker/developer-resources/ - SageMaker Automatic Model Tuning: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic- model-tuning.html
  • 45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Your Feedback is Important! http://bit.ly/2Cjblrc
  • 46. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You! @kwunhok