Here I describe the main challenges faced by data scientists involved in deploying machine learning models into real production environments.
I also include references/examples of Python libraries and multi-model systems requiring advanced features such as A/B testing and high scalability/availability.
While discussing the limitations of traditional deployment strategies, I will demonstrate how serverless computing can simplify your deployment workflow.
3. 真のデータサイエンティストとはどんな⼈人か?
|⽇日本
What does a real Data Scien5st look like?
Data ScienIst
Very smart & curious
Numbers lover (i.e. Data)
Great teamwork skills
40% analysis, 30% design, 30% code
clda.co/serverless-‐tokyo
7. 機械学習のパイプラインで得られるものは何か?
|⽇日本
What’s the outcome of a ML pipeline?
+
+
Data
ScienIst
Data
Time
ML
Model
Data
VisualisaIon
Prototype
+
+
Web
Developer
DevOps
A lot of
Time
+
+
clda.co/serverless-‐tokyo
8. オーナーシップの⽋欠如
|⽇日本
The Lack of Ownership
!=
Data ScienIst DevOps
Mathema7cal modeling
Sta7s7cal analysis
Data mining
(Cloud) Opera7ons
System administra7on
SoEware best prac7ces
clda.co/serverless-‐tokyo
9. MLaaS はどうか?
|⽇日本
What about MLaaS?
Machine Learning as a Service
Data ExploraIon does not come for free
Parameters tuning becomes blind guessing
Very liZle control over your models
You wouldn’t need a Data ScienIst at all, but…
Not everybody likes black-‐box abstracIons
clda.co/serverless-‐tokyo
13. 伝統的なデプロイ戦略
|⽇日本
Tradi5onal deployment strategies
2. Fleet of servers
same problems as before
many more servers to maintain
no elasIcity (over-‐provisioning)
Bigger capacity and no code changes, but…
Load Balancing
clda.co/serverless-‐tokyo
14. 伝統的なデプロイ戦略
|⽇日本
Tradi5onal deployment strategies
3. Auto Scaling
sIll shared resources?
even bigger lack of ownership
what about caching, versioning and auth?
ElasIc and highly available, but…
AWS ELB + Auto Scaling
(or maybe ElasIc Beanstalk?)
clda.co/serverless-‐tokyo
15. サーバレス機械学習
|⽇日本
Serverless Machine Learning
Versioning, staging & caching
1 model = 1 microservice
IntuiIve RESTful interface
High Availability (no downIme)
Very liZle operaIonal effort
Transparent elasIcity (PAYG)
Failure isolaIon / DecentralisaIon Offline training phase
ProducIon-‐ready prototypes A/B tesIng through composiIon
clda.co/serverless-‐tokyo
17. もっと実践的な例:Cloud Academyにおけるサーバーレスな機械学習事例
|⽇日本
A real example: Serverless ML @ Cloud Academy
Serverless ML @ Cloud Academy
MulI-‐model architecture
RESTful interface for each ML model
1 Lambda FuncIon for each ML model
S3 + RDS for storage
Periodic training (offline)
clda.co/serverless-‐tokyo
20. サーバーレスな機械学習における制約事項
|⽇日本
Limita5ons of Serverless ML
clda.co/serverless-‐tokyo
AWS
Lambda
No real-‐Ime models (only pseudo real-‐Ime)
Deployment package management: size limit and OS libraries
Not suitable for model training yet (5 min max execuIon Ime)
Cold start Ime is long and hard to avoid
Unit/integraIon tests help, but not enough