SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Parameter Server Approach for
Online Learning @ Twitter
Joe Xie, Yong Wang and Yue Lu
ML Infra Group, Ads Prediction Team
Oct 10, 2017
Outline
• Background
– Online learning
– Challenges
• Parameter Server Approaches
– v1.0 Decouple the training and prediction
– v2.0 Scale the training
– v3.0 Scale the model
• Future Directions
Background
Twitter is Realtime
• Twitter is all about real-time: news, events, trends,
hashtags.
– Users interest and intent change in realtime.
– Context changes in realtime.
– New advertisers, new campaigns are added in realtime.
• ML is increasingly at the core of everything we build at
Twitter
– ML model dynamically adapts to changes spanning as short as a few
hours even minutes
Real time:
Time
Model
Data Stream
Prediction Stream
Time
Model
Data Stream
Prediction Stream
Online Learning Offline Learning
Learning Phase Training Phase Serving Phase
ReadWriteRead &
Write
Read &
Write
Real time – Online Learning
Architecture
Simple and efficient for Ads Prediction and
Moments Relevance production services
Challenges
• Network fanout
– The same traffic stream is sent many times over to each prediction
instance, wasting network bandwidth.
• Limit to training traffic size
–Online training throughput is currently limited by the capacity (CPU /
Network bandwidth) of a single mesos worker
• Limit to model size
– All model are hosted within the memory for each instance.
Parameter Server Approaches
Model Architecture
Raw Features
Raw Features Feature Crosses Decision Tree
(e.g., XGBoost...)
Neural Network
(e.g., Torch,
TensorFlow...)
...
Distributed Large-scale Online Logistic Regression
(Parameter Server)
● Fully explore the feature interaction
w/o training latency constraint.
● The feature interactions don’t
change frequently historically.
● Flexible architecture with new model
structure & external machine
learning framework.
20X training data
- Parameter server v2.0 to scale the
training traffic
10X features+algo complexity
- Parameter server v3.0 to scale the
model size
10X prediction qps
- Parameter server v1.0 to decouple
the training and prediction requests
Parameter Server Approaches
Parameter Server v1.0
Training
Worker
Training
Traffic
Observation
Service
Observation
Service
Observation
Workers
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
ServicePrediction
Worker
Pull Model
Model
Model
Pull
Downsampling
Through
■ New architecture to decouple
the training / prediction services
into different clusters.
10X Prediction capacity
Higher Serving efficiency
Prediction
Requests
Updates
Downsampling
Parameter Server v1.0
• Separated training service
–Take training traffic to generate incremental model update
• New observation service
– Consume incremental model update
– Evaluate training traffic for model quality assurance
• Separated prediction service
– Consume incremental model update
– Serve the prediction request
Parameter Server v1.0
• Launched into ads engagement
prediction models.
– Mesos Efficiency: 40% reduction in CPU cores
required.
– Network Efficiency: 60% reduction in fan-out
messages required.
Parameter Server v2.0
Parameter
Server
Mo
del
Instance of
Prediction
Service Mo
del
Training
Workers
Training
Traffic
Observation
Service
Observation
Service
Observation
Worker
NO downsamplingPull
Push/Pull
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
Service
M
od
el
M
od
el
Instance of
Prediction
ServicePrediction
Workers
Pull
Model
ModelModel
Model
Through
■ New architecture to
distribute the training
20X Training data
Higher model quality
Dispatch
Workers
Dispatch
Workers
Dispatch
Workers
Downsampling
Prediction
Requests
Parameter Server v2.0
• New dispatch service
–Take un-sampled training traffic and dispatch to training service
• Updated training service
–Take training traffic and produce updates for parameter service
–Receive model update from parameter service
• New parameter service
– Aggregate the updates from training services
– Send model update to training / observation / prediction services
Parameter Server v2.0
• Launched into ads engagement
prediction models.
• First version using simple model-average
aggregation.
–20x training capacity
–xx% model quality gain
Parameter Server v3.0
Mo
del
Instance of
Prediction
Service Mo
del
Training
Workers
Training
Traffic
Observation
Service
Observation
Service
Observation
Worker
NO downsamplingPull
Push/Pull
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
Service
M
od
el
Instance of
Prediction
Service
M
od
el
M
od
el
Instance of
Prediction
ServicePrediction
Workers
Pull
Model
ModelModel
Model
Dispatch
Workers
Dispatch
Workers
Dispatch
Workers
Downsampling
Prediction
RequestsParameter
Server
Parameter
Server
Parameter
Server
Model
Through
■ New architecture for
model / feature sharding
More complex model
Higher model quality
Parameter Server v3.0
• Updated parameter service (In progress)
–Model sharding: Parameter instance hosts single model instead of
multiple models.
•xx% model quality gain in experimentation.
–Feature sharding: Parameter instance hosts partial of single model.
Future Directions
Future Works
•
•
Parameter Server Approach for Online Learning at Twitter

Weitere ähnliche Inhalte

Was ist angesagt?

Machine learning at scale by Amy Unruh from Google
Machine learning at scale by  Amy Unruh from GoogleMachine learning at scale by  Amy Unruh from Google
Machine learning at scale by Amy Unruh from GoogleBill Liu
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesScott Clark
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...Provectus
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 
Making neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionMaking neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionKaty Lee
 
H2O platform workshop
H2O platform workshopH2O platform workshop
H2O platform workshopShareThis
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedChris Fregly
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...Seldon
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systemsXavier Amatriain
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkSigOpt
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Chris Fregly
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_InterpreterKaty Lee
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOAnimesh Singh
 
Machine learning on kubernetes
Machine learning on kubernetesMachine learning on kubernetes
Machine learning on kubernetesAnirudh Ramanathan
 
SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...
SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...
SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...South Tyrol Free Software Conference
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper reviewMazen Aly
 

Was ist angesagt? (20)

Machine learning at scale by Amy Unruh from Google
Machine learning at scale by  Amy Unruh from GoogleMachine learning at scale by  Amy Unruh from Google
Machine learning at scale by Amy Unruh from Google
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning Pipelines
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
 
SigOpt for Machine Learning and AI
SigOpt for Machine Learning and AISigOpt for Machine Learning and AI
SigOpt for Machine Learning and AI
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Paddle_Spark_Summit
Paddle_Spark_SummitPaddle_Spark_Summit
Paddle_Spark_Summit
 
Making neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionMaking neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursion
 
H2O platform workshop
H2O platform workshopH2O platform workshop
H2O platform workshop
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
 
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
TensorFlow London 14: Ben Hall 'Machine Learning Workloads with Kubernetes an...
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_Interpreter
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
SFScon21 - Andrea Antonello - Integrated modeling with k.LAB
SFScon21 - Andrea Antonello - Integrated modeling with k.LABSFScon21 - Andrea Antonello - Integrated modeling with k.LAB
SFScon21 - Andrea Antonello - Integrated modeling with k.LAB
 
Machine learning on kubernetes
Machine learning on kubernetesMachine learning on kubernetes
Machine learning on kubernetes
 
SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...
SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...
SFScon 21 - Eduardo Guerra - A Lean Software Analytics Canvas for Agile Small...
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 

Ähnlich wie Parameter Server Approach for Online Learning at Twitter

ICML'16 Scaling ML System@Twitter
ICML'16 Scaling ML System@TwitterICML'16 Scaling ML System@Twitter
ICML'16 Scaling ML System@TwitterJack Xiaojiang Guo
 
Scaling ml @ careem (oreilly ai conf)
Scaling ml @ careem (oreilly ai conf)Scaling ml @ careem (oreilly ai conf)
Scaling ml @ careem (oreilly ai conf)Ahmed Kamal
 
Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013Min Fang
 
Automation & Professional Services
Automation & Professional ServicesAutomation & Professional Services
Automation & Professional ServicesMarketingArrowECS_CZ
 
PureApplication: System, Service, Software
PureApplication: System, Service, SoftwarePureApplication: System, Service, Software
PureApplication: System, Service, SoftwareProlifics
 
Past Experiences and Future Challenges using Automatic Performance Modelling ...
Past Experiences and Future Challenges using Automatic Performance Modelling ...Past Experiences and Future Challenges using Automatic Performance Modelling ...
Past Experiences and Future Challenges using Automatic Performance Modelling ...Paul Brebner
 
BT Group: Use of Graph in VENA (a smart broadcast network)
BT Group: Use of Graph in VENA (a smart broadcast network)BT Group: Use of Graph in VENA (a smart broadcast network)
BT Group: Use of Graph in VENA (a smart broadcast network)Neo4j
 
Transform Enterprise IT Infrastructure with AWS DevOps
Transform Enterprise IT Infrastructure with AWS DevOpsTransform Enterprise IT Infrastructure with AWS DevOps
Transform Enterprise IT Infrastructure with AWS DevOpsAmazon Web Services
 
DEVNET-1153 Enterprise Application to Infrastructure Integration – SDN Apps
DEVNET-1153	Enterprise Application to Infrastructure Integration – SDN AppsDEVNET-1153	Enterprise Application to Infrastructure Integration – SDN Apps
DEVNET-1153 Enterprise Application to Infrastructure Integration – SDN AppsCisco DevNet
 
Practical soa for business and researchers
Practical soa for business and researchersPractical soa for business and researchers
Practical soa for business and researchersMustafa Gamal
 
Enterprise Application to Infrastructure Integration - SDN Apps
Enterprise Application to Infrastructure Integration - SDN AppsEnterprise Application to Infrastructure Integration - SDN Apps
Enterprise Application to Infrastructure Integration - SDN AppsMiftakhZein1
 
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...Prasanna Hegde
 
How to improve customer experience with a self organizing network
How to improve customer experience with a self organizing networkHow to improve customer experience with a self organizing network
How to improve customer experience with a self organizing networkComarch
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowLviv Startup Club
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowEdunomica
 
How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...NRB
 
Service Provider Architectures for Tomorrow by Chow Khay Kid
Service Provider Architectures for Tomorrow by Chow Khay KidService Provider Architectures for Tomorrow by Chow Khay Kid
Service Provider Architectures for Tomorrow by Chow Khay KidMyNOG
 

Ähnlich wie Parameter Server Approach for Online Learning at Twitter (20)

ML Model Serving at Twitter
ML Model Serving at TwitterML Model Serving at Twitter
ML Model Serving at Twitter
 
ICML'16 Scaling ML System@Twitter
ICML'16 Scaling ML System@TwitterICML'16 Scaling ML System@Twitter
ICML'16 Scaling ML System@Twitter
 
Scaling ml @ careem (oreilly ai conf)
Scaling ml @ careem (oreilly ai conf)Scaling ml @ careem (oreilly ai conf)
Scaling ml @ careem (oreilly ai conf)
 
Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013Service Virtualization - Next Gen Testing Conference Singapore 2013
Service Virtualization - Next Gen Testing Conference Singapore 2013
 
Automation & Professional Services
Automation & Professional ServicesAutomation & Professional Services
Automation & Professional Services
 
PureApplication: System, Service, Software
PureApplication: System, Service, SoftwarePureApplication: System, Service, Software
PureApplication: System, Service, Software
 
Past Experiences and Future Challenges using Automatic Performance Modelling ...
Past Experiences and Future Challenges using Automatic Performance Modelling ...Past Experiences and Future Challenges using Automatic Performance Modelling ...
Past Experiences and Future Challenges using Automatic Performance Modelling ...
 
BT Group: Use of Graph in VENA (a smart broadcast network)
BT Group: Use of Graph in VENA (a smart broadcast network)BT Group: Use of Graph in VENA (a smart broadcast network)
BT Group: Use of Graph in VENA (a smart broadcast network)
 
Transform Enterprise IT Infrastructure with AWS DevOps
Transform Enterprise IT Infrastructure with AWS DevOpsTransform Enterprise IT Infrastructure with AWS DevOps
Transform Enterprise IT Infrastructure with AWS DevOps
 
14
1414
14
 
14
1414
14
 
DEVNET-1153 Enterprise Application to Infrastructure Integration – SDN Apps
DEVNET-1153	Enterprise Application to Infrastructure Integration – SDN AppsDEVNET-1153	Enterprise Application to Infrastructure Integration – SDN Apps
DEVNET-1153 Enterprise Application to Infrastructure Integration – SDN Apps
 
Practical soa for business and researchers
Practical soa for business and researchersPractical soa for business and researchers
Practical soa for business and researchers
 
Enterprise Application to Infrastructure Integration - SDN Apps
Enterprise Application to Infrastructure Integration - SDN AppsEnterprise Application to Infrastructure Integration - SDN Apps
Enterprise Application to Infrastructure Integration - SDN Apps
 
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
 
How to improve customer experience with a self organizing network
How to improve customer experience with a self organizing networkHow to improve customer experience with a self organizing network
How to improve customer experience with a self organizing network
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...How to Revamp your Legacy Applications For More Agility and Better Service - ...
How to Revamp your Legacy Applications For More Agility and Better Service - ...
 
Service Provider Architectures for Tomorrow by Chow Khay Kid
Service Provider Architectures for Tomorrow by Chow Khay KidService Provider Architectures for Tomorrow by Chow Khay Kid
Service Provider Architectures for Tomorrow by Chow Khay Kid
 

Kürzlich hochgeladen

Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spaintimesproduction05
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 

Kürzlich hochgeladen (20)

Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 

Parameter Server Approach for Online Learning at Twitter

  • 1. Parameter Server Approach for Online Learning @ Twitter Joe Xie, Yong Wang and Yue Lu ML Infra Group, Ads Prediction Team Oct 10, 2017
  • 2. Outline • Background – Online learning – Challenges • Parameter Server Approaches – v1.0 Decouple the training and prediction – v2.0 Scale the training – v3.0 Scale the model • Future Directions
  • 4. Twitter is Realtime • Twitter is all about real-time: news, events, trends, hashtags. – Users interest and intent change in realtime. – Context changes in realtime. – New advertisers, new campaigns are added in realtime. • ML is increasingly at the core of everything we build at Twitter – ML model dynamically adapts to changes spanning as short as a few hours even minutes
  • 5. Real time: Time Model Data Stream Prediction Stream Time Model Data Stream Prediction Stream Online Learning Offline Learning Learning Phase Training Phase Serving Phase ReadWriteRead & Write Read & Write
  • 6. Real time – Online Learning Architecture Simple and efficient for Ads Prediction and Moments Relevance production services
  • 7. Challenges • Network fanout – The same traffic stream is sent many times over to each prediction instance, wasting network bandwidth. • Limit to training traffic size –Online training throughput is currently limited by the capacity (CPU / Network bandwidth) of a single mesos worker • Limit to model size – All model are hosted within the memory for each instance.
  • 9. Model Architecture Raw Features Raw Features Feature Crosses Decision Tree (e.g., XGBoost...) Neural Network (e.g., Torch, TensorFlow...) ... Distributed Large-scale Online Logistic Regression (Parameter Server) ● Fully explore the feature interaction w/o training latency constraint. ● The feature interactions don’t change frequently historically. ● Flexible architecture with new model structure & external machine learning framework.
  • 10. 20X training data - Parameter server v2.0 to scale the training traffic 10X features+algo complexity - Parameter server v3.0 to scale the model size 10X prediction qps - Parameter server v1.0 to decouple the training and prediction requests Parameter Server Approaches
  • 11. Parameter Server v1.0 Training Worker Training Traffic Observation Service Observation Service Observation Workers Instance of Prediction Service M od el Instance of Prediction Service M od el Instance of Prediction ServicePrediction Worker Pull Model Model Model Pull Downsampling Through ■ New architecture to decouple the training / prediction services into different clusters. 10X Prediction capacity Higher Serving efficiency Prediction Requests Updates Downsampling
  • 12. Parameter Server v1.0 • Separated training service –Take training traffic to generate incremental model update • New observation service – Consume incremental model update – Evaluate training traffic for model quality assurance • Separated prediction service – Consume incremental model update – Serve the prediction request
  • 13. Parameter Server v1.0 • Launched into ads engagement prediction models. – Mesos Efficiency: 40% reduction in CPU cores required. – Network Efficiency: 60% reduction in fan-out messages required.
  • 14. Parameter Server v2.0 Parameter Server Mo del Instance of Prediction Service Mo del Training Workers Training Traffic Observation Service Observation Service Observation Worker NO downsamplingPull Push/Pull Instance of Prediction Service M od el Instance of Prediction Service M od el Instance of Prediction Service M od el Instance of Prediction Service M od el M od el Instance of Prediction ServicePrediction Workers Pull Model ModelModel Model Through ■ New architecture to distribute the training 20X Training data Higher model quality Dispatch Workers Dispatch Workers Dispatch Workers Downsampling Prediction Requests
  • 15. Parameter Server v2.0 • New dispatch service –Take un-sampled training traffic and dispatch to training service • Updated training service –Take training traffic and produce updates for parameter service –Receive model update from parameter service • New parameter service – Aggregate the updates from training services – Send model update to training / observation / prediction services
  • 16. Parameter Server v2.0 • Launched into ads engagement prediction models. • First version using simple model-average aggregation. –20x training capacity –xx% model quality gain
  • 17. Parameter Server v3.0 Mo del Instance of Prediction Service Mo del Training Workers Training Traffic Observation Service Observation Service Observation Worker NO downsamplingPull Push/Pull Instance of Prediction Service M od el Instance of Prediction Service M od el Instance of Prediction Service M od el Instance of Prediction Service M od el M od el Instance of Prediction ServicePrediction Workers Pull Model ModelModel Model Dispatch Workers Dispatch Workers Dispatch Workers Downsampling Prediction RequestsParameter Server Parameter Server Parameter Server Model Through ■ New architecture for model / feature sharding More complex model Higher model quality
  • 18. Parameter Server v3.0 • Updated parameter service (In progress) –Model sharding: Parameter instance hosts single model instead of multiple models. •xx% model quality gain in experimentation. –Feature sharding: Parameter instance hosts partial of single model.