Multi Task Learning for Recommendation Systems

Progressive
Layered
Extraction for
Multi Task Learning in
Recommendation Systems.
Vaibhav Singh - Sr Data Science Manager

Who am I
• Name Pronunciation: y bhav
• Currently Head Machine Learning in Klarna and focus on Fraud, Shopping App
Recommendations and Consumer Growth
• Past Machine Learning Experience in
• Large Scale Image/Ads Moderation
• Credit Risk for P2P Lending
• Moved from Software Engineering to Machine Learning

What are we
learning today ?
● Multi Task Learning
● Mixture of Experts
● MTL in Recommendation Systems
● PLE and CGC in MTL

Image Source: KDD2018 video. (2018). Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts [YouTube Video].
Retrieved from https://www.youtube.com/watch?v=Dweg47Tswxw

Current Challenges in MTL
• Uncorrelated features
• Performance of the network might be affected due to
unrelated features
• Negative Transfer
• Mitigated by multi-gating networks - MMoE - from Google
• Seesaw Phenomenon
• Mitigated by CGC and PLE - from Tencent

Mixture of Experts
Image Source: “Lecture 38 Mixture of Experts Neural Network.” SlideServe, 14 Mar. 2019,
www.slideserve.com/quincy-morrow/lecture-38-mixture-of-experts-neural-network-powerpoint-ppt-presentation. Accessed 2 Dec. 2020.

Image Source: Ma, Jiaqi, et al. “Modeling Task Relationships in Multi-Task Learning with Multi-Gate Mixture-of-Experts.” Proceedings of the 24th ACM SIGKDD International Conference Knowledge
Discovery & Data Mining, 19 July 2018, 10.1145/3219819.3220007. Accessed 25 Nov. 2020.

Image Source: Tang, Hongyan, et al. “Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations.” Fourteenth ACM Conference
on Recommender Systems, 22 Sept. 2020, 10.1145/3383313.3412236. Accessed 25 Nov. 2020.
Single Level MTL Models

Objectives in Recommendation Engines
• Conventional KPI’s
• Click Through Rate
• Conversion Rate
• View Rate
• Share rate
• Comment Rate
• Challenges for MTL
• Heterogeneous sample space due to sequential user actions.
• Determining weight of individual losses is not an easy task
• This paper talks about
• VCR - View Completion Rate - Regression Task - Degree of completion of video
• VTR - View Through Rate - Binary Classification Task - Viewing duration above threshold
• CTR - Click Through Rate
• SHR - Share Rate
• CMR - Comment Rate

Seesaw Phenomenon under Complex Task Correlation

Progressive
Layered Extraction
& Customized Gate
Control for MTL

Customized Gate Control

CGC - Customized Gate Control
● Explicitly separate shared and task specific layers
● Shared experts and task-specific experts are combined through a gating network for selective fusion.
● Output of task k’s gating network is formulated
● wk
(x) is a weighting function to calculate the weight vector of task k through linear transformation and a
SoftMax layer
● Sk
(x) is a selected matrix composed of all selected vectors including shared experts and task
● Prediction of task k. tk
denotes the tower network of task k

PLE - Progressive Layered Extraction

Loss Function for Multi-Task Learning

Loss function for MTL
● Weighted sum of the losses for each individual task
● MTL Loss in practice for Recommendation Systems
○ To train these tasks jointly, we consider the union of sample space of all tasks as the whole
training set, and ignore samples out of its own sample space when calculating the loss of each
individual task.
○ Where lossk
is task k’s loss of sample i calculated based on prediction yˆk
i
and ground truth yk
i
,
δk
i
∈ {0,1} indicates whether sample i lies in the sample space of task k
○ Finally loss weights for each task is updated every epoch.

Links and references
1. Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts MMoE. LINK
2. Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized
Recommendations LINK
3. Lecture 38 Mixture of Experts Neural Network LINK
4. Andrej Karpathy: Tesla Autopilot and Multi-Task Learning for Perception and Prediction VIDEO LINK
5. Andrew Ng Multitask Learning (C3W2L08) VIDEO LINK
6. Keras-MMoE Github

Thank
you!
Vaibhav Singh Linkedin
Klarna - We are hiring

Multi Task Learning for Recommendation Systems

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Multi Task Learning for Recommendation Systems

Ähnlich wie Multi Task Learning for Recommendation Systems (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Multi Task Learning for Recommendation Systems