SlideShare ist ein Scribd-Unternehmen logo
1 von 14
UltraGCN: Ultra Simplification of
Graph Convolutional Networks for
Recommendation
CIKM’21, Kelong Mao(Huawei Noah’s Ark Lab) et al.
POSTECH DI Lab
Presenter: Changsoo Kwak
2021.11.23
1
Preliminary: Previous GNN model
2
Message passing in GCN[1]
𝐸(𝑙+1) = 𝜎(𝐷−
1
2𝐴𝐷−
1
2𝐸 𝑙 𝑊 𝑙 )
Message passing in LightGCN[2]
𝐸(𝑙+1) = (𝐷−
1
2𝐴𝐷−
1
2)𝐸 𝑙
[1] Semi-Supervised Classification with Graph Convolutional Networks(ICLR’17)
[2] LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation(SIGIR’20)
Removing feature transformations & non-linear activations
Predict using dot product of user/item
representation from last layer
𝑁 𝑖 : Neighbor of node 𝑖
𝑒𝑖
(𝑙)
: Node 𝑖’s embedding at 𝑙th layer
𝐸(𝑙)
: Embedding matrix at 𝑙th layer
𝑊 𝑙
: Trainable weight matrix at 𝑙th layer
𝐴 : Adjacency matrix with self-connection
𝐷 : Diagonal node degree matrix with self connection
𝑒𝑢
(𝑙+1)
∙ 𝑒𝑖
𝑙+1
= 𝛼𝑢𝑖 𝑒𝑢
𝑙
∙ 𝑒𝑖
𝑙
+
𝑘∈𝑁 𝑢
𝛼𝑖𝑘 𝑒𝑖
𝑙
∙ 𝑒𝑘
𝑙
+
𝑣∈𝑁 𝑖
𝛼𝑢𝑣 𝑒𝑢
𝑙
∙ 𝑒𝑣
𝑙
+
𝑣∈𝑁 𝑖 𝑘∈𝑁(𝑢)
𝛼𝑘𝑣 𝑒𝑘
𝑙
∙ 𝑒𝑣
𝑙
Preliminary: Dot product in LightGCN
3
𝑒𝑢
(𝑙+1)
∙ 𝑒𝑖
𝑙+1
= 𝛼𝑢𝑖 𝑒𝑢
𝑙
∙ 𝑒𝑖
𝑙
+
𝑘∈𝑁 𝑢
𝛼𝑖𝑘 𝑒𝑖
𝑙
∙ 𝑒𝑘
𝑙
+
𝑣∈𝑁 𝑖
𝛼𝑢𝑣 𝑒𝑢
𝑙
∙ 𝑒𝑣
𝑙
+
𝑣∈𝑁 𝑖 𝑘∈𝑁(𝑢)
𝛼𝑘𝑣 𝑒𝑘
𝑙
∙ 𝑒𝑣
𝑙
similarity between target user – target item
similarity between target item – interacted item
similarity between target user – interacted user
similarity between interacted user – interacted item
Construction representation in LightGCN
𝑒𝑢
(𝑙+1)
=
1
𝑑𝑢 + 1
𝑒𝑢
(𝑙)
+
𝑘∈𝑁 𝑢
1
𝑑𝑢 + 1 𝑑𝑘 + 1
𝑒𝑘
(𝑙)
𝑒𝑖
(𝑙+1)
=
1
𝑑𝑖 + 1
𝑒𝑖
(𝑙)
+
𝑣∈𝑁 𝑖
1
𝑑𝑣 + 1 𝑑𝑖 + 1
𝑒𝑣
(𝑙)
𝛼𝑢𝑖 =
1
(𝑑𝑢 + 1)(𝑑𝑖 + 1)
𝛼𝑖𝑘 =
1
𝑑𝑢 + 1 𝑑𝑘 + 1(𝑑𝑖 + 1)
𝛼𝑢𝑣 =
1
𝑑𝑣 + 1 𝑑𝑖 + 1(𝑑𝑢 + 1)
𝛼𝑘𝑣 =
1
𝑑𝑢 + 1 𝑑𝑘 + 1 𝑑𝑣 + 1 𝑑𝑖 + 1
Contributions of various
relationships are different
Limitation of LightGCN
4
Limitation 1
Weight 𝛼𝑖𝑘, 𝛼𝑢𝑣 is not reasonable due to factors of users/items are asymmetric
𝛼𝑖𝑘 =
1
𝑑𝑢 + 1 𝑑𝑘 + 1(𝑑𝑖 + 1)
Limitation 2
Message passing combine various relationships via stacked layers
Due to stacking problematic information, affect negative impact on result
weight for modeling item-item relationship Different weight on target item 𝑖 & interacted item 𝑘
Need to adjust weight(importance) of various relationship
Limitation of LightGCN
5
Limitation 3
Stacking more layers → Capture higher-order collaborative signals
LightGCN performs best with 2~3 layers → Over-smoothing problem may occur
From Theorem 1 in GCNⅡ[1], infinite powers of message passing can derived
lim
𝑙→∞
(𝐷−
1
2𝐴𝐷−
1
2)𝑖,𝑗
𝑙
=
𝑑𝑖 + 1 𝑑𝑗 + 1
2𝑚 + 𝑛
[1] Simple and Deep Graph Convolutional Networks (ICML’20)
Motivation: Removing explicit message passing!
Proposed method: UltraGCN
6
Figure 1: UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation (CIKM’21)
Remove explicit message passing
Directly approximate such convergence state
𝑒𝑖 = lim
𝑛→∞
𝑒𝑖
(𝑙+1)
= lim
𝑛→∞
𝑒𝑖
(𝑙)
Link prediction in graph
Learning on user-item graph
7
Idea: 𝑒𝑢 = lim
𝑛→∞
𝑒𝑢
(𝑙+1)
= lim
𝑛→∞
𝑒𝑢
(𝑙)
Propagation in LightGCN: 𝑒𝑢 =
1
𝑑𝑢 + 1
𝑒𝑢 +
𝑖∈𝑁 𝑢
1
𝑑𝑢 + 1 𝑑𝑖 + 1
𝑒𝑖
𝑒𝑢 =
𝑖∈𝑁 𝑢
𝛽𝑢,𝑖𝑒𝑖 , 𝛽𝑢,𝑖 =
1
𝑑𝑢
𝑑𝑢 + 1
𝑑𝑖 + 1
Transposing
Objective: Minimize difference of both sides
𝐿𝐶 = −
𝑢,𝑖 ∈𝑁+
𝛽𝑢,𝑖 log 𝜎 𝑒𝑢
⊺
𝑒𝑖 −
𝑢,𝑗 ∈𝑁−
𝛽𝑢,𝑗 log 𝜎 −𝑒𝑢
⊺
𝑒𝑗
𝑁+
: Set of positive pairs
𝑁−
: Set of randomly sampled negative pairs
Learning on user-item graph
8
Typical prediction on recommender system → Link prediction on graph
Possible loss: Pairwise BPR vs Pointwise BCE
𝐿𝑂 = −
𝑢,𝑖 ∈𝑁+
log 𝜎 𝑒𝑢
⊺ 𝑒𝑖 −
𝑢,𝑗 ∈𝑁−
log 𝜎 −𝑒𝑢
⊺ 𝑒𝑗
𝐿 = 𝐿𝑂 + 𝜆𝐿𝐶
Above loss depends on user-item graphs(UltraGCNBase)
Learning on item-item graph
9
Limitation 2: Need to adjust weight(importance) of various relationship
UltraGCN does not use explicit message passing → Can adjust weight on various relationship flexibly
Item-Item co-occurrence graph is useful for recommendation[1]
1. Build item-item co-occurrence graph 𝐺 ∈ ℝ 𝐼 ×|𝐼| = 𝐴⊺𝐴
2. Do the same thing for approximate infinite state as similar to deriving 𝛽𝑢,𝑖, with item-item co-occurrence graph 𝐺
𝑒𝑖 =
𝑗∈𝑁𝐺(𝑖)
𝜔𝑖,𝑗𝑒𝑗, where 𝜔𝑖,𝑗 =
𝐺𝑖,𝑗
𝑔𝑖 − 𝐺𝑖,𝑖
𝑔𝑖
𝑔𝑗
, 𝑔𝑖 =
𝑘
𝐺𝑖,𝑘
[1]: M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems (KDD’20 ads track, oral)
Rather than using all 𝑗 ∈ 𝑁𝐺(𝑖), select top-K most similar items 𝑆(𝑖) based on 𝜔𝑖,𝑗 for training
Learning on item-item graph & Final loss
10
On the item-item graph, proper representation of item 𝑖
𝑒𝑖 =
𝑗∈𝑆(𝑖)
𝜔𝑖,𝑗𝑒𝑗
For positive pair 𝑢, 𝑖 ∈ 𝑁+, BCE loss with infinite state of 𝑒𝑖
𝐿𝐼 =
𝑢,𝑖 ∈𝑁+ 𝑗∈𝑆(𝑖)
𝜔𝑖,𝑗 log 𝜎 𝑒𝑢
⊺
𝑒𝑗 + 𝑛𝑒𝑔. 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔
Final loss
𝐿 = 𝐿𝑂 + 𝜆𝐿𝐶 + 𝛾𝐿𝐼
𝐿𝐶, 𝐿𝐼
𝐿𝑂
Experiments
11
Experiments
12
Epoch needed for achieve best performance
Experiments – Ablation study
13
Checklists
1. Are each part of UltraGCN effective?
2. Training user-item pair on item-item co-occurrence graph is better than training item-item pair?
3. Why not use user-user co-occurrence graph?
𝐿𝐼
′
=
𝑢,𝑖 ∈𝑁+ 𝑗∈𝑆(𝑖)
𝜔𝑖,𝑗 log 𝜎 𝑒𝑖
⊺
𝑒𝑗
Experiments – Ablation study
14
Checklists
1. Are each part of UltraGCN effective?
2. Training user-item pair on item-item co-occurrence graph is better than training item-item pair?
3. Why not use user-user co-occurrence graph?

Weitere ähnliche Inhalte

Ähnlich wie Review: [CIKM'21]UltraGCN.pptx

COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
acijjournal
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
MLAI2
 
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
Cemal Ardil
 
20150703.journal club
20150703.journal club20150703.journal club
20150703.journal club
Hayaru SHOUNO
 

Ähnlich wie Review: [CIKM'21]UltraGCN.pptx (20)

Fast Incremental Community Detection on Dynamic Graphs : NOTES
Fast Incremental Community Detection on Dynamic Graphs : NOTESFast Incremental Community Detection on Dynamic Graphs : NOTES
Fast Incremental Community Detection on Dynamic Graphs : NOTES
 
Introduction to Grad-CAM (short version)
Introduction to Grad-CAM (short version)Introduction to Grad-CAM (short version)
Introduction to Grad-CAM (short version)
 
Advanced_cluster_analysis.pptx
Advanced_cluster_analysis.pptxAdvanced_cluster_analysis.pptx
Advanced_cluster_analysis.pptx
 
A generalized Dai-Liao type CG-method with a new monotone line search for unc...
A generalized Dai-Liao type CG-method with a new monotone line search for unc...A generalized Dai-Liao type CG-method with a new monotone line search for unc...
A generalized Dai-Liao type CG-method with a new monotone line search for unc...
 
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
A Performance Analysis of CLMS and Augmented CLMS Algorithms for Smart Antennas
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attribution
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
 
Development of deep reinforcement learning for inverted pendulum
Development of deep reinforcement learning for inverted  pendulumDevelopment of deep reinforcement learning for inverted  pendulum
Development of deep reinforcement learning for inverted pendulum
 
Graph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph GenerationGraph R-CNN for Scene Graph Generation
Graph R-CNN for Scene Graph Generation
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement Learning
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
A comparison-of-first-and-second-order-training-algorithms-for-artificial-neu...
 
Back propderiv
Back propderivBack propderiv
Back propderiv
 
20150703.journal club
20150703.journal club20150703.journal club
20150703.journal club
 
DDPG algortihm for angry birds
DDPG algortihm for angry birdsDDPG algortihm for angry birds
DDPG algortihm for angry birds
 
第13回 配信講義 計算科学技術特論A(2021)
第13回 配信講義 計算科学技術特論A(2021)第13回 配信講義 計算科学技術特論A(2021)
第13回 配信講義 計算科学技術特論A(2021)
 
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Review: [CIKM'21]UltraGCN.pptx

  • 1. UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation CIKM’21, Kelong Mao(Huawei Noah’s Ark Lab) et al. POSTECH DI Lab Presenter: Changsoo Kwak 2021.11.23 1
  • 2. Preliminary: Previous GNN model 2 Message passing in GCN[1] 𝐸(𝑙+1) = 𝜎(𝐷− 1 2𝐴𝐷− 1 2𝐸 𝑙 𝑊 𝑙 ) Message passing in LightGCN[2] 𝐸(𝑙+1) = (𝐷− 1 2𝐴𝐷− 1 2)𝐸 𝑙 [1] Semi-Supervised Classification with Graph Convolutional Networks(ICLR’17) [2] LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation(SIGIR’20) Removing feature transformations & non-linear activations Predict using dot product of user/item representation from last layer 𝑁 𝑖 : Neighbor of node 𝑖 𝑒𝑖 (𝑙) : Node 𝑖’s embedding at 𝑙th layer 𝐸(𝑙) : Embedding matrix at 𝑙th layer 𝑊 𝑙 : Trainable weight matrix at 𝑙th layer 𝐴 : Adjacency matrix with self-connection 𝐷 : Diagonal node degree matrix with self connection 𝑒𝑢 (𝑙+1) ∙ 𝑒𝑖 𝑙+1 = 𝛼𝑢𝑖 𝑒𝑢 𝑙 ∙ 𝑒𝑖 𝑙 + 𝑘∈𝑁 𝑢 𝛼𝑖𝑘 𝑒𝑖 𝑙 ∙ 𝑒𝑘 𝑙 + 𝑣∈𝑁 𝑖 𝛼𝑢𝑣 𝑒𝑢 𝑙 ∙ 𝑒𝑣 𝑙 + 𝑣∈𝑁 𝑖 𝑘∈𝑁(𝑢) 𝛼𝑘𝑣 𝑒𝑘 𝑙 ∙ 𝑒𝑣 𝑙
  • 3. Preliminary: Dot product in LightGCN 3 𝑒𝑢 (𝑙+1) ∙ 𝑒𝑖 𝑙+1 = 𝛼𝑢𝑖 𝑒𝑢 𝑙 ∙ 𝑒𝑖 𝑙 + 𝑘∈𝑁 𝑢 𝛼𝑖𝑘 𝑒𝑖 𝑙 ∙ 𝑒𝑘 𝑙 + 𝑣∈𝑁 𝑖 𝛼𝑢𝑣 𝑒𝑢 𝑙 ∙ 𝑒𝑣 𝑙 + 𝑣∈𝑁 𝑖 𝑘∈𝑁(𝑢) 𝛼𝑘𝑣 𝑒𝑘 𝑙 ∙ 𝑒𝑣 𝑙 similarity between target user – target item similarity between target item – interacted item similarity between target user – interacted user similarity between interacted user – interacted item Construction representation in LightGCN 𝑒𝑢 (𝑙+1) = 1 𝑑𝑢 + 1 𝑒𝑢 (𝑙) + 𝑘∈𝑁 𝑢 1 𝑑𝑢 + 1 𝑑𝑘 + 1 𝑒𝑘 (𝑙) 𝑒𝑖 (𝑙+1) = 1 𝑑𝑖 + 1 𝑒𝑖 (𝑙) + 𝑣∈𝑁 𝑖 1 𝑑𝑣 + 1 𝑑𝑖 + 1 𝑒𝑣 (𝑙) 𝛼𝑢𝑖 = 1 (𝑑𝑢 + 1)(𝑑𝑖 + 1) 𝛼𝑖𝑘 = 1 𝑑𝑢 + 1 𝑑𝑘 + 1(𝑑𝑖 + 1) 𝛼𝑢𝑣 = 1 𝑑𝑣 + 1 𝑑𝑖 + 1(𝑑𝑢 + 1) 𝛼𝑘𝑣 = 1 𝑑𝑢 + 1 𝑑𝑘 + 1 𝑑𝑣 + 1 𝑑𝑖 + 1 Contributions of various relationships are different
  • 4. Limitation of LightGCN 4 Limitation 1 Weight 𝛼𝑖𝑘, 𝛼𝑢𝑣 is not reasonable due to factors of users/items are asymmetric 𝛼𝑖𝑘 = 1 𝑑𝑢 + 1 𝑑𝑘 + 1(𝑑𝑖 + 1) Limitation 2 Message passing combine various relationships via stacked layers Due to stacking problematic information, affect negative impact on result weight for modeling item-item relationship Different weight on target item 𝑖 & interacted item 𝑘 Need to adjust weight(importance) of various relationship
  • 5. Limitation of LightGCN 5 Limitation 3 Stacking more layers → Capture higher-order collaborative signals LightGCN performs best with 2~3 layers → Over-smoothing problem may occur From Theorem 1 in GCNⅡ[1], infinite powers of message passing can derived lim 𝑙→∞ (𝐷− 1 2𝐴𝐷− 1 2)𝑖,𝑗 𝑙 = 𝑑𝑖 + 1 𝑑𝑗 + 1 2𝑚 + 𝑛 [1] Simple and Deep Graph Convolutional Networks (ICML’20) Motivation: Removing explicit message passing!
  • 6. Proposed method: UltraGCN 6 Figure 1: UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation (CIKM’21) Remove explicit message passing Directly approximate such convergence state 𝑒𝑖 = lim 𝑛→∞ 𝑒𝑖 (𝑙+1) = lim 𝑛→∞ 𝑒𝑖 (𝑙) Link prediction in graph
  • 7. Learning on user-item graph 7 Idea: 𝑒𝑢 = lim 𝑛→∞ 𝑒𝑢 (𝑙+1) = lim 𝑛→∞ 𝑒𝑢 (𝑙) Propagation in LightGCN: 𝑒𝑢 = 1 𝑑𝑢 + 1 𝑒𝑢 + 𝑖∈𝑁 𝑢 1 𝑑𝑢 + 1 𝑑𝑖 + 1 𝑒𝑖 𝑒𝑢 = 𝑖∈𝑁 𝑢 𝛽𝑢,𝑖𝑒𝑖 , 𝛽𝑢,𝑖 = 1 𝑑𝑢 𝑑𝑢 + 1 𝑑𝑖 + 1 Transposing Objective: Minimize difference of both sides 𝐿𝐶 = − 𝑢,𝑖 ∈𝑁+ 𝛽𝑢,𝑖 log 𝜎 𝑒𝑢 ⊺ 𝑒𝑖 − 𝑢,𝑗 ∈𝑁− 𝛽𝑢,𝑗 log 𝜎 −𝑒𝑢 ⊺ 𝑒𝑗 𝑁+ : Set of positive pairs 𝑁− : Set of randomly sampled negative pairs
  • 8. Learning on user-item graph 8 Typical prediction on recommender system → Link prediction on graph Possible loss: Pairwise BPR vs Pointwise BCE 𝐿𝑂 = − 𝑢,𝑖 ∈𝑁+ log 𝜎 𝑒𝑢 ⊺ 𝑒𝑖 − 𝑢,𝑗 ∈𝑁− log 𝜎 −𝑒𝑢 ⊺ 𝑒𝑗 𝐿 = 𝐿𝑂 + 𝜆𝐿𝐶 Above loss depends on user-item graphs(UltraGCNBase)
  • 9. Learning on item-item graph 9 Limitation 2: Need to adjust weight(importance) of various relationship UltraGCN does not use explicit message passing → Can adjust weight on various relationship flexibly Item-Item co-occurrence graph is useful for recommendation[1] 1. Build item-item co-occurrence graph 𝐺 ∈ ℝ 𝐼 ×|𝐼| = 𝐴⊺𝐴 2. Do the same thing for approximate infinite state as similar to deriving 𝛽𝑢,𝑖, with item-item co-occurrence graph 𝐺 𝑒𝑖 = 𝑗∈𝑁𝐺(𝑖) 𝜔𝑖,𝑗𝑒𝑗, where 𝜔𝑖,𝑗 = 𝐺𝑖,𝑗 𝑔𝑖 − 𝐺𝑖,𝑖 𝑔𝑖 𝑔𝑗 , 𝑔𝑖 = 𝑘 𝐺𝑖,𝑘 [1]: M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems (KDD’20 ads track, oral) Rather than using all 𝑗 ∈ 𝑁𝐺(𝑖), select top-K most similar items 𝑆(𝑖) based on 𝜔𝑖,𝑗 for training
  • 10. Learning on item-item graph & Final loss 10 On the item-item graph, proper representation of item 𝑖 𝑒𝑖 = 𝑗∈𝑆(𝑖) 𝜔𝑖,𝑗𝑒𝑗 For positive pair 𝑢, 𝑖 ∈ 𝑁+, BCE loss with infinite state of 𝑒𝑖 𝐿𝐼 = 𝑢,𝑖 ∈𝑁+ 𝑗∈𝑆(𝑖) 𝜔𝑖,𝑗 log 𝜎 𝑒𝑢 ⊺ 𝑒𝑗 + 𝑛𝑒𝑔. 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 Final loss 𝐿 = 𝐿𝑂 + 𝜆𝐿𝐶 + 𝛾𝐿𝐼 𝐿𝐶, 𝐿𝐼 𝐿𝑂
  • 12. Experiments 12 Epoch needed for achieve best performance
  • 13. Experiments – Ablation study 13 Checklists 1. Are each part of UltraGCN effective? 2. Training user-item pair on item-item co-occurrence graph is better than training item-item pair? 3. Why not use user-user co-occurrence graph? 𝐿𝐼 ′ = 𝑢,𝑖 ∈𝑁+ 𝑗∈𝑆(𝑖) 𝜔𝑖,𝑗 log 𝜎 𝑒𝑖 ⊺ 𝑒𝑗
  • 14. Experiments – Ablation study 14 Checklists 1. Are each part of UltraGCN effective? 2. Training user-item pair on item-item co-occurrence graph is better than training item-item pair? 3. Why not use user-user co-occurrence graph?

Hinweis der Redaktion

  1. User와 이미 interaction이 있는 item k, predict 예상인 item i를 다르게 취급하는게 맞는가? 이런 것들이 쌓일텐데 그 결과가 옳은 결과인가?
  2. 원 논문에는 spectral gap에 의해 +- alpha가 붙는다고 하는데 여기서는 날려버림 *Spectral gap: difference between the moduli(abs) of the two largest eigenvalues of a matrix(wikipedia)
  3. Message passing에 문제가 있다 + Inifinite power의 limit이 존재한다 -> Inifinite layer message passing을 스킵하고, 적당한 convergence state를 approximate하면 안되나?
  4. Positive pair만 가지고 학습시킬 경우 over-smoothing 현상이 또 발생할 수 있음 - 기존 GCN-based model들은 layer 개수를 줄여서 해결 UltraGCN은 limit of infinite-layer message passing을 approximate했기 때문에, negative sampling을 사용함 결국, 생겨먹은 건 weighted MF 형식…?!
  5. User의 interest가 item이 가지는 어떤 properties보다 더 넓어서, user-user graph user간 relationship을 잘 capture하기 힘들다? - 그래서 넣어도 performance에 영향이 적다