[RecSys2023] Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis

D
Daniele MalitestaPhD Candidate at Politecnico di Bari um Politecnico di Bari
Challenging the Myth of Graph Collaborative Filtering:
a Reasoned and Reproducibility-driven Analysis
Vito Walter Anelli1, Daniele Malitesta1, Claudio Pomo1,
Alejandro Bellogín2
, Eugenio Di Sciascio1
, Tommaso Di Noia1
Politecnico di Bari, Bari (Italy), email: firstname.lastname@poliba.it
Universidad Autónoma de Madrid, Madrid (Spain), email: alejandro.bellogin@uam.es
The 17th ACM Conference on Recommender Systems
Singapore, SG, 09-20-2023
Main Track - Reproducibility
1
2
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● Introduction and motivations
● Background and reproducibility analysis
● Replication of prior results (RQ1)
● Benchmarking graph CF approaches using alternative baselines (RQ2)
● Extending the experimental comparison to new datasets (RQ3 - RQ4)
● Conclusion and future work
Outline
2
Introduction and motivations
3
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
In collaborative filtering (CF), graph convolutional networks (GCNs) have gained momentum thanks to their ability to
aggregate neighbor nodes information into ego nodes at multiple hops (i.e., message-passing), thus effectively
distilling the collaborative signal
One-hop
neighborhood
m
e
s
s
a
g
e
message
m
e
s
s
a
g
e
m
e
s
s
a
g
e
4
Graph collaborative filtering: message-passing
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Graph collaborative filtering: a non-exhaustive timeline
2020
Lighten the graph convolutional
layer [Chen et al., He et al.], use
graph attention networks to
recognize meaningful user-item
interactions at higher-grained level
[Wang et al. (2020), Tao et al.]
2021-2022
Self-supervised and
contrastive learning
[Wu et al., Yu et al.]
2021-2022
Simplify the message-
passing formulation [Mao
et al., Peng et al., Shen et
al.]
2021-2022
Explore other latent
spaces [Shen et al., Sun
et al., Zhang et al.]
2022
Exploit hypergraphs [Wei
et al., Xia et al.]
2022
Use graph attention networks to
recognize meaningful user-item
interactions at finer-grained
[Zhang et al.] level
2017-2018
Pioneer approaches
proposing GCN-based
aggregation methods
[van den Berg et al.,
Ying et al.]
2019
Explore inter-dependencies between nodes
and their neighbors [Wang et al. (2019a)],
use graph attention networks to recognize
meaningful user-item interactions at
higher-grained level [Wang et al.
(2019b)]
5
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● Reproducibility in machine learning research is the cutting-edge task involving the replication
of experimental results under the same share settings [Bellogín and Said, Anelli et al. (2021a-
2022), Ferrari Dacrema et al. (2019-2021), Sun et al.]
● In graph collaborative filtering, reproducibility is not always feasible since novel approaches
usually tend to
○ copy and paste previous results from the baselines
○ do not provide full details about the experimental settings
● What the research community should seek to
○ provide more detailed descriptions of the experimental settings
○ establish standard evaluation metrics and experimental protocols
Reproducibility and graph collaborative filtering
6
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● RQ1. Is the state-of-the-art (i.e., the six most important papers) of graph collaborative filtering
(graph CF) replicable?
● RQ2. How does the state-of-the-art of graph CF position with respect to classic CF state-of-the-art?
● RQ3. How does the state-of-the-art of graph CF perform on datasets from different domains and
with different topological aspects, not commonly adopted for graph CF recommendation?
● RQ4. What information (or lack of it) impacts the performance of the graph CF methods across the
various datasets?
Research questions
7
Background and
reproducibility analysis
8
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Background notions
i2
i3
u1
i1
u2
1 1 1
1 1 0
R
0 0 1 1 1
0 0 1 1 0
1 1 0 0 0
1 1 0 0 0
1 0 0 0 0
A
User-item interaction matrix
Adjacency matrix
Bipartite and undirected
user-item graph
U I
u1 u2
i3
i2
i1
9
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Selected graph-based recommender systems
10
Model Venue Year Strategy
NGCF SIGIR 2019
• Pioneer approach in graph CF
• Inter-dependencies among ego and neighbor nodes
DGCF SIGIR 2020
• Disentangles users’ and items’ into intents and weights their importance
• Updates graph structure according to those learned intents
LightGCN SIGIR 2020
• Lightens the graph convolutional layer
• Removes feature transformation and non-linearities
SGL SIGIR 2021
• Brings self-supervised and contrastive learning to recommendation
• Learns multiple node views through node/edge dropout and random walk
UltraGCN CIKM 2021
• Approximates infinite propagation layers through a constraint loss and negative sampling
• Explores item-item connections
GFCF CIKM 2021
• Questions graph convolution in recommendation through graph signal processing
• Proposes a strong close-form algorithm
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● Most of the approaches (apart
from UltraGCN) are compared
against a small subset of
classical CF solutions
● The recent literature has raised
concerns about usually-
untested strong CF baselines
[Anelli et al. (2021a-2022),
Ferrari Dacrema et al. (2019-
2021), Zhu et al.]
Analysis on reported baselines
11
Families Baselines
Models
NGCF [71] DGCF [73] LightGCN [28] SGL [78] UltraGCN [47] GFCF [59]
Used as graph CF baseline in (2021 — present)
[10, 13, 32, 62, 77, 84] [19, 39, 46, 74, 75, 92] [40, 54, 78, 82, 88, 89] [22, 46, 77, 82, 85, 93] [17, 24, 42, 48, 95, 96] [4, 5, 41, 50, 80, 96]
Classic CF
MF-BPR [55] 3 3 3
NeuMF [29] 3
CMN [18] 3
MacridVAE [44] 3
Mult-VAE [38] 3 3 3
DNN+SSL [86] 3
ENMF [11] 3
CML [30] 3
DeepWalk [52] 3
LINE [66] 3
Node2Vec [25] 3
NBPO [91] 3
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● Conversely, most of the approaches
are compared against graph CF
solutions
● Orange ticks indicate that no
extensive comparison among the
selected baselines is performed (for
chronological reasons)
Analysis on reported baselines (cont.)
12
Families Baselines
Models
NGCF [71] DGCF [73] LightGCN [28] SGL [78] UltraGCN [47] GFCF [59]
Used as graph CF baseline in (2021 — present)
[10, 13, 32, 62, 77, 84] [19, 39, 46, 74, 75, 92] [40, 54, 78, 82, 88, 89] [22, 46, 77, 82, 85, 93] [17, 24, 42, 48, 95, 96] [4, 5, 41, 50, 80, 96]
Graph CF
HOP-Rec [83] 3
GC-MC [68] 3 3
PinSage [87] 3
NGCF [71] 3 3 3 3 3
DisenGCN [43] 3
GRMF [53] 3 3
GRMF-Norm [28] 3 3
NIA-GCN [64] 3
LightGCN [28] 3 3 3
DGCF [73] 3
LR-GCCF [14] 3
SCF [94] 3
BGCF [63] 3
LCFN [90] 3
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Analysis on reported datasets
● Only a limited subset of shared
recommendation datasets
● We include novel, never-investigated
datasets
13
Models Gowalla Yelp 2018 Amazon Book Alibaba-iFashion Movielens 1M Amazon Electronics Amazon CDs
NGCF 3 3 3
DGCF 3 3 3
LightGCN 3 3 3
SGL 3 3 3
UltraGCN 3 3 3 3 3 3
GFCF 3 3 3
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Analysis on experimental comparison
● NGCF train all baselines from scratch
● DGCF reports the results directly from the NGCF paper for the shared baselines
● LightGCN, SGL, and UltraGCN copy and paste from the original papers
● GFCF reproduce the results from LightGCN as the baselines are exactly the same
● Some authors are shared across such works
What we have done
● Re-implement from scratch all baselines by carefully following the original works
● Train/evaluate them within Elliot [Anelli et al. (2021b), Malitesta et al. (2023a)]
● Our goal is to provide a fair and repeatable experimental environment
● Use the same hyper-parameter settings as reported in the original papers and codes
14
Replication of prior results (RQ1)
15
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● All approaches (except for SGL) use the same datasets filtering and splitting (80/20 hold-out splitting
user-wise)
● 10% of the training is left for validation for the tuning of hyper-parameters (no indication in the
papers and/or codes)
● All unrated items as evaluation protocol
● Evaluation through the Recall@20 and nDCG@20 (Recall@20 as validation metric)
● The best settings of hyper-parameters are usually shared in the paper and/or code
Settings
16
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Results
● The most significant performance shift is in the order
of 10-3
● GFCF is the best replicated one (no random
initialization of model weights)
● NGCF and DGCF rarely achieve 10-4
because of the
random initializations and stochastic learning
processes involved
● Replicability is ensured and the copy-paste practise
did not hurt the results
17
Datasets Models Ours Original Performance Shift
Recall nDCG Recall nDCG Recall nDCG
Gowalla
NGCF 0.1556 0.1320 0.1569 0.1327 1.3 · 10 03 7 · 10 04
DGCF 0.1736 0.1477 0.1794 0.1521 5.8 · 10 03 4.4 · 10 03
LightGCN 0.1826 0.1545 0.1830 0.1554 4 · 10 04 9 · 10 04
SGL* — — — — — —
UltraGCN 0.1863 0.1580 0.1862 0.1580 +1 · 10 04 0
GFCF 0.1849 0.1518 0.1849 0.1518 0 0
Yelp 2018
NGCF 0.0556 0.0452 0.0579 0.0477 2.3 · 10 03 2.5 · 10 03
DGCF 0.0621 0.0505 0.0640 0.0522 1.9 · 10 03 1.7 · 10 03
LightGCN 0.0629 0.0516 0.0649 0.0530 2 · 10 03 1.4 · 10 03
SGL 0.0669 0.0552 0.0675 0.0555 6 · 10 04 3 · 10 04
UltraGCN 0.0672 0.0553 0.0683 0.0561 1.1 · 10 03 8 · 10 04
GFCF 0.0697 0.0571 0.0697 0.0571 0 0
Amazon Book
NGCF 0.0319 0.0246 0.0337 0.0261 1.8 · 10 03 1.5 · 10 03
DGCF 0.0384 0.0295 0.0399 0.0308 1.5 · 10 03 1.3 · 10 03
LightGCN 0.0419 0.0323 0.0411 0.0315 +8 · 10 04 +8 · 10 04
SGL 0.0474 0.0372 0.0478 0.0379 4 · 10 04 7 · 10 04
UltraGCN 0.0688 0.0561 0.0681 0.0556 +7 · 10 04 +5 · 10 04
GFCF 0.0710 0.0584 0.0710 0.0584 0 0
*Results are not provided since SGL was not originally trained and tested on Gowalla.
Benchmarking graph CF approaches using
alternative baselines (RQ2)
18
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● Expand the investigation to four classic CF recommender systems: UserkNN, ItemkNN, RP3β, EASER
[Ferrari Dacrema et al. (2019), Anelli et al. (2022)]
● Consider two unpersonalized approaches (MostPop and Random)
● Follow the exact same 80/20 train/test splitting, and retain our version of the 10% of the training
as validation
● Use Tree-structured Parzen Estimator (with 20 exploration) [Bergstra et al.]
● Recall@20 is used as validation metric
Settings
19
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Results
● Neither MostPop nor Random get acceptable results:
popularity bias is not present in the datasets or was
removed (see later)
● Some of the classic CF approaches reach better
performance than some graph CF baselines, and on Yelp
2018 and Amazon Book they reach best or second-to-best
performance
20
Families Models Gowalla Yelp 2018 Amazon Book
Recall nDCG Recall nDCG Recall nDCG
Reference
MostPop 0.0416 0.0316 0.0125 0.0101 0.0051 0.0044
Random 0.0005 0.0003 0.0005 0.0004 0.0002 0.0002
Classic CF
UserkNN 0.1685 0.1370 0.0630 0.0528 0.0582 0.0477
ItemkNN 0.1409 0.1165 0.0610 0.0507 0.0634 0.0524
RP3 0.1829 0.1520 0.0671 0.0559 0.0683 0.0565
EASER* 0.1661 0.1384 0.0655 0.0552 0.0710 0.0567
Graph CF
NGCF 0.1556 0.1320 0.0556 0.0452 0.0319 0.0246
DGCF 0.1736 0.1477 0.0621 0.0505 0.0384 0.0295
LightGCN 0.1826 0.1545 0.0629 0.0516 0.0419 0.0323
SGL — — 0.0669 0.0552 0.0474 0.0372
UltraGCN 0.1863 0.1580 0.0672 0.0553 0.0688 0.0561
GFCF 0.1849 0.1518 0.0697 0.0571 0.0710 0.0584
*Results for EASER on Amazon Book are taken from BARS Benchmark.
Extending the experimental comparison to new
datasets (RQ3 - RQ4)
21
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
● Two novel datasets: Allrecipes and BookCrossing with
discordant characteristics compared to the other datasets
● Allrecipes:
○ users are more numerous than items
○ much lower average user and item degrees
● BookCrossing:
○ lowest ratio between users and items
○ much higher density than the other datasets
● Useful to assess the performance in different (and never-
explored) topological settings
● Use the same experimental setting from RQ2 but with
validation set (10% of the training set)
Settings
22
Statistics Gowalla Yelp 2018 Amazon Book Allrecipes BookCrossing
Users 29,858 31,668 52,643 10,084 6,754
Items 40,981 38,048 91,599 8,407 13,670
Edges 810,128 1,237,259 2,380,730 80,540 234,762
Density 0.0007 0.0010 0.0005 0.0010 0.0025
Avg. Deg. (U ) 27.1327 39.0697 45.2241 7.9869 34.7590
Avg. Deg. (I ) 19.7684 32.5184 25.9908 9.5801 17.1735
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Results
● Classic CF methods are very competitive
● Especially on BookCrossing, the classic CF baselines are the
top-performing approaches
● Only UltraGCN and LightGCN keep their performance as
observed in the previous datasets
● For the other graph-based ones, the performance
significantly drops
23
Families Models Allrecipes BookCrossing
Recall nDCG Recall nDCG
Reference
MostPop 0.0472 0.0242 0.0352 0.0319
Random 0.0024 0.0010 0.0013 0.0011
Classic CF
UserkNN 0.0339 0.0188 0.0871 0.0769
ItemkNN 0.0326 0.0180 0.0779 0.0739
RP3 0.0170 0.0089 0.0941 0.0821
EASER 0.0351 0.0192 0.0925 0.0847
Graph CF
NGCF 0.0291 0.0144 0.0670 0.0546
DGCF 0.0448 0.0234 0.0643 0.0543
LightGCN 0.0459 0.0236 0.0803 0.0660
SGL 0.0365 0.0192 0.0716 0.0600
UltraGCN 0.0475 0.0248 0.0800 0.0651
GFCF 0.0101 0.0051 0.0819 0.0712
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Discussion (graph-based models’ ranking)
● UltraGCN and GFCF are the two best-performing
approaches
● All the other approaches rank according to the
chronological order
● On Allrecipes and BookCrossing
○ UltraGCN preserves its role of leading approach
○ GFCF and DGCF performance is very
fluctuating
○ LightGCN is in the top positions and surpasses
other models which should ideally outperform
it (e.g., SGL)
○ NGCF poor performance is confirmed
24
Metric Gowalla Yelp 2018 Amazon Book Allrecipes BookCrossing
Recall
1. UltraGCN (+19.73%) GFCF (+25.36%) GFCF (+122.57%) UltraGCN (+370.30%) GFCF (+27.37%)
2. GFCF (+18.83%) UltraGCN (+20.86%) UltraGCN (+115.67%) LightGCN (+354.46%) LightGCN (+24.88%)
3. LightGCN (+17.35%) SGL (+20.32%) SGL (+48.59%) DGCF (+343.56%) UltraGCN (+24.42%)
4. DGCF (+11.57%) LightGCN (+13.13%) LightGCN (+31.35%) SGL (+261.39%) SGL (+11.35%)
5. NGCF ( — ) DGCF (+11.69%) DGCF (+20.38%) NGCF (+188.12%) NGCF (+4.20%)
6. SGL* ( — ) NGCF ( — ) NGCF ( — ) GFCF ( — ) DGCF ( — )
nDCG
1. UltraGCN (+19.70%) GFCF (+26.33%) GFCF (+137.40%) UltraGCN (+386.27%) GFCF (+31.12%)
2. LightGCN (+17.05%) UltraGCN (+22.35%) UltraGCN (+128.05%) LightGCN (+362.75%) LightGCN (+21.55%)
3. GFCF (+15.00%) SGL (+22.12%) SGL (+51.22%) DGCF (+358.82%) UltraGCN (+19.89%)
4. DGCF (+11.89%) LightGCN (+14.16%) LightGCN (+31.30%) SGL (+276.47%) SGL (+10.50%)
5. NGCF ( — ) DGCF (+11.73%) DGCF (+19.92%) NGCF (+182.35%) NGCF (+0.55%)
6. SGL* ( — ) NGCF ( — ) NGCF ( — ) GFCF ( — ) DGCF ( — )
*SGL is not classifiable on the Gowalla dataset as results were not calculated in the original paper.
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Discussion (analysis on the node degree)
● We reinterpret node degree as information flow from
neighbor nodes to the ego nodes after multiple hops
● Only users as ending nodes because accuracy metrics are
calculated user-wise
● Information flow at 1, 2, and 3 hops:
information after 1-hop
column vector
25
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Analysis on the node degree (1-hop)
Indication of the activeness of users on the platform
average performance
(nDCG@20)
user quartiles over
information values
performance
improvement
● The 4th quartile is favoured with respect to the other ones
● The trend is even more evident on GFCF
26
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Analysis on the node degree (2-hop)
Indication of the influence of items’ popularity on users
● Models favour the warm users who enjoyed popular items over the cold users who enjoyed niche
items
● On Allrecipes, UltraGCN, DGCF, and LightGCN show less discriminatory behavior across
quartiles; SGL and NGCF show a higher slope that is comparable to classic CF methods; GFCF
behavior is even more accentuated than the 1-hop setting
● On BookCrossing, the trend is almost aligned across all models
27
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Analysis on the node degree (3-hop)
Indication of the influence of co-interacting users’ activeness on users
● On Allrecipes, UltraGCN, DGCF, and LightGCN exhibit more
consistency across quartiles, while NGCF, SGL, and GFCF have a
more disparate range of results
● On BookCrossing, the information at the 3-hop is not providing
more insights than the 2-hop
28
Conclusion and future work
29
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Conclusion
● Replicate the results of six state-of-the-art graph CF methods
● We include other state-of-the-art approaches and other (unexplored) datasets
● The topological graph characteristics (i.e., node degree) may impact the performance
● This happens especially for the information flow at 2-hop (i.e., user activeness + item popularity)
Future work
● Further investigation into diversity and fairness of graph CF approaches
● Analyze the impact of other topological graph characteristics on the performance (currently on arXiv
[Malitesta et al. (2023b)])
30
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
Useful resources
31
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
A Topology-aware Analysis of Graph Collaborative Filtering
32
[Malitesta et al. (2023b)]
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
[van den Berg et al.] 2017. Graph convolutional matrix completion. CoRR abs/1706.02263.
[Ying et al.] 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD. ACM, 974–983.
[Wang et al. (2019a)] 2019. Neural Graph Collaborative Filtering. In SIGIR. ACM, 165–174.
[Wang et al. (2019b)] 2019. KGAT: Knowledge Graph Attention Network for Recommendation. In KDD. ACM, 950–958.
[Chen et al.] 2020. Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach. In AAAI. AAAI Press, 27–34.
[He et al.] 2020. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. In SIGIR. ACM, 639–648.
[Wang et al. (2020)] 2020. Disentangled Graph Collaborative Filtering. In SIGIR. ACM, 1001–1010.
[Tao et al.] 2020. MGAT: Multimodal Graph Attention Network for Recommendation. Inf. Process. Manag. 57, 5 (2020), 102277.
[Wu et al.] 2021. Self-supervised Graph Learning for Recommendation. In SIGIR. ACM, 726–735.
[Yu et al.] 2022. Are Graph Augmentations Necessary?: Simple Graph Contrastive Learning for Recommendation. In SIGIR. ACM, 1294–1303.
[Mao et al.] 2021. UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation. In CIKM. ACM, 1253–1262.
[Peng et al.] 2022. SVD-GCN: A Simplified Graph Convolution Paradigm for Recommendation. In CIKM. ACM, 1625–1634.
[Shen et al.] 2021. How Powerful is Graph Convolution for Recommendation?. In CIKM. ACM, 1619–1629.
[Sun et al.] 2021. HGCF: Hyperbolic Graph Convolution Networks for Collaborative Filtering. In WWW. ACM / IW3C2, 593–601.
[Zhang et al.] 2022. Geometric Disentangled Collaborative Filtering. In SIGIR. ACM, 80–90.
[Wei et al.] 2022. Dynamic Hypergraph Learning for Collaborative Filtering. In CIKM. ACM, 2108–2117.
[Xia et al.] 2022. Hypergraph Contrastive Collaborative Filtering. In SIGIR. ACM, 70–79.
[Bellogín and Said] 2021. Improving accountability in recommender systems research through reproducibility. User Model. User Adapt. Interact. 31, 5 (2021), 941–977.
[Anelli et al. (2021a)] 2021. Reenvisioning the comparison between Neural Collaborative Filtering and Matrix Factorization. In RecSys. ACM, 521–529.
[Anelli et al. (2022)] 2022. Top-N Recommendation Algorithms: A Quest for the State-of-the-Art. In UMAP. ACM, 121–131.
References 1/2
33
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
[Ferrari Dacrema et al. (2019)] 2019. Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In RecSys. ACM, 101–109.
[Ferrari Dacrema et al. (2021)] 2021. A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research. ACM Trans. Inf. Syst. 39, 2 (2021), 20:1–
20:49.
[Sun et al.] 2020. Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison. In RecSys. ACM, 23–32.
[Zhu et al.] 2022. BARS: Towards Open Benchmarking for Recommender Systems. In SIGIR. ACM, 2912–2923.
[Anelli et al. (2021b)] 2021. Elliot: A Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation. In SIGIR. ACM, 2405–2414.
[Malitesta et al. (2023a)] 2023. An Out-of-the-Box Application for Reproducible Graph Collaborative Filtering extending the Elliot Framework. In UMAP (Adjunct
Publication). ACM, 12–15.
[Bergstra et al.] 2011. Algorithms for Hyper-Parameter Optimization. In NIPS. 2546–2554.
[Malitesta et al. (2023b)] 2023. A Topology-aware Analysis of Graph Collaborative Filtering. CoRR abs/2308.10778.
References 2/2
34
1 von 34

Recomendados

A scalable collaborative filtering framework based on co clustering von
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringAllenWu
4K views20 Folien
Integration of queuing network and idef3 for business process analysis von
Integration of queuing network and idef3 for business process analysisIntegration of queuing network and idef3 for business process analysis
Integration of queuing network and idef3 for business process analysisPatricia Tavares Boralli
407 views13 Folien
LightGCN: Simplifying and Powering Graph Convolution Network for Recommendati... von
LightGCN: Simplifying and Powering Graph Convolution Network for Recommendati...LightGCN: Simplifying and Powering Graph Convolution Network for Recommendati...
LightGCN: Simplifying and Powering Graph Convolution Network for Recommendati...ssuser2624f71
2 views19 Folien
Unleashing Creativity: A Comparative Analysis of the Best GAN Methods for Gen... von
Unleashing Creativity: A Comparative Analysis of the Best GAN Methods for Gen...Unleashing Creativity: A Comparative Analysis of the Best GAN Methods for Gen...
Unleashing Creativity: A Comparative Analysis of the Best GAN Methods for Gen...076TalathUnNabiAnik
11 views3 Folien
Implementation of Automated Attendance System using Deep Learning von
Implementation of Automated Attendance System using Deep LearningImplementation of Automated Attendance System using Deep Learning
Implementation of Automated Attendance System using Deep LearningMd. Mahfujur Rahman
82 views14 Folien
algorithms von
algorithmsalgorithms
algorithmsDikshaGupta535173
56 views22 Folien

Más contenido relacionado

Similar a [RecSys2023] Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis

IRJET- Generating 3D Models Using 3D Generative Adversarial Network von
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET Journal
22 views5 Folien
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST... von
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
55 views16 Folien
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on... von
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...IJERA Editor
154 views6 Folien
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf von
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdfOpenACC
178 views16 Folien
IRJET- Customer Segmentation from Massive Customer Transaction Data von
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET Journal
16 views5 Folien
Deep_Learning__INAF_baroncelli.pdf von
Deep_Learning__INAF_baroncelli.pdfDeep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdfasdfasdf214078
2 views31 Folien

Similar a [RecSys2023] Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis(20)

IRJET- Generating 3D Models Using 3D Generative Adversarial Network von IRJET Journal
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET Journal22 views
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST... von acijjournal
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
acijjournal55 views
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on... von IJERA Editor
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
IJERA Editor154 views
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf von OpenACC
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC178 views
IRJET- Customer Segmentation from Massive Customer Transaction Data von IRJET Journal
IRJET- Customer Segmentation from Massive Customer Transaction DataIRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET- Customer Segmentation from Massive Customer Transaction Data
IRJET Journal16 views
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact... von IRJET Journal
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
IRJET Journal3 views
Content Based Image Retrieval (CBIR) von Behzad Shomali
Content Based Image Retrieval (CBIR)Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)
Behzad Shomali29 views
IRJET- Crowd Density Estimation using Novel Feature Descriptor von IRJET Journal
IRJET- Crowd Density Estimation using Novel Feature DescriptorIRJET- Crowd Density Estimation using Novel Feature Descriptor
IRJET- Crowd Density Estimation using Novel Feature Descriptor
IRJET Journal23 views
AUTOMATED WASTE MANAGEMENT SYSTEM von IRJET Journal
AUTOMATED WASTE MANAGEMENT SYSTEMAUTOMATED WASTE MANAGEMENT SYSTEM
AUTOMATED WASTE MANAGEMENT SYSTEM
IRJET Journal5 views
Partial Object Detection in Inclined Weather Conditions von IRJET Journal
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
IRJET Journal4 views
Analysis of Nifty 50 index stock market trends using hybrid machine learning ... von IJECEIAES
Analysis of Nifty 50 index stock market trends using hybrid machine learning ...Analysis of Nifty 50 index stock market trends using hybrid machine learning ...
Analysis of Nifty 50 index stock market trends using hybrid machine learning ...
IJECEIAES121 views
IRJET- Interactive Image Segmentation with Seed Propagation von IRJET Journal
IRJET-  	  Interactive Image Segmentation with Seed PropagationIRJET-  	  Interactive Image Segmentation with Seed Propagation
IRJET- Interactive Image Segmentation with Seed Propagation
IRJET Journal9 views
A systematic mapping study of performance analysis and modelling of cloud sys... von IJECEIAES
A systematic mapping study of performance analysis and modelling of cloud sys...A systematic mapping study of performance analysis and modelling of cloud sys...
A systematic mapping study of performance analysis and modelling of cloud sys...
IJECEIAES18 views
Image Features Matching and Classification Using Machine Learning von IRJET Journal
Image Features Matching and Classification Using Machine LearningImage Features Matching and Classification Using Machine Learning
Image Features Matching and Classification Using Machine Learning
IRJET Journal6 views
Face recognition using gaussian mixture model & artificial neural network von eSAT Journals
Face recognition using gaussian mixture model & artificial neural networkFace recognition using gaussian mixture model & artificial neural network
Face recognition using gaussian mixture model & artificial neural network
eSAT Journals217 views
Improve the Performance of Clustering Using Combination of Multiple Clusterin... von ijdmtaiir
Improve the Performance of Clustering Using Combination of Multiple Clusterin...Improve the Performance of Clustering Using Combination of Multiple Clusterin...
Improve the Performance of Clustering Using Combination of Multiple Clusterin...
ijdmtaiir28 views
A Survey of Machine Learning Methods Applied to Computer ... von butest
A Survey of Machine Learning Methods Applied to Computer ...A Survey of Machine Learning Methods Applied to Computer ...
A Survey of Machine Learning Methods Applied to Computer ...
butest2.9K views

Último

Renewal Projects in Seismic Construction von
Renewal Projects in Seismic ConstructionRenewal Projects in Seismic Construction
Renewal Projects in Seismic ConstructionEngineering & Seismic Construction
5 views8 Folien
Searching in Data Structure von
Searching in Data StructureSearching in Data Structure
Searching in Data Structureraghavbirla63
14 views8 Folien
fakenews_DBDA_Mar23.pptx von
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptxdeepmitra8
16 views34 Folien
Generative AI Models & Their Applications von
Generative AI Models & Their ApplicationsGenerative AI Models & Their Applications
Generative AI Models & Their ApplicationsSN
10 views1 Folie
START Newsletter 3 von
START Newsletter 3START Newsletter 3
START Newsletter 3Start Project
6 views25 Folien

Último(20)

fakenews_DBDA_Mar23.pptx von deepmitra8
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
deepmitra816 views
Generative AI Models & Their Applications von SN
Generative AI Models & Their ApplicationsGenerative AI Models & Their Applications
Generative AI Models & Their Applications
SN10 views
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx von lwang78
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
lwang78109 views
Design_Discover_Develop_Campaign.pptx von ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth637 views
Update 42 models(Diode/General ) in SPICE PARK(DEC2023) von Tsuyoshi Horigome
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ... von AltinKaradagli
Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ...Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ...
Investigation of Physicochemical Changes of Soft Clay around Deep Geopolymer ...
AltinKaradagli15 views
MongoDB.pdf von ArthyR3
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
ArthyR345 views
Design of machine elements-UNIT 3.pptx von gopinathcreddy
Design of machine elements-UNIT 3.pptxDesign of machine elements-UNIT 3.pptx
Design of machine elements-UNIT 3.pptx
gopinathcreddy33 views
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth von Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 6 views
MSA Website Slideshow (16).pdf von msaucla
MSA Website Slideshow (16).pdfMSA Website Slideshow (16).pdf
MSA Website Slideshow (16).pdf
msaucla92 views

[RecSys2023] Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis

  • 1. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis Vito Walter Anelli1, Daniele Malitesta1, Claudio Pomo1, Alejandro Bellogín2 , Eugenio Di Sciascio1 , Tommaso Di Noia1 Politecnico di Bari, Bari (Italy), email: firstname.lastname@poliba.it Universidad Autónoma de Madrid, Madrid (Spain), email: alejandro.bellogin@uam.es The 17th ACM Conference on Recommender Systems Singapore, SG, 09-20-2023 Main Track - Reproducibility 1 2
  • 2. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● Introduction and motivations ● Background and reproducibility analysis ● Replication of prior results (RQ1) ● Benchmarking graph CF approaches using alternative baselines (RQ2) ● Extending the experimental comparison to new datasets (RQ3 - RQ4) ● Conclusion and future work Outline 2
  • 4. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) In collaborative filtering (CF), graph convolutional networks (GCNs) have gained momentum thanks to their ability to aggregate neighbor nodes information into ego nodes at multiple hops (i.e., message-passing), thus effectively distilling the collaborative signal One-hop neighborhood m e s s a g e message m e s s a g e m e s s a g e 4 Graph collaborative filtering: message-passing
  • 5. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Graph collaborative filtering: a non-exhaustive timeline 2020 Lighten the graph convolutional layer [Chen et al., He et al.], use graph attention networks to recognize meaningful user-item interactions at higher-grained level [Wang et al. (2020), Tao et al.] 2021-2022 Self-supervised and contrastive learning [Wu et al., Yu et al.] 2021-2022 Simplify the message- passing formulation [Mao et al., Peng et al., Shen et al.] 2021-2022 Explore other latent spaces [Shen et al., Sun et al., Zhang et al.] 2022 Exploit hypergraphs [Wei et al., Xia et al.] 2022 Use graph attention networks to recognize meaningful user-item interactions at finer-grained [Zhang et al.] level 2017-2018 Pioneer approaches proposing GCN-based aggregation methods [van den Berg et al., Ying et al.] 2019 Explore inter-dependencies between nodes and their neighbors [Wang et al. (2019a)], use graph attention networks to recognize meaningful user-item interactions at higher-grained level [Wang et al. (2019b)] 5 Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
  • 6. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● Reproducibility in machine learning research is the cutting-edge task involving the replication of experimental results under the same share settings [Bellogín and Said, Anelli et al. (2021a- 2022), Ferrari Dacrema et al. (2019-2021), Sun et al.] ● In graph collaborative filtering, reproducibility is not always feasible since novel approaches usually tend to ○ copy and paste previous results from the baselines ○ do not provide full details about the experimental settings ● What the research community should seek to ○ provide more detailed descriptions of the experimental settings ○ establish standard evaluation metrics and experimental protocols Reproducibility and graph collaborative filtering 6
  • 7. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● RQ1. Is the state-of-the-art (i.e., the six most important papers) of graph collaborative filtering (graph CF) replicable? ● RQ2. How does the state-of-the-art of graph CF position with respect to classic CF state-of-the-art? ● RQ3. How does the state-of-the-art of graph CF perform on datasets from different domains and with different topological aspects, not commonly adopted for graph CF recommendation? ● RQ4. What information (or lack of it) impacts the performance of the graph CF methods across the various datasets? Research questions 7
  • 9. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Background notions i2 i3 u1 i1 u2 1 1 1 1 1 0 R 0 0 1 1 1 0 0 1 1 0 1 1 0 0 0 1 1 0 0 0 1 0 0 0 0 A User-item interaction matrix Adjacency matrix Bipartite and undirected user-item graph U I u1 u2 i3 i2 i1 9
  • 10. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Selected graph-based recommender systems 10 Model Venue Year Strategy NGCF SIGIR 2019 • Pioneer approach in graph CF • Inter-dependencies among ego and neighbor nodes DGCF SIGIR 2020 • Disentangles users’ and items’ into intents and weights their importance • Updates graph structure according to those learned intents LightGCN SIGIR 2020 • Lightens the graph convolutional layer • Removes feature transformation and non-linearities SGL SIGIR 2021 • Brings self-supervised and contrastive learning to recommendation • Learns multiple node views through node/edge dropout and random walk UltraGCN CIKM 2021 • Approximates infinite propagation layers through a constraint loss and negative sampling • Explores item-item connections GFCF CIKM 2021 • Questions graph convolution in recommendation through graph signal processing • Proposes a strong close-form algorithm
  • 11. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● Most of the approaches (apart from UltraGCN) are compared against a small subset of classical CF solutions ● The recent literature has raised concerns about usually- untested strong CF baselines [Anelli et al. (2021a-2022), Ferrari Dacrema et al. (2019- 2021), Zhu et al.] Analysis on reported baselines 11 Families Baselines Models NGCF [71] DGCF [73] LightGCN [28] SGL [78] UltraGCN [47] GFCF [59] Used as graph CF baseline in (2021 — present) [10, 13, 32, 62, 77, 84] [19, 39, 46, 74, 75, 92] [40, 54, 78, 82, 88, 89] [22, 46, 77, 82, 85, 93] [17, 24, 42, 48, 95, 96] [4, 5, 41, 50, 80, 96] Classic CF MF-BPR [55] 3 3 3 NeuMF [29] 3 CMN [18] 3 MacridVAE [44] 3 Mult-VAE [38] 3 3 3 DNN+SSL [86] 3 ENMF [11] 3 CML [30] 3 DeepWalk [52] 3 LINE [66] 3 Node2Vec [25] 3 NBPO [91] 3
  • 12. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● Conversely, most of the approaches are compared against graph CF solutions ● Orange ticks indicate that no extensive comparison among the selected baselines is performed (for chronological reasons) Analysis on reported baselines (cont.) 12 Families Baselines Models NGCF [71] DGCF [73] LightGCN [28] SGL [78] UltraGCN [47] GFCF [59] Used as graph CF baseline in (2021 — present) [10, 13, 32, 62, 77, 84] [19, 39, 46, 74, 75, 92] [40, 54, 78, 82, 88, 89] [22, 46, 77, 82, 85, 93] [17, 24, 42, 48, 95, 96] [4, 5, 41, 50, 80, 96] Graph CF HOP-Rec [83] 3 GC-MC [68] 3 3 PinSage [87] 3 NGCF [71] 3 3 3 3 3 DisenGCN [43] 3 GRMF [53] 3 3 GRMF-Norm [28] 3 3 NIA-GCN [64] 3 LightGCN [28] 3 3 3 DGCF [73] 3 LR-GCCF [14] 3 SCF [94] 3 BGCF [63] 3 LCFN [90] 3
  • 13. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Analysis on reported datasets ● Only a limited subset of shared recommendation datasets ● We include novel, never-investigated datasets 13 Models Gowalla Yelp 2018 Amazon Book Alibaba-iFashion Movielens 1M Amazon Electronics Amazon CDs NGCF 3 3 3 DGCF 3 3 3 LightGCN 3 3 3 SGL 3 3 3 UltraGCN 3 3 3 3 3 3 GFCF 3 3 3
  • 14. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Analysis on experimental comparison ● NGCF train all baselines from scratch ● DGCF reports the results directly from the NGCF paper for the shared baselines ● LightGCN, SGL, and UltraGCN copy and paste from the original papers ● GFCF reproduce the results from LightGCN as the baselines are exactly the same ● Some authors are shared across such works What we have done ● Re-implement from scratch all baselines by carefully following the original works ● Train/evaluate them within Elliot [Anelli et al. (2021b), Malitesta et al. (2023a)] ● Our goal is to provide a fair and repeatable experimental environment ● Use the same hyper-parameter settings as reported in the original papers and codes 14
  • 15. Replication of prior results (RQ1) 15
  • 16. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● All approaches (except for SGL) use the same datasets filtering and splitting (80/20 hold-out splitting user-wise) ● 10% of the training is left for validation for the tuning of hyper-parameters (no indication in the papers and/or codes) ● All unrated items as evaluation protocol ● Evaluation through the Recall@20 and nDCG@20 (Recall@20 as validation metric) ● The best settings of hyper-parameters are usually shared in the paper and/or code Settings 16
  • 17. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Results ● The most significant performance shift is in the order of 10-3 ● GFCF is the best replicated one (no random initialization of model weights) ● NGCF and DGCF rarely achieve 10-4 because of the random initializations and stochastic learning processes involved ● Replicability is ensured and the copy-paste practise did not hurt the results 17 Datasets Models Ours Original Performance Shift Recall nDCG Recall nDCG Recall nDCG Gowalla NGCF 0.1556 0.1320 0.1569 0.1327 1.3 · 10 03 7 · 10 04 DGCF 0.1736 0.1477 0.1794 0.1521 5.8 · 10 03 4.4 · 10 03 LightGCN 0.1826 0.1545 0.1830 0.1554 4 · 10 04 9 · 10 04 SGL* — — — — — — UltraGCN 0.1863 0.1580 0.1862 0.1580 +1 · 10 04 0 GFCF 0.1849 0.1518 0.1849 0.1518 0 0 Yelp 2018 NGCF 0.0556 0.0452 0.0579 0.0477 2.3 · 10 03 2.5 · 10 03 DGCF 0.0621 0.0505 0.0640 0.0522 1.9 · 10 03 1.7 · 10 03 LightGCN 0.0629 0.0516 0.0649 0.0530 2 · 10 03 1.4 · 10 03 SGL 0.0669 0.0552 0.0675 0.0555 6 · 10 04 3 · 10 04 UltraGCN 0.0672 0.0553 0.0683 0.0561 1.1 · 10 03 8 · 10 04 GFCF 0.0697 0.0571 0.0697 0.0571 0 0 Amazon Book NGCF 0.0319 0.0246 0.0337 0.0261 1.8 · 10 03 1.5 · 10 03 DGCF 0.0384 0.0295 0.0399 0.0308 1.5 · 10 03 1.3 · 10 03 LightGCN 0.0419 0.0323 0.0411 0.0315 +8 · 10 04 +8 · 10 04 SGL 0.0474 0.0372 0.0478 0.0379 4 · 10 04 7 · 10 04 UltraGCN 0.0688 0.0561 0.0681 0.0556 +7 · 10 04 +5 · 10 04 GFCF 0.0710 0.0584 0.0710 0.0584 0 0 *Results are not provided since SGL was not originally trained and tested on Gowalla.
  • 18. Benchmarking graph CF approaches using alternative baselines (RQ2) 18
  • 19. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● Expand the investigation to four classic CF recommender systems: UserkNN, ItemkNN, RP3β, EASER [Ferrari Dacrema et al. (2019), Anelli et al. (2022)] ● Consider two unpersonalized approaches (MostPop and Random) ● Follow the exact same 80/20 train/test splitting, and retain our version of the 10% of the training as validation ● Use Tree-structured Parzen Estimator (with 20 exploration) [Bergstra et al.] ● Recall@20 is used as validation metric Settings 19 Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023)
  • 20. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Results ● Neither MostPop nor Random get acceptable results: popularity bias is not present in the datasets or was removed (see later) ● Some of the classic CF approaches reach better performance than some graph CF baselines, and on Yelp 2018 and Amazon Book they reach best or second-to-best performance 20 Families Models Gowalla Yelp 2018 Amazon Book Recall nDCG Recall nDCG Recall nDCG Reference MostPop 0.0416 0.0316 0.0125 0.0101 0.0051 0.0044 Random 0.0005 0.0003 0.0005 0.0004 0.0002 0.0002 Classic CF UserkNN 0.1685 0.1370 0.0630 0.0528 0.0582 0.0477 ItemkNN 0.1409 0.1165 0.0610 0.0507 0.0634 0.0524 RP3 0.1829 0.1520 0.0671 0.0559 0.0683 0.0565 EASER* 0.1661 0.1384 0.0655 0.0552 0.0710 0.0567 Graph CF NGCF 0.1556 0.1320 0.0556 0.0452 0.0319 0.0246 DGCF 0.1736 0.1477 0.0621 0.0505 0.0384 0.0295 LightGCN 0.1826 0.1545 0.0629 0.0516 0.0419 0.0323 SGL — — 0.0669 0.0552 0.0474 0.0372 UltraGCN 0.1863 0.1580 0.0672 0.0553 0.0688 0.0561 GFCF 0.1849 0.1518 0.0697 0.0571 0.0710 0.0584 *Results for EASER on Amazon Book are taken from BARS Benchmark.
  • 21. Extending the experimental comparison to new datasets (RQ3 - RQ4) 21
  • 22. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) ● Two novel datasets: Allrecipes and BookCrossing with discordant characteristics compared to the other datasets ● Allrecipes: ○ users are more numerous than items ○ much lower average user and item degrees ● BookCrossing: ○ lowest ratio between users and items ○ much higher density than the other datasets ● Useful to assess the performance in different (and never- explored) topological settings ● Use the same experimental setting from RQ2 but with validation set (10% of the training set) Settings 22 Statistics Gowalla Yelp 2018 Amazon Book Allrecipes BookCrossing Users 29,858 31,668 52,643 10,084 6,754 Items 40,981 38,048 91,599 8,407 13,670 Edges 810,128 1,237,259 2,380,730 80,540 234,762 Density 0.0007 0.0010 0.0005 0.0010 0.0025 Avg. Deg. (U ) 27.1327 39.0697 45.2241 7.9869 34.7590 Avg. Deg. (I ) 19.7684 32.5184 25.9908 9.5801 17.1735
  • 23. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Results ● Classic CF methods are very competitive ● Especially on BookCrossing, the classic CF baselines are the top-performing approaches ● Only UltraGCN and LightGCN keep their performance as observed in the previous datasets ● For the other graph-based ones, the performance significantly drops 23 Families Models Allrecipes BookCrossing Recall nDCG Recall nDCG Reference MostPop 0.0472 0.0242 0.0352 0.0319 Random 0.0024 0.0010 0.0013 0.0011 Classic CF UserkNN 0.0339 0.0188 0.0871 0.0769 ItemkNN 0.0326 0.0180 0.0779 0.0739 RP3 0.0170 0.0089 0.0941 0.0821 EASER 0.0351 0.0192 0.0925 0.0847 Graph CF NGCF 0.0291 0.0144 0.0670 0.0546 DGCF 0.0448 0.0234 0.0643 0.0543 LightGCN 0.0459 0.0236 0.0803 0.0660 SGL 0.0365 0.0192 0.0716 0.0600 UltraGCN 0.0475 0.0248 0.0800 0.0651 GFCF 0.0101 0.0051 0.0819 0.0712
  • 24. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Discussion (graph-based models’ ranking) ● UltraGCN and GFCF are the two best-performing approaches ● All the other approaches rank according to the chronological order ● On Allrecipes and BookCrossing ○ UltraGCN preserves its role of leading approach ○ GFCF and DGCF performance is very fluctuating ○ LightGCN is in the top positions and surpasses other models which should ideally outperform it (e.g., SGL) ○ NGCF poor performance is confirmed 24 Metric Gowalla Yelp 2018 Amazon Book Allrecipes BookCrossing Recall 1. UltraGCN (+19.73%) GFCF (+25.36%) GFCF (+122.57%) UltraGCN (+370.30%) GFCF (+27.37%) 2. GFCF (+18.83%) UltraGCN (+20.86%) UltraGCN (+115.67%) LightGCN (+354.46%) LightGCN (+24.88%) 3. LightGCN (+17.35%) SGL (+20.32%) SGL (+48.59%) DGCF (+343.56%) UltraGCN (+24.42%) 4. DGCF (+11.57%) LightGCN (+13.13%) LightGCN (+31.35%) SGL (+261.39%) SGL (+11.35%) 5. NGCF ( — ) DGCF (+11.69%) DGCF (+20.38%) NGCF (+188.12%) NGCF (+4.20%) 6. SGL* ( — ) NGCF ( — ) NGCF ( — ) GFCF ( — ) DGCF ( — ) nDCG 1. UltraGCN (+19.70%) GFCF (+26.33%) GFCF (+137.40%) UltraGCN (+386.27%) GFCF (+31.12%) 2. LightGCN (+17.05%) UltraGCN (+22.35%) UltraGCN (+128.05%) LightGCN (+362.75%) LightGCN (+21.55%) 3. GFCF (+15.00%) SGL (+22.12%) SGL (+51.22%) DGCF (+358.82%) UltraGCN (+19.89%) 4. DGCF (+11.89%) LightGCN (+14.16%) LightGCN (+31.30%) SGL (+276.47%) SGL (+10.50%) 5. NGCF ( — ) DGCF (+11.73%) DGCF (+19.92%) NGCF (+182.35%) NGCF (+0.55%) 6. SGL* ( — ) NGCF ( — ) NGCF ( — ) GFCF ( — ) DGCF ( — ) *SGL is not classifiable on the Gowalla dataset as results were not calculated in the original paper.
  • 25. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Discussion (analysis on the node degree) ● We reinterpret node degree as information flow from neighbor nodes to the ego nodes after multiple hops ● Only users as ending nodes because accuracy metrics are calculated user-wise ● Information flow at 1, 2, and 3 hops: information after 1-hop column vector 25
  • 26. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Analysis on the node degree (1-hop) Indication of the activeness of users on the platform average performance (nDCG@20) user quartiles over information values performance improvement ● The 4th quartile is favoured with respect to the other ones ● The trend is even more evident on GFCF 26
  • 27. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Analysis on the node degree (2-hop) Indication of the influence of items’ popularity on users ● Models favour the warm users who enjoyed popular items over the cold users who enjoyed niche items ● On Allrecipes, UltraGCN, DGCF, and LightGCN show less discriminatory behavior across quartiles; SGL and NGCF show a higher slope that is comparable to classic CF methods; GFCF behavior is even more accentuated than the 1-hop setting ● On BookCrossing, the trend is almost aligned across all models 27
  • 28. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Analysis on the node degree (3-hop) Indication of the influence of co-interacting users’ activeness on users ● On Allrecipes, UltraGCN, DGCF, and LightGCN exhibit more consistency across quartiles, while NGCF, SGL, and GFCF have a more disparate range of results ● On BookCrossing, the information at the 3-hop is not providing more insights than the 2-hop 28
  • 30. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Conclusion ● Replicate the results of six state-of-the-art graph CF methods ● We include other state-of-the-art approaches and other (unexplored) datasets ● The topological graph characteristics (i.e., node degree) may impact the performance ● This happens especially for the information flow at 2-hop (i.e., user activeness + item popularity) Future work ● Further investigation into diversity and fairness of graph CF approaches ● Analyze the impact of other topological graph characteristics on the performance (currently on arXiv [Malitesta et al. (2023b)]) 30
  • 31. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) Useful resources 31
  • 32. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) A Topology-aware Analysis of Graph Collaborative Filtering 32 [Malitesta et al. (2023b)]
  • 33. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) [van den Berg et al.] 2017. Graph convolutional matrix completion. CoRR abs/1706.02263. [Ying et al.] 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD. ACM, 974–983. [Wang et al. (2019a)] 2019. Neural Graph Collaborative Filtering. In SIGIR. ACM, 165–174. [Wang et al. (2019b)] 2019. KGAT: Knowledge Graph Attention Network for Recommendation. In KDD. ACM, 950–958. [Chen et al.] 2020. Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach. In AAAI. AAAI Press, 27–34. [He et al.] 2020. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. In SIGIR. ACM, 639–648. [Wang et al. (2020)] 2020. Disentangled Graph Collaborative Filtering. In SIGIR. ACM, 1001–1010. [Tao et al.] 2020. MGAT: Multimodal Graph Attention Network for Recommendation. Inf. Process. Manag. 57, 5 (2020), 102277. [Wu et al.] 2021. Self-supervised Graph Learning for Recommendation. In SIGIR. ACM, 726–735. [Yu et al.] 2022. Are Graph Augmentations Necessary?: Simple Graph Contrastive Learning for Recommendation. In SIGIR. ACM, 1294–1303. [Mao et al.] 2021. UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation. In CIKM. ACM, 1253–1262. [Peng et al.] 2022. SVD-GCN: A Simplified Graph Convolution Paradigm for Recommendation. In CIKM. ACM, 1625–1634. [Shen et al.] 2021. How Powerful is Graph Convolution for Recommendation?. In CIKM. ACM, 1619–1629. [Sun et al.] 2021. HGCF: Hyperbolic Graph Convolution Networks for Collaborative Filtering. In WWW. ACM / IW3C2, 593–601. [Zhang et al.] 2022. Geometric Disentangled Collaborative Filtering. In SIGIR. ACM, 80–90. [Wei et al.] 2022. Dynamic Hypergraph Learning for Collaborative Filtering. In CIKM. ACM, 2108–2117. [Xia et al.] 2022. Hypergraph Contrastive Collaborative Filtering. In SIGIR. ACM, 70–79. [Bellogín and Said] 2021. Improving accountability in recommender systems research through reproducibility. User Model. User Adapt. Interact. 31, 5 (2021), 941–977. [Anelli et al. (2021a)] 2021. Reenvisioning the comparison between Neural Collaborative Filtering and Matrix Factorization. In RecSys. ACM, 521–529. [Anelli et al. (2022)] 2022. Top-N Recommendation Algorithms: A Quest for the State-of-the-Art. In UMAP. ACM, 121–131. References 1/2 33
  • 34. Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis The 17th ACM Conference on Recommender Systems (Singapore, 18-22 September 2023) [Ferrari Dacrema et al. (2019)] 2019. Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In RecSys. ACM, 101–109. [Ferrari Dacrema et al. (2021)] 2021. A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research. ACM Trans. Inf. Syst. 39, 2 (2021), 20:1– 20:49. [Sun et al.] 2020. Are We Evaluating Rigorously? Benchmarking Recommendation for Reproducible Evaluation and Fair Comparison. In RecSys. ACM, 23–32. [Zhu et al.] 2022. BARS: Towards Open Benchmarking for Recommender Systems. In SIGIR. ACM, 2912–2923. [Anelli et al. (2021b)] 2021. Elliot: A Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation. In SIGIR. ACM, 2405–2414. [Malitesta et al. (2023a)] 2023. An Out-of-the-Box Application for Reproducible Graph Collaborative Filtering extending the Elliot Framework. In UMAP (Adjunct Publication). ACM, 12–15. [Bergstra et al.] 2011. Algorithms for Hyper-Parameter Optimization. In NIPS. 2546–2554. [Malitesta et al. (2023b)] 2023. A Topology-aware Analysis of Graph Collaborative Filtering. CoRR abs/2308.10778. References 2/2 34