The document proposes a Partially Labeled Stochastic Block Model (PLSBM) as a probabilistic generative model for networks with some labeled nodes. It proves the relationship between label propagation (LP) and the Stochastic Block Model (SBM) through PLSBM. Specifically, it shows that the solution to LP is identical to maximum likelihood estimation under PLSBM when the label ratio is uniform, edge probabilities are uniform within and between labels, and the network is assortative. When these conditions do not hold, LP is shown to fail both theoretically and experimentally.
5. Label Propagation (2/2)
Q F;X,Y,λ( )=
1
2
fi − yi 2
2
i=1
N
∑ +
λ
2
xij fi − fj 2
2
j=1
N
∑
i=1
N
∑
Given: adjacency matrix X and labels Y
Find: F = { fi } that minimizes Q
17/08/22
IJCAI@Melbourne
5
F ∈ RN x K
Y ∈ {0, 1}N x K
X ∈ {0, 1}N x N
N: # of nodes
K: # of labels
λ ∈ R+ : user parameter
[Zhu+, 03], [Zhou+, 03], etc.
6. Cases
when
LP
fails
(prac1cally
known)
Different labels
are connected
Label ratio is not uniform
Q. So, do we know why LP fails in these cases?
A. No. Since it’s not a probabilistic model, we
don’t know the assumptions behind the model.
17/08/22
IJCAI@Melbourne
6
Edge probability is not uniform
7. What
we
do
in
this
work
1. Prove
a
theore1cal
rela1onship
between
LP
and
Stochas(c
Block
Model,
which
is
a
well-‐
studied
probabilis1c
genera1ve
model
2. Find
the
assump(ons
behind
LP
through
the
assump1ons
behind
SBM
3. Show
when
and
why
LP
fails
17/08/22
IJCAI@Melbourne
7
9. Stochastic Block Model
Generative process
Multinomial
Bernoulli
①
②
①: Generate cluster assignment for each node
(which can be thought of labels)
②: Generate adjacency matrix
17/08/22
IJCAI@Melbourne
9
γ ∈ RK
Π ∈ RKxK
Parameters:
10. Proposed:
Partially Labeled SBM (PLSBM)
Generative process
①
②
③
②:Generate labels for “labeled nodes”
(α large à yi is more likely to be the same as zi)
Depends on
parameter α
17/08/22
IJCAI@Melbourne
10
γ ∈ RK
Π ∈ RKxK
α ∈ 0,1[ ]
Parameters:
12. Main Result
Map estimator Z of PLSBM is identical to the solution of
(discretized) LP when the following conditions hold
Condition 1:
Condition 2:
Condition 3:
Condition 4: (omitted)
17/08/22
IJCAI@Melbourne
12
13. Condition 1
Implication (implicit assumption of LP)
• Label ratio is uniform
17/08/22
IJCAI@Melbourne
13
Violates this assumption L
14. Condition 2
Implication (Implicit assumptions of LP)
• Edge probs between the same labels are all the same (μ)
• Edge probs between different labels are all the same (ν)
17/08/22
IJCAI@Melbourne
14
Violates this assumption L
15. Condition 3
Implication (Implicit assumption of LP)
• Assortative (same labels tend to be connected)
17/08/22
IJCAI@Melbourne
15
Violates this assumption L
16. Experimental results
17/08/22
IJCAI@Melbourne
16
… Come see full results at the poster session J
Better
Setups:
1. Generate datasets by PLSBM
2. infer labels (Z) by PLSBM, SBM, and LP
3. Report mean accuracy of 20 trials
Assortative
Disassortative
Agree with
theoretical results
17. Summary
• Proposed
Par1ally-‐Labeled
SBM
(PLSBM)
• Proved
the
rela1onship
between
LP
and
SBM
via
PLSBM
• Showed
cases
when
LP
fails
• Experimental
and
Theore1cal
results
agree
17/08/22
IJCAI@Melbourne
17
Github: yamaguchiyuto/plsbm