SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
1 1
B
. 6 1 1
I + L
1 1
1 12
0 1 1
6 1 +
•
-
u
w x
•
- p
- u p p
•
- w x
- u
p
ʼt
u
u
p u
( O XO 3KbO d C O
u
( w Y 9 x
•
- ö p ~
p
– |p2ZZOXNSa
– W J J!W
(
u
C 9D!
u w x
k(x, y) = exp( ||x y||2
/ 2
)p u
X HX
NKM
(Xi) = k(·, Xi)
: X ! HX
f(x) = hf, k(·, x)iHX
, 8f 2 HX . )
u |
b
Y
C 9D
r
• p u
pA u r
• × p t
Z
k(·, y) x(y) = k(·, x)
X ⇠ pX k(x1, x2) = h (x1), (x2)iHX
µX :=
Z
k(·, x)dpX (x) =
Z
(x)dpX(x)
pX
u
• × u p
wA oJA x
p G u p
u
• × ʼ ~
-
- u ×
k(x, x0
) = exx0
µX(t) = EX⇠P [k(t, X)] =
Z
etx
dP(x)
=
Z
1 + tx +
t2
x2
2
+
t3
x3
3!
+ · · · dP(x)
= 1 + tEX⇠P [X] +
t2
2
EX⇠P [X2
] +
t3
3!
EX⇠P [X3
] + · · ·
ˆµX :=
1
n
nX
i=1
k(·, xi)
!
• w8 O YX O K ?:AD .x
- u w x
-
•
- 5 WKaSW W WOKX NS M OZKXMb!r
w x
- A0B 5
• u HY RS K K O K ?:AD !
- u w pFS SZONSKä
x 5
X1, ..., X` ⇠ P Y1, ..., Yn ⇠ Q
H0 : P = Q vs H1 : P 6= Q
C 9D
rBSS K
y
f |
u fz
M2
(P, Q) := ||µP µQ||2
Hk
µP
µQ
,
•
- C 9D
•
r
• u
ä Z
u pS O
hg, CY X fiHY
= E[f(X)g(Y )] 8f 2 HX , g 2 HY
CY X : HX ! HY
7 YW DYXQ O K (
k((x1, y1), (x2, y2)) = kX(x1, x2)kY (y1, y2)
HX ⌦ HY
X, Y ⇠ p HX and (x), HY and (y).
CY X = E[ (Y ) ⌦ (X)] 2 HY ⌦ HX
u
-
• a~ H A HeG0a! u
r
|
wERW x
- u r
× pDYXQ O K (
• r
7 YW DYXQ O K (
µY |X=x = E[ (Y )|X = x]
µY |X=x = CY X C 1
XX (x)
ˆµY |X=x = ˆCY X ( ˆCXX + I) 1
(x)
E[g(Y )|X = x] ⇡ hg, ˆµY |X=xiHY
,
8 g 2 HY .
.
u
• 3KbO d O
- ~ p p
-
• ~ r 4 4 D 4 234 O Mv
• ×
~
• u u
- u
p × |
t
p⇡
(Y |X = x) / ⇡(Y )p(X = x|Y )
ö
O XO 3KbO C O7 WSc O K ?:AD !
• ö
• 8YK
• p ö ~
u ~
• r
• u
ʼ
p
Y ⇡(Y ) p(X = x|Y )
x ⇡(Y )
p⇡
(Y |X = x)
µ⇡
Y (X = x)
µ⇡
Y (X = x) = C⇡
Y X C⇡ 1
XX (x)
3 C⇡
Y X = Ep⇡(Y,X)[ Y (Y ) ⌦ X (X)]
p⇡
(Y, X) = ⇡(Y )p(X|Y ) 6= p(Y, X).
p | ~p(Y, X)
• r
- ä u
• EROY OW A YYP SX 2ZZOXNSa! p
~ w u | x
• u | p
~
~ p p
- ×
ˆµ⇡
Y (X = x) = ˆC⇡
Y X ([ ˆC⇡
XX]2
+ I) 1 ˆC⇡
XX (x)
ˆC⇡
XX (B + I) 1
z
(B2
+ I) 1
Bz
C⇡
Y X = C(Y X)Y C 1
Y Y ⇡Y , C⇡
XX = C(XX)Y C 1
Y Y ⇡Y
u
• L O _K SYX
• u
8 lXO j NO O K :4 = !
•
COQ3KbO IR O K =C )!
u |s
H6D
|s
ER O RY NSXQ COQ K ScK SYX COQ3KbO
(
u
• u
A YYP SX 2ZZOXNSa!
• COQ3KbO IR O K =C )!
• p
min
p(Y |X=x)
KL(p(Y |X = x)||⇡(Y ))
Z
log p(X = x|Y )dp(Y |X = x)
s.t. p(Y |X = x) 2 Pprob
min
p(Y |X=x),⇠
KL(p(Y |X = x)||⇡(Y ))
Z
log p(X = x|Y )dp(Y |X = x) + U(⇠)
s.t. p(Y |X = x) 2 Pprob(⇠)
u
)
|
r u p
p
A YZ
ER O RY NSXQ
OQ K ScK SYX
COQ3KbO
ä
"[µ]
"[µ]
"S[µ]
ˆ"S[µ]
ˆ"+
S [µ]
ˆ" ,n[µ]
|
r ~ u
A YZ
ER O RY NSXQ
OQ K ScK SYX
COQ3KbO
ä
"[µ]
"[µ]
"S[µ]
ˆ"S[µ]
ˆ"+
S [µ]
ˆ" ,n[µ]
~ ERW )!
ERW !
ERW (!
ERW !
• p
p
• p p
- – p p
r u | p
ä A YYP SX 2ZZOXNSa!
• EROY OW )
p
→ ~
u
p⇡
(Y |X = x) µ⇡
Y (X = x) 2 HY
h 2 HY
hh, µ⇡
Y (X = x)iHY
= Ep⇡(Y |X=x)[h(Y )]
"[µ] := sup
||h||HY
1
EX [(EY [h(Y )|X]] hh, µ(X)iHY
)2
]
EX [·] p⇡
(X) EY [·|X] p⇡
(Y |X)
"S[µ] = E(X,Y )⇠p⇡(X,Y )[|| (Y ) µ(X)||2
HY
]
µ⇤
:= argminµ2HK
"[µ] = argminµ2HK
"s[µ] pX a.s.
EROY OW ) Y O NO KS SX 2ZZOXNSXa!
(xi, yi)n
i=1
"[µ] µ : X ! HY
,
"[µ] "S[µ]
s
• r
|
•
- r
→
ö u r
QS_OX
→ w pERW x
RO O
"S[µ]
f(x, y) := || (y) µ(x)||2
HY
2 HX ⌦ HY
"s[µ] = E(X,Y )[|| (Y ) µ(X)||2
HY
] = hf, µ(X,Y )iHX ⌦HY
ˆµ(X,Y ) = ˆC(X,Y )Y ( ˆCY Y + I) 1
ˆ⇡Y
ˆ"s[µ] = hˆµ(X,Y ), fiHX ⌦HY
p⇡
(X, Y )
A YZ A YYP SX 2ZZOXNSa!
ˆ⇡Y =
P`
i=1 ˜↵i (˜yi)
"S[µ]
-
ˆ✏s[µ] =
nX
i=1
i|| (yi) µ(xi)||2
HY
,
where. = (GY + n I) 1 ˜GY ˜↵, (GY )ij = kY (yi, yj), ( ˜GY )ij = kY (yi, ˜yj)
ER O RY NSXQ OQ K ScK SYX
• pA YZ ~ p
• EROY OW ( | p
R O RY NSXQ OQ K ScK SYX p?S RSbKWK O K K GS_
ü A 5A ! u
| p EROY OW (
MYX S OXMb
i
i
+
i := max(0, i)
ER O RY NSXQ OQ K ScK SYX
✏+
s [µ]✏s[µ]r
|ˆ✏+
s [µ] ✏s[µ]|
p
! 0.
.
SXSWScSXQ KK
• p
• EROY OW
- × u p
ˆ✏+
s [µ]
ˆ✏ ,n[µ] =
nX
i=1
+
i || (yi) µ(xi)||2
HY
+ ||µ||2
HK
A YZY S SYX A YYP SX 2ZZOXNSXa!
ˆµ ,n = argminµ ˆ" ,n[µ]
ˆµ ,n(x) = (KX + n⇤+
) 1
K:x
where. = ( (y1), ..., (yn)), (KX )ij = kX (xi, xj),
⇤+
= diag(1/ +
1 , ..., 1/ +
n ), K:x = (kX (x, x1), ..., kX (x, xn))T
.
"S[ˆµ n,n]
P
! min
µ
"S[µ]. u
COQ3KbO
• ER O RY NSXQ OQ K ScK SYX p
ʼ
• COQLKbO
- r
- A YZ p
- s
• –p ü m n – 11
• u ö t
L :=
mX
i=1
+
i || (yi) µ(xi)||2
HY
+ ||µ||2
HK
+
nX
i=m+1
||µ(xi) (ti)||2
HY
.
A YZY S SYX (
ˆµreg(x) = (KX + ⇤+
) 1
K:x
where. = ( (y1), ... (yn)), (KX )ij = kX (xi, xj),
⇤+
= diag(1/ +
1 , ..., 1/ +
m, 1/ , ..., 1/ ), K:x = (kX (x, x1), ..., kX (x, xn))T
.
6aZO SWOX
• 4KWO K ZY S SYX OMY_O b !
• r × p
p
r i × !
• r
w x
u u | p u ×
| p × !
{(xt, yt)}m
t=1.
✓t+1 = ✓t + 0.2 + N (0, 4e 1), rt+1 = max(R2, min(R1, rt + N (0, 1))
xt+1 = rt+1 cos ✓t+1, yt+1 = rt+1 sin ✓t+1
{(xt, yt)}m
t=1.
×
• u u
u
• ö COQ3KbO ~
→ ä | D ZO _S SYX NK K p
COQ3KbO
• u 3C!p
7!
ER O RY NSXQ OQ K ScK SYX Z 3C!
COQ3KbO D6
- COQ3KbO ä
u | ö ~ p
~
ä ʼ
~
(
R1 = 0, R2 = 10
R1 = 5, R2 = 7
0 ä
~ ʼ
ö
It It+1
(xt, yt) (xt+1, yt+1)
A ONSM SYX
ö !
w x
p((xt+1, yt+1)|I1, ..., IT )
p((xt+1, yt+1)|I1, ..., IT , It+1)
COPO OXMO
• 2 R 8 O YX K OX 3Y Q K N K O CK MR 3O XRK N DMRk YZP KXN 2 DWY K 2
O XO WO RYN PY RO Y KWZ O Z YL OW
ZKQO (g ,
• =O DYXQ YXK RKX 9 KXQ 2 Oa DWY K KXN OXTS 7 WSc 9S LO ZKMO OWLONNSXQ YP
MYXNS SYXK NS SL SYX S R KZZ SMK SYX Y NbXKWSMK b OW
ZKQO . g. - 24 .
• =O DYXQ OXTS 7 WSc KXN 2 R 8 O YX O XO OWLONNSXQ YP MYXNS SYXK
NS SL SYX 2 XSPSON O XO P KWO Y PY XYXZK KWO SM SXPO OXMO SX Q KZRSMK WYNO
( ) .- (
• OXTS 7 WSc =O DYXQ KXN 2 R 8 O YX O XO LKbO d O
ZKQO ,(,g ,)
• HKXQ DYXQ X IR KXN HYXQ COX O XO 3KbO SKX :XPO OXMO S R AY O SY
COQ K ScK SYX ZKQO ), ( ),,
• H bK HY RS K K EYWYRK : K K 9S Y RS DK KNK KXN EK O RS HKWKNK 4 Y NYWKSX
WK MRSXQ PY LKQ YP Y N NK K _SK O XO OWLONNSXQ YP K OX NS SL SYX
ZKQO ) ) (
• H ?S RSbKWK 2LNO KW 3Y K SK 2 R 8 O YX KXN OXTS 7 WSc 9S LO ZKMO
OWLONNSXQ YP ZYWNZ K GS_ Z OZ SX K GS_ )--,
• D OPPOX 8 lXO j NO 8 b =O_O = MK 3K NK K O DKW AK O YX 2 R 8 O YX KXN
K SWS SKXY AYX S 4YXNS SYXK WOKX OWLONNSXQ K OQ O Y
ZKQO - (g -(
• X IR ?SXQ 4ROX KXN 6 SM A GSXQ 3KbO SKX SXPO OXMO S R ZY O SY OQ K ScK SYX KXN
KZZ SMK SYX Y SXPSXS O K OX _W
! ,..g -), )
)
Kernel Bayesian Inference with Posterior Regularization (Appendix)
Yuchi Matsuoka
2017 3 18
1 Preliminaries
(X, BX ) pX HX k(·, ·) RKHS pX
µX = EpX
[φ(X)] ∈ HX φ(X) = k(X, ·). 1
f ∈ H EpX
[f(X)] = EpX
[⟨f, φ(X)⟩] =
⟨f, µX⟩ universal kernel RKHS H sup norm CX
2 (X, BX ), (Y, BY) φ(x), ψ(y) RKHS HX ,HY p X ×Y
(X, Y ) CXY CXY = Ep[φ(X)⊗ψ(Y )] k((x1, y1), (x2, y2)) = kX (x1, x2)kY(y1, y2)
RKHS HX ⊗ HY µ(XY )
Theorem 1 CXX µX ∈ R(CXX) g ∈ HY E[g(Y )|X = ·] ∈ HX
µY = CY XC−1
XXµX, µY |X=x = E[ψ(Y )|X = x] = CY XC−1
XXφ(x).
2
µX pX {xi}N
i=1 ˆµX = 1
N
N
i=1 φ(xi), CXY
ˆCXY = 1
N
N
i=1 φ(xi) ⊗ ψ(yi)
RKHS Op(N−1/2
)
1
,i.e. supx kX (x, x) < ∞
1
3 Theorem 1
Theorem 1 ((Song et al., 2009, Equation 6)) mΠ mQy HX Π HY QY CXX
mΠ ∈ R(CXX) g ∈ HY E[g(Y )|X = ·] ∈ HX
mQy = CY XC−1
XXmΠ.
C−1
XXmΠ CXX mΠ
Proof CXXf = mΠ f ∈ HX g ∈ HY
⟨CY Xf, g⟩ = ⟨f, CXY g⟩ = ⟨f, CXXE[g(Y )|X = ·]⟩
= ⟨CXXf, E[g(Y )|X = ·]⟩ = ⟨mΠ, E[g(Y )|X = ·]⟩ = ⟨mQy , g⟩.
⟨mΠ, E[g(Y )|X = ·]⟩ = ⟨mQy , g⟩
⟨f, mX⟩ = E[f(X)]
(X, Y ) ∼ p(x, y). U ∼ π(u). (Z, W) ∼ q(x, y) = π(x)p(y|x), qY(y) = q(x, y)dx
⟨mQY
, g⟩HY
= E[g(W)] = g(w)qY(w)dw
⟨mΠ, E[g(Y )|X = ·]⟩ = EU [EY [g(Y )|U]] =
X
(
Y
g(y)p(y|u)dx)π(u)du
=
Y
(
X
g(y)q(u, y)du)dx =
Y
g(y)qY(y)dy.
mQy = CY XC−1
XXmΠ mΠ kX (·, x) E[kY(·, Y )|X = x] = CY XC−1
XXkX (·, x) ✷
4
π(Y ) Y p(X = x|Y ) pπ
(Y |X = x) π(Y ) x pπ
(X, Y ) = π(Y )p(X|Y )
πY CXY µπ
Y (X = x)
pπ
(Y |X = x) ∝ π(Y )p(X = x|Y ).
2
p(X|Y ) X × Y p CXY Thm.
1 Cπ
Y X pπ
Cπ
XX pπ
X
µπ
Y (X = x) = Cπ
Y XCπ −1
XX φ(x).
Cπ
Y X HY ⊗ HX µ(Y X) Thm 1.
µ(Y X) = C(Y X)Y C−1
Y Y πY , where. C(Y X)Y := E[ψ(Y ) ⊗ φ(X) ⊗ ψ(Y )].
Cπ
XX
µ(XX) = C(XX)Y C−1
Y Y πY
5
Regularized Bayesian inference (RegBayes) Pprob
minp(Y |X=x) KL(p(Y |X = x)||π(Y )) − log p(X = x|Y )dp(Y |X = x)
s.t. p(Y |X = x) ∈ Pprob
Proof
KL(p(Y |X = x)||π(Y )) − log p(X = x|Y )dp(Y |X = x) = log
p(Y |X = x)
π(Y )
dp(Y |X = x) − log p(X = x|Y )dp(Y |X = x)
= log
p(Y |X = x)
π(Y )p(X = x|Y )
dp(Y |X = x)
= log
p(Y |X = x)
π(Y )p(X=x|Y )
pπ(X=x)
dp(Y |X = x) + log pπ
(X = x)dp(Y |X = x)
= KL p(Y |X = x)||
π(Y )p(X = x|Y )
pπ(X = x)
+ log pπ
(X = x).
arg minp(Y |X=x) KL(p(Y |X = x)||π(Y )) − log p(X = x|Y )dp(Y |X = x) = π(Y )p(X=x|Y )
pπ(X=x)
. ✷
3
6 Vector-valued regression
(RKHS )
E(f) :=
n
i=1
||yj − f(xj)||2
HY
+ λ||f||2
HK
,
where. yj ∈ HY, f : X → HY.
f RKHS HY f RKHS HK
6.0.1 Vector-values regression and RKHSs
{(xi, vi)}i≤m X × V i.i.d X (V, ⟨·, ·⟩V
E(X,V )[||f(X) − V ||2
V]
f : X → V vector-valued regression problem
[Definition] h : X → V (H, ⟨·, ·⟩Γ) x ∈ X, v ∈ V h → ⟨v, h(x)⟩V
RKHS HΓ
Riesz 2
x ∈ X, v ∈ V V HΓ Γx (Γxv ∈ HΓ ) h ∈ HΓ
⟨v, h(x)⟩V = ⟨h, Γxv⟩Γ
HΓ RKHS
Γx L(V) V V
Γ(x, x′
) ∈ L(V)
Γ(x, x′
)v ∈ (Γx′ v)(x) ∈ V
2
(Riesz ) H H R H∗
H φ ∈ H∗
yφ ∈ H x ∈ H φ(x) = ⟨x, yφ⟩
4
[Proposition 2.1] Γ : X × X → L(V)
(1) Γ(x, x′
) = Γ(x′
, x)∗
.
(2) n ∈ N, {(xi, vi)}i≤n ⊂ X × V i,j≤n⟨vi, Γ(xi, xj)vj⟩V ≥ 0.
E(X,V )[||f(X) − V ||2
V] n
i=1 ||vi − f(xi)||2
V f RKHS HΓ
HΓ
ˆϵλ(f) :=
n
i=1
||vi − f(xi)||2
V + λ||f||2
Γ.
Γxi
Theorem 2.2.(Adapted from G. Lever and S. Gr¨unew¨alder+ 2012) f∗
ˆϵλ HΓ
f∗
=
n
i=1
Γxi
ci
{ci}, ci ∈ V
i≤n
(Γ(xj, xi) + λδji)ci = vj, 1 ≤ j ≤ n.
ˆϵλ(f)
5
7 ϵs[µ] ε[µ]
Proof
ε[µ] := sup
||h||HY
≤1
EX[(EY [h(Y )|X]] − ⟨h, µ(X)⟩HY
)2
]
= sup
||h||HY ≤1
EX[(EY [⟨h, ψ(Y )⟩HY
|X] − ⟨, h, µ(X)⟩HY
)2
]
≤ sup
||h||HY ≤1
EX,Y [⟨h, ψ(Y ) − µ(X)⟩2
HY
]
≤ sup
||h||HY ≤1
||h||2
HY
EX,Y [||ψ(Y ) − µ(X)||2
HY
]
= EX,Y [||ψ(Y ) − µ(X)||2
HY
] = ϵs[µ].
✷
8 Proposition 1
Proposition 1 (X, Y ) X × Y Y prior π(Y ) p(X|Y ) HX kX φ(x)
RKHS HY kY ψ(y) RKHS φ(x, y) HX ⊗ HY
ˆπY = ℓ
i=1 ˜αiψ(˜yi) πY {(xi, yi)}n
i=1 p(X|Y )
f(x, y) = ||ψ(y) − µ(x)||2
HY
f ∈ HX ⊗ HY
ˆϵs[µ] =
n
i=1
βi||ψ(yi) − µ(xi)||2
HY
,
β = (β1, ..., βn)T
β = (GY + nλI)−1 ˜GY ˜α (GY )ij = kY(yi, yj), ( ˜GY )ij = kY(yi, ˜yj), ˜α = (˜α1, ...˜αℓ)T
.
Proof K. Fukumizu 2016. Kernel Bayes Rule. Proposition 4
6
ΦX,Y = (φ(x1, y1), ..., φ(xn, yn)) ˆµ(X,Y ) = ΦX,Y β = ΦX,Y (GY + nλI)−1 ˜GY ˜α
HX ⊗ HY
ϵs[µ] = ⟨ˆµ(X,Y ), f⟩HX ⊗HY
= ⟨ΦX,Y (GY + nλI)−1 ˜GY ˜α, f⟩HX ⊗HY
= ⟨ΦX,Y β, f⟩HX ⊗HY
=
n
i=1
βi||ψ(yi) − µ(xi)||2
HY
.
ˆµ(X,Y ) = ˆC(X,Y )Y ( ˆCY Y + λI)−1
ˆπY h = ( ˆCY Y + λI)−1
ˆπY
h =
n
i=1
aiψ(yi) + h⊥
h⊥ h span(ψ(y1), ..., ψ(yn)} ( ˆCY Y + λI)h = ˆπY
1
n i,j≤n
aikY(yi, yj)ψ(yj) + λ
i≤n
aiψ(yi) + h⊥ =
i≤ℓ
˜αiψ(˜yi)
ψ(yk)|n
k=1
1
n
G2
Y a + λGY a = ˜GY ˜α ⇔
1
n
(GY + nλI)GY a = ˜GY ˜α ⇔
1
n
GY a = (GY + nλI)−1 ˜GY ˜α
ˆµ(X,Y )
ˆµ(X,Y ) =
1
n i≤n
φ(xi, yi) ⊗ ψ(yi) h =
1
n
ΦX,Y GY a = ΦX,Y (GY + nλI)−1 ˜GY ˜α
✷
7
9 Proposition 2
Proposition 2 i β+
i ̸= 0 µ ∈ HK HK K(xi, xj) = kX (xi, xj)I
I : HK → HK
ˆµλ,n(x) = Ψ(KX + λnΛ+
)−1
K:x
Ψ = (ψ(y1), ..., ψ(yn)) (KX)ij = kX (xi, xj) Λ+
= diag(1/β+
1 , ..., 1/β+
n ) K:x = (kX (x, x1), ..., kX (x, xn))T
λn
Proof β+
i = 0 (xi, yi) i β+
i ̸= 0
µ = µ0 + g µ0 = n
i=1 Kxi
ci ˆϵλ,n[µ]
ˆϵλ,n[µ] =
n
i=1
β+
i ||ψ(yi) − µ(xi)||2
HY
+ λn||µ||2
HK
=
n
i=1
β+
i ||ψ(yi) − (µ0(xi) + g(xi))||2
HY
+ λn||µ0 + g||2
HK
=
n
i=1
β+
i ||ψ(yi) − µ0(xi)||2
+ λn||µ0||2
+
n
i=1
β+
i ||g(xi)||2
+ λn||g||2
+ 2λn⟨µ0, g⟩ − 2
n
i=1
β+
i ⟨g(xi), ψ(yi) − µ0(xi)⟩.
i ψ(yi) − n
j=1 kX (xi, xj)cj = λn
β+
i
ci ˆϵλ,n[µ]
λn⟨µ0, g⟩ −
n
i=1
β+
i ⟨g(xi), ψ(yi) − µ0(xi)⟩ = 0
ˆϵλ,n[µ] = ˆϵλ,n[µ0] +
n
i=1
β+
i ||g(xi)||2
+ λn||g||2
≥ ˆϵλ,n[µ0]
ψ(yi) − n
j=1 kX (xi, xj)cj = λn
β+
i
ci ci µ0 = n
i=1 Kxi
ci
(KX + λnΛ+
)c = Ψ
µ0(x) =
n
i=1
kX (x, xi)ci = Ψ(KX + λnΛ+
)−1
K:x
✷
8

Weitere ähnliche Inhalte

Was ist angesagt?

GradStudentSeminarSept30
GradStudentSeminarSept30GradStudentSeminarSept30
GradStudentSeminarSept30
Ryan White
 
A Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILNA Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILN
Tomonari Masada
 
Actuarial Science Reference Sheet
Actuarial Science Reference SheetActuarial Science Reference Sheet
Actuarial Science Reference Sheet
Daniel Nolan
 

Was ist angesagt? (20)

Soluções dos exercícios de cinética química digitados
Soluções dos exercícios de cinética química digitadosSoluções dos exercícios de cinética química digitados
Soluções dos exercícios de cinética química digitados
 
Query Suggestion @ tokyotextmining#2
Query Suggestion @ tokyotextmining#2Query Suggestion @ tokyotextmining#2
Query Suggestion @ tokyotextmining#2
 
GradStudentSeminarSept30
GradStudentSeminarSept30GradStudentSeminarSept30
GradStudentSeminarSept30
 
Fixed point theorems for random variables in complete metric spaces
Fixed point theorems for random variables in complete metric spacesFixed point theorems for random variables in complete metric spaces
Fixed point theorems for random variables in complete metric spaces
 
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
 
A Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILNA Note on the Derivation of the Variational Inference Updates for DILN
A Note on the Derivation of the Variational Inference Updates for DILN
 
Probability based learning (in book: Machine learning for predictve data anal...
Probability based learning (in book: Machine learning for predictve data anal...Probability based learning (in book: Machine learning for predictve data anal...
Probability based learning (in book: Machine learning for predictve data anal...
 
PU Learning
PU LearningPU Learning
PU Learning
 
Lec4a policy-gradients-actor-critic
Lec4a policy-gradients-actor-criticLec4a policy-gradients-actor-critic
Lec4a policy-gradients-actor-critic
 
ゲーム理論BASIC 演習16 -2回繰り返し寡占市場ゲーム
ゲーム理論BASIC 演習16 -2回繰り返し寡占市場ゲームゲーム理論BASIC 演習16 -2回繰り返し寡占市場ゲーム
ゲーム理論BASIC 演習16 -2回繰り返し寡占市場ゲーム
 
Cii integral 5
Cii integral 5Cii integral 5
Cii integral 5
 
Patakis themata tth
Patakis themata tthPatakis themata tth
Patakis themata tth
 
Actuarial Science Reference Sheet
Actuarial Science Reference SheetActuarial Science Reference Sheet
Actuarial Science Reference Sheet
 
確率伝播
確率伝播確率伝播
確率伝播
 
A Fifth-Order Iterative Method for Solving Nonlinear Equations
A Fifth-Order Iterative Method for Solving Nonlinear EquationsA Fifth-Order Iterative Method for Solving Nonlinear Equations
A Fifth-Order Iterative Method for Solving Nonlinear Equations
 
[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展[DL輪読会]近年のエネルギーベースモデルの進展
[DL輪読会]近年のエネルギーベースモデルの進展
 
QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...
QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...
QMC: Operator Splitting Workshop, Boundedness of the Sequence if Iterates Gen...
 
Some recent developments in the traffic flow variational formulation
Some recent developments in the traffic flow variational formulationSome recent developments in the traffic flow variational formulation
Some recent developments in the traffic flow variational formulation
 
Teaching Population Genetics with R
Teaching Population Genetics with RTeaching Population Genetics with R
Teaching Population Genetics with R
 
Numerical Computing
Numerical ComputingNumerical Computing
Numerical Computing
 

Ähnlich wie 関西NIPS+読み会発表スライド

jhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfg
jhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfgjhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfg
jhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfg
Tonn Za
 
ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-
ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-
ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-
ssusere0a682
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892
drayertaurus
 

Ähnlich wie 関西NIPS+読み会発表スライド (20)

Hiroaki Shiokawa
Hiroaki ShiokawaHiroaki Shiokawa
Hiroaki Shiokawa
 
統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半
 
jhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfg
jhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfgjhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfg
jhkl,l.มือครูคณิตศาสตร์พื้นฐาน ม.4 สสวท เล่ม 2fuyhfg
 
情報幾何の基礎とEMアルゴリズムの解釈
情報幾何の基礎とEMアルゴリズムの解釈情報幾何の基礎とEMアルゴリズムの解釈
情報幾何の基礎とEMアルゴリズムの解釈
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
ゲーム理論BASIC 第19回 -有限回繰り返しゲーム-
ゲーム理論BASIC 第19回 -有限回繰り返しゲーム-ゲーム理論BASIC 第19回 -有限回繰り返しゲーム-
ゲーム理論BASIC 第19回 -有限回繰り返しゲーム-
 
ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-
ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-
ゲーム理論BASIC 第23回 -ベイジアンゲームにおける戦略と均衡-
 
cps170_bayes_nets.ppt
cps170_bayes_nets.pptcps170_bayes_nets.ppt
cps170_bayes_nets.ppt
 
Mechanical Engineering
Mechanical EngineeringMechanical Engineering
Mechanical Engineering
 
On maximal and variational Fourier restriction
On maximal and variational Fourier restrictionOn maximal and variational Fourier restriction
On maximal and variational Fourier restriction
 
Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892Tugasmatematikakelompok 150715235527-lva1-app6892
Tugasmatematikakelompok 150715235527-lva1-app6892
 
Ejercicios radhames ultima unidad
Ejercicios radhames ultima unidadEjercicios radhames ultima unidad
Ejercicios radhames ultima unidad
 
El text.life science6.matsubayashi191120
El text.life science6.matsubayashi191120El text.life science6.matsubayashi191120
El text.life science6.matsubayashi191120
 
ตรรกวิทยา
ตรรกวิทยาตรรกวิทยา
ตรรกวิทยา
 
Tugas matematika kelompok
Tugas matematika kelompokTugas matematika kelompok
Tugas matematika kelompok
 
[Paper Reading] Causal Bandits: Learning Good Interventions via Causal Inference
[Paper Reading] Causal Bandits: Learning Good Interventions via Causal Inference[Paper Reading] Causal Bandits: Learning Good Interventions via Causal Inference
[Paper Reading] Causal Bandits: Learning Good Interventions via Causal Inference
 
Solvable models on noncommutative spaces with minimal length uncertainty rela...
Solvable models on noncommutative spaces with minimal length uncertainty rela...Solvable models on noncommutative spaces with minimal length uncertainty rela...
Solvable models on noncommutative spaces with minimal length uncertainty rela...
 
Bayes2
Bayes2Bayes2
Bayes2
 
Tugasmatematikakelompok
TugasmatematikakelompokTugasmatematikakelompok
Tugasmatematikakelompok
 
Semi vae memo (2)
Semi vae memo (2)Semi vae memo (2)
Semi vae memo (2)
 

Kürzlich hochgeladen

LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Silpa
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Silpa
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 

関西NIPS+読み会発表スライド

  • 1. 1 1 B . 6 1 1 I + L 1 1 1 12 0 1 1 6 1 +
  • 2. • - u w x • - p - u p p • - w x - u p ʼt
  • 3. u u p u ( O XO 3KbO d C O u ( w Y 9 x • - ö p ~ p – |p2ZZOXNSa – W J J!W (
  • 4. u C 9D! u w x k(x, y) = exp( ||x y||2 / 2 )p u X HX NKM (Xi) = k(·, Xi) : X ! HX f(x) = hf, k(·, x)iHX , 8f 2 HX . )
  • 5. u | b Y C 9D r • p u pA u r • × p t Z k(·, y) x(y) = k(·, x) X ⇠ pX k(x1, x2) = h (x1), (x2)iHX µX := Z k(·, x)dpX (x) = Z (x)dpX(x) pX
  • 6. u • × u p wA oJA x p G u p u • × ʼ ~ - - u × k(x, x0 ) = exx0 µX(t) = EX⇠P [k(t, X)] = Z etx dP(x) = Z 1 + tx + t2 x2 2 + t3 x3 3! + · · · dP(x) = 1 + tEX⇠P [X] + t2 2 EX⇠P [X2 ] + t3 3! EX⇠P [X3 ] + · · · ˆµX := 1 n nX i=1 k(·, xi)
  • 7. ! • w8 O YX O K ?:AD .x - u w x - • - 5 WKaSW W WOKX NS M OZKXMb!r w x - A0B 5 • u HY RS K K O K ?:AD ! - u w pFS SZONSKä x 5 X1, ..., X` ⇠ P Y1, ..., Yn ⇠ Q H0 : P = Q vs H1 : P 6= Q C 9D rBSS K y f | u fz M2 (P, Q) := ||µP µQ||2 Hk µP µQ ,
  • 8. • - C 9D • r • u ä Z u pS O hg, CY X fiHY = E[f(X)g(Y )] 8f 2 HX , g 2 HY CY X : HX ! HY 7 YW DYXQ O K ( k((x1, y1), (x2, y2)) = kX(x1, x2)kY (y1, y2) HX ⌦ HY X, Y ⇠ p HX and (x), HY and (y). CY X = E[ (Y ) ⌦ (X)] 2 HY ⌦ HX u -
  • 9. • a~ H A HeG0a! u r | wERW x - u r × pDYXQ O K ( • r 7 YW DYXQ O K ( µY |X=x = E[ (Y )|X = x] µY |X=x = CY X C 1 XX (x) ˆµY |X=x = ˆCY X ( ˆCXX + I) 1 (x) E[g(Y )|X = x] ⇡ hg, ˆµY |X=xiHY , 8 g 2 HY . .
  • 10. u • 3KbO d O - ~ p p - • ~ r 4 4 D 4 234 O Mv • × ~ • u u - u p × | t p⇡ (Y |X = x) / ⇡(Y )p(X = x|Y ) ö
  • 11. O XO 3KbO C O7 WSc O K ?:AD ! • ö • 8YK • p ö ~ u ~ • r • u ʼ p Y ⇡(Y ) p(X = x|Y ) x ⇡(Y ) p⇡ (Y |X = x) µ⇡ Y (X = x) µ⇡ Y (X = x) = C⇡ Y X C⇡ 1 XX (x) 3 C⇡ Y X = Ep⇡(Y,X)[ Y (Y ) ⌦ X (X)] p⇡ (Y, X) = ⇡(Y )p(X|Y ) 6= p(Y, X). p | ~p(Y, X)
  • 12. • r - ä u • EROY OW A YYP SX 2ZZOXNSa! p ~ w u | x • u | p ~ ~ p p - × ˆµ⇡ Y (X = x) = ˆC⇡ Y X ([ ˆC⇡ XX]2 + I) 1 ˆC⇡ XX (x) ˆC⇡ XX (B + I) 1 z (B2 + I) 1 Bz C⇡ Y X = C(Y X)Y C 1 Y Y ⇡Y , C⇡ XX = C(XX)Y C 1 Y Y ⇡Y
  • 13. u • L O _K SYX • u 8 lXO j NO O K :4 = ! • COQ3KbO IR O K =C )! u |s H6D |s ER O RY NSXQ COQ K ScK SYX COQ3KbO (
  • 14. u • u A YYP SX 2ZZOXNSa! • COQ3KbO IR O K =C )! • p min p(Y |X=x) KL(p(Y |X = x)||⇡(Y )) Z log p(X = x|Y )dp(Y |X = x) s.t. p(Y |X = x) 2 Pprob min p(Y |X=x),⇠ KL(p(Y |X = x)||⇡(Y )) Z log p(X = x|Y )dp(Y |X = x) + U(⇠) s.t. p(Y |X = x) 2 Pprob(⇠) u )
  • 15. | r u p p A YZ ER O RY NSXQ OQ K ScK SYX COQ3KbO ä "[µ] "[µ] "S[µ] ˆ"S[µ] ˆ"+ S [µ] ˆ" ,n[µ]
  • 16. | r ~ u A YZ ER O RY NSXQ OQ K ScK SYX COQ3KbO ä "[µ] "[µ] "S[µ] ˆ"S[µ] ˆ"+ S [µ] ˆ" ,n[µ] ~ ERW )! ERW ! ERW (! ERW !
  • 17. • p p • p p - – p p r u | p ä A YYP SX 2ZZOXNSa! • EROY OW ) p → ~ u p⇡ (Y |X = x) µ⇡ Y (X = x) 2 HY h 2 HY hh, µ⇡ Y (X = x)iHY = Ep⇡(Y |X=x)[h(Y )] "[µ] := sup ||h||HY 1 EX [(EY [h(Y )|X]] hh, µ(X)iHY )2 ] EX [·] p⇡ (X) EY [·|X] p⇡ (Y |X) "S[µ] = E(X,Y )⇠p⇡(X,Y )[|| (Y ) µ(X)||2 HY ] µ⇤ := argminµ2HK "[µ] = argminµ2HK "s[µ] pX a.s. EROY OW ) Y O NO KS SX 2ZZOXNSXa! (xi, yi)n i=1 "[µ] µ : X ! HY , "[µ] "S[µ]
  • 18. s • r | • - r → ö u r QS_OX → w pERW x RO O "S[µ] f(x, y) := || (y) µ(x)||2 HY 2 HX ⌦ HY "s[µ] = E(X,Y )[|| (Y ) µ(X)||2 HY ] = hf, µ(X,Y )iHX ⌦HY ˆµ(X,Y ) = ˆC(X,Y )Y ( ˆCY Y + I) 1 ˆ⇡Y ˆ"s[µ] = hˆµ(X,Y ), fiHX ⌦HY p⇡ (X, Y ) A YZ A YYP SX 2ZZOXNSa! ˆ⇡Y = P` i=1 ˜↵i (˜yi) "S[µ] - ˆ✏s[µ] = nX i=1 i|| (yi) µ(xi)||2 HY , where. = (GY + n I) 1 ˜GY ˜↵, (GY )ij = kY (yi, yj), ( ˜GY )ij = kY (yi, ˜yj)
  • 19. ER O RY NSXQ OQ K ScK SYX • pA YZ ~ p • EROY OW ( | p R O RY NSXQ OQ K ScK SYX p?S RSbKWK O K K GS_ ü A 5A ! u | p EROY OW ( MYX S OXMb i i + i := max(0, i) ER O RY NSXQ OQ K ScK SYX ✏+ s [µ]✏s[µ]r |ˆ✏+ s [µ] ✏s[µ]| p ! 0. .
  • 20. SXSWScSXQ KK • p • EROY OW - × u p ˆ✏+ s [µ] ˆ✏ ,n[µ] = nX i=1 + i || (yi) µ(xi)||2 HY + ||µ||2 HK A YZY S SYX A YYP SX 2ZZOXNSXa! ˆµ ,n = argminµ ˆ" ,n[µ] ˆµ ,n(x) = (KX + n⇤+ ) 1 K:x where. = ( (y1), ..., (yn)), (KX )ij = kX (xi, xj), ⇤+ = diag(1/ + 1 , ..., 1/ + n ), K:x = (kX (x, x1), ..., kX (x, xn))T . "S[ˆµ n,n] P ! min µ "S[µ]. u
  • 21. COQ3KbO • ER O RY NSXQ OQ K ScK SYX p ʼ • COQLKbO - r - A YZ p - s • –p ü m n – 11 • u ö t L := mX i=1 + i || (yi) µ(xi)||2 HY + ||µ||2 HK + nX i=m+1 ||µ(xi) (ti)||2 HY . A YZY S SYX ( ˆµreg(x) = (KX + ⇤+ ) 1 K:x where. = ( (y1), ... (yn)), (KX )ij = kX (xi, xj), ⇤+ = diag(1/ + 1 , ..., 1/ + m, 1/ , ..., 1/ ), K:x = (kX (x, x1), ..., kX (x, xn))T .
  • 22. 6aZO SWOX • 4KWO K ZY S SYX OMY_O b ! • r × p p r i × ! • r w x u u | p u × | p × ! {(xt, yt)}m t=1. ✓t+1 = ✓t + 0.2 + N (0, 4e 1), rt+1 = max(R2, min(R1, rt + N (0, 1)) xt+1 = rt+1 cos ✓t+1, yt+1 = rt+1 sin ✓t+1 {(xt, yt)}m t=1. ×
  • 23. • u u u • ö COQ3KbO ~ → ä | D ZO _S SYX NK K p COQ3KbO • u 3C!p 7! ER O RY NSXQ OQ K ScK SYX Z 3C! COQ3KbO D6 - COQ3KbO ä u | ö ~ p ~ ä ʼ ~ ( R1 = 0, R2 = 10 R1 = 5, R2 = 7 0 ä ~ ʼ ö It It+1 (xt, yt) (xt+1, yt+1) A ONSM SYX ö ! w x p((xt+1, yt+1)|I1, ..., IT ) p((xt+1, yt+1)|I1, ..., IT , It+1)
  • 24. COPO OXMO • 2 R 8 O YX K OX 3Y Q K N K O CK MR 3O XRK N DMRk YZP KXN 2 DWY K 2 O XO WO RYN PY RO Y KWZ O Z YL OW ZKQO (g , • =O DYXQ YXK RKX 9 KXQ 2 Oa DWY K KXN OXTS 7 WSc 9S LO ZKMO OWLONNSXQ YP MYXNS SYXK NS SL SYX S R KZZ SMK SYX Y NbXKWSMK b OW ZKQO . g. - 24 . • =O DYXQ OXTS 7 WSc KXN 2 R 8 O YX O XO OWLONNSXQ YP MYXNS SYXK NS SL SYX 2 XSPSON O XO P KWO Y PY XYXZK KWO SM SXPO OXMO SX Q KZRSMK WYNO ( ) .- ( • OXTS 7 WSc =O DYXQ KXN 2 R 8 O YX O XO LKbO d O ZKQO ,(,g ,) • HKXQ DYXQ X IR KXN HYXQ COX O XO 3KbO SKX :XPO OXMO S R AY O SY COQ K ScK SYX ZKQO ), ( ),, • H bK HY RS K K EYWYRK : K K 9S Y RS DK KNK KXN EK O RS HKWKNK 4 Y NYWKSX WK MRSXQ PY LKQ YP Y N NK K _SK O XO OWLONNSXQ YP K OX NS SL SYX ZKQO ) ) ( • H ?S RSbKWK 2LNO KW 3Y K SK 2 R 8 O YX KXN OXTS 7 WSc 9S LO ZKMO OWLONNSXQ YP ZYWNZ K GS_ Z OZ SX K GS_ )--, • D OPPOX 8 lXO j NO 8 b =O_O = MK 3K NK K O DKW AK O YX 2 R 8 O YX KXN K SWS SKXY AYX S 4YXNS SYXK WOKX OWLONNSXQ K OQ O Y ZKQO - (g -( • X IR ?SXQ 4ROX KXN 6 SM A GSXQ 3KbO SKX SXPO OXMO S R ZY O SY OQ K ScK SYX KXN KZZ SMK SYX Y SXPSXS O K OX _W ! ,..g -), ) )
  • 25. Kernel Bayesian Inference with Posterior Regularization (Appendix) Yuchi Matsuoka 2017 3 18 1 Preliminaries (X, BX ) pX HX k(·, ·) RKHS pX µX = EpX [φ(X)] ∈ HX φ(X) = k(X, ·). 1 f ∈ H EpX [f(X)] = EpX [⟨f, φ(X)⟩] = ⟨f, µX⟩ universal kernel RKHS H sup norm CX 2 (X, BX ), (Y, BY) φ(x), ψ(y) RKHS HX ,HY p X ×Y (X, Y ) CXY CXY = Ep[φ(X)⊗ψ(Y )] k((x1, y1), (x2, y2)) = kX (x1, x2)kY(y1, y2) RKHS HX ⊗ HY µ(XY ) Theorem 1 CXX µX ∈ R(CXX) g ∈ HY E[g(Y )|X = ·] ∈ HX µY = CY XC−1 XXµX, µY |X=x = E[ψ(Y )|X = x] = CY XC−1 XXφ(x). 2 µX pX {xi}N i=1 ˆµX = 1 N N i=1 φ(xi), CXY ˆCXY = 1 N N i=1 φ(xi) ⊗ ψ(yi) RKHS Op(N−1/2 ) 1 ,i.e. supx kX (x, x) < ∞ 1
  • 26. 3 Theorem 1 Theorem 1 ((Song et al., 2009, Equation 6)) mΠ mQy HX Π HY QY CXX mΠ ∈ R(CXX) g ∈ HY E[g(Y )|X = ·] ∈ HX mQy = CY XC−1 XXmΠ. C−1 XXmΠ CXX mΠ Proof CXXf = mΠ f ∈ HX g ∈ HY ⟨CY Xf, g⟩ = ⟨f, CXY g⟩ = ⟨f, CXXE[g(Y )|X = ·]⟩ = ⟨CXXf, E[g(Y )|X = ·]⟩ = ⟨mΠ, E[g(Y )|X = ·]⟩ = ⟨mQy , g⟩. ⟨mΠ, E[g(Y )|X = ·]⟩ = ⟨mQy , g⟩ ⟨f, mX⟩ = E[f(X)] (X, Y ) ∼ p(x, y). U ∼ π(u). (Z, W) ∼ q(x, y) = π(x)p(y|x), qY(y) = q(x, y)dx ⟨mQY , g⟩HY = E[g(W)] = g(w)qY(w)dw ⟨mΠ, E[g(Y )|X = ·]⟩ = EU [EY [g(Y )|U]] = X ( Y g(y)p(y|u)dx)π(u)du = Y ( X g(y)q(u, y)du)dx = Y g(y)qY(y)dy. mQy = CY XC−1 XXmΠ mΠ kX (·, x) E[kY(·, Y )|X = x] = CY XC−1 XXkX (·, x) ✷ 4 π(Y ) Y p(X = x|Y ) pπ (Y |X = x) π(Y ) x pπ (X, Y ) = π(Y )p(X|Y ) πY CXY µπ Y (X = x) pπ (Y |X = x) ∝ π(Y )p(X = x|Y ). 2
  • 27. p(X|Y ) X × Y p CXY Thm. 1 Cπ Y X pπ Cπ XX pπ X µπ Y (X = x) = Cπ Y XCπ −1 XX φ(x). Cπ Y X HY ⊗ HX µ(Y X) Thm 1. µ(Y X) = C(Y X)Y C−1 Y Y πY , where. C(Y X)Y := E[ψ(Y ) ⊗ φ(X) ⊗ ψ(Y )]. Cπ XX µ(XX) = C(XX)Y C−1 Y Y πY 5 Regularized Bayesian inference (RegBayes) Pprob minp(Y |X=x) KL(p(Y |X = x)||π(Y )) − log p(X = x|Y )dp(Y |X = x) s.t. p(Y |X = x) ∈ Pprob Proof KL(p(Y |X = x)||π(Y )) − log p(X = x|Y )dp(Y |X = x) = log p(Y |X = x) π(Y ) dp(Y |X = x) − log p(X = x|Y )dp(Y |X = x) = log p(Y |X = x) π(Y )p(X = x|Y ) dp(Y |X = x) = log p(Y |X = x) π(Y )p(X=x|Y ) pπ(X=x) dp(Y |X = x) + log pπ (X = x)dp(Y |X = x) = KL p(Y |X = x)|| π(Y )p(X = x|Y ) pπ(X = x) + log pπ (X = x). arg minp(Y |X=x) KL(p(Y |X = x)||π(Y )) − log p(X = x|Y )dp(Y |X = x) = π(Y )p(X=x|Y ) pπ(X=x) . ✷ 3
  • 28. 6 Vector-valued regression (RKHS ) E(f) := n i=1 ||yj − f(xj)||2 HY + λ||f||2 HK , where. yj ∈ HY, f : X → HY. f RKHS HY f RKHS HK 6.0.1 Vector-values regression and RKHSs {(xi, vi)}i≤m X × V i.i.d X (V, ⟨·, ·⟩V E(X,V )[||f(X) − V ||2 V] f : X → V vector-valued regression problem [Definition] h : X → V (H, ⟨·, ·⟩Γ) x ∈ X, v ∈ V h → ⟨v, h(x)⟩V RKHS HΓ Riesz 2 x ∈ X, v ∈ V V HΓ Γx (Γxv ∈ HΓ ) h ∈ HΓ ⟨v, h(x)⟩V = ⟨h, Γxv⟩Γ HΓ RKHS Γx L(V) V V Γ(x, x′ ) ∈ L(V) Γ(x, x′ )v ∈ (Γx′ v)(x) ∈ V 2 (Riesz ) H H R H∗ H φ ∈ H∗ yφ ∈ H x ∈ H φ(x) = ⟨x, yφ⟩ 4
  • 29. [Proposition 2.1] Γ : X × X → L(V) (1) Γ(x, x′ ) = Γ(x′ , x)∗ . (2) n ∈ N, {(xi, vi)}i≤n ⊂ X × V i,j≤n⟨vi, Γ(xi, xj)vj⟩V ≥ 0. E(X,V )[||f(X) − V ||2 V] n i=1 ||vi − f(xi)||2 V f RKHS HΓ HΓ ˆϵλ(f) := n i=1 ||vi − f(xi)||2 V + λ||f||2 Γ. Γxi Theorem 2.2.(Adapted from G. Lever and S. Gr¨unew¨alder+ 2012) f∗ ˆϵλ HΓ f∗ = n i=1 Γxi ci {ci}, ci ∈ V i≤n (Γ(xj, xi) + λδji)ci = vj, 1 ≤ j ≤ n. ˆϵλ(f) 5
  • 30. 7 ϵs[µ] ε[µ] Proof ε[µ] := sup ||h||HY ≤1 EX[(EY [h(Y )|X]] − ⟨h, µ(X)⟩HY )2 ] = sup ||h||HY ≤1 EX[(EY [⟨h, ψ(Y )⟩HY |X] − ⟨, h, µ(X)⟩HY )2 ] ≤ sup ||h||HY ≤1 EX,Y [⟨h, ψ(Y ) − µ(X)⟩2 HY ] ≤ sup ||h||HY ≤1 ||h||2 HY EX,Y [||ψ(Y ) − µ(X)||2 HY ] = EX,Y [||ψ(Y ) − µ(X)||2 HY ] = ϵs[µ]. ✷ 8 Proposition 1 Proposition 1 (X, Y ) X × Y Y prior π(Y ) p(X|Y ) HX kX φ(x) RKHS HY kY ψ(y) RKHS φ(x, y) HX ⊗ HY ˆπY = ℓ i=1 ˜αiψ(˜yi) πY {(xi, yi)}n i=1 p(X|Y ) f(x, y) = ||ψ(y) − µ(x)||2 HY f ∈ HX ⊗ HY ˆϵs[µ] = n i=1 βi||ψ(yi) − µ(xi)||2 HY , β = (β1, ..., βn)T β = (GY + nλI)−1 ˜GY ˜α (GY )ij = kY(yi, yj), ( ˜GY )ij = kY(yi, ˜yj), ˜α = (˜α1, ...˜αℓ)T . Proof K. Fukumizu 2016. Kernel Bayes Rule. Proposition 4 6
  • 31. ΦX,Y = (φ(x1, y1), ..., φ(xn, yn)) ˆµ(X,Y ) = ΦX,Y β = ΦX,Y (GY + nλI)−1 ˜GY ˜α HX ⊗ HY ϵs[µ] = ⟨ˆµ(X,Y ), f⟩HX ⊗HY = ⟨ΦX,Y (GY + nλI)−1 ˜GY ˜α, f⟩HX ⊗HY = ⟨ΦX,Y β, f⟩HX ⊗HY = n i=1 βi||ψ(yi) − µ(xi)||2 HY . ˆµ(X,Y ) = ˆC(X,Y )Y ( ˆCY Y + λI)−1 ˆπY h = ( ˆCY Y + λI)−1 ˆπY h = n i=1 aiψ(yi) + h⊥ h⊥ h span(ψ(y1), ..., ψ(yn)} ( ˆCY Y + λI)h = ˆπY 1 n i,j≤n aikY(yi, yj)ψ(yj) + λ i≤n aiψ(yi) + h⊥ = i≤ℓ ˜αiψ(˜yi) ψ(yk)|n k=1 1 n G2 Y a + λGY a = ˜GY ˜α ⇔ 1 n (GY + nλI)GY a = ˜GY ˜α ⇔ 1 n GY a = (GY + nλI)−1 ˜GY ˜α ˆµ(X,Y ) ˆµ(X,Y ) = 1 n i≤n φ(xi, yi) ⊗ ψ(yi) h = 1 n ΦX,Y GY a = ΦX,Y (GY + nλI)−1 ˜GY ˜α ✷ 7
  • 32. 9 Proposition 2 Proposition 2 i β+ i ̸= 0 µ ∈ HK HK K(xi, xj) = kX (xi, xj)I I : HK → HK ˆµλ,n(x) = Ψ(KX + λnΛ+ )−1 K:x Ψ = (ψ(y1), ..., ψ(yn)) (KX)ij = kX (xi, xj) Λ+ = diag(1/β+ 1 , ..., 1/β+ n ) K:x = (kX (x, x1), ..., kX (x, xn))T λn Proof β+ i = 0 (xi, yi) i β+ i ̸= 0 µ = µ0 + g µ0 = n i=1 Kxi ci ˆϵλ,n[µ] ˆϵλ,n[µ] = n i=1 β+ i ||ψ(yi) − µ(xi)||2 HY + λn||µ||2 HK = n i=1 β+ i ||ψ(yi) − (µ0(xi) + g(xi))||2 HY + λn||µ0 + g||2 HK = n i=1 β+ i ||ψ(yi) − µ0(xi)||2 + λn||µ0||2 + n i=1 β+ i ||g(xi)||2 + λn||g||2 + 2λn⟨µ0, g⟩ − 2 n i=1 β+ i ⟨g(xi), ψ(yi) − µ0(xi)⟩. i ψ(yi) − n j=1 kX (xi, xj)cj = λn β+ i ci ˆϵλ,n[µ] λn⟨µ0, g⟩ − n i=1 β+ i ⟨g(xi), ψ(yi) − µ0(xi)⟩ = 0 ˆϵλ,n[µ] = ˆϵλ,n[µ0] + n i=1 β+ i ||g(xi)||2 + λn||g||2 ≥ ˆϵλ,n[µ0] ψ(yi) − n j=1 kX (xi, xj)cj = λn β+ i ci ci µ0 = n i=1 Kxi ci (KX + λnΛ+ )c = Ψ µ0(x) = n i=1 kX (x, xi)ci = Ψ(KX + λnΛ+ )−1 K:x ✷ 8