3. u
u
p u
( O XO 3KbO d C O
u
( w Y 9 x
•
- ö p ~
p
– |p2ZZOXNSa
– W J J!W
(
4. u
C 9D!
u w x
k(x, y) = exp( ||x y||2
/ 2
)p u
X HX
NKM
(Xi) = k(·, Xi)
: X ! HX
f(x) = hf, k(·, x)iHX
, 8f 2 HX . )
5. u |
b
Y
C 9D
r
• p u
pA u r
• × p t
Z
k(·, y) x(y) = k(·, x)
X ⇠ pX k(x1, x2) = h (x1), (x2)iHX
µX :=
Z
k(·, x)dpX (x) =
Z
(x)dpX(x)
pX
6. u
• × u p
wA oJA x
p G u p
u
• × ʼ ~
-
- u ×
k(x, x0
) = exx0
µX(t) = EX⇠P [k(t, X)] =
Z
etx
dP(x)
=
Z
1 + tx +
t2
x2
2
+
t3
x3
3!
+ · · · dP(x)
= 1 + tEX⇠P [X] +
t2
2
EX⇠P [X2
] +
t3
3!
EX⇠P [X3
] + · · ·
ˆµX :=
1
n
nX
i=1
k(·, xi)
7. !
• w8 O YX O K ?:AD .x
- u w x
-
•
- 5 WKaSW W WOKX NS M OZKXMb!r
w x
- A0B 5
• u HY RS K K O K ?:AD !
- u w pFS SZONSKä
x 5
X1, ..., X` ⇠ P Y1, ..., Yn ⇠ Q
H0 : P = Q vs H1 : P 6= Q
C 9D
rBSS K
y
f |
u fz
M2
(P, Q) := ||µP µQ||2
Hk
µP
µQ
,
8. •
- C 9D
•
r
• u
ä Z
u pS O
hg, CY X fiHY
= E[f(X)g(Y )] 8f 2 HX , g 2 HY
CY X : HX ! HY
7 YW DYXQ O K (
k((x1, y1), (x2, y2)) = kX(x1, x2)kY (y1, y2)
HX ⌦ HY
X, Y ⇠ p HX and (x), HY and (y).
CY X = E[ (Y ) ⌦ (X)] 2 HY ⌦ HX
u
-
9. • a~ H A HeG0a! u
r
|
wERW x
- u r
× pDYXQ O K (
• r
7 YW DYXQ O K (
µY |X=x = E[ (Y )|X = x]
µY |X=x = CY X C 1
XX (x)
ˆµY |X=x = ˆCY X ( ˆCXX + I) 1
(x)
E[g(Y )|X = x] ⇡ hg, ˆµY |X=xiHY
,
8 g 2 HY .
.
10. u
• 3KbO d O
- ~ p p
-
• ~ r 4 4 D 4 234 O Mv
• ×
~
• u u
- u
p × |
t
p⇡
(Y |X = x) / ⇡(Y )p(X = x|Y )
ö
11. O XO 3KbO C O7 WSc O K ?:AD !
• ö
• 8YK
• p ö ~
u ~
• r
• u
ʼ
p
Y ⇡(Y ) p(X = x|Y )
x ⇡(Y )
p⇡
(Y |X = x)
µ⇡
Y (X = x)
µ⇡
Y (X = x) = C⇡
Y X C⇡ 1
XX (x)
3 C⇡
Y X = Ep⇡(Y,X)[ Y (Y ) ⌦ X (X)]
p⇡
(Y, X) = ⇡(Y )p(X|Y ) 6= p(Y, X).
p | ~p(Y, X)
12. • r
- ä u
• EROY OW A YYP SX 2ZZOXNSa! p
~ w u | x
• u | p
~
~ p p
- ×
ˆµ⇡
Y (X = x) = ˆC⇡
Y X ([ ˆC⇡
XX]2
+ I) 1 ˆC⇡
XX (x)
ˆC⇡
XX (B + I) 1
z
(B2
+ I) 1
Bz
C⇡
Y X = C(Y X)Y C 1
Y Y ⇡Y , C⇡
XX = C(XX)Y C 1
Y Y ⇡Y
13. u
• L O _K SYX
• u
8 lXO j NO O K :4 = !
•
COQ3KbO IR O K =C )!
u |s
H6D
|s
ER O RY NSXQ COQ K ScK SYX COQ3KbO
(
14. u
• u
A YYP SX 2ZZOXNSa!
• COQ3KbO IR O K =C )!
• p
min
p(Y |X=x)
KL(p(Y |X = x)||⇡(Y ))
Z
log p(X = x|Y )dp(Y |X = x)
s.t. p(Y |X = x) 2 Pprob
min
p(Y |X=x),⇠
KL(p(Y |X = x)||⇡(Y ))
Z
log p(X = x|Y )dp(Y |X = x) + U(⇠)
s.t. p(Y |X = x) 2 Pprob(⇠)
u
)
15. |
r u p
p
A YZ
ER O RY NSXQ
OQ K ScK SYX
COQ3KbO
ä
"[µ]
"[µ]
"S[µ]
ˆ"S[µ]
ˆ"+
S [µ]
ˆ" ,n[µ]
16. |
r ~ u
A YZ
ER O RY NSXQ
OQ K ScK SYX
COQ3KbO
ä
"[µ]
"[µ]
"S[µ]
ˆ"S[µ]
ˆ"+
S [µ]
ˆ" ,n[µ]
~ ERW )!
ERW !
ERW (!
ERW !
17. • p
p
• p p
- – p p
r u | p
ä A YYP SX 2ZZOXNSa!
• EROY OW )
p
→ ~
u
p⇡
(Y |X = x) µ⇡
Y (X = x) 2 HY
h 2 HY
hh, µ⇡
Y (X = x)iHY
= Ep⇡(Y |X=x)[h(Y )]
"[µ] := sup
||h||HY
1
EX [(EY [h(Y )|X]] hh, µ(X)iHY
)2
]
EX [·] p⇡
(X) EY [·|X] p⇡
(Y |X)
"S[µ] = E(X,Y )⇠p⇡(X,Y )[|| (Y ) µ(X)||2
HY
]
µ⇤
:= argminµ2HK
"[µ] = argminµ2HK
"s[µ] pX a.s.
EROY OW ) Y O NO KS SX 2ZZOXNSXa!
(xi, yi)n
i=1
"[µ] µ : X ! HY
,
"[µ] "S[µ]
18. s
• r
|
•
- r
→
ö u r
QS_OX
→ w pERW x
RO O
"S[µ]
f(x, y) := || (y) µ(x)||2
HY
2 HX ⌦ HY
"s[µ] = E(X,Y )[|| (Y ) µ(X)||2
HY
] = hf, µ(X,Y )iHX ⌦HY
ˆµ(X,Y ) = ˆC(X,Y )Y ( ˆCY Y + I) 1
ˆ⇡Y
ˆ"s[µ] = hˆµ(X,Y ), fiHX ⌦HY
p⇡
(X, Y )
A YZ A YYP SX 2ZZOXNSa!
ˆ⇡Y =
P`
i=1 ˜↵i (˜yi)
"S[µ]
-
ˆ✏s[µ] =
nX
i=1
i|| (yi) µ(xi)||2
HY
,
where. = (GY + n I) 1 ˜GY ˜↵, (GY )ij = kY (yi, yj), ( ˜GY )ij = kY (yi, ˜yj)
19. ER O RY NSXQ OQ K ScK SYX
• pA YZ ~ p
• EROY OW ( | p
R O RY NSXQ OQ K ScK SYX p?S RSbKWK O K K GS_
ü A 5A ! u
| p EROY OW (
MYX S OXMb
i
i
+
i := max(0, i)
ER O RY NSXQ OQ K ScK SYX
✏+
s [µ]✏s[µ]r
|ˆ✏+
s [µ] ✏s[µ]|
p
! 0.
.
20. SXSWScSXQ KK
• p
• EROY OW
- × u p
ˆ✏+
s [µ]
ˆ✏ ,n[µ] =
nX
i=1
+
i || (yi) µ(xi)||2
HY
+ ||µ||2
HK
A YZY S SYX A YYP SX 2ZZOXNSXa!
ˆµ ,n = argminµ ˆ" ,n[µ]
ˆµ ,n(x) = (KX + n⇤+
) 1
K:x
where. = ( (y1), ..., (yn)), (KX )ij = kX (xi, xj),
⇤+
= diag(1/ +
1 , ..., 1/ +
n ), K:x = (kX (x, x1), ..., kX (x, xn))T
.
"S[ˆµ n,n]
P
! min
µ
"S[µ]. u
21. COQ3KbO
• ER O RY NSXQ OQ K ScK SYX p
ʼ
• COQLKbO
- r
- A YZ p
- s
• –p ü m n – 11
• u ö t
L :=
mX
i=1
+
i || (yi) µ(xi)||2
HY
+ ||µ||2
HK
+
nX
i=m+1
||µ(xi) (ti)||2
HY
.
A YZY S SYX (
ˆµreg(x) = (KX + ⇤+
) 1
K:x
where. = ( (y1), ... (yn)), (KX )ij = kX (xi, xj),
⇤+
= diag(1/ +
1 , ..., 1/ +
m, 1/ , ..., 1/ ), K:x = (kX (x, x1), ..., kX (x, xn))T
.
22. 6aZO SWOX
• 4KWO K ZY S SYX OMY_O b !
• r × p
p
r i × !
• r
w x
u u | p u ×
| p × !
{(xt, yt)}m
t=1.
✓t+1 = ✓t + 0.2 + N (0, 4e 1), rt+1 = max(R2, min(R1, rt + N (0, 1))
xt+1 = rt+1 cos ✓t+1, yt+1 = rt+1 sin ✓t+1
{(xt, yt)}m
t=1.
×
23. • u u
u
• ö COQ3KbO ~
→ ä | D ZO _S SYX NK K p
COQ3KbO
• u 3C!p
7!
ER O RY NSXQ OQ K ScK SYX Z 3C!
COQ3KbO D6
- COQ3KbO ä
u | ö ~ p
~
ä ʼ
~
(
R1 = 0, R2 = 10
R1 = 5, R2 = 7
0 ä
~ ʼ
ö
It It+1
(xt, yt) (xt+1, yt+1)
A ONSM SYX
ö !
w x
p((xt+1, yt+1)|I1, ..., IT )
p((xt+1, yt+1)|I1, ..., IT , It+1)
24. COPO OXMO
• 2 R 8 O YX K OX 3Y Q K N K O CK MR 3O XRK N DMRk YZP KXN 2 DWY K 2
O XO WO RYN PY RO Y KWZ O Z YL OW
ZKQO (g ,
• =O DYXQ YXK RKX 9 KXQ 2 Oa DWY K KXN OXTS 7 WSc 9S LO ZKMO OWLONNSXQ YP
MYXNS SYXK NS SL SYX S R KZZ SMK SYX Y NbXKWSMK b OW
ZKQO . g. - 24 .
• =O DYXQ OXTS 7 WSc KXN 2 R 8 O YX O XO OWLONNSXQ YP MYXNS SYXK
NS SL SYX 2 XSPSON O XO P KWO Y PY XYXZK KWO SM SXPO OXMO SX Q KZRSMK WYNO
( ) .- (
• OXTS 7 WSc =O DYXQ KXN 2 R 8 O YX O XO LKbO d O
ZKQO ,(,g ,)
• HKXQ DYXQ X IR KXN HYXQ COX O XO 3KbO SKX :XPO OXMO S R AY O SY
COQ K ScK SYX ZKQO ), ( ),,
• H bK HY RS K K EYWYRK : K K 9S Y RS DK KNK KXN EK O RS HKWKNK 4 Y NYWKSX
WK MRSXQ PY LKQ YP Y N NK K _SK O XO OWLONNSXQ YP K OX NS SL SYX
ZKQO ) ) (
• H ?S RSbKWK 2LNO KW 3Y K SK 2 R 8 O YX KXN OXTS 7 WSc 9S LO ZKMO
OWLONNSXQ YP ZYWNZ K GS_ Z OZ SX K GS_ )--,
• D OPPOX 8 lXO j NO 8 b =O_O = MK 3K NK K O DKW AK O YX 2 R 8 O YX KXN
K SWS SKXY AYX S 4YXNS SYXK WOKX OWLONNSXQ K OQ O Y
ZKQO - (g -(
• X IR ?SXQ 4ROX KXN 6 SM A GSXQ 3KbO SKX SXPO OXMO S R ZY O SY OQ K ScK SYX KXN
KZZ SMK SYX Y SXPSXS O K OX _W
! ,..g -), )
)
28. 6 Vector-valued regression
(RKHS )
E(f) :=
n
i=1
||yj − f(xj)||2
HY
+ λ||f||2
HK
,
where. yj ∈ HY, f : X → HY.
f RKHS HY f RKHS HK
6.0.1 Vector-values regression and RKHSs
{(xi, vi)}i≤m X × V i.i.d X (V, ⟨·, ·⟩V
E(X,V )[||f(X) − V ||2
V]
f : X → V vector-valued regression problem
[Definition] h : X → V (H, ⟨·, ·⟩Γ) x ∈ X, v ∈ V h → ⟨v, h(x)⟩V
RKHS HΓ
Riesz 2
x ∈ X, v ∈ V V HΓ Γx (Γxv ∈ HΓ ) h ∈ HΓ
⟨v, h(x)⟩V = ⟨h, Γxv⟩Γ
HΓ RKHS
Γx L(V) V V
Γ(x, x′
) ∈ L(V)
Γ(x, x′
)v ∈ (Γx′ v)(x) ∈ V
2
(Riesz ) H H R H∗
H φ ∈ H∗
yφ ∈ H x ∈ H φ(x) = ⟨x, yφ⟩
4
29. [Proposition 2.1] Γ : X × X → L(V)
(1) Γ(x, x′
) = Γ(x′
, x)∗
.
(2) n ∈ N, {(xi, vi)}i≤n ⊂ X × V i,j≤n⟨vi, Γ(xi, xj)vj⟩V ≥ 0.
E(X,V )[||f(X) − V ||2
V] n
i=1 ||vi − f(xi)||2
V f RKHS HΓ
HΓ
ˆϵλ(f) :=
n
i=1
||vi − f(xi)||2
V + λ||f||2
Γ.
Γxi
Theorem 2.2.(Adapted from G. Lever and S. Gr¨unew¨alder+ 2012) f∗
ˆϵλ HΓ
f∗
=
n
i=1
Γxi
ci
{ci}, ci ∈ V
i≤n
(Γ(xj, xi) + λδji)ci = vj, 1 ≤ j ≤ n.
ˆϵλ(f)
5
30. 7 ϵs[µ] ε[µ]
Proof
ε[µ] := sup
||h||HY
≤1
EX[(EY [h(Y )|X]] − ⟨h, µ(X)⟩HY
)2
]
= sup
||h||HY ≤1
EX[(EY [⟨h, ψ(Y )⟩HY
|X] − ⟨, h, µ(X)⟩HY
)2
]
≤ sup
||h||HY ≤1
EX,Y [⟨h, ψ(Y ) − µ(X)⟩2
HY
]
≤ sup
||h||HY ≤1
||h||2
HY
EX,Y [||ψ(Y ) − µ(X)||2
HY
]
= EX,Y [||ψ(Y ) − µ(X)||2
HY
] = ϵs[µ].
✷
8 Proposition 1
Proposition 1 (X, Y ) X × Y Y prior π(Y ) p(X|Y ) HX kX φ(x)
RKHS HY kY ψ(y) RKHS φ(x, y) HX ⊗ HY
ˆπY = ℓ
i=1 ˜αiψ(˜yi) πY {(xi, yi)}n
i=1 p(X|Y )
f(x, y) = ||ψ(y) − µ(x)||2
HY
f ∈ HX ⊗ HY
ˆϵs[µ] =
n
i=1
βi||ψ(yi) − µ(xi)||2
HY
,
β = (β1, ..., βn)T
β = (GY + nλI)−1 ˜GY ˜α (GY )ij = kY(yi, yj), ( ˜GY )ij = kY(yi, ˜yj), ˜α = (˜α1, ...˜αℓ)T
.
Proof K. Fukumizu 2016. Kernel Bayes Rule. Proposition 4
6
31. ΦX,Y = (φ(x1, y1), ..., φ(xn, yn)) ˆµ(X,Y ) = ΦX,Y β = ΦX,Y (GY + nλI)−1 ˜GY ˜α
HX ⊗ HY
ϵs[µ] = ⟨ˆµ(X,Y ), f⟩HX ⊗HY
= ⟨ΦX,Y (GY + nλI)−1 ˜GY ˜α, f⟩HX ⊗HY
= ⟨ΦX,Y β, f⟩HX ⊗HY
=
n
i=1
βi||ψ(yi) − µ(xi)||2
HY
.
ˆµ(X,Y ) = ˆC(X,Y )Y ( ˆCY Y + λI)−1
ˆπY h = ( ˆCY Y + λI)−1
ˆπY
h =
n
i=1
aiψ(yi) + h⊥
h⊥ h span(ψ(y1), ..., ψ(yn)} ( ˆCY Y + λI)h = ˆπY
1
n i,j≤n
aikY(yi, yj)ψ(yj) + λ
i≤n
aiψ(yi) + h⊥ =
i≤ℓ
˜αiψ(˜yi)
ψ(yk)|n
k=1
1
n
G2
Y a + λGY a = ˜GY ˜α ⇔
1
n
(GY + nλI)GY a = ˜GY ˜α ⇔
1
n
GY a = (GY + nλI)−1 ˜GY ˜α
ˆµ(X,Y )
ˆµ(X,Y ) =
1
n i≤n
φ(xi, yi) ⊗ ψ(yi) h =
1
n
ΦX,Y GY a = ΦX,Y (GY + nλI)−1 ˜GY ˜α
✷
7
32. 9 Proposition 2
Proposition 2 i β+
i ̸= 0 µ ∈ HK HK K(xi, xj) = kX (xi, xj)I
I : HK → HK
ˆµλ,n(x) = Ψ(KX + λnΛ+
)−1
K:x
Ψ = (ψ(y1), ..., ψ(yn)) (KX)ij = kX (xi, xj) Λ+
= diag(1/β+
1 , ..., 1/β+
n ) K:x = (kX (x, x1), ..., kX (x, xn))T
λn
Proof β+
i = 0 (xi, yi) i β+
i ̸= 0
µ = µ0 + g µ0 = n
i=1 Kxi
ci ˆϵλ,n[µ]
ˆϵλ,n[µ] =
n
i=1
β+
i ||ψ(yi) − µ(xi)||2
HY
+ λn||µ||2
HK
=
n
i=1
β+
i ||ψ(yi) − (µ0(xi) + g(xi))||2
HY
+ λn||µ0 + g||2
HK
=
n
i=1
β+
i ||ψ(yi) − µ0(xi)||2
+ λn||µ0||2
+
n
i=1
β+
i ||g(xi)||2
+ λn||g||2
+ 2λn⟨µ0, g⟩ − 2
n
i=1
β+
i ⟨g(xi), ψ(yi) − µ0(xi)⟩.
i ψ(yi) − n
j=1 kX (xi, xj)cj = λn
β+
i
ci ˆϵλ,n[µ]
λn⟨µ0, g⟩ −
n
i=1
β+
i ⟨g(xi), ψ(yi) − µ0(xi)⟩ = 0
ˆϵλ,n[µ] = ˆϵλ,n[µ0] +
n
i=1
β+
i ||g(xi)||2
+ λn||g||2
≥ ˆϵλ,n[µ0]
ψ(yi) − n
j=1 kX (xi, xj)cj = λn
β+
i
ci ci µ0 = n
i=1 Kxi
ci
(KX + λnΛ+
)c = Ψ
µ0(x) =
n
i=1
kX (x, xi)ci = Ψ(KX + λnΛ+
)−1
K:x
✷
8