SlideShare ist ein Scribd-Unternehmen logo
1 von 141
Downloaden Sie, um offline zu lesen
Dmytro Fishman
(dmytro@ut.ee)
Introduction to
Gaussian Processes
x f(x) y
Let’s take a look inside
x
y= f(x)
y
x
Let be linear functiony= f(x)
y= ✓0 + ✓1x
y
x
Let be linear functiony= f(x)
y= ✓0 + ✓1x
y
x
Let be linear functiony= f(x)
arg min
✓
nX
i=1
(yi ˆyi)2
ˆyi = ✓0 + ✓1xi
yi = ✓0 + ✓1xi + ✏i
arg min
✓
nX
i=1
(yi ˆyi)2
ˆyi = ✓0 + ✓1xi
yi = ✓0 + ✓1xi + ✏i
Find and by optimising
error
arg min
✓
nX
i=1
(yi ˆyi)2
ˆyi = ✓0 + ✓1xi
y = ✓ + ✓ x + ✏
arg min
✓
nX
i=1
(yi ˆyi)2
ˆyi = ✓0 + ✓1xi
y = ✓ + ✓ x + ✏
y= ✓0 + ✓1x
y
x
But if data is not linear?
y
x
But if data is not linear?
y= ✓0 + ✓1x
y
x
y= ✓0 + ✓1x + ✓2x2
But if data is not linear?
y
x
But if data is not linear?
y= ✓0 + ✓1x + ✓2x2
+ ✓3x3
y
x
What if don’t want to
assume a specific form?
y
x
GPs let you model
any function directly
y
x
y
x
y
Parametric ML Nonparametric ML
A learning model that
summarizes data with a set
of parameters of fixed size
(independent of the number
of training examples) is
called a parametric model.
Algorithms that do not make
strong assumptions about
the form of the mapping
function are called
nonparametric machine
learning algorithms.
y= ✓0 + ✓1x
x
y
x
y
Parametric ML Nonparametric ML
A learning model that
summarizes data with a set
of parameters of fixed size
(independent of the number
of training examples) is
called a parametric model.
Algorithms that do not make
strong assumptions about
the form of the mapping
function are called
nonparametric machine
learning algorithms.
y= ✓0 + ✓1x
Question: is K-nearest neighbour parametric
or nonparametric algorithm according to
these definitions?
x
GPs let you model
any function directly
y
x
y
GPs let you model
any function directly
estimates the uncertainty
for each new prediction
x
y
If I ask you to predict for
xi
xiyi
x
y
If I ask you to predict for
?
xiyi
You better be very uncertain
xi
How is it even possible?
We will need
Normal distribution
x xi
y
µ
1p
2⇡
e
(x µ)2
2 2
With average coordinate and standard
deviation from centre
µ
Many important processes follow normal
distribution
µ
1p
2⇡
e
(x µ)2
2 2N(µ, 2
)
With average coordinate and standard
deviation from centre
µ
Many important processes follow normal
distribution
X1 ⇠ N(µ1, 2
1)1p
2⇡
e
(x µ)2
2 2
With average coordinate and standard
deviation from centre
µ
Many important processes follow normal
distribution
µ1
1
1p
2⇡
e
(x µ)2
2 2
What If I draw
another distribution?
With average coordinate and standard
deviation from centre
µ
Many important processes follow normal
distribution
X1 ⇠ N(µ1, 2
1)
µ1
1
X1 ⇠ N(µ1, 2
1) X2 ⇠ N(µ2, 2
2)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
X1 X20 0
X1 X20 0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
X1 X20 0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
X1 X20 0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
X1 X20 0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
X1 X20 0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
X1 X20 0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
What if we would join them into one plot?
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
X2
X1
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
M =

µ1
µ2
X =

x1
x2
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
M =

µ1
µ2
X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
Joint distribution of variables and
X =

x1
x2
x1 x2
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
M =

µ1
µ2
X2 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
Joint distribution of variables and
X =

x1
x2
x1 x2
X1 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
M =

µ1
µ2
X2 ⇠ N(0, 1)
Joint distribution of variables and
X =

x1
x2
x1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
M =

µ1
µ2
Joint distribution of variables and
X =

x1
x2
x1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
M =

µ1
µ2
Joint distribution of variables and
X =

x1
x2
x1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
0
M =

µ1
µ2
Joint distribution of variables and
X =

x1
x2
x1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2 ⇠ N(0, 1)
X2
X1
Covariance matrix
or ⌃
µ1 = 0 1 = 1 µ2 = 0 2 = 1
Joint distribution of variables and
0
x1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
Joint distribution of variables andx1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
Joint distribution of variables andx1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2 ⇠ N(0, 1)
X2
X1
Similarity

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X10
0
X2
X10
0
X2

x1
x2
⇠ N
✓
0
0

1 0.5
0.5 1
◆
Positive value of does not
tell much about
X1
X2
Some similarity (correlation)
Positive value of with
good probability means
positive
X1
X2
No similarity (no correlation)
µ1 = 0 1 = 1 µ2 = 0 2 = 1
Joint distribution of variables andx1 x2
X1 ⇠ N(0, 1)

x1
x2
⇠ N
✓
0
0

1 0
0 1
◆
X2 ⇠ N(0, 1)
X2
X1
µ1 = 0 1 = 1 µ2 = 0 2 = 1
Joint distribution of variables and
0
x1 x2
X1 ⇠ N(0, 1) X2 ⇠ N(0, 1)
X2
X1

x1
x2
⇠ N
✓
0
0

1 0.5
0.5 1
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2
1|2
µ1|2

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2
Conditional distribution
1|2
µ1|2
P(x1|x2) = N(x1|µ1|2, 1|2)

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2
Conditional distribution
1|2
µ1|2
P(x1|x2) = N(x1|µ1|2, 1|2)
µ1|2 = µ1 + 12 + T
22(x2 µ2)
1|2 = 11 12
T
22 21
X2
X1
x2
1|2
µ1|2

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
Conditional distribution
P(x1|x2) = N(x1|µ1|2, 1|2)
µ1|2 = µ1 + 12 + T
22(x2 µ2)
1|2 = 11 12
T
22 21
P(x1, x2)
Joint distribution

x1
x2
⇠ N
✓
0
0

1 0.5
0.5 1
◆
X10
0
X2
N(µ, 2
)
Normal distribution
or 1D Gaussian
or 2D Gaussian
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
Sampling
Samples from
2D Gaussian
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)
Sampling
Samples from
2D Gaussian
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)
Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)
Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
There is little
dependency
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
There is little
dependency
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling
(0.13,0.52)
There is little
dependency
Samples from
2D Gaussian
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling
(0.13,0.52)
There is little
dependency
0
1st 2nd
1
1
Samples from
2D Gaussian
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling
(0.13,0.52)
There is little
dependency
0
1st 2nd
1
1
Samples from
2D Gaussian
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling (0.13,0.52)
Samples from
2D Gaussian
There is little
dependency
0
1st 2nd
1
1
(-0.03,-0.24)
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling (0.13,0.52)
Samples from
2D Gaussian
There is little
dependency
0
1st 2nd
1
1
(-0.03,-0.24)
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling (0.13,0.52)
Samples from
2D Gaussian
There is little
dependency
0
1st 2nd
1
1
(-0.03,-0.24)
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling (0.13,0.52)
Samples from
2D Gaussian
There is little
dependency
0
1st 2nd
1
1
(-0.03,-0.24)
More dependent
values
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
(-0.23, 1.13)Sampling
Samples from
2D Gaussian
0
1st 2nd
1
1
(-1.14, 0.65)
2D
Gaussian
⇠
✓
0
0

1 0.5
0.5 1
◆
Sampling (0.13,0.52)
Samples from
2D Gaussian
There is little
dependency
0
1st 2nd
1
1
(-0.03,-0.24)
More dependent
values
How would a sample from 20D
Gaussian look like?
2D
Gaussian
⇠
✓
0
0

1 0
0 1
◆
20D
Gaussian
?
20D
Gaussian
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
1 0 0 . . . 0
0 1 0 . . . 0
...
...
...
...
...
0 0 0 . . . 1
3
7
7
7
5
1
C
C
C
A
Sampling
(0.73, -0.12, 0.42, 1.2,…, 16 more)
0
1st 2nd
1
1
3rd 4th 5th 6th 7th
20D
Gaussian
Let’s add more dependency
between points
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
1 0.5 0.5 . . . 0.5
0.5 1 0.5 . . . 0.5
...
...
...
...
...
0.5 0.5 0.5 . . . 1
3
7
7
7
5
1
C
C
C
A
20D
Gaussian
Let’s add more dependency
between points
(0.73, 0.18, 0.68, -0.2,…, 16 more)
0
1st 2nd
1
1
3rd 4th 5th 6th 7th
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
1 0.5 0.5 . . . 0.5
0.5 1 0.5 . . . 0.5
...
...
...
...
...
0.5 0.5 0.5 . . . 1
3
7
7
7
5
1
C
C
C
A
20D
Gaussian
Let’s add more dependency
between points
0
1st 2nd
1
1
3rd 4th 5th 6th 7th
We want some notion of smoothness between points…
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
1 0.5 0.5 . . . 0.5
0.5 1 0.5 . . . 0.5
...
...
...
...
...
0.5 0.5 0.5 . . . 1
3
7
7
7
5
1
C
C
C
A
20D
Gaussian
Let’s add more dependency
between points
0
1st 2nd
1
1
3rd 4th 5th 6th 7th
We want some notion of smoothness between points…
So that dependancy between 1st and 2nd points is larger than between 1st and the 3rd.
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
1 0.5 0.5 . . . 0.5
0.5 1 0.5 . . . 0.5
...
...
...
...
...
0.5 0.5 0.5 . . . 1
3
7
7
7
5
1
C
C
C
A
20D
Gaussian
Let’s add more dependency
between points
We want some notion of smoothness between points…
So that dependancy between 1st and 2nd points is larger than between 1st and the 3rd.
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
1 0.5 0.5 . . . 0.5
0.5 1 0.5 . . . 0.5
...
...
...
...
...
0.5 0.5 0.5 . . . 1
3
7
7
7
5
1
C
C
C
A
We might have just increased corresponding values in
covariance matrix, right?
20D
Gaussian
Let’s add more dependency
between points
We want some notion of smoothness between points…
So that dependancy between 1st and 2nd points is larger than between 1st and the 3rd.
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
1 0.5 0.5 . . . 0.5
0.5 1 0.5 . . . 0.5
...
...
...
...
...
0.5 0.5 0.5 . . . 1
3
7
7
7
5
1
C
C
C
A
We might have just increased corresponding values in
covariance matrix, right?
We need a way to generate a “smooth” covariance
matrix automatically depending on the distance
between points
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
20D
Gaussian
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
20D
Gaussian
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K120
K21 K22 K23 . . . K220
...
...
...
...
...
K201 K202 K203 . . . K2020
3
7
7
7
5
1
C
C
C
A
0
1st 2nd
1
1
3rd 4th 5th 6th 7th
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
200D
Gaussian
0
1
1
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K1200
K21 K22 K23 . . . K2200
...
...
...
...
...
K2001 K2002 K2003 . . . K200200
3
7
7
7
5
1
C
C
C
A
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
200D
Gaussian
0
1
1
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K1200
K21 K22 K23 . . . K2200
...
...
...
...
...
K2001 K2002 K2003 . . . K200200
3
7
7
7
5
1
C
C
C
A
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
200D
Gaussian
0
1
1
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K1200
K21 K22 K23 . . . K2200
...
...
...
...
...
K2001 K2002 K2003 . . . K200200
3
7
7
7
5
1
C
C
C
A
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
200D
Gaussian
0
1
1
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K1200
K21 K22 K23 . . . K2200
...
...
...
...
...
K2001 K2002 K2003 . . . K200200
3
7
7
7
5
1
C
C
C
A
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
200D
Gaussian
0
1
1
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K1200
K21 K22 K23 . . . K2200
...
...
...
...
...
K2001 K2002 K2003 . . . K200200
3
7
7
7
5
1
C
C
C
A
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
200D
Gaussian
0
1
1
µ⇤ ⇤
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K1200
K21 K22 K23 . . . K2200
...
...
...
...
...
K2001 K2002 K2003 . . . K200200
3
7
7
7
5
1
C
C
C
A
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
200D
Gaussian
0
1
1
µ⇤ ⇤
0
B
B
B
@
2
6
6
6
4
0
0
...
0
3
7
7
7
5
2
6
6
6
4
K11 K12 K13 . . . K1200
K21 K22 K23 . . . K2200
...
...
...
...
...
K2001 K2002 K2003 . . . K200200
3
7
7
7
5
1
C
C
C
A
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Previously we were using:
to generate correlated points,
can we do it again here?

f1
f2
⇠
✓
0
0

1 0.5
0.5 1
◆
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Wait! But now we have three
points, we cannot use the
same formula!

f1
f2
⇠
✓
0
0

1 0.5
0.5 1
◆
Previously we were using:
to generate correlated points,
can we do it again here?
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Ok… What about now?
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5
2
4
1 0.5 0.5
0.5 1 0.5
0.5 0.5 1
3
5
1
A
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Ok… What about now?
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5
2
4
1 0.5 0.5
0.5 1 0.5
0.5 0.5 1
3
5
1
A
Wait, did he just said that f2
should be more correlated
to f1 than to f3?
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Ok… What about now?
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5
2
4
1 0.5 0.5
0.5 1 0.5
0.5 0.5 1
3
5
1
A
Wait, did he just said that f2
should be more correlated
to f1 than to f3?
Arrrr….
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Better now?
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5
2
4
1 0.7 0.2
0.7 1 0.5
0.2 0.5 1
3
5
1
A
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Better now?
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5
2
4
1 0.7 0.2
0.7 1 0.5
0.2 0.5 1
3
5
1
A
Yes, but what if we want to
obtain this matrix
automatically based on how
close points are by (Z)?
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
Better now?
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5
2
4
1 0.7 0.2
0.7 1 0.5
0.2 0.5 1
3
5
1
A
Yes, but what if we want to
obtain this matrix
automatically based on how
close points are by (Z)?
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
f1
f2
f3
We are interested in modelling
for given
Z
Z
So that is more correlated with than
z1 z2 z3
F(z)
F(z)
f1f2 f3
We will use a similarity measure
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
So now, it will become:
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5 ,
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
1
A
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5 ,
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
1
A
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5 ,
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
1
A
Which is the same as saying:
f ⇠ N(0, K)
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5 ,
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
1
A
Which is the same as saying:
f ⇠ N(0, K)
But how do we model f*?
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5 ,
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
1
A
Which is the same as saying:
f ⇠ N(0, K)
But how do we model f*?
Well, probably again some
kinda normal…
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
2
4
f1
f2
f3
3
5 ⇠
0
@
2
4
0
0
0
3
5 ,
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
1
A
Which is the same as saying:
f ⇠ N(0, K)
But how do we model f*?
Well, probably again some
kinda normal…
Maybe something like:
f⇤ ⇠ N(0, ?)
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K)
Maybe something like:
f⇤ ⇠ N(0, ?)
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K)
Maybe something like:
f⇤ ⇠ N(0, ?)
But what is this “?”
covariance matrix of z* with z*?
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K)
Maybe something like:
f⇤ ⇠ N(0, ?)
But what is this “?”
covariance matrix of z* with z*?
f⇤ ⇠ N(0, K⇤⇤)
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K)
Maybe something like:
f⇤ ⇠ N(0, ?)
But what is this “?”
covariance matrix of z* with z*?
f⇤ ⇠ N(0, K⇤⇤)
But isn’t K** is just 1?
K⇤⇤ = e ||z⇤ z⇤||2
= 1
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
What else is left?
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
What else is left?
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
What else is left?
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
What else is left?
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Only one entity is left:
K1⇤ = K(z1, z⇤)
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
What else is left?
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Only one entity is left:
K1⇤ = K(z1, z⇤)
I guess we know how to
calculate this one!
Kij = e ||zi zj ||2
=
(
0, ||zi zj|| ! 1
1, zi = zj
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
Yeah! We did it!
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
Yeah! We did it!
Wait… but what we do now?
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
Yeah! We did it!
Wait… but what we do now?
Remember….

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2
Conditional distribution
1|2
µ1|2
P(x1|x2) = N(x1|µ1|2, 1|2)
µ1|2 = µ1 + 12 + T
22(x2 µ2)
1|2 = 11 12
T
22 21

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2
Conditional distribution
1|2
µ1|2
P(x1|x2) = N(x1|µ1|2, 1|2)
µ1|2 = µ1 + 12 + T
22(x2 µ2)
1|2 = 11 12
T
22 21
What if we substitute x1 with f* and x2 with f?

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2
Conditional distribution
1|2
µ1|2
P(x1|x2) = N(x1|µ1|2, 1|2)
µ1|2 = µ1 + 12 + T
22(x2 µ2)
1|2 = 11 12
T
22 21
Then we can compute mean and standard deviation of f*!
What if we substitute x1 with f* and x2 with f?

x1
x2 i
⇠ N
✓
µ1
µ2

11 12
21 22
◆
X2
X1
Joint distribution of variables andx1 x2
P(x1, x2)
x2
Conditional distribution
1|2
µ1|2
P(x1|x2) = N(x1|µ1|2, 1|2)
µ1|2 = µ1 + 12 + T
22(x2 µ2)
1|2 = 11 12
T
22 21
Then we can compute mean and standard deviation of f*!
Exactly!
What if we substitute x1 with f* and x2 with f?
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
µ⇤
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
µ⇤ ⇤
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
⇤ = K⇤⇤ KT
⇤ K 1
K⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
µ⇤
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
⇤ = K⇤⇤ KT
⇤ K 1
K⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
µ⇤
z⇤ z⇤
µ⇤
µ⇤
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
⇤ = K⇤⇤ KT
⇤ K 1
K⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
µ⇤
z⇤ z⇤
µ⇤
µ⇤
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
⇤ = K⇤⇤ KT
⇤ K 1
K⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤also given
f⇤
Ok, so we have just modelled f:
f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)

f
f⇤
⇠
0
B
B
B
B
@

0
0
2
6
6
6
6
4
2
4
K11 K12 K13
K21 K22 K23
K31 K32 K33
3
5
2
4
K1⇤
K2⇤
K3⇤
3
5
⇥
K⇤1 K⇤2 K⇤3
⇤
[K⇤⇤]
3
7
7
7
7
5
1
C
C
C
C
A
K
1
Ki⇤
K⇤i
µ⇤
µ⇤
µ⇤
z⇤z⇤ z⇤
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
⇤ = K⇤⇤ KT
⇤ K 1
K⇤
What is ?
f1
f2
f3
Zz1 z2 z3
F(z)
Given: {(f1, z1); (f2, z2); (f3, z3)}
z⇤also given
f⇤
µ⇤
µ⇤
µ⇤
z⇤z⇤ z⇤
µ⇤= µ(z⇤) + KT
⇤ K 1
(f µf )
⇤ = K⇤⇤ KT
⇤ K 1
K⇤
Pros:
1. Can model almost any function directly
3. Provides uncertainty estimates
2. Can be made more flexible with different kernels
Cons:
1. Cannot be interpreted
2. Loose efficiency in high dimensional spaces
3. Overfitting
Cat or Dog?
“It’s always seemed obvious to me that it’s better to know that
you don’t know, than to think you know and act on wrong
information.”
Katherine Bailey
Teaching statistics Doing statistics
Resources:
Katherine Bailey’s presentation: http://katbailey.github.io/gp_talk/
Gaussian_Processes.pdf
Katherine Bailey’s blog post: from both sides now: the math of linear
regression (http://katbailey.github.io/post/from-both-sides-now-the-
math-of-linear-regression/)
Katherine Bailey’s blog post: Gaussian processes for dummies (http://
katbailey.github.io/post/gaussian-processes-for-dummies/)
Kevin P. Murphy’s book: Machine Learning - A Probabilistic
Perspective, Chapter 15 (https://www.amazon.com/Machine-Learning-
Probabilistic-Perspective-Computation/dp/0262018020)
Alex Bridgland’s blog post: Introduction to Gaussian Processes - Part I
(http://bridg.land/posts/gaussian-processes-1)
Nando de Freitas, Machine Learning - Introduction to Gaussian
Processes (https://youtu.be/4vGiHC35j9s)
in class
Under the review

Weitere ähnliche Inhalte

Was ist angesagt?

An Efficient Boundary Integral Method for Stiff Fluid Interface Problems
An Efficient Boundary Integral Method for Stiff Fluid Interface ProblemsAn Efficient Boundary Integral Method for Stiff Fluid Interface Problems
An Efficient Boundary Integral Method for Stiff Fluid Interface ProblemsAlex (Oleksiy) Varfolomiyev
 
ゲーム理論BASIC 演習4 -交渉集合を求める-
ゲーム理論BASIC 演習4 -交渉集合を求める-ゲーム理論BASIC 演習4 -交渉集合を求める-
ゲーム理論BASIC 演習4 -交渉集合を求める-ssusere0a682
 
On Convergence of Jungck Type Iteration for Certain Contractive Conditions
On Convergence of Jungck Type Iteration for Certain Contractive ConditionsOn Convergence of Jungck Type Iteration for Certain Contractive Conditions
On Convergence of Jungck Type Iteration for Certain Contractive Conditionsresearchinventy
 
INFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERS
INFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERSINFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERS
INFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERSZac Darcy
 
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...iosrjce
 
Flexural analysis of thick beams using single
Flexural analysis of thick beams using singleFlexural analysis of thick beams using single
Flexural analysis of thick beams using singleiaemedu
 
Vcla - Inner Products
Vcla - Inner ProductsVcla - Inner Products
Vcla - Inner ProductsPreetshah1212
 
Kites team l3
Kites team l3Kites team l3
Kites team l3aero103
 
近似ベイズ計算によるベイズ推定
近似ベイズ計算によるベイズ推定近似ベイズ計算によるベイズ推定
近似ベイズ計算によるベイズ推定Kosei ABE
 
Power Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's EquationPower Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's EquationArijitDhali
 
Weighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with ApplicationsWeighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with ApplicationsPremier Publishers
 
8 ashish final paper--85-90
8 ashish   final paper--85-908 ashish   final paper--85-90
8 ashish final paper--85-90Alexander Decker
 
2 backlash simulation
2 backlash simulation2 backlash simulation
2 backlash simulationSolo Hermelin
 

Was ist angesagt? (20)

Energy principles Att 6607
Energy principles Att 6607Energy principles Att 6607
Energy principles Att 6607
 
5. stress function
5.  stress function5.  stress function
5. stress function
 
An Efficient Boundary Integral Method for Stiff Fluid Interface Problems
An Efficient Boundary Integral Method for Stiff Fluid Interface ProblemsAn Efficient Boundary Integral Method for Stiff Fluid Interface Problems
An Efficient Boundary Integral Method for Stiff Fluid Interface Problems
 
Krishna
KrishnaKrishna
Krishna
 
ゲーム理論BASIC 演習4 -交渉集合を求める-
ゲーム理論BASIC 演習4 -交渉集合を求める-ゲーム理論BASIC 演習4 -交渉集合を求める-
ゲーム理論BASIC 演習4 -交渉集合を求める-
 
On Convergence of Jungck Type Iteration for Certain Contractive Conditions
On Convergence of Jungck Type Iteration for Certain Contractive ConditionsOn Convergence of Jungck Type Iteration for Certain Contractive Conditions
On Convergence of Jungck Type Iteration for Certain Contractive Conditions
 
Dyadics
DyadicsDyadics
Dyadics
 
INFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERS
INFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERSINFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERS
INFLUENCE OF OVERLAYERS ON DEPTH OF IMPLANTED-HETEROJUNCTION RECTIFIERS
 
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
 
Flexural analysis of thick beams using single
Flexural analysis of thick beams using singleFlexural analysis of thick beams using single
Flexural analysis of thick beams using single
 
Vcla - Inner Products
Vcla - Inner ProductsVcla - Inner Products
Vcla - Inner Products
 
Kites team l3
Kites team l3Kites team l3
Kites team l3
 
近似ベイズ計算によるベイズ推定
近似ベイズ計算によるベイズ推定近似ベイズ計算によるベイズ推定
近似ベイズ計算によるベイズ推定
 
1638 vector quantities
1638 vector quantities1638 vector quantities
1638 vector quantities
 
BS2506 tutorial 1
BS2506 tutorial 1BS2506 tutorial 1
BS2506 tutorial 1
 
Power Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's EquationPower Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's Equation
 
Linreg
LinregLinreg
Linreg
 
Weighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with ApplicationsWeighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with Applications
 
8 ashish final paper--85-90
8 ashish   final paper--85-908 ashish   final paper--85-90
8 ashish final paper--85-90
 
2 backlash simulation
2 backlash simulation2 backlash simulation
2 backlash simulation
 

Ähnlich wie Introduction to Gaussian Processes

07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processing07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processingJordanJohmMallillin
 
Diffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricDiffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricGota Morota
 
進化ゲーム理論入門第4回 -漸近安定な平衡点-
進化ゲーム理論入門第4回 -漸近安定な平衡点-進化ゲーム理論入門第4回 -漸近安定な平衡点-
進化ゲーム理論入門第4回 -漸近安定な平衡点-ssusere0a682
 
Formulario cálculo
Formulario cálculoFormulario cálculo
Formulario cálculoMan50035
 
A direct method for estimating linear non-Gaussian acyclic models
A direct method for estimating linear non-Gaussian acyclic modelsA direct method for estimating linear non-Gaussian acyclic models
A direct method for estimating linear non-Gaussian acyclic modelsShiga University, RIKEN
 
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions ManualCalculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions Manualnodyligomi
 
Detailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss FunctionDetailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss Function범준 김
 
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONPROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONJournal For Research
 
Formulario oficial-calculo
Formulario oficial-calculoFormulario oficial-calculo
Formulario oficial-calculoFavian Flores
 
51554 0131469657 ism-13
51554 0131469657 ism-1351554 0131469657 ism-13
51554 0131469657 ism-13Carlos Fuentes
 
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by AntonSolutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by AntonPamelaew
 
51546 0131469657 ism-5
51546 0131469657 ism-551546 0131469657 ism-5
51546 0131469657 ism-5Carlos Fuentes
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihoodDeep Learning JP
 
Means and variances of random variables
Means and variances of random variablesMeans and variances of random variables
Means and variances of random variablesUlster BOCES
 
Paul Bleau Calc III Project 2 - Basel Problem
Paul Bleau Calc III Project 2 - Basel ProblemPaul Bleau Calc III Project 2 - Basel Problem
Paul Bleau Calc III Project 2 - Basel ProblemPaul Bleau
 

Ähnlich wie Introduction to Gaussian Processes (20)

07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processing07-Convolution.pptx signal spectra and signal processing
07-Convolution.pptx signal spectra and signal processing
 
Diffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metricDiffusion kernels on SNP data embedded in a non-Euclidean metric
Diffusion kernels on SNP data embedded in a non-Euclidean metric
 
進化ゲーム理論入門第4回 -漸近安定な平衡点-
進化ゲーム理論入門第4回 -漸近安定な平衡点-進化ゲーム理論入門第4回 -漸近安定な平衡点-
進化ゲーム理論入門第4回 -漸近安定な平衡点-
 
Formulario calculo
Formulario calculoFormulario calculo
Formulario calculo
 
Formulario cálculo
Formulario cálculoFormulario cálculo
Formulario cálculo
 
A direct method for estimating linear non-Gaussian acyclic models
A direct method for estimating linear non-Gaussian acyclic modelsA direct method for estimating linear non-Gaussian acyclic models
A direct method for estimating linear non-Gaussian acyclic models
 
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions ManualCalculus Early Transcendentals 10th Edition Anton Solutions Manual
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
 
lec32.ppt
lec32.pptlec32.ppt
lec32.ppt
 
Detailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss FunctionDetailed Description on Cross Entropy Loss Function
Detailed Description on Cross Entropy Loss Function
 
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONPROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
 
Formulario oficial-calculo
Formulario oficial-calculoFormulario oficial-calculo
Formulario oficial-calculo
 
Hypergdistribution
HypergdistributionHypergdistribution
Hypergdistribution
 
51554 0131469657 ism-13
51554 0131469657 ism-1351554 0131469657 ism-13
51554 0131469657 ism-13
 
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by AntonSolutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
 
51546 0131469657 ism-5
51546 0131469657 ism-551546 0131469657 ism-5
51546 0131469657 ism-5
 
Capitulo 5 Soluciones Purcell 9na Edicion
Capitulo 5 Soluciones Purcell 9na EdicionCapitulo 5 Soluciones Purcell 9na Edicion
Capitulo 5 Soluciones Purcell 9na Edicion
 
Jacobi method
Jacobi methodJacobi method
Jacobi method
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
Means and variances of random variables
Means and variances of random variablesMeans and variances of random variables
Means and variances of random variables
 
Paul Bleau Calc III Project 2 - Basel Problem
Paul Bleau Calc III Project 2 - Basel ProblemPaul Bleau Calc III Project 2 - Basel Problem
Paul Bleau Calc III Project 2 - Basel Problem
 

Mehr von Dmytro Fishman

DOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biologyDOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biologyDmytro Fishman
 
Tips for effective presentations
Tips for effective presentationsTips for effective presentations
Tips for effective presentationsDmytro Fishman
 
Autonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WPAutonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WPDmytro Fishman
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningDmytro Fishman
 
Introduction to Machine Learning for Taxify/Bolt
Introduction to Machine Learning for Taxify/BoltIntroduction to Machine Learning for Taxify/Bolt
Introduction to Machine Learning for Taxify/BoltDmytro Fishman
 
Detecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep LearningDetecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep LearningDmytro Fishman
 
Deep Learning in Healthcare
Deep Learning in HealthcareDeep Learning in Healthcare
Deep Learning in HealthcareDmytro Fishman
 
5 Introduction to neural networks
5 Introduction to neural networks5 Introduction to neural networks
5 Introduction to neural networksDmytro Fishman
 
4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)Dmytro Fishman
 
3 Unsupervised learning
3 Unsupervised learning3 Unsupervised learning
3 Unsupervised learningDmytro Fishman
 
What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?Dmytro Fishman
 
Machine Learning in Bioinformatics
Machine Learning in BioinformaticsMachine Learning in Bioinformatics
Machine Learning in BioinformaticsDmytro Fishman
 

Mehr von Dmytro Fishman (14)

DOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biologyDOME: Recommendations for supervised machine learning validation in biology
DOME: Recommendations for supervised machine learning validation in biology
 
Tips for effective presentations
Tips for effective presentationsTips for effective presentations
Tips for effective presentations
 
Autonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WPAutonomous Driving Lab - Simultaneous Localization and Mapping WP
Autonomous Driving Lab - Simultaneous Localization and Mapping WP
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Introduction to Machine Learning for Taxify/Bolt
Introduction to Machine Learning for Taxify/BoltIntroduction to Machine Learning for Taxify/Bolt
Introduction to Machine Learning for Taxify/Bolt
 
Biit group 2018
Biit group 2018Biit group 2018
Biit group 2018
 
Detecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep LearningDetecting Nuclei from Microscopy Images with Deep Learning
Detecting Nuclei from Microscopy Images with Deep Learning
 
Deep Learning in Healthcare
Deep Learning in HealthcareDeep Learning in Healthcare
Deep Learning in Healthcare
 
5 Introduction to neural networks
5 Introduction to neural networks5 Introduction to neural networks
5 Introduction to neural networks
 
4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)4 Dimensionality reduction (PCA & t-SNE)
4 Dimensionality reduction (PCA & t-SNE)
 
3 Unsupervised learning
3 Unsupervised learning3 Unsupervised learning
3 Unsupervised learning
 
1 Supervised learning
1 Supervised learning1 Supervised learning
1 Supervised learning
 
What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?What does it mean to be a bioinformatician?
What does it mean to be a bioinformatician?
 
Machine Learning in Bioinformatics
Machine Learning in BioinformaticsMachine Learning in Bioinformatics
Machine Learning in Bioinformatics
 

Kürzlich hochgeladen

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 

Kürzlich hochgeladen (20)

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 

Introduction to Gaussian Processes

  • 2.
  • 3. x f(x) y Let’s take a look inside
  • 5. x Let be linear functiony= f(x) y= ✓0 + ✓1x y
  • 6. x Let be linear functiony= f(x) y= ✓0 + ✓1x y
  • 7. x Let be linear functiony= f(x) arg min ✓ nX i=1 (yi ˆyi)2 ˆyi = ✓0 + ✓1xi yi = ✓0 + ✓1xi + ✏i arg min ✓ nX i=1 (yi ˆyi)2 ˆyi = ✓0 + ✓1xi yi = ✓0 + ✓1xi + ✏i Find and by optimising error arg min ✓ nX i=1 (yi ˆyi)2 ˆyi = ✓0 + ✓1xi y = ✓ + ✓ x + ✏ arg min ✓ nX i=1 (yi ˆyi)2 ˆyi = ✓0 + ✓1xi y = ✓ + ✓ x + ✏ y= ✓0 + ✓1x y
  • 8. x But if data is not linear? y
  • 9. x But if data is not linear? y= ✓0 + ✓1x y
  • 10. x y= ✓0 + ✓1x + ✓2x2 But if data is not linear? y
  • 11. x But if data is not linear? y= ✓0 + ✓1x + ✓2x2 + ✓3x3 y
  • 12. x What if don’t want to assume a specific form? y
  • 13. x GPs let you model any function directly y
  • 14. x y x y Parametric ML Nonparametric ML A learning model that summarizes data with a set of parameters of fixed size (independent of the number of training examples) is called a parametric model. Algorithms that do not make strong assumptions about the form of the mapping function are called nonparametric machine learning algorithms. y= ✓0 + ✓1x
  • 15. x y x y Parametric ML Nonparametric ML A learning model that summarizes data with a set of parameters of fixed size (independent of the number of training examples) is called a parametric model. Algorithms that do not make strong assumptions about the form of the mapping function are called nonparametric machine learning algorithms. y= ✓0 + ✓1x Question: is K-nearest neighbour parametric or nonparametric algorithm according to these definitions?
  • 16. x GPs let you model any function directly y
  • 17. x y GPs let you model any function directly estimates the uncertainty for each new prediction
  • 18. x y If I ask you to predict for xi xiyi
  • 19. x y If I ask you to predict for ? xiyi You better be very uncertain xi
  • 20. How is it even possible?
  • 21. We will need Normal distribution x xi y
  • 22. µ 1p 2⇡ e (x µ)2 2 2 With average coordinate and standard deviation from centre µ Many important processes follow normal distribution
  • 23. µ 1p 2⇡ e (x µ)2 2 2N(µ, 2 ) With average coordinate and standard deviation from centre µ Many important processes follow normal distribution
  • 24. X1 ⇠ N(µ1, 2 1)1p 2⇡ e (x µ)2 2 2 With average coordinate and standard deviation from centre µ Many important processes follow normal distribution µ1 1
  • 25. 1p 2⇡ e (x µ)2 2 2 What If I draw another distribution? With average coordinate and standard deviation from centre µ Many important processes follow normal distribution X1 ⇠ N(µ1, 2 1) µ1 1
  • 26. X1 ⇠ N(µ1, 2 1) X2 ⇠ N(µ2, 2 2) µ1 = 0 1 = 1 µ2 = 0 2 = 1 X1 X20 0
  • 27. X1 X20 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1) µ1 = 0 1 = 1 µ2 = 0 2 = 1
  • 28. µ1 = 0 1 = 1 µ2 = 0 2 = 1 X1 X20 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
  • 29. µ1 = 0 1 = 1 µ2 = 0 2 = 1 X1 X20 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
  • 30. µ1 = 0 1 = 1 µ2 = 0 2 = 1 X1 X20 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
  • 31. µ1 = 0 1 = 1 µ2 = 0 2 = 1 X1 X20 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
  • 32. µ1 = 0 1 = 1 µ2 = 0 2 = 1 X1 X20 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1) What if we would join them into one plot?
  • 33. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1) X2 X1
  • 34. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1) X2 X1
  • 35. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1) X2 X1
  • 36. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1) X2 X1
  • 37. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1) X2 X1
  • 38. X2 X1 µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 M =  µ1 µ2 X =  x1 x2 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)
  • 39. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 M =  µ1 µ2 X2 ⇠ N(0, 1)X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ Joint distribution of variables and X =  x1 x2 x1 x2 X2 X1
  • 40. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 M =  µ1 µ2 X2 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ Joint distribution of variables and X =  x1 x2 x1 x2 X1 ⇠ N(0, 1) X2 X1
  • 41. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 M =  µ1 µ2 X2 ⇠ N(0, 1) Joint distribution of variables and X =  x1 x2 x1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 X1
  • 42. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 M =  µ1 µ2 Joint distribution of variables and X =  x1 x2 x1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 ⇠ N(0, 1) X2 X1
  • 43. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 M =  µ1 µ2 Joint distribution of variables and X =  x1 x2 x1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 ⇠ N(0, 1) X2 X1
  • 44. µ1 = 0 1 = 1 µ2 = 0 2 = 1 0 M =  µ1 µ2 Joint distribution of variables and X =  x1 x2 x1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 ⇠ N(0, 1) X2 X1 Covariance matrix or ⌃
  • 45. µ1 = 0 1 = 1 µ2 = 0 2 = 1 Joint distribution of variables and 0 x1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 ⇠ N(0, 1) X2 X1
  • 46. µ1 = 0 1 = 1 µ2 = 0 2 = 1 Joint distribution of variables andx1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 ⇠ N(0, 1) X2 X1
  • 47. µ1 = 0 1 = 1 µ2 = 0 2 = 1 Joint distribution of variables andx1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 ⇠ N(0, 1) X2 X1 Similarity
  • 48.  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X10 0 X2 X10 0 X2  x1 x2 ⇠ N ✓ 0 0  1 0.5 0.5 1 ◆ Positive value of does not tell much about X1 X2 Some similarity (correlation) Positive value of with good probability means positive X1 X2 No similarity (no correlation)
  • 49. µ1 = 0 1 = 1 µ2 = 0 2 = 1 Joint distribution of variables andx1 x2 X1 ⇠ N(0, 1)  x1 x2 ⇠ N ✓ 0 0  1 0 0 1 ◆ X2 ⇠ N(0, 1) X2 X1
  • 50. µ1 = 0 1 = 1 µ2 = 0 2 = 1 Joint distribution of variables and 0 x1 x2 X1 ⇠ N(0, 1) X2 ⇠ N(0, 1) X2 X1  x1 x2 ⇠ N ✓ 0 0  1 0.5 0.5 1 ◆
  • 51. X2 X1 Joint distribution of variables andx1 x2 P(x1, x2)  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆
  • 52. X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆
  • 53. X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆
  • 54. X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2 1|2 µ1|2  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆
  • 55. X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2 Conditional distribution 1|2 µ1|2 P(x1|x2) = N(x1|µ1|2, 1|2)  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆
  • 56.  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆ X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2 Conditional distribution 1|2 µ1|2 P(x1|x2) = N(x1|µ1|2, 1|2) µ1|2 = µ1 + 12 + T 22(x2 µ2) 1|2 = 11 12 T 22 21
  • 57. X2 X1 x2 1|2 µ1|2  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆ Conditional distribution P(x1|x2) = N(x1|µ1|2, 1|2) µ1|2 = µ1 + 12 + T 22(x2 µ2) 1|2 = 11 12 T 22 21 P(x1, x2) Joint distribution  x1 x2 ⇠ N ✓ 0 0  1 0.5 0.5 1 ◆ X10 0 X2 N(µ, 2 ) Normal distribution or 1D Gaussian or 2D Gaussian
  • 60. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13) Sampling Samples from 2D Gaussian
  • 61. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13) Sampling Samples from 2D Gaussian 0 1st 2nd 1 1
  • 62. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13) Sampling Samples from 2D Gaussian 0 1st 2nd 1 1
  • 63. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65)
  • 64. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65)
  • 65. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65)
  • 66. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) There is little dependency
  • 67. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ There is little dependency
  • 68. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) There is little dependency Samples from 2D Gaussian
  • 69. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) There is little dependency 0 1st 2nd 1 1 Samples from 2D Gaussian
  • 70. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) There is little dependency 0 1st 2nd 1 1 Samples from 2D Gaussian
  • 71. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) Samples from 2D Gaussian There is little dependency 0 1st 2nd 1 1 (-0.03,-0.24)
  • 72. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) Samples from 2D Gaussian There is little dependency 0 1st 2nd 1 1 (-0.03,-0.24)
  • 73. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) Samples from 2D Gaussian There is little dependency 0 1st 2nd 1 1 (-0.03,-0.24)
  • 74. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) Samples from 2D Gaussian There is little dependency 0 1st 2nd 1 1 (-0.03,-0.24) More dependent values
  • 75. 2D Gaussian ⇠ ✓ 0 0  1 0 0 1 ◆ (-0.23, 1.13)Sampling Samples from 2D Gaussian 0 1st 2nd 1 1 (-1.14, 0.65) 2D Gaussian ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Sampling (0.13,0.52) Samples from 2D Gaussian There is little dependency 0 1st 2nd 1 1 (-0.03,-0.24) More dependent values How would a sample from 20D Gaussian look like?
  • 78. 20D Gaussian 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 1 0 0 . . . 0 0 1 0 . . . 0 ... ... ... ... ... 0 0 0 . . . 1 3 7 7 7 5 1 C C C A Sampling (0.73, -0.12, 0.42, 1.2,…, 16 more) 0 1st 2nd 1 1 3rd 4th 5th 6th 7th
  • 79. 20D Gaussian Let’s add more dependency between points 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 1 0.5 0.5 . . . 0.5 0.5 1 0.5 . . . 0.5 ... ... ... ... ... 0.5 0.5 0.5 . . . 1 3 7 7 7 5 1 C C C A
  • 80. 20D Gaussian Let’s add more dependency between points (0.73, 0.18, 0.68, -0.2,…, 16 more) 0 1st 2nd 1 1 3rd 4th 5th 6th 7th 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 1 0.5 0.5 . . . 0.5 0.5 1 0.5 . . . 0.5 ... ... ... ... ... 0.5 0.5 0.5 . . . 1 3 7 7 7 5 1 C C C A
  • 81. 20D Gaussian Let’s add more dependency between points 0 1st 2nd 1 1 3rd 4th 5th 6th 7th We want some notion of smoothness between points… 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 1 0.5 0.5 . . . 0.5 0.5 1 0.5 . . . 0.5 ... ... ... ... ... 0.5 0.5 0.5 . . . 1 3 7 7 7 5 1 C C C A
  • 82. 20D Gaussian Let’s add more dependency between points 0 1st 2nd 1 1 3rd 4th 5th 6th 7th We want some notion of smoothness between points… So that dependancy between 1st and 2nd points is larger than between 1st and the 3rd. 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 1 0.5 0.5 . . . 0.5 0.5 1 0.5 . . . 0.5 ... ... ... ... ... 0.5 0.5 0.5 . . . 1 3 7 7 7 5 1 C C C A
  • 83. 20D Gaussian Let’s add more dependency between points We want some notion of smoothness between points… So that dependancy between 1st and 2nd points is larger than between 1st and the 3rd. 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 1 0.5 0.5 . . . 0.5 0.5 1 0.5 . . . 0.5 ... ... ... ... ... 0.5 0.5 0.5 . . . 1 3 7 7 7 5 1 C C C A We might have just increased corresponding values in covariance matrix, right?
  • 84. 20D Gaussian Let’s add more dependency between points We want some notion of smoothness between points… So that dependancy between 1st and 2nd points is larger than between 1st and the 3rd. 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 1 0.5 0.5 . . . 0.5 0.5 1 0.5 . . . 0.5 ... ... ... ... ... 0.5 0.5 0.5 . . . 1 3 7 7 7 5 1 C C C A We might have just increased corresponding values in covariance matrix, right? We need a way to generate a “smooth” covariance matrix automatically depending on the distance between points
  • 85. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 20D Gaussian
  • 86. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 20D Gaussian 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K120 K21 K22 K23 . . . K220 ... ... ... ... ... K201 K202 K203 . . . K2020 3 7 7 7 5 1 C C C A 0 1st 2nd 1 1 3rd 4th 5th 6th 7th
  • 87. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 200D Gaussian 0 1 1 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K1200 K21 K22 K23 . . . K2200 ... ... ... ... ... K2001 K2002 K2003 . . . K200200 3 7 7 7 5 1 C C C A
  • 88. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 200D Gaussian 0 1 1 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K1200 K21 K22 K23 . . . K2200 ... ... ... ... ... K2001 K2002 K2003 . . . K200200 3 7 7 7 5 1 C C C A
  • 89. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 200D Gaussian 0 1 1 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K1200 K21 K22 K23 . . . K2200 ... ... ... ... ... K2001 K2002 K2003 . . . K200200 3 7 7 7 5 1 C C C A
  • 90. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 200D Gaussian 0 1 1 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K1200 K21 K22 K23 . . . K2200 ... ... ... ... ... K2001 K2002 K2003 . . . K200200 3 7 7 7 5 1 C C C A
  • 91. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 200D Gaussian 0 1 1 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K1200 K21 K22 K23 . . . K2200 ... ... ... ... ... K2001 K2002 K2003 . . . K200200 3 7 7 7 5 1 C C C A
  • 92. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 200D Gaussian 0 1 1 µ⇤ ⇤ 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K1200 K21 K22 K23 . . . K2200 ... ... ... ... ... K2001 K2002 K2003 . . . K200200 3 7 7 7 5 1 C C C A
  • 93. We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj 200D Gaussian 0 1 1 µ⇤ ⇤ 0 B B B @ 2 6 6 6 4 0 0 ... 0 3 7 7 7 5 2 6 6 6 4 K11 K12 K13 . . . K1200 K21 K22 K23 . . . K2200 ... ... ... ... ... K2001 K2002 K2003 . . . K200200 3 7 7 7 5 1 C C C A
  • 94. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3
  • 95. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Previously we were using: to generate correlated points, can we do it again here?  f1 f2 ⇠ ✓ 0 0  1 0.5 0.5 1 ◆
  • 96. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Wait! But now we have three points, we cannot use the same formula!  f1 f2 ⇠ ✓ 0 0  1 0.5 0.5 1 ◆ Previously we were using: to generate correlated points, can we do it again here?
  • 97. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Ok… What about now? 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 2 4 1 0.5 0.5 0.5 1 0.5 0.5 0.5 1 3 5 1 A
  • 98. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Ok… What about now? 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 2 4 1 0.5 0.5 0.5 1 0.5 0.5 0.5 1 3 5 1 A Wait, did he just said that f2 should be more correlated to f1 than to f3?
  • 99. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Ok… What about now? 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 2 4 1 0.5 0.5 0.5 1 0.5 0.5 0.5 1 3 5 1 A Wait, did he just said that f2 should be more correlated to f1 than to f3? Arrrr….
  • 100. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Better now? 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 2 4 1 0.7 0.2 0.7 1 0.5 0.2 0.5 1 3 5 1 A
  • 101. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Better now? 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 2 4 1 0.7 0.2 0.7 1 0.5 0.2 0.5 1 3 5 1 A Yes, but what if we want to obtain this matrix automatically based on how close points are by (Z)?
  • 102. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 Better now? 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 2 4 1 0.7 0.2 0.7 1 0.5 0.2 0.5 1 3 5 1 A Yes, but what if we want to obtain this matrix automatically based on how close points are by (Z)? We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj
  • 103. f1 f2 f3 We are interested in modelling for given Z Z So that is more correlated with than z1 z2 z3 F(z) F(z) f1f2 f3 We will use a similarity measure Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj So now, it will become: 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 , 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 1 A
  • 104. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤
  • 105. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤
  • 106. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤
  • 107. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 , 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 1 A
  • 108. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 , 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 1 A Which is the same as saying: f ⇠ N(0, K)
  • 109. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 , 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 1 A Which is the same as saying: f ⇠ N(0, K) But how do we model f*?
  • 110. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 , 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 1 A Which is the same as saying: f ⇠ N(0, K) But how do we model f*? Well, probably again some kinda normal…
  • 111. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: 2 4 f1 f2 f3 3 5 ⇠ 0 @ 2 4 0 0 0 3 5 , 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 1 A Which is the same as saying: f ⇠ N(0, K) But how do we model f*? Well, probably again some kinda normal… Maybe something like: f⇤ ⇠ N(0, ?)
  • 112. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) Maybe something like: f⇤ ⇠ N(0, ?)
  • 113. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) Maybe something like: f⇤ ⇠ N(0, ?) But what is this “?” covariance matrix of z* with z*?
  • 114. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) Maybe something like: f⇤ ⇠ N(0, ?) But what is this “?” covariance matrix of z* with z*? f⇤ ⇠ N(0, K⇤⇤)
  • 115. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) Maybe something like: f⇤ ⇠ N(0, ?) But what is this “?” covariance matrix of z* with z*? f⇤ ⇠ N(0, K⇤⇤) But isn’t K** is just 1? K⇤⇤ = e ||z⇤ z⇤||2 = 1
  • 116. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)
  • 117. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: What else is left? f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)
  • 118. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: What else is left? f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A
  • 119. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: What else is left? f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1
  • 120. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: What else is left? f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Only one entity is left: K1⇤ = K(z1, z⇤)
  • 121. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: What else is left? f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Only one entity is left: K1⇤ = K(z1, z⇤) I guess we know how to calculate this one! Kij = e ||zi zj ||2 = ( 0, ||zi zj|| ! 1 1, zi = zj
  • 122. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i Yeah! We did it!
  • 123. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i Yeah! We did it! Wait… but what we do now?
  • 124. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i Yeah! We did it! Wait… but what we do now? Remember….
  • 125.  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆ X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2 Conditional distribution 1|2 µ1|2 P(x1|x2) = N(x1|µ1|2, 1|2) µ1|2 = µ1 + 12 + T 22(x2 µ2) 1|2 = 11 12 T 22 21
  • 126.  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆ X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2 Conditional distribution 1|2 µ1|2 P(x1|x2) = N(x1|µ1|2, 1|2) µ1|2 = µ1 + 12 + T 22(x2 µ2) 1|2 = 11 12 T 22 21 What if we substitute x1 with f* and x2 with f?
  • 127.  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆ X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2 Conditional distribution 1|2 µ1|2 P(x1|x2) = N(x1|µ1|2, 1|2) µ1|2 = µ1 + 12 + T 22(x2 µ2) 1|2 = 11 12 T 22 21 Then we can compute mean and standard deviation of f*! What if we substitute x1 with f* and x2 with f?
  • 128.  x1 x2 i ⇠ N ✓ µ1 µ2  11 12 21 22 ◆ X2 X1 Joint distribution of variables andx1 x2 P(x1, x2) x2 Conditional distribution 1|2 µ1|2 P(x1|x2) = N(x1|µ1|2, 1|2) µ1|2 = µ1 + 12 + T 22(x2 µ2) 1|2 = 11 12 T 22 21 Then we can compute mean and standard deviation of f*! Exactly! What if we substitute x1 with f* and x2 with f?
  • 129. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf )
  • 130. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i µ⇤ µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf )
  • 131. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i µ⇤ ⇤ µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf ) ⇤ = K⇤⇤ KT ⇤ K 1 K⇤
  • 132. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i µ⇤ µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf ) ⇤ = K⇤⇤ KT ⇤ K 1 K⇤
  • 133. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i µ⇤ z⇤ z⇤ µ⇤ µ⇤ µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf ) ⇤ = K⇤⇤ KT ⇤ K 1 K⇤
  • 134. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤ z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i µ⇤ z⇤ z⇤ µ⇤ µ⇤ µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf ) ⇤ = K⇤⇤ KT ⇤ K 1 K⇤
  • 135. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤also given f⇤ Ok, so we have just modelled f: f ⇠ N(0, K) and f⇤ ⇠ N(0, K⇤⇤)  f f⇤ ⇠ 0 B B B B @  0 0 2 6 6 6 6 4 2 4 K11 K12 K13 K21 K22 K23 K31 K32 K33 3 5 2 4 K1⇤ K2⇤ K3⇤ 3 5 ⇥ K⇤1 K⇤2 K⇤3 ⇤ [K⇤⇤] 3 7 7 7 7 5 1 C C C C A K 1 Ki⇤ K⇤i µ⇤ µ⇤ µ⇤ z⇤z⇤ z⇤ µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf ) ⇤ = K⇤⇤ KT ⇤ K 1 K⇤
  • 136. What is ? f1 f2 f3 Zz1 z2 z3 F(z) Given: {(f1, z1); (f2, z2); (f3, z3)} z⇤also given f⇤ µ⇤ µ⇤ µ⇤ z⇤z⇤ z⇤ µ⇤= µ(z⇤) + KT ⇤ K 1 (f µf ) ⇤ = K⇤⇤ KT ⇤ K 1 K⇤
  • 137. Pros: 1. Can model almost any function directly 3. Provides uncertainty estimates 2. Can be made more flexible with different kernels Cons: 1. Cannot be interpreted 2. Loose efficiency in high dimensional spaces 3. Overfitting
  • 138. Cat or Dog? “It’s always seemed obvious to me that it’s better to know that you don’t know, than to think you know and act on wrong information.” Katherine Bailey
  • 140. Resources: Katherine Bailey’s presentation: http://katbailey.github.io/gp_talk/ Gaussian_Processes.pdf Katherine Bailey’s blog post: from both sides now: the math of linear regression (http://katbailey.github.io/post/from-both-sides-now-the- math-of-linear-regression/) Katherine Bailey’s blog post: Gaussian processes for dummies (http:// katbailey.github.io/post/gaussian-processes-for-dummies/) Kevin P. Murphy’s book: Machine Learning - A Probabilistic Perspective, Chapter 15 (https://www.amazon.com/Machine-Learning- Probabilistic-Perspective-Computation/dp/0262018020) Alex Bridgland’s blog post: Introduction to Gaussian Processes - Part I (http://bridg.land/posts/gaussian-processes-1) Nando de Freitas, Machine Learning - Introduction to Gaussian Processes (https://youtu.be/4vGiHC35j9s)