Paper Review: An exact mapping between the Variational Renormalization Group and Deep Learning
1. An exact mapping between the Variational
Renormalization Group and Deep Learning
Kai-Wen Zhao, kv
Physics, National Taiwan University
kelispinor@gmail.com
December 1, 2016
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 1 / 18
2. Outline
Overview
Renormalization Group
Physical world with various length scales
Symmetry and Scale Invariance
Restricted Boltzman Machine
Generative, Energy-based Model, Unsupervised Learning Algorithm
Richard Feynman: What I Cannot Create, I Do Not Understand.
Mapping
Unsupervised Deep Learning Implements the Kadanoff Real Space
Variational Renormalization Group
HRG
λ [{hj }] = HRBM
λ [{hj }]
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 2 / 18
3. Overview of Variational RG
Statistical Physics
An ensemble of N spins {vi }, take value ±1, i is position index in some
lattice. Boltzman distribution and partition function
P({vi }) =
e−H({vi })
Z
, where Z = Trvi e−H({vi })
=
v1,v2,...=±1
e−H({vi })
Typically, Hamiltonian depends on a set of couplings {Ks}
H[{vi }] = −
i
Ki vi −
ij
Kij vi vj −
ijk
Kijkvi vj vk + ...
Free energy of spin system
F = − log Z = − log(Trvi e−H({vi })
)
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 3 / 18
4. Overview of Variational RG
Overview of Variational Renormalization Group
Idea behind RG: To finde a new coarsed-grained description of spin
system, where one has integrated out short distance fluctuations.
N Physical spins: {vi }, couplings {K}
M Coarse-grained spins: {hj }, couplings { ˜K}, where M < N
Renormalization transformation is often represented as a mapping
{K} → { ˜K}
Coarse-grained Hamiltonian
HRG
[{hj }] = −
i
˜Ki hi −
ij
˜Kij hi hj −
ijk
˜Kijkhi hj hk + ...
Now, we do not distinguish vi and {vi } if no ambiguity
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 4 / 18
5. Overview of Variational RG
Overview of Variational Renormalization Group
Variational RG scheme (Kadanoff)
Coarse graining procedure: Tλ(vi , hj ) couples auxiliary spins hj to physical
spins vi
Naturally, we marginalize over the physical spins
exp (−HRG
λ (hj )) = Trvi exp (Tλ(vi , hj ) − H(vi ))
The free energy of coarse grained system
Fh
λ = −log(Trhj
e−HRG
λ (hj )
)
Choose parameters λ to ensure long-distrance observables are invariant.
Minimize free energy difference
∆F = Fh
λ − Fv
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 5 / 18
6. Overview of Variational RG
Overview of Variational Renormalization Group
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 6 / 18
7. RBMs and Deep Neural Networks
Restricted Boltzman Machine
Binary data probability distribution P(vi ). Energy function
E(vi , hj ) =
ij
wij vi hj +
i
ci vi +
j
bj hj
where we denote parameters λ = {w, b, c}. Joint probability
pλ(vi , hj ) =
e−E(vi ,hj )
Z
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 7 / 18
8. RBMs and Deep Neural Networks
Restricted Boltzman Machine
Variational distribution of visible variables
pλ(vi ) =
hj
p(vi , hj ) = Trhj
pλ(vi , hj ) :=
e−HRBM
λ (vi )
Z
pλ(hj ) =
vi
p(vi , hj ) = Trvi pλ(vi , hj ) :=
e−HRBM
λ (hj )
Z
Kullback-Leibler divergence
DKL(P(vi )||pλ(vi )) =
vi
P(vi ) log
P(vi )
pλ(vi )
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 8 / 18
9. Exact Mapping VRG to DL
Mapping Variational RG to RBM
In RG scheme, the couplings between visible and hidden spins are encodes
by the operators T. Analogous role, in RBM, is played by joint energy
function.
T(vi , hj ) = −E(vi , hj ) + H(vi )
To derive equivalent statement from coarse-grained Hamiltonian
e−HRG
λ (hj )
Z
=
Trvi eTλ(vi ,hj )−H(vi )
Z
= Trvi
e−E(vi ,hj )
Z
= pλ(hj )
=
e−HRBM
λ (hj )
Z
Subsituting the right-hand side yields
HRG
λ [{hj }] = HRBM
λ [{hj }] (1)
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 9 / 18
10. Exact Mapping VRG to DL
Mapping Variational RG to RBM
The operator Tλ can be viewed as a variational approximation for
conditional probability
eT(vi ,hj )
= e−E(vi ,hj )+H(vi )
=
pλ(vi , hj )
pλ(vi )
eH(vi )−HRBM
λ (vi )
= pλ(hj |vi )eH(vi )−HRBM
λ (vi )
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 10 / 18
11. Examples
Examples: 2D Ising Model
Two dimensional nearest neighbor Ising model with ferromagnetic coupling
H({vi }) = −J
<ij>
vi vj
Phase transition occurs when J/(kBT) = 0.4352.
Experiment Setup
20,000 samples, 40x40 periodic lattice
RBM’s architecture 1600-400-100-25
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 11 / 18
12. Examples
Examples: 2D Ising Model
Figure: Top layer
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 12 / 18
13. Examples
Examples: 2D Ising Model
Figure: Middle layer
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 13 / 18
15. Conclusion
Conclusion and Discussion
One-to-one mapping between RBM-based DNN and variational RG
Suggest learning implements RG-like scheme to extract important
features from data
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 15 / 18
16. Relate to us
Relate to us: Auto-Encoder and Convolutional AE
z is the codes extracted by machine
φ : X → Z ψ : Z → X
arg min ||X − (ψ ◦ φ)X||2
Figure: Scheme of Auto-Encoder
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 16 / 18
17. Relate to us
Relate to us: Auto-Encoder and Convolutional AE
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 17 / 18