This document provides an overview of autoencoders and variational autoencoders. It discusses how principal component analysis (PCA) is related to linear autoencoders and can be performed using backpropagation. Deep and nonlinear autoencoders are also covered. The document then introduces variational autoencoders, which combine variational inference with autoencoders to allow for probabilistic latent space modeling. It explains how variational autoencoders are trained using backpropagation through reparameterization to maximize the evidence lower bound.
4. PCA for dimensionality reduction
● The U that maximizes the variance of PC1
● also minimizes the reconstruction error
○ Note: this is not the same as OLS,
which minimizes
There are efficient solvers for this, but we could also use backpropagation
5. PCA through backpropagation
●
● This is an autoencoder
● If the neurons are linear, it is similar to PCA
○ Caveat: PCs are orthogonal, autoencoded
components are not - but they will span the
same space
12. Many variations of autoencoders
● Sparse autoencoders
● Denoising autoencoders
● Convolutional autoencoders
○ UNet is a sort of autoencoder
● And more…
● I’d like to introduce Variational Autoencoders
16. Variational Inference (quick overview)
z
x observation
latent variable
problematic...
Variational Inference Solution:
Chosen to be a
distribution we can work
with
17. Side note on
● Information
○ “How many bits do we need to represent event x if we optimized for p(x)?”
● Entropy
○ “What is the expected amount of information in each event drawn from p(x)?” (how many bits?)
● Cross-entropy
○ “What is the expected amount of information in p(x) if we optimized for q(x)?” (how many bits?)
● Kullback-Leibler divergence: “cross-entropy - entropy”
○ “How many more bits will we need to represent events from p(x) if we optimized for q(x)?