1) The document discusses super-resolution techniques in deep learning, including inverse problems, image restoration problems, and different deep learning models.
2) Early models like SRCNN used convolutional networks for super-resolution but were shallow, while later models incorporated residual learning (VDSR), recursive learning (DRCN), and became very deep and dense (SRResNet).
3) Key developments included EDSR which provided a strong backbone model and GAN-based approaches like SRGAN which aimed to generate more realistic textures but require new evaluation metrics.
3. Inverse problems: scattering
Maxwell equation
Láme equation
noise
Inverse
problem
Restored density
original density
model ð = ð¯(ð) + ð
* Yoo J et al., SIAM, 2016
observed field
e.g. electromagnetic, or acoustic wave
physical property reconstruction* or
source localization
5. System (H)
noise
model ð = ð¯(ð) + ð
Image restoration (IR)
By specifying different degradation operator H,
one can correspondingly get different IR tasks.
Deblurring or Deconvolution : ð¯ð = ð â ð
Super-resolution : ð¯ð = (ð â ð) â ð
Denoising : ð¯ð = ð°ð
Inpainting : ð¯ð = ð° ððððððð ð
General formulation of IR problems:
6. Given a single image ð, solve fðšð« ð:
⢠ð: known low resolution (LR) image
⢠ð: unknown high resolution (HR) image
⢠ð: unknown blur kernel (typically set as identity)
⢠â ð: downsample ð by the factor of ð (typically done by bicubic function)
⢠ð: additive white Gaussian noise (AWGN)
Single Image Super-Resolution
ð = ð â ð â ð + ð
8. Image restoration (IR)
From Bayesian perspective, the solution à·ð can be obtained by solving a Maximum A Posteriori (MAP) problem:
à·ð = ððððŠðð±
ð
ððð ð ð ð + ððð ð(ð)
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ(ð)
More formally,
data fidelity regularization
Enforces desired property
of the output
Guarantees the solution accords
with the degradation process
1) Model-based optimization 2) Discriminative learning methods
: What kinds of prior knowledge can we âimpose onâ our model?
10. Image restoration (IR)
From Bayesian perspective, the solution à·ð can be obtained by solving a Maximum A Posteriori (MAP) problem:
à·ð = ððððŠðð±
ð
ððð ð ð ð + ððð ð(ð)
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ(ð)
More formally,
data fidelity regularization
Enforces desired property
of the output
Guarantees the solution accords
with the degradation process
1) Model-based optimization 2) Discriminative learning methods
What kinds of prior knowledge can we âlearn usingâ our model? :
11. Discriminative learning methods
What kinds of prior knowledge can we âlearn usingâ our model?
ðŠð¢ð§
ðœ
ð à·ð, ð ,
ð. ð. à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ(ð; ðœ)
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ(ð)
data fidelity regularization
Here, we learn the prior parameter ðœ:
through an optimization of a loss function ð on a training set (image pairs).
12. Discriminative learning methods
CNNs (ð)
ð ð
Conv, ReLU, pooling, etc.
General statement of the problem:
ðŠð¢ð§
ðœ
ð à·ð, ð ,
ð. ð. à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð
+ ððœ(ð; ðœ)
By replacing MAP inference with a predefined nonlinear function à·ð = ð ð, ð¯, ðœ ,
solving IR problem with CNNs can be treated as one of the discriminative learning methods.
learn the image prior model
13. SRCNN
The start of deep learning in SISR
⢠Link the CNN architecture to the traditional âsparse codingâ methods.
⢠The first end-to-end framework: each module is optimized through the learning process
14. SRCNN
Set5 dataset with an upscaling factor à ð
SNCNN Surpasses the bicubic baseline and
outperforms the sparse coding based method.
The first-layer filters trained on upscaling factor à ð
Example feature maps of different layers.
Results
21. VDSR (CVPR â16)
Network design: 1st cornerstone, residual learning
Very Deep SR network
⢠The first âdeepâ network (20 layers)
⢠Proposed a practical method to actually train the âdeepâ layers (before BN)
22. VDSR (CVPR â16)
Network design: 1st cornerstone, residual learning
Very Deep SR network
⢠The first âdeepâ network (20 layers)
⢠Proposed a practical method to actually train the âdeepâ layers
27. Network design: 2nd cornerstone
EDSR (CVPR â17)
⢠The first to provide a backbone for âSRâ task
⢠Remove batch normalization
⢠Residual scaling
⢠Very stable and reproducible model
⢠Removed batch normalization layers
⢠Self-geometric ensemble method
⢠Exploit pretrained 2x model for the other scales
⢠Performance gain
⢠Model size reduction (43M â 8M)
⢠Flexibility (partially scale agnostic)
30. Network design
Non-local & Attention module
Generative Adversarial Networks in Super-resolution
⢠"Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network" (SRGAN, CVPR â17)
⢠"A fully progressive approach to single-image super-resolution" (ProSR, CVPR â18)
⢠"ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks" ECCV â18
⢠Candidate of 4th cornerstone?
⢠"2018 PIRM Challenge on Perceptual Image Super-resolution" (ECCV â18)
⢠3rd cornerstone; at least in the perspective of the performance; too sensitive, many hyper-params.
⢠"Image Super-Resolution Using Very Deep Residual Channel Attention Networks" (RCAN, ECCV â18)
⢠"Non-local Recurrent Network for Image Restoration" (NLRN, NIPS â18)
⢠"Residual Non-local Attention Networks for Image Restoration" (RNAN, ICLR â19)
31. GANs in SR: candidate of 4th cornerstone?
Problems
⢠Cannot go along with traditional metrics: PSNR / SSIM
⢠New metric?; "2018 PIRM Challenge on Perceptual Image Super-resolution" (ECCV â18)
ProSRGAN (CVPR â18)
33. Pros
General to handle different IR problems
Clear physical meanings
Data-driven end-to-end learning
Efficient inference during test-phase
Cons
Hand-crafted priors (weak representations)
Optimization task is time-consuming
Generality of model is limited
Interpretability of model is limited
Summary (until now)
Methods
1) Model-based optimization 2) Discriminative learning methods
General formulation of IR
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ(ð)
34. Pros
General to handle different IR problems
Clear physical meanings
Data-driven end-to-end learning
Efficient inference during test-phase
Cons
Hand-crafted priors (weak representations)
Optimization task is time-consuming
Generality of model is limited
Interpretability of model is limited
Summary (until now)
Methods
1) Model-based optimization 2) Discriminative learning methods
General formulation of IR
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ(ð)
âCan we somehow get the best of both worlds?â
35. Getting the best of both worlds
Variable Splitting Methods
⢠Want to separately deal with the data fidelity term and the regularization terms
⢠Specifically, the regularization term only corresponds to a denoising subproblem
⢠Alternating Direction Method of Multipliers (ADMM), Half Quadratic Splitting (HQS)
⢠Cost function of HQS: â ð ð, ð =
ð
ð
ð â ð¯ð ð
+ ððœ ð + ð ð â ð ð
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ ð ð. ð. ð = ð
ð ð+ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð
+ ð ð â ð ð
ð
, ð ð+ð = ððððŠð¢ð§
ð
ð
ð ð/ð
ð
ð â ð ð+ð
ð
+ ðœ ð
36. Getting the best of both worlds
Variable Splitting Methods
⢠Want to separately deal with the data fidelity term and the regularization terms
⢠Specifically, the regularization term only corresponds to a denoising subproblem
⢠Alternating Direction Method of Multipliers (ADMM), Half Quadratic Splitting (HQS)
⢠Cost function of HQS: â ð ð, ð =
ð
ð
ð â ð¯ð ð
+ ððœ ð + ð ð â ð ð
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ ð ð. ð. ð = ð
ð ð+ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð
+ ð ð â ð ð
ð
, ð ð+ð = ððððŠð¢ð§
ð
ð
ð ð/ð
ð
ð â ð ð+ð
ð
+ ðœ ð
ð ð+ð = ð¯ ð»
ð¯ + ðð°
âð
(ð¯ð + ðð ð)
37. Getting the best of both worlds
Variable Splitting Methods
⢠Want to separately deal with the data fidelity term and the regularization terms
⢠Specifically, the regularization term only corresponds to a denoising subproblem
⢠Alternating Direction Method of Multipliers (ADMM), Half Quadratic Splitting (HQS)
⢠Cost function of HQS: â ð ð, ð =
ð
ð
ð â ð¯ð ð
+ ððœ ð + ð ð â ð ð
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ ð ð. ð. ð = ð
ð ð+ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð
+ ð ð â ð ð
ð
, ð ð+ð = ððððŠð¢ð§
ð
ð
ð ð/ð
ð
ð â ð ð+ð
ð
+ ðœ ð
In Bayesian perspective, this is Gaussian denoising
subproblem with noise level ð/ð!ð ð+ð = ð¯ ð»
ð¯ + ðð°
âð
(ð¯ð + ðð ð)
38. Getting the best of both worlds
Variable Splitting Methods
⢠Want to separately deal with the data fidelity term and the regularization terms
⢠Specifically, the regularization term only corresponds to a denoising subproblem
⢠Alternating Direction Method of Multipliers (ADMM), Half Quadratic Splitting (HQS)
⢠Cost function of HQS: â ð ð, ð =
ð
ð
ð â ð¯ð ð
+ ððœ ð + ð ð â ð ð
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ ð ð. ð. ð = ð
ð ð+ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð
+ ð ð â ð ð
ð
, ð ð+ð = ððððŠð¢ð§
ð
ð
ð ð/ð
ð
ð â ð ð+ð
ð
+ ðœ ð
ð ð+ð = ð«ððððððð (ð ð+ð, ð/ð)ð ð+ð = ð¯ ð»
ð¯ + ðð°
âð
(ð¯ð + ðð ð)
39. Getting the best of both worlds
Variable Splitting Methods
⢠Want to separately deal with the data fidelity term and the regularization terms
⢠Specifically, the regularization term only corresponds to a denoising subproblem
⢠Alternating Direction Method of Multipliers (ADMM), Half Quadratic Splitting (HQS)
⢠Cost function of HQS: â ð ð, ð =
ð
ð
ð â ð¯ð ð
+ ððœ ð + ð ð â ð ð
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ ð ð. ð. ð = ð
ð ð+ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð
+ ð ð â ð ð
ð
, ð ð+ð = ððððŠð¢ð§
ð
ð
ð ð/ð
ð
ð â ð ð+ð
ð
+ ðœ ð
ð ð+ð = ð«ððððððð (ð ð+ð, ð/ð)ð ð+ð = ð¯ ð»
ð¯ + ðð°
âð
(ð¯ð + ðð ð)
1. Any gray or color denoisers to solve a variety of inverse problems.
2. The explicit image prior can be unknown in solving the original equation.
3. Several complementary denoisers which exploit different image priors can be
jointly utilized to solve one specific problem.
40. IRCNN
HQS: Plug and Play
⢠Image Restoration with CNN Denoiser Prior
⢠Kai Zhang et al. âLearning Deep CNN Denoiser Prior for Image Restorationâ
ð ð+ð = ð«ððððððð (ð ð+ð, ð/ð)
41. IRCNN
HQS: Plug and Play
⢠Image Restoration with CNN Denoiser Prior
⢠Kai Zhang et al. âLearning Deep CNN Denoiser Prior for Image Restorationâ
ð ð+ð = ð«ððððððð (ð ð+ð, ð/ð)
Image deblurring performance comparison for Leaves image
(the blur kernel is Gaussian kernel with standard deviation 1.6, the noise level Ï is 2).
42. IRCNN
HQS: Plug and Play
⢠Image Restoration with CNN Denoiser Prior
⢠Kai Zhang et al. âLearning Deep CNN Denoiser Prior for Image Restorationâ
ð ð+ð = ð«ððððððð (ð ð+ð, ð/ð)
SISR performance comparison for Set5: IRCNN can tune the blur kernel and scale factor w/o training.
(the blur kernel is 7Ã7 Gaussian kernel with standard deviation 1.6, the scale factor à 3)
43. Pros
General to handle different IR problems
Clear physical meanings
Data-driven end-to-end learning
Efficient inference during test-phase
Cons
Hand-crafted priors (weak representations)
Optimization task is time-consuming
Generality of model is limited
Interpretability of model is limited
Summary (until now)
Methods
1) Model-based optimization 2) Discriminative learning methods
General formulation of IR
à·ð = ððððŠð¢ð§
ð
ð
ð
ð â ð¯ð ð + ððœ(ð)
ââ
45. Problems yet to be solved
⢠It WORKS but NO WHYS.
⢠Many studies are just blindly suggesting a new architecture that works.
⢠Recent architecture are (kind of) overfitted to the dataset.
⢠Bicubic downsampling tasks are saturated. (fails in other d/s scheme or realistic noises)
⢠We need more ârealisticâ and âpragmaticâ model that works in real environments.
⢠Lack of fair comparisons
⢠Lighter (greener) and faster (inference) models
⢠New architectures (more than just a shared parameters)
⢠New methods
46. Jaejun Yoo
Ph.D. Research Scientist
@NAVER Clova AI Research, South Korea
Interested in Generative models, Signal Processing,
Interpretable AI, and Algebraic Topology
Techblog: https://jaejunyoo.blogspot.com
Github: https://github.com/jaejun-yoo / LinkedIn: www.linkedin.com/in/jaejunyoo
Research Keywords
deep learning, inverse problem, signal processing, generative models
Thank you
Q&A?