Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction.pptx

•Download as PPTX, PDF•

0 likes•118 views

Chung Hyung Jin

CVPR 2022

Engineering

Come-Closer-Diffuse-Faster
Hyungjin Chung, Byeongsu Sim, Jong Chul Ye
Accelerating Conditional Diffusion Models for
Inverse Problems through Stochastic Contraction

Diffusion models are slow (Unconditional)
𝒙𝑖+1 = 𝑓 𝒙𝑖, 𝑖 + 𝑔 𝒙𝑖, 𝑖 𝒛𝑖

Diffusion models are slow (Conditional)
𝒙𝑖+1
′
= 𝑓 𝒙𝑖, 𝑖 + 𝑔 𝒙𝑖, 𝑖 𝒛𝑖
𝒙𝑖+1 = 𝑨𝒙𝑖+1
′
+ 𝒃

Why use the whole process?
Is this part necessary?

Forward diffusion initialization
𝒙0
𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛
𝒙𝑁′
• 𝑡0 = 0.3
• 𝑁′
= 𝑡0𝑁 = 300
use this much

𝒙0
𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛
𝒙𝑁′
• 𝑡0 = 0.3
• 𝑁′
= 𝑡0𝑁 = 300
use this much
𝒙0
𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛
𝒙𝑁′
Forward diffusion initialization

Come-Closer-Diffuse-Faster
𝒙0
𝒙0
𝒙𝑁′
Shortcut exists?
Possible path?

Come-Closer-Diffuse-Faster
𝜀0 ∶= ||𝒙0 − 𝒙0||2

Come-Closer-Diffuse-Faster
𝜀𝑁′ ∶= 𝔼||𝒙𝑁′ − 𝒙𝑁′||2

Come-Closer-Diffuse-Faster
𝜀0,𝑟 ∶= 𝔼||𝒙0,𝑟 − 𝒙0,𝑟||2

Theoretical Findings
Theorem 1.
𝜀0,𝑟 ≤
2𝐶𝜏
1 − 𝜆2
+ 𝜆2𝑁′
𝜀𝑁′
Error decreases exponentially
with reverse diffusion!
𝜏 =
Tr(𝑨𝑇
𝑨)
𝑛
𝜆
𝐶

Theoretical Findings
Theorem 2. (shortcut path)
𝜀0,𝑟 ≤ 𝜇𝜀0
• For any 0 < 𝜇 ≤ 1, there exists a minimum 𝑁′ s.t.
• Optimal 𝑁′ decreases as 𝜀0 gets smaller

Theoretical Findings
Feed-forward network
correction

Empirical Results: Super-resolution (SR)
20 step diffusion
𝑁 = 100, 𝑡0 = 0.2
• ILVR, SR3
• proposed
𝑁 = 20, 𝑡0 = 1.0

Empirical Results: Super-resolution (SR)

Empirical Results: Inpainting
20 step diffusion
𝑁 = 100, 𝑡0 = 0.2
• Score-SDE
• proposed
𝑁 = 20, 𝑡0 = 1.0

Empirical Results: CS-MRI
20 step diffusion
𝑁 = 1000, 𝑡0 = 0.02
• Chung & Ye, 2022
• proposed
𝑁 = 1000, 𝑡0 = 1.0

Thank you!
Hyungjin Chung Byeongsu Sim Jong Chul Ye
hj.chung@kaist.ac.kr byeongsu.s@kaist.ac.kr jong.ye@kaist.ac.kr

What's hot

【DL輪読会】Efficiently Modeling Long Sequences with Structured State SpacesDeep Learning JP

深層強化学習の self-playで、複雑な行動を機械に学習させたい！Junichiro Katsuta

[DL輪読会]`強化学習のための状態表現学習－より良い「世界モデル」の獲得に向けて－Deep Learning JP

[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...Deep Learning JP

[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...Deep Learning JP

【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...Deep Learning JP

多様な強化学習の概念と課題認識佑甲野

【DL輪読会】マルチエージェント強化学習における近年の協調的方策学習アルゴリズムの発展Deep Learning JP

[DL輪読会]Object-Centric Learning with Slot AttentionDeep Learning JP

MASTERING ATARI WITH DISCRETE WORLD MODELS (DreamerV2)harmonylab

[DL輪読会]NVAE: A Deep Hierarchical Variational AutoencoderDeep Learning JP

Control as Inference (強化学習とベイズ統計)Shohei Taniguchi

【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement LearningDeep Learning JP

Rainbowharmonylab

【DL輪読会】Mastering Diverse Domains through World ModelsDeep Learning JP

Reinforcement Learning @ NeurIPS2018佑甲野

[DL輪読会]Grokking: Generalization Beyond Overfitting on Small Algorithmic DatasetsDeep Learning JP

変分推論と Normalizing FlowAkihiro Nitta

[DL輪読会]逆強化学習とGANsDeep Learning JP

機械学習モデルのハイパパラメータ最適化gree_tech

What's hot (20)

【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces

深層強化学習の self-playで、複雑な行動を機械に学習させたい！

[DL輪読会]`強化学習のための状態表現学習－より良い「世界モデル」の獲得に向けて－

[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...

[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...

【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...

多様な強化学習の概念と課題認識

【DL輪読会】マルチエージェント強化学習における近年の協調的方策学習アルゴリズムの発展

[DL輪読会]Object-Centric Learning with Slot Attention

MASTERING ATARI WITH DISCRETE WORLD MODELS (DreamerV2)

[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder

Control as Inference (強化学習とベイズ統計)

【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning

Rainbow

【DL輪読会】Mastering Diverse Domains through World Models

Reinforcement Learning @ NeurIPS2018

[DL輪読会]Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

変分推論と Normalizing Flow

[DL輪読会]逆強化学習とGANs

機械学習モデルのハイパパラメータ最適化

Recently uploaded

Correctly Loading Incremental Data at ScaleAlluxio, Inc.

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani

8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191

Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1

Heart Disease Prediction using machine learning.pptxPoojaBan

Churning of Butter, Factors affecting .Satyam Kumar

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar

Design and analysis of solar grass cutter.pdfTagore Institute of Engineering And Technology

Work Experience-Dalton Park.pptxfvvvvvvvLewisJB

main PPT.pptx of girls hostel security using rfidNikhilNagaraju

young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N

complete construction, environmental and economics information of biomass com...asadnawaz62

young call girls in Green Park🔝 9953056974 🔝 escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis

Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066

Recently uploaded (20)

Correctly Loading Incremental Data at Scale

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...

🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf

8251 universal synchronous asynchronous receiver transmitter

Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers

Heart Disease Prediction using machine learning.pptx

Churning of Butter, Factors affecting .

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger

Design and analysis of solar grass cutter.pdf

Work Experience-Dalton Park.pptxfvvvvvvv

main PPT.pptx of girls hostel security using rfid

young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service

UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)

complete construction, environmental and economics information of biomass com...

young call girls in Green Park🔝 9953056974 🔝 escort Service

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction

Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)

Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction.pptx

1. Come-Closer-Diffuse-Faster Hyungjin Chung, Byeongsu Sim, Jong Chul Ye Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction

2. Diffusion models are slow (Unconditional) 𝒙𝑖+1 = 𝑓 𝒙𝑖, 𝑖 + 𝑔 𝒙𝑖, 𝑖 𝒛𝑖

3. Diffusion models are slow (Conditional) 𝒙𝑖+1 ′ = 𝑓 𝒙𝑖, 𝑖 + 𝑔 𝒙𝑖, 𝑖 𝒛𝑖 𝒙𝑖+1 = 𝑨𝒙𝑖+1 ′ + 𝒃

4. Why use the whole process? Is this part necessary?

5. Forward diffusion initialization 𝒙0 𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛 𝒙𝑁′ • 𝑡0 = 0.3 • 𝑁′ = 𝑡0𝑁 = 300 use this much

6. 𝒙0 𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛 𝒙𝑁′ • 𝑡0 = 0.3 • 𝑁′ = 𝑡0𝑁 = 300 use this much 𝒙0 𝒙𝑁′ = 𝑎𝑁′𝒙0 + 𝑏𝑁′𝒛 𝒙𝑁′ Forward diffusion initialization

7. Come-Closer-Diffuse-Faster 𝒙0 𝒙0 𝒙𝑁′ Shortcut exists? Possible path?

8. Come-Closer-Diffuse-Faster 𝜀0 ∶= ||𝒙0 − 𝒙0||2

9. Come-Closer-Diffuse-Faster 𝜀𝑁′ ∶= 𝔼||𝒙𝑁′ − 𝒙𝑁′||2

10. Come-Closer-Diffuse-Faster 𝜀0,𝑟 ∶= 𝔼||𝒙0,𝑟 − 𝒙0,𝑟||2

11. Theoretical Findings Theorem 1. 𝜀0,𝑟 ≤ 2𝐶𝜏 1 − 𝜆2 + 𝜆2𝑁′ 𝜀𝑁′ Error decreases exponentially with reverse diffusion! 𝜏 = Tr(𝑨𝑇 𝑨) 𝑛 𝜆 𝐶

12. Theoretical Findings

13. Theoretical Findings Theorem 2. (shortcut path) 𝜀0,𝑟 ≤ 𝜇𝜀0 • For any 0 < 𝜇 ≤ 1, there exists a minimum 𝑁′ s.t. • Optimal 𝑁′ decreases as 𝜀0 gets smaller

14. Theoretical Findings

15. Theoretical Findings Feed-forward network correction

16. Empirical Results: Super-resolution (SR) 20 step diffusion 𝑁 = 100, 𝑡0 = 0.2 • ILVR, SR3 • proposed 𝑁 = 20, 𝑡0 = 1.0

17. Empirical Results: Super-resolution (SR)

18. Empirical Results: Inpainting 20 step diffusion 𝑁 = 100, 𝑡0 = 0.2 • Score-SDE • proposed 𝑁 = 20, 𝑡0 = 1.0

19. Empirical Results: CS-MRI 20 step diffusion 𝑁 = 1000, 𝑡0 = 0.02 • Chung & Ye, 2022 • proposed 𝑁 = 1000, 𝑡0 = 1.0

20. Thank you! Hyungjin Chung Byeongsu Sim Jong Chul Ye hj.chung@kaist.ac.kr byeongsu.s@kaist.ac.kr jong.ye@kaist.ac.kr

Editor's Notes

Hi, I’m Hyungjin Chung, and I will be presenting the work come-closer-diffuse-faster, which is a work aimed for accelerating diffusion models for inverse problems.
Diffusion models are slow. They define an iterative chain, starting from pure Gaussian noise, slowly denoised to form a high fidelity image. Note that the diffusion process takes few thousand iterations to perform well. Writing in terms of stochastic difference equation, one can write as follows, where z represents some gaussian noise.
Diffusion models are very flexible such that the unconditional models can be used directly to solve inverse problems without any finetuing. Just by imposing data consistency constraints in-between, one can sample from some conditional distribution given a measurement. However, again, the process is slow.
When we inspect the diffusion process, specifically of conditional diffusion, it seems odd that almost half of the reverse diffusion process does not contain any relevant information about the reconstruction. So we question: Is this part necessary?
Let us assume that the total discretization step, N, is set to one thousand, and N’ is defined as t_0 times N. For solving inverse problems, recall that we have the corrupted image x_0, and due to the linearity of the forward diffusion, one can sample from the intermediate step at N’ with a single reparameterization step.
When we compare this to the forward diffusion of the ground truth image, tilde x_0, the two images look very similar.
Our intuition is that starting the reverse diffusion step from x N’ will eventually lead us to tilde x_0, given that our model is properly trained.
In order to study this property in a more rigorous fashion, let us first define epsilon_0, standing for the error between the ground truth, and the measurement.
Define bar epsilon N’ as the expected distance between the two images after the forward diffusion.
Then, we finally define bar epsilon 0, reverse, as the expected distance between the two images after the reverse diffusion.
Then, our first theorem tells us that the error bar epsilon 0, reverse, will exponentially decrease with the reverse diffusion, according to the theory of stochastic contraction. Here, the specific values for lambda, C, and tau is defined by the type of diffusion process you use, and the ill-posedness of the problem, respectively.
Visually, we can plot the error as the reverse diffusion progresses, as follows. This will be the case when we use full reverse diffusion.
Our second theorem, which is our main theorem, states that for any mu between 0 and 1, there exists a minimum N’ such that final error is bounded by mu times epsilon 0. Moreover, such optimal N’ will decrease, as epsilon_0, implying the initial error of the problem, gets smaller.
Visually speaking, the error will polynomially increase with the forward diffusion initialization, and decrease exponentially with the reverse diffusion, such that we can find an optimal value of N’, or t_0.
Even better, if we have a feed-forward network that is trained for a specific task, for example, super resolution, we can directly start from that estimate, and use even smaller number of diffusion steps, as in this figure.
Using our strategy, we show that we can achieve state-of-the-art reconstruction results for various types of imaging problems. First, for super-resolution, we achieve high quality reconstructions using only 20 step diffusion, where as other diffusion based methods such as SR3 or ILVR suffer heavily.
Plotting the FID score, we again observe that CCDF consistently performs better than ILVR under the same budget.
We see similar high quality results with 20 step diffusion with image inpainting, where we see that the strategy adopted in, for example, score-sde fails, not being able to remove noise properly.
Our final application shows that our strategy can also be directly applied to compressed sensing MRI, which is supposedly the most practical, since fast reconstruction is of utmost importance in medical imaging. We show that we can even outperform the strategy that uses full reverse diffusion, with our 20 step diffusion counterpart.
Thank you for your attention.

Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

More from Chung Hyung Jin

More from Chung Hyung Jin (11)

Recently uploaded

Recently uploaded (20)

Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction.pptx

Editor's Notes