5. Inverse Problem Regularization
Observations: y = x0 + w 2 RP .
Estimator: x(y) depends only on
observations y
parameter
Example: variational methods
1
x(y) 2 argmin ||y
x||2 + J(x)
x2RN 2
Data fidelity Regularity
6. Inverse Problem Regularization
Observations: y = x0 + w 2 RP .
Estimator: x(y) depends only on
observations y
parameter
Example: variational methods
1
x(y) 2 argmin ||y
x||2 + J(x)
x2RN 2
Data fidelity Regularity
Choice of : tradeo
Noise level
||w||
Regularity of x0
J(x0 )
7. Inverse Problem Regularization
Observations: y = x0 + w 2 RP .
Estimator: x(y) depends only on
observations y
parameter
Example: variational methods
1
x(y) 2 argmin ||y
x||2 + J(x)
x2RN 2
Data fidelity Regularity
Choice of : tradeo
No noise:
Noise level
||w||
0+ , minimize
Regularity of x0
J(x0 )
x? 2 argmin J(x)
x2RQ ,Kx=y
8. Inverse Problem Regularization
Observations: y = x0 + w 2 RP .
Estimator: x(y) depends only on
observations y
parameter
Example: variational methods
1
x(y) 2 argmin ||y
x||2 + J(x)
x2RN 2
Data fidelity Regularity
Choice of : tradeo
No noise:
Noise level
||w||
0+ , minimize
Regularity of x0
J(x0 )
x? 2 argmin J(x)
x2RQ ,Kx=y
This course:
Performance analysis.
Fast computational scheme.
9. Union of Linear Models for Data Processing
Union of models: T 2 T linear spaces.
Synthesis
sparsity:
T
Coe cients x
Image
x
10. Union of Linear Models for Data Processing
Union of models: T 2 T linear spaces.
Synthesis
sparsity:
Structured
sparsity:
T
Coe cients x
Image
x
11. Union of Linear Models for Data Processing
Union of models: T 2 T linear spaces.
Synthesis
sparsity:
Structured
sparsity:
T
Coe cients x
Image
x
D
Analysis
sparsity:
Image x
Gradient D⇤ x
12. Union of Linear Models for Data Processing
Union of models: T 2 T linear spaces.
Synthesis
sparsity:
Structured
sparsity:
T
Coe cients x
Analysis
sparsity:
Image
x
D
Low-rank:
Image x
Gradient D⇤ x
S1,·
Multi-spectral imaging:
Pr
xi,· = j=1 Ai,j Sj,·
S2,·
x
S3,·
13. Gauges for Union of Linear Models
Gauge:
J :R
N
!R
+
Convex
8 ↵ 2 R+ , J(↵x) = ↵J(x)
14. Gauges for Union of Linear Models
Gauge:
J :R
N
!R
+
Convex
8 ↵ 2 R+ , J(↵x) = ↵J(x)
Piecewise regular ball , Union of linear models (T )T 2T
x
J(x) = ||x||1 T
T = sparse
vectors
15. Gauges for Union of Linear Models
Gauge:
J :R
N
!R
+
Convex
8 ↵ 2 R+ , J(↵x) = ↵J(x)
Piecewise regular ball , Union of linear models (T )T 2T
T0
x0
x
J(x) = ||x||1 T
T = sparse
vectors
16. Gauges for Union of Linear Models
Gauge:
J :R
N
!R
Convex
8 ↵ 2 R+ , J(↵x) = ↵J(x)
+
Piecewise regular ball , Union of linear models (T )T 2T
T
T0
x0
x
J(x) = ||x||1 T
T = sparse
vectors
x
T0
x
|x1 |+||x2,3 ||
T = block
sparse
vectors
0
17. Gauges for Union of Linear Models
Gauge:
J :R
N
!R
Convex
8 ↵ 2 R+ , J(↵x) = ↵J(x)
+
Piecewise regular ball , Union of linear models (T )T 2T
T
T0
x0
x
J(x) = ||x||1 T
T = sparse
vectors
x
T
x
|x1 |+||x2,3 ||
T = block
sparse
vectors
0
x
0
J(x) = ||x||⇤
T = low-rank
matrices
18. Gauges for Union of Linear Models
Gauge:
J :R
N
!R
Convex
8 ↵ 2 R+ , J(↵x) = ↵J(x)
+
Piecewise regular ball , Union of linear models (T )T 2T
T
T0
x0
x
J(x) = ||x||1 T
T = sparse
vectors
T0
x
T
x
|x1 |+||x2,3 ||
T = block
sparse
vectors
0
x
x
x0
0
J(x) = ||x||⇤
T = low-rank
matrices
J(x) = ||x||1
T = antisparse
vectors
24. Examples
`1 sparsity: J(x) = ||x||1
ex = sign(x)
Tx = {z supp(z) ⇢ supp(x)}
P
Structured sparsity: J(x) = b ||xb ||
N (a) = a/||a||
ex = (N (xb ))b2B
Tx = {z supp(z) ⇢ supp(x)}
x
@J(x)
x
0
x
@J(x)
x
0
25. Examples
`1 sparsity: J(x) = ||x||1
ex = sign(x)
Tx = {z supp(z) ⇢ supp(x)}
P
Structured sparsity: J(x) = b ||xb ||
N (a) = a/||a||
ex = (N (xb ))b2B
Tx = {z supp(z) ⇢ supp(x)}
x = U ⇤V ⇤
SVD:
Nuclear norm: J(x) = ||x||⇤
Tx = U A + BV ⇤ (A, B) 2 (Rn⇥n )2
ex = U V ⇤
x
@J(x)
x
0
x
@J(x)
x
0
x
@J(x)
26. Examples
`1 sparsity: J(x) = ||x||1
ex = sign(x)
Tx = {z supp(z) ⇢ supp(x)}
P
Structured sparsity: J(x) = b ||xb ||
N (a) = a/||a||
ex = (N (xb ))b2B
Tx = {z supp(z) ⇢ supp(x)}
x = U ⇤V ⇤
SVD:
Nuclear norm: J(x) = ||x||⇤
Tx = U A + BV ⇤ (A, B) 2 (Rn⇥n )2
ex = U V ⇤
I = {i |xi | = ||x||1 }
Anti-sparsity: J(x) = ||x||1
Tx = {y yI / sign(xI )}
ex = |I| 1 sign(x)
x
@J(x)
x
0
x
@J(x)
@J(x)
x
0
x
@J(x)
x
x
0
36. Compressed Sensing Setting
Random matrix:
2 RP ⇥N ,
Sparse vectors: J = || · ||1 .
Theorem: Let s = ||x0 ||0 . If
i,j
⇠ N (0, 1), i.i.d.
[Rudelson, Vershynin 2006]
[Chandrasekaran et al. 2011]
P > 2s log (N/s)
¯
Then 9⌘ 2 D(x0 ) with high probability on
.
37. Compressed Sensing Setting
Random matrix:
2 RP ⇥N ,
Sparse vectors: J = || · ||1 .
Theorem: Let s = ||x0 ||0 . If
i,j
⇠ N (0, 1), i.i.d.
[Rudelson, Vershynin 2006]
[Chandrasekaran et al. 2011]
P > 2s log (N/s)
¯
Then 9⌘ 2 D(x0 ) with high probability on
Low-rank matrices: J = || · ||⇤ .
Theorem: Let r = rank(x0 ). If
.
[Chandrasekaran et al. 2011]
x0 2 RN1 ⇥N2
P > 3r(N1 + N2 r)
¯
Then 9⌘ 2 D(x0 ) with high probability on
.
38. Compressed Sensing Setting
Random matrix:
2 RP ⇥N ,
Sparse vectors: J = || · ||1 .
Theorem: Let s = ||x0 ||0 . If
i,j
⇠ N (0, 1), i.i.d.
[Rudelson, Vershynin 2006]
[Chandrasekaran et al. 2011]
P > 2s log (N/s)
¯
Then 9⌘ 2 D(x0 ) with high probability on
Low-rank matrices: J = || · ||⇤ .
Theorem: Let r = rank(x0 ). If
.
[Chandrasekaran et al. 2011]
x0 2 RN1 ⇥N2
P > 3r(N1 + N2 r)
¯
Then 9⌘ 2 D(x0 ) with high probability on
! Similar results for || · ||1,2 , || · ||1 .
.
39. Phase Transitions
THE THE GEOMETRYPHASE TRANSITIONS IN CONVEX OPTIMIZATION
GEOMETRY OF OF PHASE TRANSITIONS IN CONVEX OPTIMIZATION
J = || · ||1
1
100 100
J = || · ||⇤
1
900 900
75
75
P/N
50
P/N
600 600
50
300 300
25
25
0
0
0
00
25
25
50
50
75
s/N 1
75
100 100
0
00
0
0
10
10
p
r/ N 1
20
20
30
30
F IGURE Phase transitions for for linear inverse problems. [left] Recovery of sparse vectors. empirical
IGURE 2.2:2.2: Phase transitions linear inverse problems. [left] Recovery of sparse vectors. The The empiri
probability the `1 `1 minimization problem (2.6) identifies a sparse vector 0 2 100 given random line
robability thatthat the minimization problem (2.6) identifies a sparse vector x 0 2xR100Rgiven random linear
From [Amelunxen et al. 20013]
41. Minimal-norm Certificate
⌘ 2 D(x0 )
=)
⇢
⌘ = ⇤q
ProjT (⌘) = e
Minimal-norm pre-certificate: ⌘0 =
T = T x0
e = ex0
argmin
⌘=
⇤ q,⌘
T =e
||q||
42. Minimal-norm Certificate
⌘ 2 D(x0 )
=)
⇢
⌘ = ⇤q
ProjT (⌘) = e
Minimal-norm pre-certificate: ⌘0 =
Proposition:
One has
⌘0 = (
+
T
T = T x0
e = ex0
argmin
⌘=
)⇤ e
⇤ q,⌘
T =e
T
||q||
=
ProjT
43. Minimal-norm Certificate
⌘ 2 D(x0 )
=)
⇢
⌘ = ⇤q
ProjT (⌘) = e
Minimal-norm pre-certificate: ⌘0 =
Proposition:
Theorem:
One has
⌘0 = (
¯
If ⌘0 2 D(x0 ) and
+
T
T = T x0
e = ex0
argmin
⌘=
)⇤ e
⇤ q,⌘
T =e
T
||q||
=
ProjT
⇠ ||w||,
the unique solution x? of P (y) for y = x0 + w satisfies
Tx ? = T x 0
and ||x?
x0 || = O(||w||) [Vaiter et al. 2013]
44. Minimal-norm Certificate
⌘ 2 D(x0 )
=)
⇢
⌘ = ⇤q
ProjT (⌘) = e
Minimal-norm pre-certificate: ⌘0 =
Proposition:
Theorem:
One has
⌘0 = (
¯
If ⌘0 2 D(x0 ) and
+
T
T = T x0
e = ex0
argmin
⌘=
)⇤ e
⇤ q,⌘
T =e
T
||q||
=
ProjT
⇠ ||w||,
the unique solution x? of P (y) for y = x0 + w satisfies
Tx ? = T x 0
and ||x?
x0 || = O(||w||) [Vaiter et al. 2013]
[Fuchs 2004]: J = || · ||1 .
[Vaiter et al. 2011]: J = ||D⇤ · ||1 .
[Bach 2008]: J = || · ||1,2 and J = || · ||⇤ .
45. Compressed Sensing Setting
Random matrix:
2 RP ⇥N ,
Sparse vectors: J = || · ||1 .
Theorem: Let s = ||x0 ||0 . If
i,j
⇠ N (0, 1), i.i.d.
[Wainwright 2009]
[Dossal et al. 2011]
P > 2s log(N )
¯
Then ⌘0 2 D(x0 ) with high probability on
.
46. Compressed Sensing Setting
Random matrix:
2 RP ⇥N ,
i,j
Sparse vectors: J = || · ||1 .
⇠ N (0, 1), i.i.d.
[Wainwright 2009]
[Dossal et al. 2011]
Theorem: Let s = ||x0 ||0 . If
P > 2s log(N )
¯
Then ⌘0 2 D(x0 ) with high probability on
Phase
transitions:
L2 stability
P ⇠ 2s log(N/s)
vs.
.
Model stability
P ⇠ 2s log(N )
47. Compressed Sensing Setting
Random matrix:
2 RP ⇥N ,
i,j
Sparse vectors: J = || · ||1 .
⇠ N (0, 1), i.i.d.
[Wainwright 2009]
[Dossal et al. 2011]
Theorem: Let s = ||x0 ||0 . If
P > 2s log(N )
¯
Then ⌘0 2 D(x0 ) with high probability on
Phase
transitions:
L2 stability
P ⇠ 2s log(N/s)
vs.
.
Model stability
! Similar results for || · ||1,2 , || · ||⇤ , || · ||1 .
P ⇠ 2s log(N )
48. Compressed Sensing Setting
Random matrix:
2 RP ⇥N ,
i,j
Sparse vectors: J = || · ||1 .
⇠ N (0, 1), i.i.d.
[Wainwright 2009]
[Dossal et al. 2011]
Theorem: Let s = ||x0 ||0 . If
P > 2s log(N )
¯
Then ⌘0 2 D(x0 ) with high probability on
Phase
transitions:
L2 stability
P ⇠ 2s log(N/s)
vs.
.
Model stability
! Similar results for || · ||1,2 , || · ||⇤ , || · ||1 .
P ⇠ 2s log(N )
! Not using RIP technics (non-uniform result on x0 ).
52. Support Instability and Measures
1
N
When N ! +1, support is not stable:
||⌘0,I c ||1
! c > 1.
N !+1
||⌘0,I c ||1
c
1
Unstable
Stable
53. Support Instability and Measures
1
N
When N ! +1, support is not stable:
||⌘0,I c ||1
! c > 1.
N !+1
Intuition: spikes wants to move laterally.
! Use Radon measures m 2 M(T), T = R/Z.
||⌘0,I c ||1
c
1
Unstable
Stable
54. Support Instability and Measures
1
N
When N ! +1, support is not stable:
||⌘0,I c ||1
! c > 1.
N !+1
Intuition: spikes wants to move laterally.
! Use Radon measures m 2 M(T), T = R/Z.
Extension of `1 : total variation
Z
||m||TV = sup
g(x) dm(x)
||g||1 61
T
Discrete measure: mx,a =
P
i
ai
One has ||mx,a ||TV = ||a||1
xi .
||⌘0,I c ||1
c
1
Unstable
Stable
56. Sparse Measure Regularization
Measurements: y =
8
< m0 2 M(T),
2
: M(T) ! L (T),
(m0 ) + w where
:
2
w 2 L (T).
Acquisition operator:
Z
(m)(x) =
'(x, x0 )dm(x0 ) where ' 2 C 2 (T ⇥ T)
T
57. Sparse Measure Regularization
Measurements: y =
8
< m0 2 M(T),
2
: M(T) ! L (T),
(m0 ) + w where
:
2
w 2 L (T).
Acquisition operator:
Z
(m)(x) =
'(x, x0 )dm(x0 ) where ' 2 C 2 (T ⇥ T)
T
Total-variation over measures regularization:
1
min
|| (m) y||2 + ||m||TV
m2M(T) 2
58. Sparse Measure Regularization
Measurements: y =
8
< m0 2 M(T),
2
: M(T) ! L (T),
(m0 ) + w where
:
2
w 2 L (T).
Acquisition operator:
Z
(m)(x) =
'(x, x0 )dm(x0 ) where ' 2 C 2 (T ⇥ T)
T
Total-variation over measures regularization:
1
min
|| (m) y||2 + ||m||TV
m2M(T) 2
! Infinite dimensional convex program.
! If dim(Im( )) < +1, dual is finite dimensional.
! If
is a filtering, re-cast dual as SDP program.
59. Fuchs vs. Vanishing Pre-Certificates
Measures:
1
2 ||
m2M
min
m
y||2 + ||m||TV
+1
1
60. Fuchs vs. Vanishing Pre-Certificates
Measures:
On a grid z:
1
2 ||
m2M
m
1
2 ||
za
min
min
a2RN
y||2 + ||m||TV
+1
zi
y||2 + ||a||1
1
61. Fuchs vs. Vanishing Pre-Certificates
1
2 ||
m2M
m
1
2 ||
za
min
Measures:
On a grid z:
min
a2RN
y||2 + ||m||TV
+1
zi
y||2 + ||a||1
1
For m0 = mz,a0 , supp(m0 ) = x0 , supp(a0 ) = I:
⌘F =
⇤
⇤,+
I
sign(a0,I )
⌘F
62. Fuchs vs. Vanishing Pre-Certificates
1
2 ||
m2M
m
1
2 ||
za
min
Measures:
On a grid z:
min
a2RN
y||2 + ||m||TV
+1
zi
y||2 + ||a||1
1
For m0 = mz,a0 , supp(m0 ) = x0 , supp(a0 ) = I:
⌘F =
⇤
⇤,+
I
sign(a0,I )
where
⌘F
⌘V =
x (a, b) =
⇤ +,⇤
x0
P
⌘V
sign(a0 ), 0
⇤
ai '(·, xi ) + bi '0 (·, xi )
i
63. Fuchs vs. Vanishing Pre-Certificates
1
2 ||
m2M
m
1
2 ||
za
min
Measures:
On a grid z:
min
a2RN
y||2 + ||m||TV
⌘F
+1
zi
y||2 + ||a||1
1
⌘V
For m0 = mz,a0 , supp(m0 ) = x0 , supp(a0 ) = I:
⌘F =
⇤
⇤,+
I
sign(a0,I )
where
Theorem: [Fuchs 2004]
If 8 j 2 I, |⌘F (xj )| < 1,
/
⌘V =
x (a, b) =
⇤ +,⇤
x0
P
sign(a0 ), 0
⇤
ai '(·, xi ) + bi '0 (·, xi )
i
then supp(a ) = supp(a0 )
(holds for ||w|| small enough and
⇠ ||w||)
64. Fuchs vs. Vanishing Pre-Certificates
1
2 ||
m2M
m
1
2 ||
za
min
Measures:
On a grid z:
min
a2RN
y||2 + ||m||TV
⌘F
+1
zi
y||2 + ||a||1
1
⌘V
For m0 = mz,a0 , supp(m0 ) = x0 , supp(a0 ) = I:
⌘F =
⇤
⇤,+
I
sign(a0,I )
where
Theorem: [Fuchs 2004]
If 8 j 2 I, |⌘F (xj )| < 1,
/
then supp(a ) = supp(a0 )
⌘V =
x (a, b) =
⇤ +,⇤
x0
P
sign(a0 ), 0
⇤
ai '(·, xi ) + bi '0 (·, xi )
i
Theorem: [Duval-Peyr´ 2013]
e
If 8 t 2 x0 , |⌘V (t)| < 1,
/
then m = mx ,a with
||x
x0 ||1 = O(||w||)
(holds for ||w|| small enough and
⇠ ||w||)
67. Numerical Illustration
0
Ideal low-pass filter: '(x, x ) =
+1
sin((2fc +1)⇡(x x0 ))
,
sin(⇡(x x0 ))
⌘F
Zoom
⌘F ⌘V
fc = 6.
+1
⌘V
1
Discrete ! continuous:
Theorem: [Duval-Peyr´ 2013]
e
If ⌘V is valid, then a
is supported on pairs of
neighbors around supp(m0 ).
Solution path
7! a
(holds for
⇠ ||w|| small enough.