Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Exact Matrix Completion via Convex Optimization
Emmanuel J. Cand`es and Benjamin Recht
Minsu Kim1, Joonyoung Yi1
1School o...
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discu...
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discu...
Matrix Completion Problem(Informal)
Given: Matrix M with some entries are missing
Goal: Complete the matrix M
CS592 Exact ...
Matrix Completion Problem(Informal)
Given: Matrix M with some entries are missing
Goal: Complete the matrix M
In general, ...
Low-rank Assumption
Low-rank assumption
For n1 × n2 matrix M of rank r, assume the min(n1, n2) r.
Why this assumption is n...
Low-rank Assumption
Low-rank assumption
For n1 × n2 matrix M of rank r, assume the min(n1, n2) r.
Why this assumption is n...
Low-rank Assumption
Low-rank assumption
For n1 × n2 matrix M of rank r, assume the min(n1, n2) r.
Why this assumption is n...
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 ·...
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 ·...
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 ·...
Which Matrices?
Consider the rank-1 matrix M:
M = e1e∗
n =





0 0 · · · 0 1
0 0 · · · 0 0
...
...
...
...
...
0 0 ·...
Which Matrices?
Consider the SVD of a matrix M = r
k=1 σkukv∗
k where uk’s and
vk’s are the left and right singular vector...
Which Sampling Sets?
Clearly, we can not recover Mij if the sampling set avoids i-th row or
j-th column.
So, we have to de...
Which Sampling Sets?
Clearly, we can not recover Mij if the sampling set avoids i-th row or
j-th column.
So, we have to de...
Which Sampling Sets?
Clearly, we can not recover Mij if the sampling set avoids i-th row or
j-th column.
So, we have to de...
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
CS592 Exact Matri...
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
Unfortunately, th...
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
Unfortunately, th...
Which Algorithm?
Problem 1.3: Formal Matrix Completion Problem
minimize rank(X)
subject to RΩ(X) = RΩ(M)
Unfortunately, th...
Nuclear Norm Relaxation is Reasonable?(Informal)
We already know that l1-norm is surrogate to l0-norm in the Lasso
paper.
...
Nuclear Norm Relaxation is Reasonable?(Informal)
We already know that l1-norm is surrogate to l0-norm in the Lasso
paper.
...
How to Solve Nuclear Norm Minimization?
This method is also suggested by Fazel.
Convex Relaxation to Matrix Completion Pro...
How to Solve Nuclear Norm Minimization?
This method is also suggested by Fazel.
Convex Relaxation to Matrix Completion Pro...
How to Solve Nuclear Norm Minimization?
This method is also suggested by Fazel.
Convex Relaxation to Matrix Completion Pro...
How to Solve Nuclear Norm Minimization?
Therefore, the Problem 1.5 can be reduced to
Reduced form of nuclear norm minimiza...
How to Solve Nuclear Norm Minimization?
Now, reduced form can be solved by SDP.
Many algorithms solving SDP already exist....
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discu...
Nuclear Norm Relaxation is Reasonable?(Formal)
A first typical result:
M: n1 × n2 matrix of rank r, n = max(n1, n2).
Theore...
Nuclear Norm Relaxation is Reasonable?(Formal)
A first typical result:
M: n1 × n2 matrix of rank r, n = max(n1, n2).
Theore...
Nuclear Norm Relaxation is Reasonable?(Formal)
Asymptotically,
Theorem 1.1(General): m = Ω(n1.25
r log n)
Theorem 1.1(Low-...
Nuclear Norm Relaxation is Reasonable?(Formal)
Asymptotically,
Theorem 1.1(General): m = Ω(n1.25
r log n)
Theorem 1.1(Low-...
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 ...
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 ...
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 ...
Relaxation of Random Orthogonal Model
Recall why we introduce random orthogonal model:
M = e1e∗
n =





0 0 · · · 0 ...
Coherence
Definition 1.2
Let U be a subspace or Rn of dimension r and PU be the orthogonal
projection onto U. Then the cohe...
Coherence
Definition 1.2
Let U be a subspace or Rn of dimension r and PU be the orthogonal
projection onto U. Then the cohe...
Coherence
From definition 1.2, we define two assumptions A0 and A1.
Assumption A0
The coherences obey max(µ(U), µ(V )) ≤ µ0 ...
Coherence
From definition 1.2, we define two assumptions A0 and A1.
Assumption A0
The coherences obey max(µ(U), µ(V )) ≤ µ0 ...
Coherence
Let’s investigate an quite natural assumption.
The maximum entries of left and right singular matrix are bounded...
Coherence
Let’s investigate an quite natural assumption.
The maximum entries of left and right singular matrix are bounded...
Coherence
Let’s investigate an quite natural assumption.
The maximum entries of left and right singular matrix are bounded...
Main Result
With incoherence conditions, relax Theorem 1.1 to Theorem 1.3.
Theorem 1.3(General)
- M obeys A0 and A1.
- Wit...
Main Result
With incoherence conditions, relax Theorem 1.1 to Theorem 1.3.
Theorem 1.3(General)
- M obeys A0 and A1.
- Wit...
Main Result
With incoherence conditions, relax Theorem 1.1 to Theorem 1.3.
Theorem 1.3(General)
- M obeys A0 and A1.
- Wit...
Connections to Lasso
Recall: Lasso Paper
Compressive sampling(or compressed sensing, matrix sensing)
Ax = b (A is a design...
Connections to Lasso
Recall: Lasso Paper
Compressive sampling(or compressed sensing, matrix sensing)
Ax = b (A is a design...
Connections to Matrix Sensing Problem
Original Fazel’s problem was
Solving matrix sensing problem with low-rank assumption...
Connections to Matrix Sensing Problem
Original Fazel’s problem was
Solving matrix sensing problem with low-rank assumption...
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discu...
Which Matrices are Incoherent?
Theorem 1.3 says completing matrix is possible if the incoherence
conditions are hold.
Now,...
Which Matrices are Incoherent?
Theorem 1.3 says completing matrix is possible if the incoherence
conditions are hold.
Now,...
Incoherent Bases Span Incoherent Subspaces
Recall: Assumption 1.12
Assume that the uj and vj’s obey
maxij ei, uj
2 ≤ µB/n,...
Incoherent Bases Span Incoherent Subspaces
Recall: Assumption 1.12
Assume that the uj and vj’s obey
maxij ei, uj
2 ≤ µB/n,...
Incoherent Bases Span Incoherent Subspaces
Lemma 2.1
If assumption 1.12 holds,
A1 holds with µ1 = CµB log n with high prob...
Incoherent Bases Span Incoherent Subspaces
Lemma 2.1
If assumption 1.12 holds,
A1 holds with µ1 = CµB log n with high prob...
Incoherent Bases Span Incoherent Subspaces
Lemma 2.1
If assumption 1.12 holds,
A1 holds with µ1 = CµB log n with high prob...
Random Subspaces Span Incoherent Subspaces
Now, we prove that random orthogonal model obeys the two
assumptions A0 and A1(...
Random Subspaces Span Incoherent Subspaces
Now, we prove that random orthogonal model obeys the two
assumptions A0 and A1(...
Random Subspaces Span Incoherent Subspaces
Now, we prove that random orthogonal model obeys the two
assumptions A0 and A1(...
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discu...
Proof Strategy
Problem 1.5: Convex relaxation to matrix completion problem
minimize X ∗
subject to RΩ(X) = RΩ(M)
We will p...
Subgradient of Nuclear Norm
Convex optimization theory says:
X is solution to (1.5) ⇔ ∃λ ∈ R|Ω|
s.t. R∗
Ωλ ∈ ∂ X ∗
We know...
Subgradient of Nuclear Norm
Convex optimization theory says:
X is solution to (1.5) ⇔ ∃λ ∈ R|Ω|
s.t. R∗
Ωλ ∈ ∂ X ∗
We know...
Subgradient of Nuclear Norm (cont.)
For rank r, n1 × n2 matrix A, there are r non-zero singular values.
Then, we know
∂ σ ...
Subgradient of Nuclear Norm (cont.)
For rank r, n1 × n2 matrix A, there are r non-zero singular values.
Then, we know
∂ σ ...
Subgradient of Nuclear Norm (cont.)
∂ A ∗ = { U(1)
V (1)∗
+ U(2)
RV (2)∗
for all R ∈ R(n1−r)×(n2−r)
, R 2 ≤ 1}
was express...
Conditions for Unique Minimizer
Lemma 3.1 conditions for unique minimizer
Consider a matrix X0 = r
k=1 σkukv∗
k of rank r ...
Construction of Subgradient
Define PΩ(X) = Xij if (i, j) ∈ Ω, 0 otherwise. Then set matrix Y as the
solution to least squar...
Injectivity
We study the injectivity of AΩT . For convenience, we use Bernoulli model,
not uniform sampling: P(δij = 1) = ...
Injectivity (cont.)
With m large enough so that CR µ0(nr/m) log n ≤ 1/2,
p
2
PT (X) F ≤ (PT PΩPT )(X) F ≤
3p
2
PT (X) F (4...
Size of Spectral Norm
We will investigate the probability of such Y will satisfy PT⊥ (Y ) 2 < 1.
Denote H ≡ PT − p−1
PT PΩ...
Size of Spectral Norm (cont.)
If we set k0 = 3 in lemma 4.8, we can bound the spectral norm of
PT⊥ (Y ) 2 < 1 with probabi...
Overall Structure of the Paper
vspace0.3cm
CS592 Exact Matrix Completion 2018. 4. 24 40 / 51
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discu...
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite m...
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite m...
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite m...
Experiment Overview
1. Experiment of Theorem 1.3
2. Experiment of Theorem 1.3 with the assumption that M is
semi-definite m...
Experiment 1
Success: white, Failure: Black. (a): n = 40, (b): n = 50
Two experiments in very similar plots for different n...
Experiment 2
For positive semidefinite matrices case, the recovery region is much
larger.
Future work is needed to investig...
Experiment 3
To check theoretical bound of matrix sensing problem(Original Fazel’s
problem).
CS592 Exact Matrix Completion...
Reproduce Experiments
We’ve reproduced this experiments using Python.
Using CVXOPT package to SDP.
https://github.com/Joon...
Reproduce Experiments
We’ve reproduced this experiments using Python.
Using CVXOPT package to SDP.
https://github.com/Joon...
Table of Contents
1 Introduction
2 Results of Paper
3 Rationality of Incoherence Assumption
4 Proofs
5 Experiments
6 Discu...
Improvements
The main result of this paper.
m = Ω(n1.25
r log n)
With low-rank assumption(r ≤ n1/5),
m = Ω(n1.2
r log n)
C...
Improvements
The main result of this paper.
m = Ω(n1.25
r log n)
With low-rank assumption(r ≤ n1/5),
m = Ω(n1.2
r log n)
C...
Improvements
The main result of this paper.
m = Ω(n1.25
r log n)
With low-rank assumption(r ≤ n1/5),
m = Ω(n1.2
r log n)
C...
Further directions
(CASE 1) Noise Handling
Observation is not Mij, but Yij.
Yij = Mij + zij, (i, j) ∈ Ω, where z is a dete...
Further directions
(CASE 2) Low-rank Matrix Fitting
Let M = Σ1≤k≤nσkukv∗
k where σ1 ≥ σ2 ≥ · · · ≥ 0 and the truncated
SVD...
Further directions
(CASE 3) Slow SDP
This direction can be done after (CASE 3).
SDP is too slow. So, algorithm described i...
Further directions
(CASE 3) Slow SDP
This direction can be done after (CASE 3).
SDP is too slow. So, algorithm described i...
Sie haben dieses Dokument abgeschlossen.
Lade die Datei herunter und lese sie offline.
Nächste SlideShare
What to Upload to SlideShare
Weiter
Nächste SlideShare
What to Upload to SlideShare
Weiter
Herunterladen, um offline zu lesen und im Vollbildmodus anzuzeigen.

Teilen

Exact Matrix Completion via Convex Optimization Slide (PPT)

Herunterladen, um offline zu lesen

Slide of the paper "Exact Matrix Completion via Convex Optimization" of Emmanuel J. Candès and Benjamin Recht. We presented this slide in KAIST CS592 Class, April 2018.

- Code: https://github.com/JoonyoungYi/MCCO-numpy
- Abstract of the paper: We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys
𝑚≥𝐶𝑛1.2𝑟log𝑛
for some positive numerical constant C, then with very high probability, most n×n matrices of rank r can be perfectly recovered by solving a simple convex optimization program. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Similar results hold for arbitrary rectangular matrices as well. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information.

Exact Matrix Completion via Convex Optimization Slide (PPT)

  1. 1. Exact Matrix Completion via Convex Optimization Emmanuel J. Cand`es and Benjamin Recht Minsu Kim1, Joonyoung Yi1 1School of Computing, KAIST CS592, Advanced Machine Learning: High-Dimensional Data Analysis April, 2018 CS592 Exact Matrix Completion 2018. 4. 24 1 / 51
  2. 2. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 2 / 51
  3. 3. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 3 / 51
  4. 4. Matrix Completion Problem(Informal) Given: Matrix M with some entries are missing Goal: Complete the matrix M CS592 Exact Matrix Completion 2018. 4. 24 4 / 51
  5. 5. Matrix Completion Problem(Informal) Given: Matrix M with some entries are missing Goal: Complete the matrix M In general, it is impossible. But, it can be possible with low-rank assumption. CS592 Exact Matrix Completion 2018. 4. 24 4 / 51
  6. 6. Low-rank Assumption Low-rank assumption For n1 × n2 matrix M of rank r, assume the min(n1, n2) r. Why this assumption is needed? CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
  7. 7. Low-rank Assumption Low-rank assumption For n1 × n2 matrix M of rank r, assume the min(n1, n2) r. Why this assumption is needed? For simplicity, think about n × n matrix M of rank r. It has (2n − r)r degrees of freedom. Degree of freedom is calculated by counting parameters in the SVD. (The number of singular values) + (degree of freedom of left singular vectors) + (degree of freedom of right singular vectors) = r + ((2n − r − 1) × r)/2 + ((2n − r − 1) × r)/2 Considerably smaller than n2 . With low-rank assumption, degree of freedom is reduced by about 2r/n. CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
  8. 8. Low-rank Assumption Low-rank assumption For n1 × n2 matrix M of rank r, assume the min(n1, n2) r. Why this assumption is needed? For simplicity, think about n × n matrix M of rank r. It has (2n − r)r degrees of freedom. Degree of freedom is calculated by counting parameters in the SVD. (The number of singular values) + (degree of freedom of left singular vectors) + (degree of freedom of right singular vectors) = r + ((2n − r − 1) × r)/2 + ((2n − r − 1) × r)/2 Considerably smaller than n2 . With low-rank assumption, degree of freedom is reduced by about 2r/n. This assumption is similar to sparsity assumption in the Lasso paper we’ve learned. CS592 Exact Matrix Completion 2018. 4. 24 5 / 51
  9. 9. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  10. 10. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. Then, we can see only 0 with probability 1 − m/n2 . CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  11. 11. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. Then, we can see only 0 with probability 1 − m/n2 . If sample set doesn’t contain 1, we can not complete matrix exactly. CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  12. 12. Which Matrices? Consider the rank-1 matrix M: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      Let m be the number of observed entries of M. Then, we can see only 0 with probability 1 − m/n2 . If sample set doesn’t contain 1, we can not complete matrix exactly. So, impossible to recover all low-rank matrices. CS592 Exact Matrix Completion 2018. 4. 24 6 / 51
  13. 13. Which Matrices? Consider the SVD of a matrix M = r k=1 σkukv∗ k where uk’s and vk’s are the left and right singular vectors, and the σk’s are the singular values. Random orthogonal model The family {uk}1≤k≤r is selected uniformly at random among all families of r orthonormal vectors, and similarly for the family {vk}1≤k≤r. With this model, we can see, intuitively, small probability occuring exception as we mentioned. Formal proofs are provided by Lemma 2.1 and 2.2. CS592 Exact Matrix Completion 2018. 4. 24 7 / 51
  14. 14. Which Sampling Sets? Clearly, we can not recover Mij if the sampling set avoids i-th row or j-th column. So, we have to define sampling set to recover. CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
  15. 15. Which Sampling Sets? Clearly, we can not recover Mij if the sampling set avoids i-th row or j-th column. So, we have to define sampling set to recover. Definition of sampling set Ω If Mij is observed, (i, j) ∈ Ω. The cardinality of Ω is m. Uniform sampling assumption The set Ω sampled uniformly at random. CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
  16. 16. Which Sampling Sets? Clearly, we can not recover Mij if the sampling set avoids i-th row or j-th column. So, we have to define sampling set to recover. Definition of sampling set Ω If Mij is observed, (i, j) ∈ Ω. The cardinality of Ω is m. Uniform sampling assumption The set Ω sampled uniformly at random. Additionally, this paper introduce sampling operator RΩ for convenience. Sampling Operator RΩ Let RΩ : Rn1×n2 → R Ω be the sampling operator which extracts the observed entries, RΩ(X) = (Xij)ij∈Ω. CS592 Exact Matrix Completion 2018. 4. 24 8 / 51
  17. 17. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  18. 18. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) Unfortunately, this problem is NP-hard and non-convex. rank function is not convex. All known algorithms require exponential time to n. CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  19. 19. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) Unfortunately, this problem is NP-hard and non-convex. rank function is not convex. All known algorithms require exponential time to n. Problem 1.5: Convex Relaxation to Matrix Completion Problem minimize X ∗ subject to RΩ(X) = RΩ(M) CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  20. 20. Which Algorithm? Problem 1.3: Formal Matrix Completion Problem minimize rank(X) subject to RΩ(X) = RΩ(M) Unfortunately, this problem is NP-hard and non-convex. rank function is not convex. All known algorithms require exponential time to n. Problem 1.5: Convex Relaxation to Matrix Completion Problem minimize X ∗ subject to RΩ(X) = RΩ(M) Nuclear norm X ∗ is surrogate of rank. Also, nuclear norm is convex function. This heuristic is introduced by [Fazel et al. 2002]. CS592 Exact Matrix Completion 2018. 4. 24 9 / 51
  21. 21. Nuclear Norm Relaxation is Reasonable?(Informal) We already know that l1-norm is surrogate to l0-norm in the Lasso paper. CS592 Exact Matrix Completion 2018. 4. 24 10 / 51
  22. 22. Nuclear Norm Relaxation is Reasonable?(Informal) We already know that l1-norm is surrogate to l0-norm in the Lasso paper. Figure: http://public.lanl.gov/mewall/kluwer2002.html Let vector σ = (σ1, σ2, · · · , σn), σi is i-th singular value of matrix M. The rank function is l0-norm of σ vector. The nuclear norm function is l1-norm of σ vector. CS592 Exact Matrix Completion 2018. 4. 24 10 / 51
  23. 23. How to Solve Nuclear Norm Minimization? This method is also suggested by Fazel. Convex Relaxation to Matrix Completion Problem(Nuclear Norm Minimization) can be solved by Semi-definite Programming(SDP). CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
  24. 24. How to Solve Nuclear Norm Minimization? This method is also suggested by Fazel. Convex Relaxation to Matrix Completion Problem(Nuclear Norm Minimization) can be solved by Semi-definite Programming(SDP). Let SVD of matrix X = UΣV ∗. Define W1 = UΣU∗, W2 = V ΣV ∗, X = W1 X X∗ W2 . CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
  25. 25. How to Solve Nuclear Norm Minimization? This method is also suggested by Fazel. Convex Relaxation to Matrix Completion Problem(Nuclear Norm Minimization) can be solved by Semi-definite Programming(SDP). Let SVD of matrix X = UΣV ∗. Define W1 = UΣU∗, W2 = V ΣV ∗, X = W1 X X∗ W2 . Under above settings, the following properties satisfy: X ∗ = W1 ∗ = W2 ∗ X 0 (positive semidefinite matrix) X = U V Σ U V ∗ = U V √ Σ U V √ Σ ∗ 0 ∀ matrix A, AA∗ 0 ∀ symmetric matrix X, trace(X) = X ∗ https://chrischoy.github.io/research/matrix-norms/ CS592 Exact Matrix Completion 2018. 4. 24 11 / 51
  26. 26. How to Solve Nuclear Norm Minimization? Therefore, the Problem 1.5 can be reduced to Reduced form of nuclear norm minimization minimize trace(X ) subject to RΩ(X) = RΩ(M), X 0 X = W1 X X∗ W2 trace(X ) = trace(W1) + trace(W2) = X ∗ + X ∗ = 2 X ∗ CS592 Exact Matrix Completion 2018. 4. 24 12 / 51
  27. 27. How to Solve Nuclear Norm Minimization? Now, reduced form can be solved by SDP. Many algorithms solving SDP already exist. Semidefinite Programming(SDP) minimizeX ∈Sn C, X Sn subject to Ak, X Sn = bk, k = 1, . . . , m X 0 Sn : n × n symmetric matrix. X, Y = trace(X∗ Y ) is element-wise product. Ak is all 0s without one entry which is 1 for (i, j) ∈ Ω. CS592 Exact Matrix Completion 2018. 4. 24 13 / 51
  28. 28. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 14 / 51
  29. 29. Nuclear Norm Relaxation is Reasonable?(Formal) A first typical result: M: n1 × n2 matrix of rank r, n = max(n1, n2). Theorem 1.1(General) - M obeys the random orthogonal model. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cn5/4 r log n the minimizer to the problem 1.5 is unique and equal to M with probability at least 1 − cn−3. CS592 Exact Matrix Completion 2018. 4. 24 15 / 51
  30. 30. Nuclear Norm Relaxation is Reasonable?(Formal) A first typical result: M: n1 × n2 matrix of rank r, n = max(n1, n2). Theorem 1.1(General) - M obeys the random orthogonal model. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cn5/4 r log n the minimizer to the problem 1.5 is unique and equal to M with probability at least 1 − cn−3. For low-rank case, tighter bound is given. Theorem 1.1(Low-rank) if r ≤ n1/5, the recovery is exact with same probability provided that m ≥ Cn6/5 r log n CS592 Exact Matrix Completion 2018. 4. 24 15 / 51
  31. 31. Nuclear Norm Relaxation is Reasonable?(Formal) Asymptotically, Theorem 1.1(General): m = Ω(n1.25 r log n) Theorem 1.1(Low-rank): m = Ω(n1.2 r log n) This bound is not tight(Plug r ← n into Theorem 1.1 general case). CS592 Exact Matrix Completion 2018. 4. 24 16 / 51
  32. 32. Nuclear Norm Relaxation is Reasonable?(Formal) Asymptotically, Theorem 1.1(General): m = Ω(n1.25 r log n) Theorem 1.1(Low-rank): m = Ω(n1.2 r log n) This bound is not tight(Plug r ← n into Theorem 1.1 general case). This theorem tells: With some conditions, we can recover matrix exactly. Problem 1.3 and Problem 1.5 are formally equivalent with some conditions. Rationality of nuclear norm relaxation. CS592 Exact Matrix Completion 2018. 4. 24 16 / 51
  33. 33. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  34. 34. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      More generally, it is hard to recover if the singular vectors of the matrix M are similar to standard basis. Because, information is highly concentrated on specific region. Hence, the singular vectors need to be sufficiently spread. CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  35. 35. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      More generally, it is hard to recover if the singular vectors of the matrix M are similar to standard basis. Because, information is highly concentrated on specific region. Hence, the singular vectors need to be sufficiently spread. Random orthogonal model is one of the models to tackle this problem. We want to relax this assumption of model. Why? CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  36. 36. Relaxation of Random Orthogonal Model Recall why we introduce random orthogonal model: M = e1e∗ n =      0 0 · · · 0 1 0 0 · · · 0 0 ... ... ... ... ... 0 0 · · · 0 0      More generally, it is hard to recover if the singular vectors of the matrix M are similar to standard basis. Because, information is highly concentrated on specific region. Hence, the singular vectors need to be sufficiently spread. Random orthogonal model is one of the models to tackle this problem. We want to relax this assumption of model. Why? To get more general theorem that covers a much larger set of matrices M. CS592 Exact Matrix Completion 2018. 4. 24 17 / 51
  37. 37. Coherence Definition 1.2 Let U be a subspace or Rn of dimension r and PU be the orthogonal projection onto U. Then the coherence of U is defined to be µ(U) ≡ n r max 1≤i≤n PU ei 2 CS592 Exact Matrix Completion 2018. 4. 24 18 / 51
  38. 38. Coherence Definition 1.2 Let U be a subspace or Rn of dimension r and PU be the orthogonal projection onto U. Then the coherence of U is defined to be µ(U) ≡ n r max 1≤i≤n PU ei 2 Examples Incoherence Example(Minimum Case) If U is spanned by vectors whose entries all have magnitude 1/ √ n, µ(U) = 1. Examples Coherence Example(Maximum Case) If U contains a standard basis element, µ(U) = n/r. The more similar to standard basis element, µ(U) is larger. The more far from standard basis element, µ(U) is smaller. CS592 Exact Matrix Completion 2018. 4. 24 18 / 51
  39. 39. Coherence From definition 1.2, we define two assumptions A0 and A1. Assumption A0 The coherences obey max(µ(U), µ(V )) ≤ µ0 for some positive µ0. Assumption A1 The n1 × n2 matrix Σ1≤k≤rukv∗ k has maximum entry bounded by µ1 r/(n1n2) in absolute value for some positive µ1. A1 is always hold with µ1 = µ0 √ r. CS592 Exact Matrix Completion 2018. 4. 24 19 / 51
  40. 40. Coherence From definition 1.2, we define two assumptions A0 and A1. Assumption A0 The coherences obey max(µ(U), µ(V )) ≤ µ0 for some positive µ0. Assumption A1 The n1 × n2 matrix Σ1≤k≤rukv∗ k has maximum entry bounded by µ1 r/(n1n2) in absolute value for some positive µ1. A1 is always hold with µ1 = µ0 √ r. This paper insists that A0, A1 are more general compared to random orthogonal model. Lemma 2.2 proves that random orthogonal model is special case of A0, A1. CS592 Exact Matrix Completion 2018. 4. 24 19 / 51
  41. 41. Coherence Let’s investigate an quite natural assumption. The maximum entries of left and right singular matrix are bounded. CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
  42. 42. Coherence Let’s investigate an quite natural assumption. The maximum entries of left and right singular matrix are bounded. Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
  43. 43. Coherence Let’s investigate an quite natural assumption. The maximum entries of left and right singular matrix are bounded. Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). With this assumption, assumptions A0, A1 hold. µ(U), µ(V ) ≤ µB µ1 = O( √ log n) (Proved by Lemma 2.1) CS592 Exact Matrix Completion 2018. 4. 24 20 / 51
  44. 44. Main Result With incoherence conditions, relax Theorem 1.1 to Theorem 1.3. Theorem 1.3(General) - M obeys A0 and A1. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cmax(µ2 1, µ 1/2 0 µ1, µ0n1/4 )nr(β log n) for some β > 2, then the minimizer to the problem (1.5) is unique and equal to M with probability at least 1 − cn−β. CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
  45. 45. Main Result With incoherence conditions, relax Theorem 1.1 to Theorem 1.3. Theorem 1.3(General) - M obeys A0 and A1. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cmax(µ2 1, µ 1/2 0 µ1, µ0n1/4 )nr(β log n) for some β > 2, then the minimizer to the problem (1.5) is unique and equal to M with probability at least 1 − cn−β. Theorem 1.3(Low-rank) If r ≤ µ−1 0 n1/5, the recovery is exact with same prob. provided that m ≥ Cµ0n6/5r(β log n). CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
  46. 46. Main Result With incoherence conditions, relax Theorem 1.1 to Theorem 1.3. Theorem 1.3(General) - M obeys A0 and A1. - With uniformly random sampling assumption. Then, ∃ constants C, c such that if m ≥ Cmax(µ2 1, µ 1/2 0 µ1, µ0n1/4 )nr(β log n) for some β > 2, then the minimizer to the problem (1.5) is unique and equal to M with probability at least 1 − cn−β. Theorem 1.3(Low-rank) If r ≤ µ−1 0 n1/5, the recovery is exact with same prob. provided that m ≥ Cµ0n6/5r(β log n). Asymptotically similar to Theorem 1.1 CS592 Exact Matrix Completion 2018. 4. 24 21 / 51
  47. 47. Connections to Lasso Recall: Lasso Paper Compressive sampling(or compressed sensing, matrix sensing) Ax = b (A is a design matrix, b is a observation vector) In the Lasso paper, they add the sparsity assumption with x to tackle the problem that problem dimension is much larger than observations(n p). x is k-sparse in the Fourier domain, it can be perfectly recovered with high probability by l1 minimization when m = Ω(k log n). CS592 Exact Matrix Completion 2018. 4. 24 22 / 51
  48. 48. Connections to Lasso Recall: Lasso Paper Compressive sampling(or compressed sensing, matrix sensing) Ax = b (A is a design matrix, b is a observation vector) In the Lasso paper, they add the sparsity assumption with x to tackle the problem that problem dimension is much larger than observations(n p). x is k-sparse in the Fourier domain, it can be perfectly recovered with high probability by l1 minimization when m = Ω(k log n). Also, this paper claim that they generalizes the notion of incoherence to problems beyond the setting of sparse signal recovery. CS592 Exact Matrix Completion 2018. 4. 24 22 / 51
  49. 49. Connections to Matrix Sensing Problem Original Fazel’s problem was Solving matrix sensing problem with low-rank assumption(nuclear norm heuristic) and other assumptions. CS592 Exact Matrix Completion 2018. 4. 24 23 / 51
  50. 50. Connections to Matrix Sensing Problem Original Fazel’s problem was Solving matrix sensing problem with low-rank assumption(nuclear norm heuristic) and other assumptions. Contribution of this paper compared to original Fazel’s work. Extend Fazel’s work to matrix completion problem. Define some conditions(including incoherence) to complete matrix exactly. Serve theoretical bound of convex relaxation(nuclear norm minimization). CS592 Exact Matrix Completion 2018. 4. 24 23 / 51
  51. 51. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 24 / 51
  52. 52. Which Matrices are Incoherent? Theorem 1.3 says completing matrix is possible if the incoherence conditions are hold. Now, we want to find out how incoherence assumptions are reasonable. CS592 Exact Matrix Completion 2018. 4. 24 25 / 51
  53. 53. Which Matrices are Incoherent? Theorem 1.3 says completing matrix is possible if the incoherence conditions are hold. Now, we want to find out how incoherence assumptions are reasonable. In this section, we’ll show that most random matrices are incoherent. Lemma 2.1 shows incoherence assumptions are reasonable. Lemma 2.2 shows a matrix with random orthogonal model satisfies incoherence conditions. CS592 Exact Matrix Completion 2018. 4. 24 25 / 51
  54. 54. Incoherent Bases Span Incoherent Subspaces Recall: Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). CS592 Exact Matrix Completion 2018. 4. 24 26 / 51
  55. 55. Incoherent Bases Span Incoherent Subspaces Recall: Assumption 1.12 Assume that the uj and vj’s obey maxij ei, uj 2 ≤ µB/n, maxij ei, vj 2 ≤ µB/n, for some value of µB = O(1). If the assumption 1.12 holds, A0 holds with µ0 = µB(It is trivial). A1 holds with µ1 = CµB √ log n(It is not trivial). A1 hold with µ1 = µB √ r, but we can’t assume that r = O(log n). So, we’ll prove that A1 holds with µ1 = CµB √ log n. CS592 Exact Matrix Completion 2018. 4. 24 26 / 51
  56. 56. Incoherent Bases Span Incoherent Subspaces Lemma 2.1 If assumption 1.12 holds, A1 holds with µ1 = CµB log n with high prob. CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
  57. 57. Incoherent Bases Span Incoherent Subspaces Lemma 2.1 If assumption 1.12 holds, A1 holds with µ1 = CµB log n with high prob. (Proof) Consider the matrix M = Σr k=1 kukv∗ k where { k}1≤k≤r is an arbitrary sign sequence and kuk = uk. CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
  58. 58. Incoherent Bases Span Incoherent Subspaces Lemma 2.1 If assumption 1.12 holds, A1 holds with µ1 = CµB log n with high prob. (Proof) Consider the matrix M = Σr k=1 kukv∗ k where { k}1≤k≤r is an arbitrary sign sequence and kuk = uk. From this setting, simple applying Hoeffding’s inequality, P(|Mij| ≥ λµB √ r/n) ≤ 2e−λ2/2 With setting λ = √ 2β log n and applying union bound, P(|M| ≥ µB √ 2βr log n/n) ≤ 2n2e−βe−β log n = 2n2n−βe−β ≤ 2n2n−β Plug β ← β + 2, then Lemma 2.1 holds. CS592 Exact Matrix Completion 2018. 4. 24 27 / 51
  59. 59. Random Subspaces Span Incoherent Subspaces Now, we prove that random orthogonal model obeys the two assumptions A0 and A1(with appropriate values for the µ’s) with large probability. CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
  60. 60. Random Subspaces Span Incoherent Subspaces Now, we prove that random orthogonal model obeys the two assumptions A0 and A1(with appropriate values for the µ’s) with large probability. Lemma 2.3 is one of the the results of [Laurent et al. 2000]. Using the result, we can see that Lemma 2.2 is induced from Lemma 2.3. Lemma 2.2 Set ¯r = max(r, log n). Then ∃ constants C, c such that the random orthogonal model obeys: 1. µ(U) = (n/r) maxi PU ei 2 ≤ C¯r/r = µ0, 2. (n/r) 1≤k≤r ukv∗ k ∞ ≤ C log n ¯r/r = µ1 with high probability. CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
  61. 61. Random Subspaces Span Incoherent Subspaces Now, we prove that random orthogonal model obeys the two assumptions A0 and A1(with appropriate values for the µ’s) with large probability. Lemma 2.3 is one of the the results of [Laurent et al. 2000]. Using the result, we can see that Lemma 2.2 is induced from Lemma 2.3. Lemma 2.2 Set ¯r = max(r, log n). Then ∃ constants C, c such that the random orthogonal model obeys: 1. µ(U) = (n/r) maxi PU ei 2 ≤ C¯r/r = µ0, 2. (n/r) 1≤k≤r ukv∗ k ∞ ≤ C log n ¯r/r = µ1 with high probability. µ0 = O(1), µ1 = O(log n). As a result of Lemma 2.2, we can see that random subspace span incoherent subspace with high probability. CS592 Exact Matrix Completion 2018. 4. 24 28 / 51
  62. 62. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 29 / 51
  63. 63. Proof Strategy Problem 1.5: Convex relaxation to matrix completion problem minimize X ∗ subject to RΩ(X) = RΩ(M) We will prove the original matrix M is the unique solution to the problem 1.5. What are conditions for some matrix X to be a unique minimizer? CS592 Exact Matrix Completion 2018. 4. 24 30 / 51
  64. 64. Subgradient of Nuclear Norm Convex optimization theory says: X is solution to (1.5) ⇔ ∃λ ∈ R|Ω| s.t. R∗ Ωλ ∈ ∂ X ∗ We know nuclear norm of a matrix is sum of its singular values. We may derive subgradient of nuclear norm from this. CS592 Exact Matrix Completion 2018. 4. 24 31 / 51
  65. 65. Subgradient of Nuclear Norm Convex optimization theory says: X is solution to (1.5) ⇔ ∃λ ∈ R|Ω| s.t. R∗ Ωλ ∈ ∂ X ∗ We know nuclear norm of a matrix is sum of its singular values. We may derive subgradient of nuclear norm from this. Preliminary: subgradient of matrix norm (from [36]) ∂ A = conv{UDV ∗ , A = UΣV ∗ , d ∈ ∂φ(σ)} where D, Σ is m × n diagonal matrix with d, σ as diagonal entries. Since φ should be symmetric gauge function, we put φ(σ) = σ 1 = |σi|. CS592 Exact Matrix Completion 2018. 4. 24 31 / 51
  66. 66. Subgradient of Nuclear Norm (cont.) For rank r, n1 × n2 matrix A, there are r non-zero singular values. Then, we know ∂ σ 1 = {x ∈ Rmin(n1,n2) | xi = 1 for i = 1..r, |xi| ≤ 1 otherwise} CS592 Exact Matrix Completion 2018. 4. 24 32 / 51
  67. 67. Subgradient of Nuclear Norm (cont.) For rank r, n1 × n2 matrix A, there are r non-zero singular values. Then, we know ∂ σ 1 = {x ∈ Rmin(n1,n2) | xi = 1 for i = 1..r, |xi| ≤ 1 otherwise} If we partition singular vectors U, V like.. U = [U(1) |U(2) ], V = [V (1) |V (2) ] where U(1) , V (1) has r columns we get the result ∂ A ∗ = { U(1) V (1)∗ +U(2) RV (2)∗ for all R ∈ R(n1−r)×(n2−r) , R 2 ≤ 1} using the definition of convex hull(ALL convex combinations of elements) CS592 Exact Matrix Completion 2018. 4. 24 32 / 51
  68. 68. Subgradient of Nuclear Norm (cont.) ∂ A ∗ = { U(1) V (1)∗ + U(2) RV (2)∗ for all R ∈ R(n1−r)×(n2−r) , R 2 ≤ 1} was expressed in the paper as, Y = 1≤k≤r ukv∗ k + W (3.4) where W obeys two properties: the column space of W is orthogonal to U ≡ span(u1, ..., ur). and the row space of W is orthogonal to V ≡ span(v1, ..., vr); the spectral norm of W is less than or equal to 1. This says we can decompose ∂ A ∗ into 2 orthogonal spaces, T(blue one) and T⊥(green one). We define PT , PT⊥ as projection onto each spaces. CS592 Exact Matrix Completion 2018. 4. 24 33 / 51
  69. 69. Conditions for Unique Minimizer Lemma 3.1 conditions for unique minimizer Consider a matrix X0 = r k=1 σkukv∗ k of rank r which is feasible for the problem (1.5), and suppose that the following two conditions hold: 1. there exists a dual point λ such that Y = R∗ Ωλ obeys PT (Y ) = r k=1 ukv∗ k, PT⊥ (Y ) 2 < 1 2. the sampling operator RΩ restricted to elements in T is injective. Then X0 is the unique minimizer. Removing the equality from spectral norm, and adding condition 2 gives you unique minimizer! Now, constructing such Y became important. CS592 Exact Matrix Completion 2018. 4. 24 34 / 51
  70. 70. Construction of Subgradient Define PΩ(X) = Xij if (i, j) ∈ Ω, 0 otherwise. Then set matrix Y as the solution to least square problem minimize X F subject to (PT PΩ)(X) = r k=1 ukv∗ k. This will make Y vanish in ΩC and have smaller PT⊥ (Y ) 2 as well. Statement 4.2 specification of Y Denote AΩT (M) = PΩPT (M). Then if A∗ ΩT AΩT = PT PΩPT has full rank when restricted to T, the minimizer is given by Y = AΩT (A∗ ΩT AΩT )−1 (E), where E = r k=1 ukv∗ k CS592 Exact Matrix Completion 2018. 4. 24 35 / 51
  71. 71. Injectivity We study the injectivity of AΩT . For convenience, we use Bernoulli model, not uniform sampling: P(δij = 1) = p ≡ m n1n2 , Ω = {(i, j) : δij = 1}. Theorem 4.1 Small operator norm Suppose Ω is sampled according to the Bernoulli model and put n = max(n1, n2). Suppose that the coherences obey max(µ(U), µ(V )) ≤ µ0. Then, there is a numerical constants CR such that for all β > 1, p−1 PT PΩPT − pPT 2 ≤ CR µ0nr(β log n) m with probability at least 1 − 3n−β provided that CR µ0nr(β log n) m < 1. CS592 Exact Matrix Completion 2018. 4. 24 36 / 51
  72. 72. Injectivity (cont.) With m large enough so that CR µ0(nr/m) log n ≤ 1/2, p 2 PT (X) F ≤ (PT PΩPT )(X) F ≤ 3p 2 PT (X) F (4.11) Corollary 4.3 Injectivity of RΩ in T Assume that CR µ0nr(log n)/m ≤ 1/2. With the same probability as in Theorem 4.1, we have PΩPT (X) F ≤ 3p/2 PT (X) F . This provides the second condition of unique minimizer. CS592 Exact Matrix Completion 2018. 4. 24 37 / 51
  73. 73. Size of Spectral Norm We will investigate the probability of such Y will satisfy PT⊥ (Y ) 2 < 1. Denote H ≡ PT − p−1 PT PΩPT , then we can decompose PT⊥ (Y ) as PT ⊥ (Y ) = p−1 (PT ⊥ PΩPT )(E + H(E) + H2 (E) + · · · ), E = 1≤k≤r ukv∗ k. Then lemma 4.4-4.8 will give us the upper bound of these terms! p−1 (PT⊥ PΩPT )E 2 p−1 (PT⊥ PΩPT )H2 (E) 2 p−1 (PT⊥ PΩPT )H3 (E)) 2 · · · p−1 (PT⊥ PΩPT ) k≥k0 Hk (E) 2 CS592 Exact Matrix Completion 2018. 4. 24 38 / 51
  74. 74. Size of Spectral Norm (cont.) If we set k0 = 3 in lemma 4.8, we can bound the spectral norm of PT⊥ (Y ) 2 < 1 with probability at least 1 − cn−β provided m ≥ C max(µ2 1, µ 1/2 0 µ1, µ 4/3 0 r1/3 , µ0n1/4 )nrβ log n If µ 4/3 0 r1/3 is maximum, it leads to trivial case(greater than n2). Without this, it is the first result of Theorem 1.3. If we set k0 = 4 in lemma 4.8, we can bound m m ≥ C max(µ2 0r, µ0n1/5 )nrβ log n which is the second result of Thoerem 1.3. CS592 Exact Matrix Completion 2018. 4. 24 39 / 51
  75. 75. Overall Structure of the Paper vspace0.3cm CS592 Exact Matrix Completion 2018. 4. 24 40 / 51
  76. 76. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 41 / 51
  77. 77. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  78. 78. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. Matrix M(n × n, rank r) generating method: Sample two n × r factors ML and MR with i.i.d. Gaussian entries. Setting M = MLM∗ R Sample a subset Ω of m entries uniformly at random. CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  79. 79. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. Matrix M(n × n, rank r) generating method: Sample two n × r factors ML and MR with i.i.d. Gaussian entries. Setting M = MLM∗ R Sample a subset Ω of m entries uniformly at random. Determine exactness: M to be recovered if the solution returned by SDP, Xopt satisfied Xopt − M F / M F < 10−3 CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  80. 80. Experiment Overview 1. Experiment of Theorem 1.3 2. Experiment of Theorem 1.3 with the assumption that M is semi-definite matrix. 3. Experiment of Fazel’s work. Matrix M(n × n, rank r) generating method: Sample two n × r factors ML and MR with i.i.d. Gaussian entries. Setting M = MLM∗ R Sample a subset Ω of m entries uniformly at random. Determine exactness: M to be recovered if the solution returned by SDP, Xopt satisfied Xopt − M F / M F < 10−3 Repeated same procedure 50 times for 1, 2. 10 for 3. CS592 Exact Matrix Completion 2018. 4. 24 42 / 51
  81. 81. Experiment 1 Success: white, Failure: Black. (a): n = 40, (b): n = 50 Two experiments in very similar plots for different n. The results of this paper may be conservative. CS592 Exact Matrix Completion 2018. 4. 24 43 / 51
  82. 82. Experiment 2 For positive semidefinite matrices case, the recovery region is much larger. Future work is needed to investigate. CS592 Exact Matrix Completion 2018. 4. 24 44 / 51
  83. 83. Experiment 3 To check theoretical bound of matrix sensing problem(Original Fazel’s problem). CS592 Exact Matrix Completion 2018. 4. 24 45 / 51
  84. 84. Reproduce Experiments We’ve reproduced this experiments using Python. Using CVXOPT package to SDP. https://github.com/JoonyoungYi/exact-mc. CS592 Exact Matrix Completion 2018. 4. 24 46 / 51
  85. 85. Reproduce Experiments We’ve reproduced this experiments using Python. Using CVXOPT package to SDP. https://github.com/JoonyoungYi/exact-mc. We plotted the relation between n and m/n. rank = 2, trial = 1, Exactness metric is same as experiments in this paper: Xopt satisfied Xopt − M F / M F < 10−3 CS592 Exact Matrix Completion 2018. 4. 24 46 / 51
  86. 86. Table of Contents 1 Introduction 2 Results of Paper 3 Rationality of Incoherence Assumption 4 Proofs 5 Experiments 6 Discussion CS592 Exact Matrix Completion 2018. 4. 24 47 / 51
  87. 87. Improvements The main result of this paper. m = Ω(n1.25 r log n) With low-rank assumption(r ≤ n1/5), m = Ω(n1.2 r log n) CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
  88. 88. Improvements The main result of this paper. m = Ω(n1.25 r log n) With low-rank assumption(r ≤ n1/5), m = Ω(n1.2 r log n) Can we find tighter bound? Authors insisted that it would be hard as far as approaching in this way. Bound Ω(n1.2 ) is mainly determined by (PT ⊥ PΩPT Hk (E)) for k = 1, 2, 3 in the series (4.13). If we k to expand 4, Bound will become Ω(n7/6 ). If we k to expand 5, Bound will become Ω(n8/7 ). This can be done to reach k of size about log n, but size of the decoupling constants CD is varying on k. This is why the stronger results are hard to see. CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
  89. 89. Improvements The main result of this paper. m = Ω(n1.25 r log n) With low-rank assumption(r ≤ n1/5), m = Ω(n1.2 r log n) Can we find tighter bound? Authors insisted that it would be hard as far as approaching in this way. Bound Ω(n1.2 ) is mainly determined by (PT ⊥ PΩPT Hk (E)) for k = 1, 2, 3 in the series (4.13). If we k to expand 4, Bound will become Ω(n7/6 ). If we k to expand 5, Bound will become Ω(n8/7 ). This can be done to reach k of size about log n, but size of the decoupling constants CD is varying on k. This is why the stronger results are hard to see. However, [Candes et al.2010] showed that m = Ω(nr log n) with additional assumptions. CS592 Exact Matrix Completion 2018. 4. 24 48 / 51
  90. 90. Further directions (CASE 1) Noise Handling Observation is not Mij, but Yij. Yij = Mij + zij, (i, j) ∈ Ω, where z is a deterministic or stochastic perturbation. CS592 Exact Matrix Completion 2018. 4. 24 49 / 51
  91. 91. Further directions (CASE 2) Low-rank Matrix Fitting Let M = Σ1≤k≤nσkukv∗ k where σ1 ≥ σ2 ≥ · · · ≥ 0 and the truncated SVD of the matrix M, Mr = Σ1≤k≤rσkukv∗ k where the sum extends over the r largest singualar values. r is the number of singular value that we can not negligible. This is similar to process of PCA. Low-rank matrix fitting(or Low-rank matrix approximation) For general rank M, minimize X ∗ subject to X − M ∗ X − MR ∗ It is possible to try getting theoretical bound in low-rank matrix fitting problem. CS592 Exact Matrix Completion 2018. 4. 24 50 / 51
  92. 92. Further directions (CASE 3) Slow SDP This direction can be done after (CASE 3). SDP is too slow. So, algorithm described in this paper rarely used in practically. CS592 Exact Matrix Completion 2018. 4. 24 51 / 51
  93. 93. Further directions (CASE 3) Slow SDP This direction can be done after (CASE 3). SDP is too slow. So, algorithm described in this paper rarely used in practically. Practically, SVD(Alternating Minimization) Method is widely used. But, SVD method is really hard to prove theoretical bound. However, [Jain et al. 2012] showed that m = Ω(nr2.5 logn). This bound is slightly higher compared to convex optimization case, but [Jain et al. 2012] argued that this bound is not tight. CS592 Exact Matrix Completion 2018. 4. 24 51 / 51
  • JaeYeunYoon1

    Jul. 7, 2020
  • ChuangGao2

    Jun. 14, 2020
  • RuchiTripathi18

    Oct. 15, 2019
  • weiyan29

    Apr. 8, 2019
  • JihunYun2

    Mar. 26, 2019
  • WeixiongHuang1

    Jun. 18, 2018
  • ssuserb5f7ac

    May. 28, 2018

Slide of the paper "Exact Matrix Completion via Convex Optimization" of Emmanuel J. Candès and Benjamin Recht. We presented this slide in KAIST CS592 Class, April 2018. - Code: https://github.com/JoonyoungYi/MCCO-numpy - Abstract of the paper: We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys 𝑚≥𝐶𝑛1.2𝑟log𝑛 for some positive numerical constant C, then with very high probability, most n×n matrices of rank r can be perfectly recovered by solving a simple convex optimization program. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Similar results hold for arbitrary rectangular matrices as well. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information.

Aufrufe

Aufrufe insgesamt

2.858

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

0

Befehle

Downloads

99

Geteilt

0

Kommentare

0

Likes

7

×