SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Coordinate descent method 
2013.11.21 
SanghyukChun 
Many contents are from 
Large Scale Optimization Lecture 5 by Caramanis& Sanghaviin Texas Austin 
Optimization Lecture 25 by Geoff Gordon and Ryan Tibshiraniin CMU 
Convex Optimization Lecture 20 by SuvritSrain UC Berkeley 1
Contents 
•Overview 
•Convergence Analysis 
•Examples 
2
Overview of Coordinate descent method 
•Idea 
•Recall: unconstrained minimization problem 
•From Lecture 1, the formation of an unconstrained optimization problem is as follows 
•min푓푥 
•Where 푓:푅푛→푅is convex and smooth 
•In this problem, the necessary and sufficient condition for optimal solution x0 is 
•훻푓푥=0푎푡푥=푥0 
•훻푓푥= 휕푓 휕푥1 풆ퟏ+⋯+ 휕푓 휕푥푛 풆풏=0 
•Thus, in this situation, 휕푓 휕푥1=⋯= 휕푓 휕푥푛 =0 
•What if minimize for each basis respectively? 
3
Overview of Coordinate descent method 
•Description 
•퐿푒푡푒1,푒2,…,푒푛is basis for function 푓 
•If 푥푖 푘is given, the 푖thcoordinate of 푥푖 푘+1is given by 
•푥푖 푘+1←푎푟푔푚푖푛푦∈R푓(푥1 푘+1,…,푥푖−1 푘+1,푦,푥푖+1 푘,…,푥푛 푘) 
•푥푖 푘+1overwrites value in 푥푖 푘(in actual implementation) 
•Algorithm 
•Initialize with guess 푥=푥1,푥2,…,푥푛 푇 
•repeatfor all j in 1,2,…,n do 푥푗←푎푟푔푚푖푛푥푗푓푥 end foruntil convergence 
4
Overview of Coordinate descent method 
•Start with some initial guess 푥(0), and repeat for k = 1,2,3… 
•푥1(푘)∈푎푟푔푚푖푛푥1푓(푥1,푥2 푘−1,푥3 푘−1,…,푥푛 푘−1) 
•푥2(푘)∈푎푟푔푚푖푛푥2푓(푥1 푘,푥2,푥3 푘−1,…,푥푛 푘−1) 
•푥3(푘)∈푎푟푔푚푖푛푥3푓(푥1 푘,푥2 푘,푥3,…,푥푛 푘−1) 
… 
•푥푛 (푘)∈푎푟푔푚푖푛푥푛푓푥1 푘,푥2 푘,푥3 푘,…,푥푛 
•Every iteration, it goes each coordinate basis direction 
•c.f. Gradient Descent Method 
•Every iteration (step), it goes 훻푓= 휕푓 휕푥1 풆ퟏ+⋯+ 휕푓 휕푥푛 풆풏direction 
5
Properties of Coordinate Descent 
•Note: 
•Order of cycle through coordinates is arbitrary, can use any permutation of {1,2,…,n} 
•Cyclic order: 1,2,…,n,1,…, repeat 
•Almost Cyclic: Each coordinate 1<i<n picked at least once every B successive iterations (B>n) 
•Double sweep: 1,2,…,n,n-1,…,2,1, repeat 
•Cyclic with permutation: random order each cycle 
•Random sampling: pick random index at each iteration 
•Can everywhere replace individual coordinates with blocks of coordinates (Block Coordinate Descent Method) 
•“One-at-time” update scheme is critical, and “all-at-once” scheme does not necessarily converge 
6
Properties of Coordinate Descent 
•Advantages 
•Parallel algorithm is possible 
•No step size tuning 
•Each iteration usually cheap (single variable optimization) 
•No extra storage vectors needed 
•No other pesky parameters (usually) that must be tuned 
•Works well for large-scale problems 
•Very useful in cases where the actual gradient of 푓is not known 
•Easy to implement 
•Disadvantages 
•Tricky if single variable optimization is hard 
•Convergence theory can be complicated 
•Can be slower near optimum than more sophisticated methods 
•Non smooth case more tricky 
7
Convergence of Coordinate descent 
•Recall: 푥푖 푘+1←푎푟푔푚푖푛푦∈R푓(푥1 푘+1,…,푥푖−1 푘+1,푦,푥푖+1 푘,…,푥푛 푘) 
•Thus, one beings with an initial 푥0for a local minimum on F, and get a sequence 푿0,푿1,푿2,…iteratively 
•By doing line search in each iteration, we automatically have 
•퐹푿0≥퐹푿1≥퐹푿2≥⋯, 
•It can be shown that this sequence has similar convergence properties as steepest descent 
•No improvement after one cycle of line search along coordinate directions implies a stationary point is reached 
8
Convergence Analysis 
•For continuously differentiable cost functions, it can be shown to generate sequences whose limit points are stationary 
•Lemma 5.4 
•Proof 
•In the Caramanislecture note 
•Idea: show that limj→∞ 푥1(푘푗+1) −푥1 푘푗=0using limj→∞ 푧1 푘푗−푥1 푘푗=0 푤ℎ푒푟푒,푧푖 (푘)=(푥1 푘+1,…,푥푖 푘+1,푥푖+1 푘,…,푥푛 (푘)) 
9
Convergence Analysis 
•Question 
•Given convex, differentiable 푓:푅푛→푅, if we are at a point 푥s.t. 푓푥is minimized along each coordinate axis, have we found a global minimizer? 
•i.e., does 푓푥+푑∙푒푖≥푓푥푓표푟∀푑,푖→푓푥=min 푧 푓푧? 
•Here, 푒푖=0,…,1,…,0∈푅푛, the 푖-thstandard basis vector 
•Answer 
•Yes 
•Proof 
•훻푓푥= 휕푓 휕푥1 풆ퟏ+⋯+ 휕푓 휕푥푛 풆풏=0 
10
Convergence Analysis 
•Question 
•Same question but 푓is non differentiable? 
•Answer 
•No 
•Proof: Counterexample 
11
Convergence Analysis 
•Question 
•Same again, but now 푓푥=푔푥+Σ푖=1 푛ℎ푖푥푖 
•Where 푔convex, differentiable and each ℎ푖convex? 
•Here, non-smooth part called separable 
•Answer 
•Yes 
•Proof: for any 푦 
•푓푦−푓푥≥훻푔푥푇푦−푥+Σ푖=1 푛ℎ푖푦푖−ℎ푖푥푖 = 푖=1 푛 훻푖푔푥푦푖−푥푖+ℎ푖푦푖−ℎ푖푥푖≥0 
12 
≥0
Example 
13 
•Example Matlabcode 
•Reuse source code from http://www.mathworks.com/matlabcentral/fileexchange/35535- simplified-gradient-descent-optimization
14 
END OF DOCUMENT

Weitere ähnliche Inhalte

Was ist angesagt?

Maths-->>Eigenvalues and eigenvectors
Maths-->>Eigenvalues and eigenvectorsMaths-->>Eigenvalues and eigenvectors
Maths-->>Eigenvalues and eigenvectorsJaydev Kishnani
 
Householder transformation | Householder Reflection with QR Decomposition
Householder transformation | Householder Reflection with QR DecompositionHouseholder transformation | Householder Reflection with QR Decomposition
Householder transformation | Householder Reflection with QR DecompositionIsaac Yowetu
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
 
Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)cairo university
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix FactorizationTatsuya Yokota
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep LearningYan Xu
 
Multi-Armed Bandit and Applications
Multi-Armed Bandit and ApplicationsMulti-Armed Bandit and Applications
Multi-Armed Bandit and ApplicationsSangwoo Mo
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1Srinivasan R
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machinesUjjawal
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Dimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsDimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsViet-Trung TRAN
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisXin-She Yang
 

Was ist angesagt? (20)

Maths-->>Eigenvalues and eigenvectors
Maths-->>Eigenvalues and eigenvectorsMaths-->>Eigenvalues and eigenvectors
Maths-->>Eigenvalues and eigenvectors
 
Householder transformation | Householder Reflection with QR Decomposition
Householder transformation | Householder Reflection with QR DecompositionHouseholder transformation | Householder Reflection with QR Decomposition
Householder transformation | Householder Reflection with QR Decomposition
 
03 Machine Learning Linear Algebra
03 Machine Learning Linear Algebra03 Machine Learning Linear Algebra
03 Machine Learning Linear Algebra
 
Feature scaling
Feature scalingFeature scaling
Feature scaling
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)Machine Learning lecture4(logistic regression)
Machine Learning lecture4(logistic regression)
 
Disjoint sets
Disjoint setsDisjoint sets
Disjoint sets
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix Factorization
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Multi-Armed Bandit and Applications
Multi-Armed Bandit and ApplicationsMulti-Armed Bandit and Applications
Multi-Armed Bandit and Applications
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
Machine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-offMachine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-off
 
Lecture7
Lecture7Lecture7
Lecture7
 
Support vector machines
Support vector machinesSupport vector machines
Support vector machines
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Dimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applicationsDimensionality reduction: SVD and its applications
Dimensionality reduction: SVD and its applications
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
 

Andere mochten auch

Introduction to E-book
Introduction to E-bookIntroduction to E-book
Introduction to E-bookSanghyuk Chun
 
Proms' portpolio ppt(~2014.2)
Proms' portpolio ppt(~2014.2)Proms' portpolio ppt(~2014.2)
Proms' portpolio ppt(~2014.2)함 건수
 
알서포트 Rsupport
알서포트 Rsupport알서포트 Rsupport
알서포트 Rsupporttistrue
 
Presentation skill up 프레젠테이션 기획과 표현
Presentation skill up 프레젠테이션 기획과 표현Presentation skill up 프레젠테이션 기획과 표현
Presentation skill up 프레젠테이션 기획과 표현권수 김
 
Lesser tat presentation_111004
Lesser tat presentation_111004Lesser tat presentation_111004
Lesser tat presentation_111004WONSEOK YI
 
영화와 함께하는 ICT 기술-창원대학교 과학영재교육원
영화와 함께하는 ICT 기술-창원대학교 과학영재교육원영화와 함께하는 ICT 기술-창원대학교 과학영재교육원
영화와 함께하는 ICT 기술-창원대학교 과학영재교육원Changwon National University
 
Portpolio
PortpolioPortpolio
Portpoliochakm
 
소프트웨어 테스팅
소프트웨어 테스팅소프트웨어 테스팅
소프트웨어 테스팅영기 김
 
이민의 포트폴리오
이민의 포트폴리오이민의 포트폴리오
이민의 포트폴리오Min Lee
 
[2015-11월 정기 세미나] Open stack tokyo_summit_후기
[2015-11월 정기 세미나] Open stack tokyo_summit_후기[2015-11월 정기 세미나] Open stack tokyo_summit_후기
[2015-11월 정기 세미나] Open stack tokyo_summit_후기OpenStack Korea Community
 

Andere mochten auch (14)

Introduction to E-book
Introduction to E-bookIntroduction to E-book
Introduction to E-book
 
Markov Chain Basic
Markov Chain BasicMarkov Chain Basic
Markov Chain Basic
 
Proms' portpolio ppt(~2014.2)
Proms' portpolio ppt(~2014.2)Proms' portpolio ppt(~2014.2)
Proms' portpolio ppt(~2014.2)
 
1213 j wise sns
1213 j wise sns1213 j wise sns
1213 j wise sns
 
알서포트 Rsupport
알서포트 Rsupport알서포트 Rsupport
알서포트 Rsupport
 
Presentation skill up 프레젠테이션 기획과 표현
Presentation skill up 프레젠테이션 기획과 표현Presentation skill up 프레젠테이션 기획과 표현
Presentation skill up 프레젠테이션 기획과 표현
 
Lesser tat presentation_111004
Lesser tat presentation_111004Lesser tat presentation_111004
Lesser tat presentation_111004
 
Kwon portpolio
Kwon portpolioKwon portpolio
Kwon portpolio
 
영화와 함께하는 ICT 기술-창원대학교 과학영재교육원
영화와 함께하는 ICT 기술-창원대학교 과학영재교육원영화와 함께하는 ICT 기술-창원대학교 과학영재교육원
영화와 함께하는 ICT 기술-창원대학교 과학영재교육원
 
Portpolio
PortpolioPortpolio
Portpolio
 
소프트웨어 테스팅
소프트웨어 테스팅소프트웨어 테스팅
소프트웨어 테스팅
 
이민의 포트폴리오
이민의 포트폴리오이민의 포트폴리오
이민의 포트폴리오
 
K-means and GMM
K-means and GMMK-means and GMM
K-means and GMM
 
[2015-11월 정기 세미나] Open stack tokyo_summit_후기
[2015-11월 정기 세미나] Open stack tokyo_summit_후기[2015-11월 정기 세미나] Open stack tokyo_summit_후기
[2015-11월 정기 세미나] Open stack tokyo_summit_후기
 

Ähnlich wie Coordinate Descent method

13Kernel_Machines.pptx
13Kernel_Machines.pptx13Kernel_Machines.pptx
13Kernel_Machines.pptxKarasuLee
 
DL_lecture3_regularization_I.pdf
DL_lecture3_regularization_I.pdfDL_lecture3_regularization_I.pdf
DL_lecture3_regularization_I.pdfsagayalavanya2
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deepKNaveenKumarECE
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Lecture_3_Gradient_Descent.pptx
Lecture_3_Gradient_Descent.pptxLecture_3_Gradient_Descent.pptx
Lecture_3_Gradient_Descent.pptxgnans Kgnanshek
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the theSupport Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the thesanjaibalajeessn
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningSungchul Kim
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Maninda Edirisooriya
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialJia-Bin Huang
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningSungchul Kim
 
Optimum engineering design - Day 5. Clasical optimization methods
Optimum engineering design - Day 5. Clasical optimization methodsOptimum engineering design - Day 5. Clasical optimization methods
Optimum engineering design - Day 5. Clasical optimization methodsSantiagoGarridoBulln
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةFares Al-Qunaieer
 
cos323_s06_lecture03_optimization.ppt
cos323_s06_lecture03_optimization.pptcos323_s06_lecture03_optimization.ppt
cos323_s06_lecture03_optimization.pptdevesh604174
 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!ChenYiHuang5
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsMuthu Vinayagam
 
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal PatternICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal PatternKohei Asano
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksChenYiHuang5
 
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...sleepy_yoshi
 

Ähnlich wie Coordinate Descent method (20)

13Kernel_Machines.pptx
13Kernel_Machines.pptx13Kernel_Machines.pptx
13Kernel_Machines.pptx
 
DL_lecture3_regularization_I.pdf
DL_lecture3_regularization_I.pdfDL_lecture3_regularization_I.pdf
DL_lecture3_regularization_I.pdf
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Lecture_3_Gradient_Descent.pptx
Lecture_3_Gradient_Descent.pptxLecture_3_Gradient_Descent.pptx
Lecture_3_Gradient_Descent.pptx
 
Support Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the theSupport Vector Machines is the the the the the the the the the
Support Vector Machines is the the the the the the the the the
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Optimum engineering design - Day 5. Clasical optimization methods
Optimum engineering design - Day 5. Clasical optimization methodsOptimum engineering design - Day 5. Clasical optimization methods
Optimum engineering design - Day 5. Clasical optimization methods
 
Optim_methods.pdf
Optim_methods.pdfOptim_methods.pdf
Optim_methods.pdf
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
cos323_s06_lecture03_optimization.ppt
cos323_s06_lecture03_optimization.pptcos323_s06_lecture03_optimization.ppt
cos323_s06_lecture03_optimization.ppt
 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation Algorithms
 
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal PatternICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
ICANN19: Model-Agnostic Explanations for Decisions using Minimal Pattern
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
 
Sudoku
SudokuSudoku
Sudoku
 
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
 

Kürzlich hochgeladen

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Coordinate Descent method

  • 1. Coordinate descent method 2013.11.21 SanghyukChun Many contents are from Large Scale Optimization Lecture 5 by Caramanis& Sanghaviin Texas Austin Optimization Lecture 25 by Geoff Gordon and Ryan Tibshiraniin CMU Convex Optimization Lecture 20 by SuvritSrain UC Berkeley 1
  • 2. Contents •Overview •Convergence Analysis •Examples 2
  • 3. Overview of Coordinate descent method •Idea •Recall: unconstrained minimization problem •From Lecture 1, the formation of an unconstrained optimization problem is as follows •min푓푥 •Where 푓:푅푛→푅is convex and smooth •In this problem, the necessary and sufficient condition for optimal solution x0 is •훻푓푥=0푎푡푥=푥0 •훻푓푥= 휕푓 휕푥1 풆ퟏ+⋯+ 휕푓 휕푥푛 풆풏=0 •Thus, in this situation, 휕푓 휕푥1=⋯= 휕푓 휕푥푛 =0 •What if minimize for each basis respectively? 3
  • 4. Overview of Coordinate descent method •Description •퐿푒푡푒1,푒2,…,푒푛is basis for function 푓 •If 푥푖 푘is given, the 푖thcoordinate of 푥푖 푘+1is given by •푥푖 푘+1←푎푟푔푚푖푛푦∈R푓(푥1 푘+1,…,푥푖−1 푘+1,푦,푥푖+1 푘,…,푥푛 푘) •푥푖 푘+1overwrites value in 푥푖 푘(in actual implementation) •Algorithm •Initialize with guess 푥=푥1,푥2,…,푥푛 푇 •repeatfor all j in 1,2,…,n do 푥푗←푎푟푔푚푖푛푥푗푓푥 end foruntil convergence 4
  • 5. Overview of Coordinate descent method •Start with some initial guess 푥(0), and repeat for k = 1,2,3… •푥1(푘)∈푎푟푔푚푖푛푥1푓(푥1,푥2 푘−1,푥3 푘−1,…,푥푛 푘−1) •푥2(푘)∈푎푟푔푚푖푛푥2푓(푥1 푘,푥2,푥3 푘−1,…,푥푛 푘−1) •푥3(푘)∈푎푟푔푚푖푛푥3푓(푥1 푘,푥2 푘,푥3,…,푥푛 푘−1) … •푥푛 (푘)∈푎푟푔푚푖푛푥푛푓푥1 푘,푥2 푘,푥3 푘,…,푥푛 •Every iteration, it goes each coordinate basis direction •c.f. Gradient Descent Method •Every iteration (step), it goes 훻푓= 휕푓 휕푥1 풆ퟏ+⋯+ 휕푓 휕푥푛 풆풏direction 5
  • 6. Properties of Coordinate Descent •Note: •Order of cycle through coordinates is arbitrary, can use any permutation of {1,2,…,n} •Cyclic order: 1,2,…,n,1,…, repeat •Almost Cyclic: Each coordinate 1<i<n picked at least once every B successive iterations (B>n) •Double sweep: 1,2,…,n,n-1,…,2,1, repeat •Cyclic with permutation: random order each cycle •Random sampling: pick random index at each iteration •Can everywhere replace individual coordinates with blocks of coordinates (Block Coordinate Descent Method) •“One-at-time” update scheme is critical, and “all-at-once” scheme does not necessarily converge 6
  • 7. Properties of Coordinate Descent •Advantages •Parallel algorithm is possible •No step size tuning •Each iteration usually cheap (single variable optimization) •No extra storage vectors needed •No other pesky parameters (usually) that must be tuned •Works well for large-scale problems •Very useful in cases where the actual gradient of 푓is not known •Easy to implement •Disadvantages •Tricky if single variable optimization is hard •Convergence theory can be complicated •Can be slower near optimum than more sophisticated methods •Non smooth case more tricky 7
  • 8. Convergence of Coordinate descent •Recall: 푥푖 푘+1←푎푟푔푚푖푛푦∈R푓(푥1 푘+1,…,푥푖−1 푘+1,푦,푥푖+1 푘,…,푥푛 푘) •Thus, one beings with an initial 푥0for a local minimum on F, and get a sequence 푿0,푿1,푿2,…iteratively •By doing line search in each iteration, we automatically have •퐹푿0≥퐹푿1≥퐹푿2≥⋯, •It can be shown that this sequence has similar convergence properties as steepest descent •No improvement after one cycle of line search along coordinate directions implies a stationary point is reached 8
  • 9. Convergence Analysis •For continuously differentiable cost functions, it can be shown to generate sequences whose limit points are stationary •Lemma 5.4 •Proof •In the Caramanislecture note •Idea: show that limj→∞ 푥1(푘푗+1) −푥1 푘푗=0using limj→∞ 푧1 푘푗−푥1 푘푗=0 푤ℎ푒푟푒,푧푖 (푘)=(푥1 푘+1,…,푥푖 푘+1,푥푖+1 푘,…,푥푛 (푘)) 9
  • 10. Convergence Analysis •Question •Given convex, differentiable 푓:푅푛→푅, if we are at a point 푥s.t. 푓푥is minimized along each coordinate axis, have we found a global minimizer? •i.e., does 푓푥+푑∙푒푖≥푓푥푓표푟∀푑,푖→푓푥=min 푧 푓푧? •Here, 푒푖=0,…,1,…,0∈푅푛, the 푖-thstandard basis vector •Answer •Yes •Proof •훻푓푥= 휕푓 휕푥1 풆ퟏ+⋯+ 휕푓 휕푥푛 풆풏=0 10
  • 11. Convergence Analysis •Question •Same question but 푓is non differentiable? •Answer •No •Proof: Counterexample 11
  • 12. Convergence Analysis •Question •Same again, but now 푓푥=푔푥+Σ푖=1 푛ℎ푖푥푖 •Where 푔convex, differentiable and each ℎ푖convex? •Here, non-smooth part called separable •Answer •Yes •Proof: for any 푦 •푓푦−푓푥≥훻푔푥푇푦−푥+Σ푖=1 푛ℎ푖푦푖−ℎ푖푥푖 = 푖=1 푛 훻푖푔푥푦푖−푥푖+ℎ푖푦푖−ℎ푖푥푖≥0 12 ≥0
  • 13. Example 13 •Example Matlabcode •Reuse source code from http://www.mathworks.com/matlabcentral/fileexchange/35535- simplified-gradient-descent-optimization
  • 14. 14 END OF DOCUMENT