SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
Metric Learning for Clustering
SCC5945 - Análise Semi-Supervisionada e Não-Supervisionada
de Padrões em Dados
(Seminar)
Sidgley Camargo de Andrade
PhD student in computer science
Institute of Computer Science and Mathematics
University of São Paulo
June 2016
1 / 12
Agenda
Constraint-based algorithms
Motivation
Metrics
Metric learning for clustering
MPCK-means algorithm
References
2 / 12
Constraint-based algorithms
How to help the unsupervised algorithms to find better
solution?
Constraint-based methods– e.g. background knowledge
through pairwise constraints Wagstaff et al. (2001)
Con ⊆ DxD : must-link constraints
Con= ⊆ DxD : cannot-link constraints
Active- and self-learning
Other . . .
Are there “problems” related to algorithms above?
3 / 12
Motivation
Figure: (Basu et al., 2008). Legend [–] must-link [- -] cannot-link
4 / 12
Metrics
The metrics depict the relationships between the data (e.g.
euclidean distance, mahalanobis distance, etc. . . )
What is the right metric?
There are few forms or systemic mechanisms to tweak distance
metrics, and them are often by hand Xing et al. (2003).
5 / 12
Metric learning for clustering
Assumption: keeping dissimilar points far from each other and
similar points closest to each other reduces the risk of errors.
Xing et al. (2003)
Suppose a user indicates that certain points in an input space (say,
n) are considered by them to be “similar” (or “dissimilar”). Can we
automatically learn a distance metric over n that respects these
relationships, i.e., one that assigns small distances between the
similar pairs and greater distances otherwise?
Learn a metric d : nx n → over the input space.
6 / 12
Problem
A simple way is to require that similar pairs (must-linked) have
small distance between them, whereas dissimilar pairs (cannot-link)
have greater distance between them
d(x, y) = dA(x, y) = ||x − y||A = (x − y)T A(x − y)
min
A
(xi ,xj )∈S ||xi − xj ||2
A
s.t. (xi ,xj )∈D ||xi − xj ||2
A ≥ c
A 0
, where A 0 is a constraint that symmetric matrix A must be
positive semi-definite – “pseudo metric” – and c any positive
constant ≥ 1
1
Question for class – Why is constant c positive?
2
Question for class – How to transform to max problem?
7 / 12
Example – Xing et al. (2003)
8 / 12
Metric Pairwise Constraint K-means
(MPCK-means)
Assumes a matrix Ah (metric) for each cluster h
Permits the specification of an individual weight for each constraint
(fM and fC ); the penalty for constraint violations is proportional to
the violated constraints weight
9 / 12
MPCK-means algorithm – Bilenko et al. (2004)
10 / 12
MPCK-means algorithm – Bilenko et al. (2004)
11 / 12
References
Basu, S., Davidson, I., and Wagstaff, K. (2008). Constrained Clustering:
Advances in Algorithms, Theory, and Applications. Chapman &
Hall/CRC, 1 edition.
Bilenko, M., Basu, S., and Mooney, R. J. (2004). Integrating constraints
and metric learning in semi-supervised clustering. In Proceedings of
the Twenty-first International Conference on Machine Learning, ICML
’04, pages 11–, New York, NY, USA. ACM.
Wagstaff, K., Cardie, C., Rogers, S., and Schrödl, S. (2001). Constrained
k-means clustering with background knowledge. In Proceedings of the
Eighteenth International Conference on Machine Learning, ICML ’01,
pages 577–584, San Francisco, CA, USA. Morgan Kaufmann
Publishers Inc.
Xing, E. P., Ng, A. Y., Jordan, M. I., and Russell, S. (2003). Distance
metric learning, with application to clustering with side-information. In
Advances in Neural Information Processing System, pages 505–512.
MIT Press.
12 / 12

Weitere ähnliche Inhalte

Andere mochten auch

論文輪読: Deep neural networks are easily fooled: High confidence predictions for...
論文輪読: Deep neural networks are easily fooled: High confidence predictions for...論文輪読: Deep neural networks are easily fooled: High confidence predictions for...
論文輪読: Deep neural networks are easily fooled: High confidence predictions for...mmisono
 
Distance Metric Learning
Distance Metric LearningDistance Metric Learning
Distance Metric LearningSanghyuk Chun
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric LearningKoji Matsuda
 
Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3
Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3
Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3Daiki Shimada
 
Image net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural NetworksImage net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural NetworksShingo Horiuchi
 
Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)Hirokatsu Kataoka
 
Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料
Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料
Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料tm_2648
 

Andere mochten auch (9)

論文輪読: Deep neural networks are easily fooled: High confidence predictions for...
論文輪読: Deep neural networks are easily fooled: High confidence predictions for...論文輪読: Deep neural networks are easily fooled: High confidence predictions for...
論文輪読: Deep neural networks are easily fooled: High confidence predictions for...
 
Distance Metric Learning
Distance Metric LearningDistance Metric Learning
Distance Metric Learning
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric Learning
 
Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3
Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3
Adversarial Networks の画像生成に迫る @WBAFLカジュアルトーク#3
 
Image net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural NetworksImage net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural Networks
 
Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)Deep Residual Learning (ILSVRC2015 winner)
Deep Residual Learning (ILSVRC2015 winner)
 
20150930
2015093020150930
20150930
 
MIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearningMIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearning
 
Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料
Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料
Deep Convolutional Generative Adversarial Networks - Nextremer勉強会資料
 

Ähnlich wie An Introduction to Metric Learning for Clustering

block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfJunghyun Lee
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsUniversity of Glasgow
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selectionchenhm
 
Chapter5.pdf
Chapter5.pdfChapter5.pdf
Chapter5.pdfsravan66
 
Comparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face RecognitionComparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face Recognitionijdmtaiir
 
A Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAA Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAEditor Jacotech
 
Lecture on linerar discriminatory analysis
Lecture on linerar discriminatory analysisLecture on linerar discriminatory analysis
Lecture on linerar discriminatory analysisdevcb13d
 
theory of computation lecture 01
theory of computation lecture 01theory of computation lecture 01
theory of computation lecture 018threspecter
 
Self-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingSelf-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingHui Yang
 
Teaching Mathematics Concepts via Computer Algebra Systems
Teaching Mathematics Concepts via Computer Algebra SystemsTeaching Mathematics Concepts via Computer Algebra Systems
Teaching Mathematics Concepts via Computer Algebra Systemsinventionjournals
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
Recent Advances in Crop Classification
Recent Advances in Crop ClassificationRecent Advances in Crop Classification
Recent Advances in Crop ClassificationCIMMYT
 
Shriram Nandakumar & Deepa Naik
Shriram Nandakumar & Deepa NaikShriram Nandakumar & Deepa Naik
Shriram Nandakumar & Deepa NaikShriram Nandakumar
 

Ähnlich wie An Introduction to Metric Learning for Clustering (20)

block-mdp-masters-defense.pdf
block-mdp-masters-defense.pdfblock-mdp-masters-defense.pdf
block-mdp-masters-defense.pdf
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
Intro to Model Selection
Intro to Model SelectionIntro to Model Selection
Intro to Model Selection
 
Chapter5.pdf
Chapter5.pdfChapter5.pdf
Chapter5.pdf
 
Clustering
ClusteringClustering
Clustering
 
Comparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face RecognitionComparison on PCA ICA and LDA in Face Recognition
Comparison on PCA ICA and LDA in Face Recognition
 
CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
 
1376846406 14447221
1376846406  144472211376846406  14447221
1376846406 14447221
 
A Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAA Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCA
 
Lecture on linerar discriminatory analysis
Lecture on linerar discriminatory analysisLecture on linerar discriminatory analysis
Lecture on linerar discriminatory analysis
 
theory of computation lecture 01
theory of computation lecture 01theory of computation lecture 01
theory of computation lecture 01
 
Self-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingSelf-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive Modeling
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
mlcourse.ai. Clustering
mlcourse.ai. Clusteringmlcourse.ai. Clustering
mlcourse.ai. Clustering
 
ENS Macrh 2022.pdf
ENS Macrh 2022.pdfENS Macrh 2022.pdf
ENS Macrh 2022.pdf
 
Teaching Mathematics Concepts via Computer Algebra Systems
Teaching Mathematics Concepts via Computer Algebra SystemsTeaching Mathematics Concepts via Computer Algebra Systems
Teaching Mathematics Concepts via Computer Algebra Systems
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
Recent Advances in Crop Classification
Recent Advances in Crop ClassificationRecent Advances in Crop Classification
Recent Advances in Crop Classification
 
Shriram Nandakumar & Deepa Naik
Shriram Nandakumar & Deepa NaikShriram Nandakumar & Deepa Naik
Shriram Nandakumar & Deepa Naik
 
recko_paper
recko_paperrecko_paper
recko_paper
 

Mehr von Federal University of Technology - Paraná/Brazil (UTFPR)

Mehr von Federal University of Technology - Paraná/Brazil (UTFPR) (8)

Situational awareness in social media: lessons learned using information entr...
Situational awareness in social media: lessons learned using information entr...Situational awareness in social media: lessons learned using information entr...
Situational awareness in social media: lessons learned using information entr...
 
Does keyword noise change over space and time? A case study of flood- and rai...
Does keyword noise change over space and time? A case study of flood- and rai...Does keyword noise change over space and time? A case study of flood- and rai...
Does keyword noise change over space and time? A case study of flood- and rai...
 
Mining rainfall spatio-temporal patterns in Twitter: a temporal approach
Mining rainfall spatio-temporal patterns in Twitter: a temporal approachMining rainfall spatio-temporal patterns in Twitter: a temporal approach
Mining rainfall spatio-temporal patterns in Twitter: a temporal approach
 
An introduction to automated analysis of feature models through propositional...
An introduction to automated analysis of feature models through propositional...An introduction to automated analysis of feature models through propositional...
An introduction to automated analysis of feature models through propositional...
 
pSets TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)
pSets TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)pSets TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)
pSets TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)
 
Aulas TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)
Aulas TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)Aulas TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)
Aulas TSI32B - Estrutura, Pesquisa e Ordenação de Dados (TSI UTFPR-Toledo)
 
pSets TSI33A - Banco de Dados I (TSI UTFPR-Toledo)
pSets TSI33A - Banco de Dados I (TSI UTFPR-Toledo)pSets TSI33A - Banco de Dados I (TSI UTFPR-Toledo)
pSets TSI33A - Banco de Dados I (TSI UTFPR-Toledo)
 
Aulas TSI33A - Banco de Dados I (TSI UTFPR-Toledo)
Aulas TSI33A - Banco de Dados I (TSI UTFPR-Toledo)Aulas TSI33A - Banco de Dados I (TSI UTFPR-Toledo)
Aulas TSI33A - Banco de Dados I (TSI UTFPR-Toledo)
 

Kürzlich hochgeladen

Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 

Kürzlich hochgeladen (20)

Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 

An Introduction to Metric Learning for Clustering

  • 1. Metric Learning for Clustering SCC5945 - Análise Semi-Supervisionada e Não-Supervisionada de Padrões em Dados (Seminar) Sidgley Camargo de Andrade PhD student in computer science Institute of Computer Science and Mathematics University of São Paulo June 2016 1 / 12
  • 2. Agenda Constraint-based algorithms Motivation Metrics Metric learning for clustering MPCK-means algorithm References 2 / 12
  • 3. Constraint-based algorithms How to help the unsupervised algorithms to find better solution? Constraint-based methods– e.g. background knowledge through pairwise constraints Wagstaff et al. (2001) Con ⊆ DxD : must-link constraints Con= ⊆ DxD : cannot-link constraints Active- and self-learning Other . . . Are there “problems” related to algorithms above? 3 / 12
  • 4. Motivation Figure: (Basu et al., 2008). Legend [–] must-link [- -] cannot-link 4 / 12
  • 5. Metrics The metrics depict the relationships between the data (e.g. euclidean distance, mahalanobis distance, etc. . . ) What is the right metric? There are few forms or systemic mechanisms to tweak distance metrics, and them are often by hand Xing et al. (2003). 5 / 12
  • 6. Metric learning for clustering Assumption: keeping dissimilar points far from each other and similar points closest to each other reduces the risk of errors. Xing et al. (2003) Suppose a user indicates that certain points in an input space (say, n) are considered by them to be “similar” (or “dissimilar”). Can we automatically learn a distance metric over n that respects these relationships, i.e., one that assigns small distances between the similar pairs and greater distances otherwise? Learn a metric d : nx n → over the input space. 6 / 12
  • 7. Problem A simple way is to require that similar pairs (must-linked) have small distance between them, whereas dissimilar pairs (cannot-link) have greater distance between them d(x, y) = dA(x, y) = ||x − y||A = (x − y)T A(x − y) min A (xi ,xj )∈S ||xi − xj ||2 A s.t. (xi ,xj )∈D ||xi − xj ||2 A ≥ c A 0 , where A 0 is a constraint that symmetric matrix A must be positive semi-definite – “pseudo metric” – and c any positive constant ≥ 1 1 Question for class – Why is constant c positive? 2 Question for class – How to transform to max problem? 7 / 12
  • 8. Example – Xing et al. (2003) 8 / 12
  • 9. Metric Pairwise Constraint K-means (MPCK-means) Assumes a matrix Ah (metric) for each cluster h Permits the specification of an individual weight for each constraint (fM and fC ); the penalty for constraint violations is proportional to the violated constraints weight 9 / 12
  • 10. MPCK-means algorithm – Bilenko et al. (2004) 10 / 12
  • 11. MPCK-means algorithm – Bilenko et al. (2004) 11 / 12
  • 12. References Basu, S., Davidson, I., and Wagstaff, K. (2008). Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC, 1 edition. Bilenko, M., Basu, S., and Mooney, R. J. (2004). Integrating constraints and metric learning in semi-supervised clustering. In Proceedings of the Twenty-first International Conference on Machine Learning, ICML ’04, pages 11–, New York, NY, USA. ACM. Wagstaff, K., Cardie, C., Rogers, S., and Schrödl, S. (2001). Constrained k-means clustering with background knowledge. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, pages 577–584, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. Xing, E. P., Ng, A. Y., Jordan, M. I., and Russell, S. (2003). Distance metric learning, with application to clustering with side-information. In Advances in Neural Information Processing System, pages 505–512. MIT Press. 12 / 12