SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Semi-Supervised Learning 
Lukas Tencer 
PhD student @ ETS
Motivation
Image Similarity 
- Domain of origin 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Face Recognition 
- Cross-race effect 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Motivation in Machine Learning 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Motivation in Machine Learning 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Methodology
When to use Semi-Supervised Learning? 
• Labelled data is hard to get and expensive 
– Speech analysis: 
• Switchboard dataset 
• 400 hours annotation time for 1 hour of speech 
– Natural Language Processing 
• Penn Chinese Treebank 
• 2 Years for 4000 sentences 
– Medical Application 
• Require experts opinion which might not be unique 
• Unlabelled data is cheap 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Types of Semi-Supervised Leaning 
• Transductive Learning 
– Does not generalize to unseen data 
– Produces labels only for the data at training time 
• 1. Assume labels 
• 2. Train classifier on assumed labels 
• Inductive Learning 
– Does generalize to unseen data 
– Not only produces labels, but also the final classifier 
– Manifold Assumption 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Selected Semi-Supervised Algorithms 
• Self-Training 
• Help-Training 
• Transductive SVM (S3VM) 
• Multiview Algorithms 
• Graph-Based Algorithms 
• Generative Models 
• ……. 
….. 
… 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Self-Training 
• The Idea: If I am highly confident in a label of examples, I 
am right 
• Given Training set 푇 = {푥푖 }, and unlabelled set 푈 = {푢푗 } 
1. Train 푓 on 푇 
2. Get predictions 푃 = 푓(푈) 
3. If 푃푖 > 훼 then add (푥, 푓(푥)) to 푇 
4. Retrain 푓 on 푇 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Self-Training 
• Advantages: 
– Very simple and fast method 
– Frequently used in NLP 
• Disadvantages: 
– Amplifies noise in labeled data 
– Requires explicit definition of 푃 푦 푥 
– Hard to implement for discriminative classifiers (SVM) 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Self-Training 
1. Naïve Bayes Classifier on Bag-of-Visual-Word for 2 Classes 
2. Classify Unlabelled Data base on Learned Classifier 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Self-Training 
3. Add the most confident images to the training set 
4. Retrain and repeat 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Help-Training 
• The Challenge: How to make Self-Training work for 
Discriminative Classifiers (SVM) ? 
• The Idea: Train Generative Help Classifier to get 푝(푦|푥) 
• Given Training set 푇 = {푥푖 }, unlabelled set 푈 = {푢푗 }, and 
generative classifier 푔 and discriminative classifier 푓 
1. Train 푓 and 푔 on 푇 
2. Get predictions 푃푔 = 푔(푈) and 푃푓 = 푓(푈) 
3. If 푃푔,푖 > 훼 then add (푥, 푓(푥)) to 푇 
4. Reduce the value of 훼 if |푃푔,푖 > 훼| = 0 
5. Retrain 푓 and 푔 on 푇 until 푈 = 0 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Transductive SVM (S3VM) 
• The Idea: Find largest margin classifier, such that, 
unlabelled data are outside of the margin as much as 
possible, use regularization over unlabelled data 
• Given Training set 푇 = {푥푖 }, and unlabelled set 푈 = {푢푗 } 
1. Find all possible labelings 푈1 ⋯ 푈푛 on 푈 
2. For each 푇 푘 = 푇 ∪ 푈푘 train a standard SVM 
3. Choose SVM with largest margins 
• What is the catch? 
• NP hard problem, fortunately approximations exist 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Transductive SVM (S3VM) 
• Solving non-convex optimization problem: 
퐽 휃 = 
• Methods: 
1 
2 
푤 2 + 푐1 
푥푖∈푇 
퐿(푦푖푓휃 (푥푖 )) + 푐2 
– Local Combinatorial Search 
– Standard unconstrained optimization solvers (CG, BFGS…) 
– Continuation Methods 
– Concave-Convex procedure (CCCP) 
– Branch and Bound 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data :: 
푥푖∈푈 
퐿( 푓휃 (푥푖 ) )
Transductive SVM (S3VM) 
• Advantages: 
– Can be used with any SVM 
– Clear optimization criterion, mathematically well 
formulated 
• Disadvantages: 
– Hard to optimize 
– Prone to local minima – non convex 
– Only small gain given modest assumptions 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Multiview Algorithms 
• The Idea: Train 2 classifiers on 2 disjoint sets of features, 
then let each classifier label unlabelled examples and 
teach the other classifier 
• Given Training set 푇 = {푥푖 }, and unlabelled set 푈 = {푢푗 } 
1. Split 푇 into 푇1 and 푇2 on the feature dimension 
2. Train 푓1 on 푇1 and 푓1 on 푇2 
3. Get predictions 푃1 = 푓1(푈) and 푃2 = 푓2(푈) 
4. Add: top 푘 from 푃1 to 푇2; top 푘 from 푃1 to 푇1 
5. Repeat until 푈 = 0 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Multiview Algorithms 
• Application: Web-page Topic Classification 
– 1. Classifier for Images; 2. Classifier for Text 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Multiview Algorithms 
• Advantages: 
– Simple Method applicable to any classifier 
– Can correct mistakes in classification between the 2 
classifiers 
• Disadvantages: 
– Assumes conditional independence between features 
– Natural split may not exist 
– Artificial split may be complicated if only few eatures 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Graph-Based Algorithms 
• The Idea: Create a connected graph from labelled and 
unlabelled examples, propagate labels over the graph 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Graph-Based Algorithms 
• Advantages: 
– Great performance if graph fits the tasks 
– Can be used in combination with any model 
– Explicit mathematical formulation 
• Disadvantages: 
– Problem if graph does not fit the task 
– Hard to construct graph in sparse spaces 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Generative Models 
• The Idea: Assume distribution using labelled data, update 
using unlabelled data 
• Simple models is: 
GMM + EM 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Generative Models 
• Advantages: 
– Nice probabilistic framework 
– Instead of EM you can go full Bayesian and include 
prior with MAP 
• Disadvantages: 
– EM find only local minima 
– Makes strong assumptions about class distributions 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
What could go wrong? 
• Semi-Supervised Learning make a lot of assumptions 
– Smoothness 
– Clusters 
– Manifolds 
• Some techniques (Co-Training) require very specific 
setup 
• Frequently problem with noisy labels 
• There is no free lunch 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
There is much more out there 
• Structural Learning 
• Co-EM 
• Tri-Training 
• Co-Boosting 
• Unsupervised pretraining – deep learning 
• Transductive Inference 
• Universum Learning 
• Active Learning + Semi-Supervised Learning 
• ……. 
• ….. 
• … 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data :: 
My work
Demo
Conclusion 
• Play with Semi-Supervised Learning 
• Basic methods are vary simple to implement and can give 
you up to 5 to 10% accuracy 
• You can cheat at competitions by using unlabelled data, 
often no assumption is made about external data 
• Be careful when running Semi-Supervised Learning in 
production environment, keep an eye on your algorithm 
• If running in production, be aware that data patterns 
change and old assumptions about labels may screw up 
you new unlabelled data 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
Some more resources 
Videos to watch: 
Semisupervised Learning Approaches – Tom Mitchell CMU : 
http://videolectures.net/mlas06_mitchell_sla/ 
MLSS 2012 Graph based semi-supervised learning - Zoubin 
Ghahramani Cambridge : 
https://www.youtube.com/watch?v=HZQOvm0fkLA 
Books to read: 
• Semi-Supervised Learning – Chapelle, Schölkopf, Zien 
• Introduction to Semi-Supervised Learning - Zhu, Oldberg, 
Brachman, Dietterich 
:: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
THANKS FOR YOUR TIME 
Lukas Tencer 
lukas.tencer@gmail.com 
http://lukastencer.github.io/ 
https://github.com/lukastencer 
https://twitter.com/lukastencer 
Graduating August 2015, looking for ML and DS opportunities

Weitere ähnliche Inhalte

Was ist angesagt?

Random forest
Random forestRandom forest
Random forest
Ujjawal
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learning
butest
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Simplilearn
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
butest
 

Was ist angesagt? (20)

Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-offMachine Learning: Bias and Variance Trade-off
Machine Learning: Bias and Variance Trade-off
 
Random forest
Random forestRandom forest
Random forest
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learning
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 

Ähnlich wie Semi-Supervised Learning

intership summary
intership summaryintership summary
intership summary
Junting Ma
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Lucidworks
 
04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx
Shree Shree
 
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptxLETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
shamsul2010
 

Ähnlich wie Semi-Supervised Learning (20)

How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
intership summary
intership summaryintership summary
intership summary
 
1. Intro DS.pptx
1. Intro DS.pptx1. Intro DS.pptx
1. Intro DS.pptx
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Machinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdfMachinr Learning and artificial_Lect1.pdf
Machinr Learning and artificial_Lect1.pdf
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Supervised machine learning algorithms(strengths and weaknesses)
Supervised machine learning algorithms(strengths and weaknesses)Supervised machine learning algorithms(strengths and weaknesses)
Supervised machine learning algorithms(strengths and weaknesses)
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptxLETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 

Mehr von Lukas Tencer

Mehr von Lukas Tencer (12)

ICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and InteractionICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and Interaction
 
Introduction to Probability
Introduction to ProbabilityIntroduction to Probability
Introduction to Probability
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability Distibution
 
Large Scale Online Learning of Image Similarity Through Ranking
Large Scale Online Learning of Image Similarity Through RankingLarge Scale Online Learning of Image Similarity Through Ranking
Large Scale Online Learning of Image Similarity Through Ranking
 
Slovakia Presentation at Day of Cultures
Slovakia Presentation at Day of CulturesSlovakia Presentation at Day of Cultures
Slovakia Presentation at Day of Cultures
 
Web-based framework for online sketch-based image retrieval
Web-based framework for online sketch-based image retrievalWeb-based framework for online sketch-based image retrieval
Web-based framework for online sketch-based image retrieval
 
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and RetrievalSupervised Learning of Semantic Classes for Image Annotation and Retrieval
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
 
Personal Career,Education and skills presentation, 2011
Personal Career,Education and skills presentation, 2011Personal Career,Education and skills presentation, 2011
Personal Career,Education and skills presentation, 2011
 
Introduction to Computer Graphics, lesson 1
Introduction to Computer Graphics, lesson 1Introduction to Computer Graphics, lesson 1
Introduction to Computer Graphics, lesson 1
 
Computer graphics on web and in mobile devices
Computer graphics on web and in mobile devicesComputer graphics on web and in mobile devices
Computer graphics on web and in mobile devices
 
Telnet and SSH
Telnet and SSHTelnet and SSH
Telnet and SSH
 
Tracking of objects with known color signature - ELITECH 20
Tracking of objects with known color signature - ELITECH 20Tracking of objects with known color signature - ELITECH 20
Tracking of objects with known color signature - ELITECH 20
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Semi-Supervised Learning

  • 1. Semi-Supervised Learning Lukas Tencer PhD student @ ETS
  • 3. Image Similarity - Domain of origin :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 4. Face Recognition - Cross-race effect :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 5. Motivation in Machine Learning :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 6. Motivation in Machine Learning :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 8. When to use Semi-Supervised Learning? • Labelled data is hard to get and expensive – Speech analysis: • Switchboard dataset • 400 hours annotation time for 1 hour of speech – Natural Language Processing • Penn Chinese Treebank • 2 Years for 4000 sentences – Medical Application • Require experts opinion which might not be unique • Unlabelled data is cheap :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 9. Types of Semi-Supervised Leaning • Transductive Learning – Does not generalize to unseen data – Produces labels only for the data at training time • 1. Assume labels • 2. Train classifier on assumed labels • Inductive Learning – Does generalize to unseen data – Not only produces labels, but also the final classifier – Manifold Assumption :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 10. Selected Semi-Supervised Algorithms • Self-Training • Help-Training • Transductive SVM (S3VM) • Multiview Algorithms • Graph-Based Algorithms • Generative Models • ……. ….. … :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 11. Self-Training • The Idea: If I am highly confident in a label of examples, I am right • Given Training set 푇 = {푥푖 }, and unlabelled set 푈 = {푢푗 } 1. Train 푓 on 푇 2. Get predictions 푃 = 푓(푈) 3. If 푃푖 > 훼 then add (푥, 푓(푥)) to 푇 4. Retrain 푓 on 푇 :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 12. Self-Training • Advantages: – Very simple and fast method – Frequently used in NLP • Disadvantages: – Amplifies noise in labeled data – Requires explicit definition of 푃 푦 푥 – Hard to implement for discriminative classifiers (SVM) :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 13. Self-Training 1. Naïve Bayes Classifier on Bag-of-Visual-Word for 2 Classes 2. Classify Unlabelled Data base on Learned Classifier :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 14. Self-Training 3. Add the most confident images to the training set 4. Retrain and repeat :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 15. Help-Training • The Challenge: How to make Self-Training work for Discriminative Classifiers (SVM) ? • The Idea: Train Generative Help Classifier to get 푝(푦|푥) • Given Training set 푇 = {푥푖 }, unlabelled set 푈 = {푢푗 }, and generative classifier 푔 and discriminative classifier 푓 1. Train 푓 and 푔 on 푇 2. Get predictions 푃푔 = 푔(푈) and 푃푓 = 푓(푈) 3. If 푃푔,푖 > 훼 then add (푥, 푓(푥)) to 푇 4. Reduce the value of 훼 if |푃푔,푖 > 훼| = 0 5. Retrain 푓 and 푔 on 푇 until 푈 = 0 :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 16. Transductive SVM (S3VM) • The Idea: Find largest margin classifier, such that, unlabelled data are outside of the margin as much as possible, use regularization over unlabelled data • Given Training set 푇 = {푥푖 }, and unlabelled set 푈 = {푢푗 } 1. Find all possible labelings 푈1 ⋯ 푈푛 on 푈 2. For each 푇 푘 = 푇 ∪ 푈푘 train a standard SVM 3. Choose SVM with largest margins • What is the catch? • NP hard problem, fortunately approximations exist :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 17. Transductive SVM (S3VM) • Solving non-convex optimization problem: 퐽 휃 = • Methods: 1 2 푤 2 + 푐1 푥푖∈푇 퐿(푦푖푓휃 (푥푖 )) + 푐2 – Local Combinatorial Search – Standard unconstrained optimization solvers (CG, BFGS…) – Continuation Methods – Concave-Convex procedure (CCCP) – Branch and Bound :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data :: 푥푖∈푈 퐿( 푓휃 (푥푖 ) )
  • 18. Transductive SVM (S3VM) • Advantages: – Can be used with any SVM – Clear optimization criterion, mathematically well formulated • Disadvantages: – Hard to optimize – Prone to local minima – non convex – Only small gain given modest assumptions :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 19. Multiview Algorithms • The Idea: Train 2 classifiers on 2 disjoint sets of features, then let each classifier label unlabelled examples and teach the other classifier • Given Training set 푇 = {푥푖 }, and unlabelled set 푈 = {푢푗 } 1. Split 푇 into 푇1 and 푇2 on the feature dimension 2. Train 푓1 on 푇1 and 푓1 on 푇2 3. Get predictions 푃1 = 푓1(푈) and 푃2 = 푓2(푈) 4. Add: top 푘 from 푃1 to 푇2; top 푘 from 푃1 to 푇1 5. Repeat until 푈 = 0 :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 20. Multiview Algorithms • Application: Web-page Topic Classification – 1. Classifier for Images; 2. Classifier for Text :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 21. Multiview Algorithms • Advantages: – Simple Method applicable to any classifier – Can correct mistakes in classification between the 2 classifiers • Disadvantages: – Assumes conditional independence between features – Natural split may not exist – Artificial split may be complicated if only few eatures :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 22. Graph-Based Algorithms • The Idea: Create a connected graph from labelled and unlabelled examples, propagate labels over the graph :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 23. Graph-Based Algorithms • Advantages: – Great performance if graph fits the tasks – Can be used in combination with any model – Explicit mathematical formulation • Disadvantages: – Problem if graph does not fit the task – Hard to construct graph in sparse spaces :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 24. Generative Models • The Idea: Assume distribution using labelled data, update using unlabelled data • Simple models is: GMM + EM :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 25. Generative Models • Advantages: – Nice probabilistic framework – Instead of EM you can go full Bayesian and include prior with MAP • Disadvantages: – EM find only local minima – Makes strong assumptions about class distributions :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 26. What could go wrong? • Semi-Supervised Learning make a lot of assumptions – Smoothness – Clusters – Manifolds • Some techniques (Co-Training) require very specific setup • Frequently problem with noisy labels • There is no free lunch :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 27. There is much more out there • Structural Learning • Co-EM • Tri-Training • Co-Boosting • Unsupervised pretraining – deep learning • Transductive Inference • Universum Learning • Active Learning + Semi-Supervised Learning • ……. • ….. • … :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data :: My work
  • 28. Demo
  • 29. Conclusion • Play with Semi-Supervised Learning • Basic methods are vary simple to implement and can give you up to 5 to 10% accuracy • You can cheat at competitions by using unlabelled data, often no assumption is made about external data • Be careful when running Semi-Supervised Learning in production environment, keep an eye on your algorithm • If running in production, be aware that data patterns change and old assumptions about labels may screw up you new unlabelled data :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 30. Some more resources Videos to watch: Semisupervised Learning Approaches – Tom Mitchell CMU : http://videolectures.net/mlas06_mitchell_sla/ MLSS 2012 Graph based semi-supervised learning - Zoubin Ghahramani Cambridge : https://www.youtube.com/watch?v=HZQOvm0fkLA Books to read: • Semi-Supervised Learning – Chapelle, Schölkopf, Zien • Introduction to Semi-Supervised Learning - Zhu, Oldberg, Brachman, Dietterich :: Semi-Supervised Learning :: Lukas Tencer :: MTL Data ::
  • 31. THANKS FOR YOUR TIME Lukas Tencer lukas.tencer@gmail.com http://lukastencer.github.io/ https://github.com/lukastencer https://twitter.com/lukastencer Graduating August 2015, looking for ML and DS opportunities