SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Applying Deep Learning
with Weak and Noisy labels
Darian Frajberg
darian.frajberg@polimi.it
September 21, 2018
2
Introduction
 Traditional Programming
 Machine Learning
3
Introduction
 Deep Learning is a subfield of Machine Learning that has
achieved impressive results outperforming previous
techniques and even humans, thus becoming the state-of-
the-art in a wide range of tasks
 This success was possible due to 3 main factors:
- Processing power
- Data models
- Data
 Computer Vision has been one of the most benefited
areas
4
Introduction
 Computer Vision tasks
Ç
[http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf]
5
Types of Learning
 Supervised learning
- Training data includes desired outputs
 Unsupervised learning
- Training data does not include desired outputs
 Weakly or semi-supervised learning
- Training data includes a few desired outputs
 Reinforcement learning
- Rewards from sequence of actions
[ECE 5984: Introduction to Machine Learning, Dhuruv Batra]
6
Bias – Variance tradeoff
 Bias is an error from erroneous assumptions in the learning algorithm. High bias
can cause an algorithm to simplify and miss the relevant relations between
features and target outputs (underfitting).
 Variance is an error from sensitivity to small fluctuations in the training set. High
variance can cause an algorithm to model the random noise in the training data,
rather than the intended outputs (overfitting).
 Supervised learning with large and clean data can generalize achieving high
accuracy by reducing both bias and variance.
7
Large scale datasets labelling
 Dataset with over 15M labeled images
 Approximately 22k categories
 Collected from web and labeled by Amazon Mechanical Turk
(crowdsourcing tool)
[Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database." Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE
Conference on. Ieee, 2009.]
8
Large scale datasets labelling
 Dataset for autonomous driving with
semantic segmentation pixel-labels
released by Baidu
 Classes: 26
 Size: 200K image frames
 Resolution: 3,384 x 2,710 pixels
 10x bigger than previous data sets
(Kitti and CityScapes)
 200k images x 3,384 x 2,710 pixels =
1,834,128,000,000 pixels
(+1.8 TRILLION pixels!!!)
[Huang, Xinyu, et al. "The ApolloScape Dataset for Autonomous Driving." arXiv preprint arXiv:1803.06184 (2018).]
9
Large scale datasets labelling
 State-of-the-art machine learning models
require massive labeled training sets, which
usually do not exist for real-world
applications
 High quality data characteristics:
- Accuracy
- Completeness
- Consistency
- Uniqueness
- Validity
- Timeliness
 Constraints:
- Labor-intensive
- Expensive
- Tedious
- May require domain expertise
10
Large scale datasets labelling
 Solution:
- Exploit cheaper sources of labels
 Options:
- Publicly available data
- Weakly or semi-supervised approaches
- Data augmentation
- Transfer learning
- Synthetic data generation with GANs
 They can all be combined
 In this presentation we will review some
recent works to learn from large scale
datasets containing weak and noisy labels
11
Classification in the presence of Label Noise: a
Survey (2014) [1/2]
 Classification consists in predicting the classes of new samples by using a model
inferred from training data
 Class label noise (mislabeled instances) is an important issue in classification
problems
 Sources
- Labelers are humans
- Insufficient information provided to labelers
- Labels may be collected by non-experts
- Labeling may be subjective
- Data encoding or communication problems
 Consequences
- Decreased accuracy of predictions
- Increased difficulty to identify relevant features
- Increased inferred models complexity
- Increased number of necessary training samples
- Increased distortion of observed frequencies
12
Classification in the presence of Label Noise: a
Survey (2014) [2/2]
 Taxonomy of methods to deal with label noise
• Label noise-robust models
• Algorithms not to sensitive to label noise
• Common losses are not completely robust
• Label noise handled by avoiding overfitting
• Ensemble methods can improve robustness
• Label noise cleaning methods
• Filter approaches to remove or relabel mislabeled instances
• Anomaly detection through model predictions or ad hoc
measures and thresholds
• Cheap and easy to implement
• Overcleansing may reduce the performance of classifiers
• Label noise-tolerant algorithms
• Probabilistic models can take advantage of prior knowledge
• Label noise can be directly modeled during training
• Increased complexity and overfitting risk
13
Impact of biased mislabeling on learning with deep
networks (2017) [1/2]
 Types of mislabeling data
- Unbiased mislabeling: inconsistent noise
- Biased mislabeling: consistent noise
 Questions
- What is the degree to which mislabeled data affects the accuracy of
classifiers?
- Is it better to provide a large less accurate training set for deep learning or is
it better to put more effort into increasing the accuracy of the data?
 Experiment
- MNIST dataset
- Study the impact of mislabeling ratios (0%-10%-20%-30%-40%-50%) on
different dataset sizes (500 – 8,000 – 55,000)
- Study the impact of unbiased and biased mislabeling
14
Impact of biased mislabeling on learning with deep
networks (2017) [2/2]
 Results
 Conclusion
- Clear tradeoff between data size and amount of mislabeled data
- Deep Learning can model noise intrinsically on large data sets
15
Learning from Noisy Labels with Deep Neural
Networks (2014) [1/2]
 User tags from social web sites and keywords from image search engines are
very noisy labels and unlikely to help training deep neural networks without
additional tricks
 Question
- How to use clean data and high volumes of noisy data for Deep Learning
classification tasks?
 Approach
- Bottom-up noise model
- Reweighting of noisy data
with loss function
16
Learning from Noisy Labels with Deep Neural
Networks (2014) [2/2]
 Experiment
- ImageNet dataset + Web Image Search
- Study the impact of using 1.3M images with clean labels over 1,000 classes
(ImageNet) and 1.4M noisy labeled images (scraped from Internet)
 Results
17
Learning from Weak and Noisy Labels for Semantic
Segmentation (2017) [1/2]
 Weakly supervised semantic segmentation (WSS) aims to learn a segmentation
model from weak (image-level) as opposed to strong (pixel-level) labels
 User-tagged images from media sharing sites (e.g. Flicker) can be exploited as
unlimited supply of noisy labeled data
 Approach
- Label noise reduction technique (classes and super-pixels)
18
Learning from Weak and Noisy Labels for Semantic
Segmentation (2017) [2/2]
 Experiment
- PASCAL VOC dataset
- Study the impact of the proposed approach for WSSS
- Study the impact of noisy class labels for WSSS
 Results
19
Weakly Supervised Instance Segmentation using
Class Peak Response (2018) [1/2]
 Weakly supervised instance segmentation (WSIS) is similar to weakly
supervised semantic segmentation (WSSS), but it also aims to predict
precise masks for each object instance.
 Approach
20
Weakly Supervised Instance Segmentation using
Class Peak Response (2018) [2/2]
 Experiment
- PASCAL VOC dataset
- Study the impact of the proposed approach for WSIS
 Results
21
Learning to Segment Every Thing (2018) [1/3]
 Partially supervised instance segmentation (PSIS) aims at using bounding
box annotations to train a model for instance segmentation
 Approach
- Transfer learning
22
Learning to Segment Every Thing (2018) [2/3]
 Experiment
- COCO dataset (80 classes)
- Study the impact of the proposed approach for PSIS
- Split dataset (20/60 – 60/20)
- Quantitative evaluation
 Results
23
Learning to Segment Every Thing (2018) [3/3]
 Experiment
- COCO dataset (80 classes)
- Genome dataset (3,000 classes)
- Study the impact of the proposed approach for PSIS
- Split dataset (80/3,000)
- Qualitative evaluation
 Results
- Green boxes: instances with
instance mask annotations
- Red Boxes: instances with only
bounding boxes
24
Label Refinery: Improving ImageNet Classification
through Label Progression (2018) [1/2]
 Supervised learning relies on three main components: data, labels and
models. However, labels, which are often incomplete, ambiguous and
redundant, are not usually studied.
 Question
- How to modify labels during training so as to improve generalization and
accuracy of learning models?
 Approach
- Iterative soft, multicategory, dynamically-generated label refinery
25
Label Refinery: Improving ImageNet Classification
through Label Progression (2018) [2/2]
 Experiment
- ImageNet dataset
- Study the impact of the proposed approach on state-of-the-art architectures
 Results
26
Conclusions
 Data collection for supervised learning
tasks is not always feasible
 There are many alternatives to exploit
public available noisy data
 Weakly supervised learning techniques
can be very effective
 Supervised learning can be boosted by
complementing it with the presented
techniques
27
Thank you
Applying Deep Learning with
Weak and Noisy Labels
Darian Frajberg
darian.frajberg@polimi.it

Weitere ähnliche Inhalte

Was ist angesagt?

Using binary classifiers
Using binary classifiersUsing binary classifiers
Using binary classifiers
butest
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
Lei Guo
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 

Was ist angesagt? (20)

Using binary classifiers
Using binary classifiersUsing binary classifiers
Using binary classifiers
 
K means
K meansK means
K means
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Introduction to Few shot learning
Introduction to Few shot learningIntroduction to Few shot learning
Introduction to Few shot learning
 
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
Content based filtering
Content based filteringContent based filtering
Content based filtering
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)Deep learning based recommender systems (lab seminar paper review)
Deep learning based recommender systems (lab seminar paper review)
 
Data Augmentation
Data AugmentationData Augmentation
Data Augmentation
 
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
[CVPR2022, LongVersion] Online Continual Learning on a Contaminated Data Stre...
 

Ähnlich wie Applying Deep Learning with Weak and Noisy labels

FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
Editor IJMTER
 
Practical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaitonPractical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaiton
RyuichiKanoh
 
Content Based Image Retrieval Based on Color: A Survey
Content Based Image Retrieval Based on Color: A SurveyContent Based Image Retrieval Based on Color: A Survey
Content Based Image Retrieval Based on Color: A Survey
Eswar Publications
 

Ähnlich wie Applying Deep Learning with Weak and Noisy labels (20)

IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
 
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
 
On the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity DataOn the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity Data
 
Clustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesClustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining Techniques
 
A Review of Image Classification Techniques
A Review of Image Classification TechniquesA Review of Image Classification Techniques
A Review of Image Classification Techniques
 
A Modified CNN-Based Face Recognition System
A Modified CNN-Based Face Recognition SystemA Modified CNN-Based Face Recognition System
A Modified CNN-Based Face Recognition System
 
A Modified CNN-Based Face Recognition System
A Modified CNN-Based Face Recognition SystemA Modified CNN-Based Face Recognition System
A Modified CNN-Based Face Recognition System
 
A Modified CNN-Based Face Recognition System
A Modified CNN-Based Face Recognition SystemA Modified CNN-Based Face Recognition System
A Modified CNN-Based Face Recognition System
 
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 
[Paper Review] Personalized Top-N Sequential Recommendation via Convolutional...
[Paper Review] Personalized Top-N Sequential Recommendation via Convolutional...[Paper Review] Personalized Top-N Sequential Recommendation via Convolutional...
[Paper Review] Personalized Top-N Sequential Recommendation via Convolutional...
 
Week 4 advanced labeling, augmentation and data preprocessing
Week 4   advanced labeling, augmentation and data preprocessingWeek 4   advanced labeling, augmentation and data preprocessing
Week 4 advanced labeling, augmentation and data preprocessing
 
06522405
0652240506522405
06522405
 
Noise-robust classification with hypergraph neural network
Noise-robust classification with hypergraph neural networkNoise-robust classification with hypergraph neural network
Noise-robust classification with hypergraph neural network
 
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILES
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILESAN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILES
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILES
 
An efficient feature selection in
An efficient feature selection inAn efficient feature selection in
An efficient feature selection in
 
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
IRJET - Effective Workflow for High-Performance Recognition of Fruits using M...
 
Practical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaitonPractical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaiton
 
Content Based Image Retrieval Based on Color: A Survey
Content Based Image Retrieval Based on Color: A SurveyContent Based Image Retrieval Based on Color: A Survey
Content Based Image Retrieval Based on Color: A Survey
 
A Web-based Attendance System Using Face Recognition
A Web-based Attendance System Using Face RecognitionA Web-based Attendance System Using Face Recognition
A Web-based Attendance System Using Face Recognition
 

Kürzlich hochgeladen

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
anilsa9823
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
University of Hertfordshire
 

Kürzlich hochgeladen (20)

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 

Applying Deep Learning with Weak and Noisy labels

  • 1. Applying Deep Learning with Weak and Noisy labels Darian Frajberg darian.frajberg@polimi.it September 21, 2018
  • 3. 3 Introduction  Deep Learning is a subfield of Machine Learning that has achieved impressive results outperforming previous techniques and even humans, thus becoming the state-of- the-art in a wide range of tasks  This success was possible due to 3 main factors: - Processing power - Data models - Data  Computer Vision has been one of the most benefited areas
  • 4. 4 Introduction  Computer Vision tasks Ç [http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf]
  • 5. 5 Types of Learning  Supervised learning - Training data includes desired outputs  Unsupervised learning - Training data does not include desired outputs  Weakly or semi-supervised learning - Training data includes a few desired outputs  Reinforcement learning - Rewards from sequence of actions [ECE 5984: Introduction to Machine Learning, Dhuruv Batra]
  • 6. 6 Bias – Variance tradeoff  Bias is an error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to simplify and miss the relevant relations between features and target outputs (underfitting).  Variance is an error from sensitivity to small fluctuations in the training set. High variance can cause an algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).  Supervised learning with large and clean data can generalize achieving high accuracy by reducing both bias and variance.
  • 7. 7 Large scale datasets labelling  Dataset with over 15M labeled images  Approximately 22k categories  Collected from web and labeled by Amazon Mechanical Turk (crowdsourcing tool) [Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database." Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. Ieee, 2009.]
  • 8. 8 Large scale datasets labelling  Dataset for autonomous driving with semantic segmentation pixel-labels released by Baidu  Classes: 26  Size: 200K image frames  Resolution: 3,384 x 2,710 pixels  10x bigger than previous data sets (Kitti and CityScapes)  200k images x 3,384 x 2,710 pixels = 1,834,128,000,000 pixels (+1.8 TRILLION pixels!!!) [Huang, Xinyu, et al. "The ApolloScape Dataset for Autonomous Driving." arXiv preprint arXiv:1803.06184 (2018).]
  • 9. 9 Large scale datasets labelling  State-of-the-art machine learning models require massive labeled training sets, which usually do not exist for real-world applications  High quality data characteristics: - Accuracy - Completeness - Consistency - Uniqueness - Validity - Timeliness  Constraints: - Labor-intensive - Expensive - Tedious - May require domain expertise
  • 10. 10 Large scale datasets labelling  Solution: - Exploit cheaper sources of labels  Options: - Publicly available data - Weakly or semi-supervised approaches - Data augmentation - Transfer learning - Synthetic data generation with GANs  They can all be combined  In this presentation we will review some recent works to learn from large scale datasets containing weak and noisy labels
  • 11. 11 Classification in the presence of Label Noise: a Survey (2014) [1/2]  Classification consists in predicting the classes of new samples by using a model inferred from training data  Class label noise (mislabeled instances) is an important issue in classification problems  Sources - Labelers are humans - Insufficient information provided to labelers - Labels may be collected by non-experts - Labeling may be subjective - Data encoding or communication problems  Consequences - Decreased accuracy of predictions - Increased difficulty to identify relevant features - Increased inferred models complexity - Increased number of necessary training samples - Increased distortion of observed frequencies
  • 12. 12 Classification in the presence of Label Noise: a Survey (2014) [2/2]  Taxonomy of methods to deal with label noise • Label noise-robust models • Algorithms not to sensitive to label noise • Common losses are not completely robust • Label noise handled by avoiding overfitting • Ensemble methods can improve robustness • Label noise cleaning methods • Filter approaches to remove or relabel mislabeled instances • Anomaly detection through model predictions or ad hoc measures and thresholds • Cheap and easy to implement • Overcleansing may reduce the performance of classifiers • Label noise-tolerant algorithms • Probabilistic models can take advantage of prior knowledge • Label noise can be directly modeled during training • Increased complexity and overfitting risk
  • 13. 13 Impact of biased mislabeling on learning with deep networks (2017) [1/2]  Types of mislabeling data - Unbiased mislabeling: inconsistent noise - Biased mislabeling: consistent noise  Questions - What is the degree to which mislabeled data affects the accuracy of classifiers? - Is it better to provide a large less accurate training set for deep learning or is it better to put more effort into increasing the accuracy of the data?  Experiment - MNIST dataset - Study the impact of mislabeling ratios (0%-10%-20%-30%-40%-50%) on different dataset sizes (500 – 8,000 – 55,000) - Study the impact of unbiased and biased mislabeling
  • 14. 14 Impact of biased mislabeling on learning with deep networks (2017) [2/2]  Results  Conclusion - Clear tradeoff between data size and amount of mislabeled data - Deep Learning can model noise intrinsically on large data sets
  • 15. 15 Learning from Noisy Labels with Deep Neural Networks (2014) [1/2]  User tags from social web sites and keywords from image search engines are very noisy labels and unlikely to help training deep neural networks without additional tricks  Question - How to use clean data and high volumes of noisy data for Deep Learning classification tasks?  Approach - Bottom-up noise model - Reweighting of noisy data with loss function
  • 16. 16 Learning from Noisy Labels with Deep Neural Networks (2014) [2/2]  Experiment - ImageNet dataset + Web Image Search - Study the impact of using 1.3M images with clean labels over 1,000 classes (ImageNet) and 1.4M noisy labeled images (scraped from Internet)  Results
  • 17. 17 Learning from Weak and Noisy Labels for Semantic Segmentation (2017) [1/2]  Weakly supervised semantic segmentation (WSS) aims to learn a segmentation model from weak (image-level) as opposed to strong (pixel-level) labels  User-tagged images from media sharing sites (e.g. Flicker) can be exploited as unlimited supply of noisy labeled data  Approach - Label noise reduction technique (classes and super-pixels)
  • 18. 18 Learning from Weak and Noisy Labels for Semantic Segmentation (2017) [2/2]  Experiment - PASCAL VOC dataset - Study the impact of the proposed approach for WSSS - Study the impact of noisy class labels for WSSS  Results
  • 19. 19 Weakly Supervised Instance Segmentation using Class Peak Response (2018) [1/2]  Weakly supervised instance segmentation (WSIS) is similar to weakly supervised semantic segmentation (WSSS), but it also aims to predict precise masks for each object instance.  Approach
  • 20. 20 Weakly Supervised Instance Segmentation using Class Peak Response (2018) [2/2]  Experiment - PASCAL VOC dataset - Study the impact of the proposed approach for WSIS  Results
  • 21. 21 Learning to Segment Every Thing (2018) [1/3]  Partially supervised instance segmentation (PSIS) aims at using bounding box annotations to train a model for instance segmentation  Approach - Transfer learning
  • 22. 22 Learning to Segment Every Thing (2018) [2/3]  Experiment - COCO dataset (80 classes) - Study the impact of the proposed approach for PSIS - Split dataset (20/60 – 60/20) - Quantitative evaluation  Results
  • 23. 23 Learning to Segment Every Thing (2018) [3/3]  Experiment - COCO dataset (80 classes) - Genome dataset (3,000 classes) - Study the impact of the proposed approach for PSIS - Split dataset (80/3,000) - Qualitative evaluation  Results - Green boxes: instances with instance mask annotations - Red Boxes: instances with only bounding boxes
  • 24. 24 Label Refinery: Improving ImageNet Classification through Label Progression (2018) [1/2]  Supervised learning relies on three main components: data, labels and models. However, labels, which are often incomplete, ambiguous and redundant, are not usually studied.  Question - How to modify labels during training so as to improve generalization and accuracy of learning models?  Approach - Iterative soft, multicategory, dynamically-generated label refinery
  • 25. 25 Label Refinery: Improving ImageNet Classification through Label Progression (2018) [2/2]  Experiment - ImageNet dataset - Study the impact of the proposed approach on state-of-the-art architectures  Results
  • 26. 26 Conclusions  Data collection for supervised learning tasks is not always feasible  There are many alternatives to exploit public available noisy data  Weakly supervised learning techniques can be very effective  Supervised learning can be boosted by complementing it with the presented techniques
  • 27. 27 Thank you Applying Deep Learning with Weak and Noisy Labels Darian Frajberg darian.frajberg@polimi.it

Hinweis der Redaktion

  1. <number> In recent years, Deep Learning has achieved outstanding results outperforming previous techniques and even humans, thus becoming the state-of-the-art in a wide range of tasks, among which Computer Vision has been one of the most benefited areas. Nonetheless, most of this success is tightly coupled to strongly supervised learning tasks, which require highly accurate, expensive and labor-intensive defined ground truth labels. In this presentation, we will introduce diverse alternatives to deal with this problem and support the training of Deep Learning models for Computer Vision tasks by simplifying the process of data labelling or exploiting the unlimited supply of publicly available data in Internet (such as user-tagged images from Flickr). Such alternatives rely on data comprising noisy and weak labels, which are much easier to collect but require special care to be used.
  2. Underfitting = high bias, low variance Overfitting = low bias, high variance Small dataset = high bias = underfitting Random Noise = high variance= overfitting Large and clean data = more relevant relations to represent + less noise Ideally, one wants to choose a model that both accurately captures the regularities in its training data, but also generalizes well to unseen data. <number>
  3. Completeness •Are all data sets and data items recorded? Consistency •Can we match the data set across data stores? Uniqueness •Is there a single view of the data set? Validity •Does the data match the rules? Accuracy •Does the data reflect the data set? <number>
  4. mislabeled instances Class noise Feature noise <number>
  5. Label noise cleansing can be applied by detecting possible outliers/mislabeled instances and applying semi-supervised learning or submitting them for human review Filtering mechanisms can introduce bias in the data mislabeled instances Class noise Feature noise Manual review Costly and time consuming Not scalable <number>
  6. <number>
  7. <number>
  8. and this can be achieved by adding a linear layer on top of the softmax layer, which aims at learning and modeling the noise distribution. Noise distribution estimation Train CNN on clean data Generate confusion matrix on clean data Generate confusion matrix on noisy data Calculate noise distribution from their difference Train a new model <number>
  9. <number>
  10. <number>
  11. CVPR - State of the art in weakly instance segmentation This technique exploits the class response maps extracted from classification neural networks, which are generally visualized for interpretability purposes by localizing in a fully convolutional manner the image regions that motivate the inferred classes selection with their corresponding confidence values. Local maximums peaks usually correspond to informative regions of instances <number>
  12. CVPR - State of the art in weakly instance segmentation <number>
  13. The authors present an approach suitable for scaling up instance segmentation to large scale scenarios, where data collection for fully supervised learning becomes way too expensive. <number>
  14. This work proposes very simple, effective and generalized techniques to refine the quality of labels for image classification problems and is able to significantly improve the accuracy of many state of the art models (e.g., AlexNet, MobileNet, VGG19, ResNet, etc). <number>