SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Downloaden Sie, um offline zu lesen
Active Learning
Ragib Ahsan
Committee
Prof. Xinhua Zhang (Chair)
Prof. Brian Ziebart
Prof. Jon A Solworth
Overview
● What is active learning?
● Does active learning make any difference?
● Active learning from multiple oracles
● Active learning with weak and strong oracle
● Multiple oracles with varying expertise
2
What is Active Learning?
● Introduced in Education by 1990s
● Let students participate actively
● Doing things rather than just listening
● Inspired machine learning
● Also known as Query Learning
3
Contrast to passive learning
Passive Learning Active Learning
4
Applications
● Fewer labeled data
● Speech Recognition
○ Word level annotation can take ten times longer
than actual audio (Zhu, 2005)
● Medical Diagnosis
○ Expert doctors
● Document Classification
5
Active Learning Examples
Pool based active learning (Settles, 2009) 6
Active Learning Examples
a) Toy dataset, two Gaussians b) logistic regression model produces 70% accuracy c) logistic
regression with active querying produces 90% accuracy (Settles, 2009)
7
Human Active Learning
[Source: JSLHR]
8
Does AL make any difference?
“Learners do benefit from the
opportunity to actively select
examples during learning. But
It is very difficult to asses the
magnitude of difference that
active learning makes
compared to passive learning”
Laughlin (1973)
There were conflicting claims
throughout the literature on
the effectiveness of active
learning
9
Does AL make any difference?
“People make inappropriate
queries to assess simple logical
hypotheses such as if p then q
(frequently examining q
instances to see if they are p, and
failing to explore not-q instances”
Wason et al. (1972)
“If the learning task is properly
construed, human actually do a
great job in asking questions”
Gigerenzer et al.(2002)
Oaksford et al. (2007)
10
Does AL make any difference?
Castro et al. (2008) addressed these questions:
[Q1] Do humans perform better when they can select their own examples for labeling,
compared to passive observation of labeled examples?
[Q2] If so, do they achieve the full benefit of active learning suggested by statistical
learning theory?
[Q3] If they do not, can machine learning be used to enhance human performance?
[Q4] Do the answers to these questions vary depending upon the difficulty of the
learning problem?
11
Task Formulation
● Binary Classification in interval [0,1]
● Unknown decision boundary,
● 0 and 1 class
● n samples
● Xi
[0, 1], Yi
{0, 1}
● Yi
is correct with probability 1 − ε
● 0 ≤ ε < 1/2
12
[Source: Castro et. al. (2008)]
Error bound (ε = 0)
● Passive Learning
○ Random sampling
○ Error: O(1/n)
● Active Learning
○ Binary search
○ Error: O(2-n
)
13
[Source: Castro et. al. (2008)]
Error bound (ε > 0)
● Passive learning
● Active learning
[ Maximum Likelihood Estimate ]
14
[Source: Castro et. al. (2008)]
Experiment
A few 3D visual stimuli and their X values used in our experiment.
Participant was asked to guess the decision boundary
after every three iterations
15
Experiment
● Random
○ No queries
● Human Active
○ Active queries
● Machine Yoked
○ Machine makes query
○ Human observes
16
Results
Iteration, n
17[Source: Castro et. al. (2008)]
Answers
[Q1] Do humans perform better when they can select their own examples for labeling,
compared to passive observation of labeled examples? - Yes, in low noise levels
[Q2] If so, do they achieve the full benefit of active learning suggested by statistical
learning theory? - No, slower decay constants
[Q3] If they do not, can machine learning be used to enhance human performance? -
Inconclusive
[Q4] Do the answers to these questions vary depending upon the difficulty of the
learning problem? - Yes, with noise levels
18
Conclusion
● Simple learning task
● Machine Yoked Learning
● Impact on:
○ Fields of psychology and cognitive sciences
○ Intelligent tutoring systems
19
AL from multiple oracles
20
Why multiple oracles?
21
Multiple Oracle: Challenges
● How to select the most informative query?
● How to select the best oracle to ask questions?
● How to deal with disagreement among the
oracles?
● How to deal with a noisy or weak oracle?
22
Weak and strong labeler
● Zhang et al. (2015) considered exactly two oracles
● One standard oracle
○ Accurate but costly
● One weak oracle
○ Noisy but cheap
● Goal
○ Reduce number of queries to standard oracle
○ No impact on accuracy
23
Observations
● Difference Classifier to predict disagreement between
strong and weak labeler
○ Might not be statistically consistent
○ Can use cost-sensitive difference classifier
● Active learning queries a localized region of space
○ Train difference classifier on that localized region
24
Disagreement Based Active Learning (DBAL)
Vt
X
h1
h2
h7
h6
h3
h5
h4
h*
x1
x2
x8
x3
x6
x5
x7
x4
h1
(x1
) = h2
(x1
) = h3
(x1
) = h4
(x1
) = h4
(x1
)
h1
(x3
) != h2
(x3
) = h3
(x3
) = h4
(x3
) = h5
(x3
)
h1
(x4
) = h2
(x4
) = h3
(x4
) = h4
(x4
) = h5
(x4
)
query x3
O . . . . . . . . . . .
update
25
Problem Formulation
● Unlabeled Distribution, U
● Input space, X
● Label space, Y
● Hypothesis class, H
● Data distribution, D
● Excess error,
● Goal:
with as few queries to O as possible
Strong
Oracle
O
Weak
Oracle
W
26
Algorithm
● Three key ideas
○ Difference classifier
○ Disagreement region DIS(V)
■ Region of the input space
where two member
classifiers disagree
○ Epoch based agnostic CAL
■ Train fresh difference
classifier in each epoch
27
[Source: Theory of Active Learning
(Steve Hanneke, 2014)]
Algorithm
● Initialize error 0
, total number of epochs k0
and draw some n0
examples
to form labeled dataset S0
● In each iteration up to k’ iterations:
○ Set target error
○ Draw nk
unlabeled samples
○ Identify disagreement region Ak
○ Train difference classifier hdf with Ak
, O, W
○ Active learning using hdf
■ Draw mk examples, use hdf
and query either O or W. Add the labeled data
to Sk
● Return a classifier learned from the labeled dataset Sk’
28
Performance Guarantee
● First term for learning, second for training difference classifier
● Second term is lower order term when d ≈ d’
● Fitting the difference classifier does not incur a high overhead
29
AL from crowds
30
AL from crowds
● Multiple experts in supervised learning (Raykar et al.,
2009 and Yan et al., 2010)
● NLP tasks from AMT data (Snow et al., 2008)
● Yan et al., 2011 proposed a novel method in active
learning
● Focus:
○ Most informative query
○ Most useful annotator
31
Proposed Model
32
[Source: Yan et. al. (2011)]
Proposed Model
(3.3)
33
Algorithm
● Two key steps
○ Select a sample to label next
○ Select the best annotator to label
● Select sample
○ Uncertainty sampling
■ Select the sample for which classifier is least
certain about
34
Algorithm: Select Sample
Where, and (ᾶ > 0)
Separating hyperplane:
35
Algorithm: Select Annotator
(3.6)
36
Algorithm
37
Experiment
(left) Labels, (center) Areas of Labeler expertise and (right) annotator selection information for the
simplified two dimensional Galaxy Dim Data (Yan et al., 2011)
38
Experiment: Baselines
● active learning+majority vote
○ Active query based on majority vote of all annotators
● random sample+multi-labeler
○ Multi labeler algorithm on randomly sampled
examples
● random sample+majority vote
○ Random sampling with majority vote
39
Experimental Result
Accuracy comparisons on text data for the polarity, focus and the evidence labelings (Yan et al., 2011)
40
More Analyses
● Decision boundary intersects
all region of expertise
● Comparison with single oracle
AL
● Specialized vs General
expertise
41
[Source: Yan et. al. (2011)]
Future Direction
● More Applications
○ Real world problems
● Optimal number of oracles
○ Does multiple oracles always performs better than single oracle?
○ Is there an optimal number of oracles that works best?
● Cost function associated with labeling
○ Choose single vs multiple oracles
● General expertise
○ Each of multiple oracles have general expertise
42
References
● Castro, Rui M. et al. (2008). “Human Active Learning”. In: NIPS.
● Gigerenzer, Gerd and Reinhard Selten (2002). Bounded rationality: The
adaptive toolbox. MIT press.
● Laughlin, Patrick R. (1973). “Focusing strategy in concept attainment as a
function of instructions and task”. In: Journal of Experimental Psychology.
● Oaksford, Mike and Nick Chater (2007). Bayesian rationality: The
probabilistic approach to human reasoning. Oxford University Press.
● Raykar, Vikas C. et al. (2009). “Supervised learning from multiple experts:
whom to trust when everyone lies a bit”. In: ICML.
● Settles, Burr (2009). Active Learning Literature Survey. Computer Sciences
Technical Report 1648. University of Wisconsin–Madison.
43
References
● Snow, Rion et al. (2008). “Cheap and Fast - But is it Good? Evaluating
Non-Expert Annotations for Natural Language Tasks”. In: EMNLP.
● Wason, Peter Cathcart and Philip N Johnson-Laird (1972). Psychology of
reasoning: Structure and content. Vol. 86. Harvard University Press.
● Yan, Yan et al. (2010). “Modeling annotator expertise: Learning when
everybody knows a bit of something”. In: AISTATS.
● Yan, Yan et al. (2011). “Active Learning from Crowds”. In: ICML.
● Zhang, Chicheng and Kamalika Chaudhuri (2015). “Active Learning from
Weak and Strong Labelers”. In: NIPS.
● Zhu, Xiaojin (2005). “Semi-supervised Learning with Graphs”. AAI3179046.
PhD thesis. Pittsburgh, PA, USA
● Hanneke, Steve (2014). “Theory of Active Learning”
44
Questions?
45
Appendix: HAL Results
46
Appendix: WeakStrong Algorithm
47
Appendix: WeakStrong Algorithm
48
Appendix: WeakStrong Performance Guarantee
49
Appendix: Agnostic CAL
50

Weitere ähnliche Inhalte

Was ist angesagt?

Spiking neural network: an introduction I
Spiking neural network: an introduction ISpiking neural network: an introduction I
Spiking neural network: an introduction IDalin Zhang
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
Continual Learning: why, how, and when
Continual Learning: why, how, and whenContinual Learning: why, how, and when
Continual Learning: why, how, and whenGabriele Graffieti
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronMostafa G. M. Mostafa
 
Multi armed bandit
Multi armed banditMulti armed bandit
Multi armed banditJie-Han Chen
 
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkKnoldus Inc.
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1Srinivasan R
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed BanditsYan Xu
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Modelspetitegeek
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deploymenttaeseon ryu
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 

Was ist angesagt? (20)

Spiking neural network: an introduction I
Spiking neural network: an introduction ISpiking neural network: an introduction I
Spiking neural network: an introduction I
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks04 Multi-layer Feedforward Networks
04 Multi-layer Feedforward Networks
 
Continual Learning: why, how, and when
Continual Learning: why, how, and whenContinual Learning: why, how, and when
Continual Learning: why, how, and when
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
Multi armed bandit
Multi armed banditMulti armed bandit
Multi armed bandit
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Neural Networks: Introducton
Neural Networks: IntroductonNeural Networks: Introducton
Neural Networks: Introducton
 
Perceptron & Neural Networks
Perceptron & Neural NetworksPerceptron & Neural Networks
Perceptron & Neural Networks
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 

Ähnlich wie Active learning

Teacher-Aware Active Robot Learning
Teacher-Aware Active Robot LearningTeacher-Aware Active Robot Learning
Teacher-Aware Active Robot LearningMattia Racca
 
TS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of TechnologyTS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of TechnologyJawad Haqbeen
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowJulián Urbano
 
Data Analytics.03. Data processing
Data Analytics.03. Data processingData Analytics.03. Data processing
Data Analytics.03. Data processingAlex Rayón Jerez
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesHendrik Drachsler
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation Deep Learning Italia
 
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...Agnieszka (Aga) Palalas, Ed.D.
 
Active Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task SelectionActive Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task SelectionCarsten Eickhoff
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments NASIG
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionClaudio Greco
 
Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...Elisavet Andrikopoulou
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionAlessandro Suglia
 
Resonance Introduction at SacPy
Resonance Introduction at SacPyResonance Introduction at SacPy
Resonance Introduction at SacPymoorepants
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdfVincenzo Lomonaco
 

Ähnlich wie Active learning (20)

Teacher-Aware Active Robot Learning
Teacher-Aware Active Robot LearningTeacher-Aware Active Robot Learning
Teacher-Aware Active Robot Learning
 
XAI (IIT-Patna).pdf
XAI (IIT-Patna).pdfXAI (IIT-Patna).pdf
XAI (IIT-Patna).pdf
 
TS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of TechnologyTS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of Technology
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and How
 
Data Analytics.03. Data processing
Data Analytics.03. Data processingData Analytics.03. Data processing
Data Analytics.03. Data processing
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for Universities
 
Deep Meta Learning
Deep Meta Learning Deep Meta Learning
Deep Meta Learning
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation
 
Mental rotation skills
Mental rotation skillsMental rotation skills
Mental rotation skills
 
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...
 
Active Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task SelectionActive Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task Selection
 
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience APIXiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Resonance Introduction at SacPy
Resonance Introduction at SacPyResonance Introduction at SacPy
Resonance Introduction at SacPy
 
MILA DL & RL summer school highlights
MILA DL & RL summer school highlights MILA DL & RL summer school highlights
MILA DL & RL summer school highlights
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
 

Kürzlich hochgeladen

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 

Kürzlich hochgeladen (20)

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Active learning

  • 1. Active Learning Ragib Ahsan Committee Prof. Xinhua Zhang (Chair) Prof. Brian Ziebart Prof. Jon A Solworth
  • 2. Overview ● What is active learning? ● Does active learning make any difference? ● Active learning from multiple oracles ● Active learning with weak and strong oracle ● Multiple oracles with varying expertise 2
  • 3. What is Active Learning? ● Introduced in Education by 1990s ● Let students participate actively ● Doing things rather than just listening ● Inspired machine learning ● Also known as Query Learning 3
  • 4. Contrast to passive learning Passive Learning Active Learning 4
  • 5. Applications ● Fewer labeled data ● Speech Recognition ○ Word level annotation can take ten times longer than actual audio (Zhu, 2005) ● Medical Diagnosis ○ Expert doctors ● Document Classification 5
  • 6. Active Learning Examples Pool based active learning (Settles, 2009) 6
  • 7. Active Learning Examples a) Toy dataset, two Gaussians b) logistic regression model produces 70% accuracy c) logistic regression with active querying produces 90% accuracy (Settles, 2009) 7
  • 9. Does AL make any difference? “Learners do benefit from the opportunity to actively select examples during learning. But It is very difficult to asses the magnitude of difference that active learning makes compared to passive learning” Laughlin (1973) There were conflicting claims throughout the literature on the effectiveness of active learning 9
  • 10. Does AL make any difference? “People make inappropriate queries to assess simple logical hypotheses such as if p then q (frequently examining q instances to see if they are p, and failing to explore not-q instances” Wason et al. (1972) “If the learning task is properly construed, human actually do a great job in asking questions” Gigerenzer et al.(2002) Oaksford et al. (2007) 10
  • 11. Does AL make any difference? Castro et al. (2008) addressed these questions: [Q1] Do humans perform better when they can select their own examples for labeling, compared to passive observation of labeled examples? [Q2] If so, do they achieve the full benefit of active learning suggested by statistical learning theory? [Q3] If they do not, can machine learning be used to enhance human performance? [Q4] Do the answers to these questions vary depending upon the difficulty of the learning problem? 11
  • 12. Task Formulation ● Binary Classification in interval [0,1] ● Unknown decision boundary, ● 0 and 1 class ● n samples ● Xi [0, 1], Yi {0, 1} ● Yi is correct with probability 1 − ε ● 0 ≤ ε < 1/2 12 [Source: Castro et. al. (2008)]
  • 13. Error bound (ε = 0) ● Passive Learning ○ Random sampling ○ Error: O(1/n) ● Active Learning ○ Binary search ○ Error: O(2-n ) 13 [Source: Castro et. al. (2008)]
  • 14. Error bound (ε > 0) ● Passive learning ● Active learning [ Maximum Likelihood Estimate ] 14 [Source: Castro et. al. (2008)]
  • 15. Experiment A few 3D visual stimuli and their X values used in our experiment. Participant was asked to guess the decision boundary after every three iterations 15
  • 16. Experiment ● Random ○ No queries ● Human Active ○ Active queries ● Machine Yoked ○ Machine makes query ○ Human observes 16
  • 18. Answers [Q1] Do humans perform better when they can select their own examples for labeling, compared to passive observation of labeled examples? - Yes, in low noise levels [Q2] If so, do they achieve the full benefit of active learning suggested by statistical learning theory? - No, slower decay constants [Q3] If they do not, can machine learning be used to enhance human performance? - Inconclusive [Q4] Do the answers to these questions vary depending upon the difficulty of the learning problem? - Yes, with noise levels 18
  • 19. Conclusion ● Simple learning task ● Machine Yoked Learning ● Impact on: ○ Fields of psychology and cognitive sciences ○ Intelligent tutoring systems 19
  • 20. AL from multiple oracles 20
  • 22. Multiple Oracle: Challenges ● How to select the most informative query? ● How to select the best oracle to ask questions? ● How to deal with disagreement among the oracles? ● How to deal with a noisy or weak oracle? 22
  • 23. Weak and strong labeler ● Zhang et al. (2015) considered exactly two oracles ● One standard oracle ○ Accurate but costly ● One weak oracle ○ Noisy but cheap ● Goal ○ Reduce number of queries to standard oracle ○ No impact on accuracy 23
  • 24. Observations ● Difference Classifier to predict disagreement between strong and weak labeler ○ Might not be statistically consistent ○ Can use cost-sensitive difference classifier ● Active learning queries a localized region of space ○ Train difference classifier on that localized region 24
  • 25. Disagreement Based Active Learning (DBAL) Vt X h1 h2 h7 h6 h3 h5 h4 h* x1 x2 x8 x3 x6 x5 x7 x4 h1 (x1 ) = h2 (x1 ) = h3 (x1 ) = h4 (x1 ) = h4 (x1 ) h1 (x3 ) != h2 (x3 ) = h3 (x3 ) = h4 (x3 ) = h5 (x3 ) h1 (x4 ) = h2 (x4 ) = h3 (x4 ) = h4 (x4 ) = h5 (x4 ) query x3 O . . . . . . . . . . . update 25
  • 26. Problem Formulation ● Unlabeled Distribution, U ● Input space, X ● Label space, Y ● Hypothesis class, H ● Data distribution, D ● Excess error, ● Goal: with as few queries to O as possible Strong Oracle O Weak Oracle W 26
  • 27. Algorithm ● Three key ideas ○ Difference classifier ○ Disagreement region DIS(V) ■ Region of the input space where two member classifiers disagree ○ Epoch based agnostic CAL ■ Train fresh difference classifier in each epoch 27 [Source: Theory of Active Learning (Steve Hanneke, 2014)]
  • 28. Algorithm ● Initialize error 0 , total number of epochs k0 and draw some n0 examples to form labeled dataset S0 ● In each iteration up to k’ iterations: ○ Set target error ○ Draw nk unlabeled samples ○ Identify disagreement region Ak ○ Train difference classifier hdf with Ak , O, W ○ Active learning using hdf ■ Draw mk examples, use hdf and query either O or W. Add the labeled data to Sk ● Return a classifier learned from the labeled dataset Sk’ 28
  • 29. Performance Guarantee ● First term for learning, second for training difference classifier ● Second term is lower order term when d ≈ d’ ● Fitting the difference classifier does not incur a high overhead 29
  • 31. AL from crowds ● Multiple experts in supervised learning (Raykar et al., 2009 and Yan et al., 2010) ● NLP tasks from AMT data (Snow et al., 2008) ● Yan et al., 2011 proposed a novel method in active learning ● Focus: ○ Most informative query ○ Most useful annotator 31
  • 34. Algorithm ● Two key steps ○ Select a sample to label next ○ Select the best annotator to label ● Select sample ○ Uncertainty sampling ■ Select the sample for which classifier is least certain about 34
  • 35. Algorithm: Select Sample Where, and (ᾶ > 0) Separating hyperplane: 35
  • 38. Experiment (left) Labels, (center) Areas of Labeler expertise and (right) annotator selection information for the simplified two dimensional Galaxy Dim Data (Yan et al., 2011) 38
  • 39. Experiment: Baselines ● active learning+majority vote ○ Active query based on majority vote of all annotators ● random sample+multi-labeler ○ Multi labeler algorithm on randomly sampled examples ● random sample+majority vote ○ Random sampling with majority vote 39
  • 40. Experimental Result Accuracy comparisons on text data for the polarity, focus and the evidence labelings (Yan et al., 2011) 40
  • 41. More Analyses ● Decision boundary intersects all region of expertise ● Comparison with single oracle AL ● Specialized vs General expertise 41 [Source: Yan et. al. (2011)]
  • 42. Future Direction ● More Applications ○ Real world problems ● Optimal number of oracles ○ Does multiple oracles always performs better than single oracle? ○ Is there an optimal number of oracles that works best? ● Cost function associated with labeling ○ Choose single vs multiple oracles ● General expertise ○ Each of multiple oracles have general expertise 42
  • 43. References ● Castro, Rui M. et al. (2008). “Human Active Learning”. In: NIPS. ● Gigerenzer, Gerd and Reinhard Selten (2002). Bounded rationality: The adaptive toolbox. MIT press. ● Laughlin, Patrick R. (1973). “Focusing strategy in concept attainment as a function of instructions and task”. In: Journal of Experimental Psychology. ● Oaksford, Mike and Nick Chater (2007). Bayesian rationality: The probabilistic approach to human reasoning. Oxford University Press. ● Raykar, Vikas C. et al. (2009). “Supervised learning from multiple experts: whom to trust when everyone lies a bit”. In: ICML. ● Settles, Burr (2009). Active Learning Literature Survey. Computer Sciences Technical Report 1648. University of Wisconsin–Madison. 43
  • 44. References ● Snow, Rion et al. (2008). “Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks”. In: EMNLP. ● Wason, Peter Cathcart and Philip N Johnson-Laird (1972). Psychology of reasoning: Structure and content. Vol. 86. Harvard University Press. ● Yan, Yan et al. (2010). “Modeling annotator expertise: Learning when everybody knows a bit of something”. In: AISTATS. ● Yan, Yan et al. (2011). “Active Learning from Crowds”. In: ICML. ● Zhang, Chicheng and Kamalika Chaudhuri (2015). “Active Learning from Weak and Strong Labelers”. In: NIPS. ● Zhu, Xiaojin (2005). “Semi-supervised Learning with Graphs”. AAI3179046. PhD thesis. Pittsburgh, PA, USA ● Hanneke, Steve (2014). “Theory of Active Learning” 44