SlideShare ist ein Scribd-Unternehmen logo
1 von 17
INTRODUCTION
TO
MACHINE
LEARNING
3RD EDITION
ETHEM ALPAYDIN
© The MIT Press, 2014
alpaydin@boun.edu.tr
http://www.cmpe.boun.edu.tr/~ethem/i2ml3e
Lecture Slides for
CHAPTER 2:
SUPERVISED
LEARNING
Learning a Class from
Examples
3
 Class C of a “family car”
 Prediction: Is car x a family car?
 Knowledge extraction: What do people expect from a
family car?
 Output:
Positive (+) and negative (–) examples
 Input representation:
x1: price, x2 : engine power
Training set X
N
t
t
t
,r 1
}
{ 
 x
X




negative
is
if
positive
is
if
x
x
0
1
r
4







2
1
x
x
x
Class C
5
   
2
1
2
1 power
engine
AND
price e
e
p
p 



Hypothesis class H




negative
is
says
if
positive
is
says
if
)
(
x
x
x
h
h
h
0
1
 
 




N
t
t
t
r
h
h
E
1
1 x
)
|
( X
6
Error of h on H
S, G, and the Version Space
7
most specific hypothesis, S
most general hypothesis, G
h H, between S and G is
consistent and make up the
version space
(Mitchell, 1997)
Margin
8
 Choose h with largest margin
VC Dimension
9
 N points can be labeled in 2N ways as +/–
 H shatters N if there
exists h  H consistent
for any of these:
VC(H ) = N
An axis-aligned rectangle shatters 4 points only !
Probably Approximately Correct
(PAC) Learning
10
 How many training examples N should we have, such that with
probability at least 1 ‒ δ, h has error at most ε ?
(Blumer et al., 1989)
 Each strip is at most ε/4
 Pr that we miss a strip 1‒ ε/4
 Pr that N instances miss a strip (1 ‒ ε/4)N
 Pr that N instances miss 4 strips 4(1 ‒ ε/4)N
 4(1 ‒ ε/4)N ≤ δ and (1 ‒ x)≤exp( ‒ x)
 4exp(‒ εN/4) ≤ δ and N ≥ (4/ε)log(4/δ)
Noise and Model Complexity
11
Use the simpler one because
 Simpler to use
(lower computational
complexity)
 Easier to train (lower
space complexity)
 Easier to explain
(more interpretable)
 Generalizes better (lower
variance - Occam’s razor)
Multiple Classes, Ci i=1,...,K
N
t
t
t
,r 1
}
{ 
 x
X







,
if
if
i
j
r
j
t
i
t
t
i
C
C
x
x
0
1
 







,
if
if
i
j
h
j
t
i
t
t
i
C
C
x
x
x
0
1
12
Train hypotheses
hi(x), i =1,...,K:
Regression
  0
1 w
x
w
x
g 

  0
1
2
2 w
x
w
x
w
x
g 


   
 




N
t
t
t
x
g
r
N
g
E
1
2
1
X
|
13
   
 





N
t
t
t
w
x
w
r
N
w
w
E
1
2
0
1
0
1
1
X
|
,
 
  




 
t
t
t
N
t
t
t
x
f
r
r
r
x 1
,
X
Model Selection &
Generalization
14
 Learning is an ill-posed problem; data is not
sufficient to find a unique solution
 The need for inductive bias, assumptions
about H
 Generalization: How well a model performs on
new data
 Overfitting: H more complex than C or f
 Underfitting: H less complex than C or f
Triple Trade-Off
15
 There is a trade-off between three factors
(Dietterich, 2003):
1. Complexity of H, c (H),
2. Training set size, N,
3. Generalization error, E, on new data
 As N,E
 As c (H),first Eand then E
Cross-Validation
16
 To estimate generalization error, we need data
unseen during training. We split the data as
 Training set (50%)
 Validation set (25%)
 Test (publication) set (25%)
 Resampling when there is few data
Dimensions of a Supervised
Learner
1. Model:
2. Loss function:
3. Optimization procedure:
 

|
x
g
   
 


t
t
t
g
r
L
E 
 |
,
| x
X
17
 
X
|
min
arg
* 


E


Weitere ähnliche Inhalte

Was ist angesagt?

Practicle application of maxima and minima
Practicle application of maxima and minimaPracticle application of maxima and minima
Practicle application of maxima and minimaBritish Council
 
Concepts of Maxima And Minima
Concepts of Maxima And MinimaConcepts of Maxima And Minima
Concepts of Maxima And MinimaJitin Pillai
 
Molodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision MakingMolodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision Makinginventionjournals
 
Maximums and minimum
Maximums and minimum Maximums and minimum
Maximums and minimum rubimedina01
 
Mb0048 operations research
Mb0048 operations researchMb0048 operations research
Mb0048 operations researchsmumbahelp
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
3.1 Extreme Values of Functions
3.1 Extreme Values of Functions3.1 Extreme Values of Functions
3.1 Extreme Values of FunctionsSharon Henry
 
Algorithm of some numerical /computational methods
Algorithm of some numerical /computational methodsAlgorithm of some numerical /computational methods
Algorithm of some numerical /computational methodsChandan
 
Mathematical blog #2
Mathematical blog #2Mathematical blog #2
Mathematical blog #2Steven Pauly
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...tuxette
 
Permutation graphsandapplications
Permutation graphsandapplicationsPermutation graphsandapplications
Permutation graphsandapplicationsJoe Krall
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forestsChristian Robert
 
Dirichlet processes and Applications
Dirichlet processes and ApplicationsDirichlet processes and Applications
Dirichlet processes and ApplicationsSaurav Jha
 

Was ist angesagt? (17)

Practicle application of maxima and minima
Practicle application of maxima and minimaPracticle application of maxima and minima
Practicle application of maxima and minima
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
 
Concepts of Maxima And Minima
Concepts of Maxima And MinimaConcepts of Maxima And Minima
Concepts of Maxima And Minima
 
Molodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision MakingMolodtsov's Soft Set Theory and its Applications in Decision Making
Molodtsov's Soft Set Theory and its Applications in Decision Making
 
Maximums and minimum
Maximums and minimum Maximums and minimum
Maximums and minimum
 
Mb0048 operations research
Mb0048 operations researchMb0048 operations research
Mb0048 operations research
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
3.1 Extreme Values of Functions
3.1 Extreme Values of Functions3.1 Extreme Values of Functions
3.1 Extreme Values of Functions
 
Algorithm of some numerical /computational methods
Algorithm of some numerical /computational methodsAlgorithm of some numerical /computational methods
Algorithm of some numerical /computational methods
 
Mathematical blog #2
Mathematical blog #2Mathematical blog #2
Mathematical blog #2
 
Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...Classification and regression based on derivatives: a consistency result for ...
Classification and regression based on derivatives: a consistency result for ...
 
Lecture5 xing
Lecture5 xingLecture5 xing
Lecture5 xing
 
Permutation graphsandapplications
Permutation graphsandapplicationsPermutation graphsandapplications
Permutation graphsandapplications
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
 
Lar calc10 ch03_sec1
Lar calc10 ch03_sec1Lar calc10 ch03_sec1
Lar calc10 ch03_sec1
 
Dirichlet processes and Applications
Dirichlet processes and ApplicationsDirichlet processes and Applications
Dirichlet processes and Applications
 
Math HW/SW
Math HW/SWMath HW/SW
Math HW/SW
 

Ähnlich wie I2ml3e chap2

Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2butest
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep LearningRayKim51
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Daisuke Yoneoka
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector MachinesEdgar Marca
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Frank Nielsen
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learningSteve Nouri
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Varad Meru
 
Bayesian Subset Simulation
Bayesian Subset SimulationBayesian Subset Simulation
Bayesian Subset SimulationJulien Bect
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Meanstthonet
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Thesis oral defense
Thesis oral defenseThesis oral defense
Thesis oral defenseFan Zhitao
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo MethodsJames Bell
 
Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learningVARUN KUMAR
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationAlexander Litvinenko
 
Meta-learning and the ELBO
Meta-learning and the ELBOMeta-learning and the ELBO
Meta-learning and the ELBOYoonho Lee
 

Ähnlich wie I2ml3e chap2 (20)

Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learning
 
i2ml3e-chap3.pptx
i2ml3e-chap3.pptxi2ml3e-chap3.pptx
i2ml3e-chap3.pptx
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
 
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
 
Bayesian Subset Simulation
Bayesian Subset SimulationBayesian Subset Simulation
Bayesian Subset Simulation
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Thesis oral defense
Thesis oral defenseThesis oral defense
Thesis oral defense
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo Methods
 
Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learning
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantification
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Meta-learning and the ELBO
Meta-learning and the ELBOMeta-learning and the ELBO
Meta-learning and the ELBO
 

Kürzlich hochgeladen

Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf203318pmpc
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 

Kürzlich hochgeladen (20)

Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 

I2ml3e chap2

  • 1. INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e Lecture Slides for
  • 3. Learning a Class from Examples 3  Class C of a “family car”  Prediction: Is car x a family car?  Knowledge extraction: What do people expect from a family car?  Output: Positive (+) and negative (–) examples  Input representation: x1: price, x2 : engine power
  • 4. Training set X N t t t ,r 1 } {   x X     negative is if positive is if x x 0 1 r 4        2 1 x x x
  • 5. Class C 5     2 1 2 1 power engine AND price e e p p    
  • 6. Hypothesis class H     negative is says if positive is says if ) ( x x x h h h 0 1         N t t t r h h E 1 1 x ) | ( X 6 Error of h on H
  • 7. S, G, and the Version Space 7 most specific hypothesis, S most general hypothesis, G h H, between S and G is consistent and make up the version space (Mitchell, 1997)
  • 8. Margin 8  Choose h with largest margin
  • 9. VC Dimension 9  N points can be labeled in 2N ways as +/–  H shatters N if there exists h  H consistent for any of these: VC(H ) = N An axis-aligned rectangle shatters 4 points only !
  • 10. Probably Approximately Correct (PAC) Learning 10  How many training examples N should we have, such that with probability at least 1 ‒ δ, h has error at most ε ? (Blumer et al., 1989)  Each strip is at most ε/4  Pr that we miss a strip 1‒ ε/4  Pr that N instances miss a strip (1 ‒ ε/4)N  Pr that N instances miss 4 strips 4(1 ‒ ε/4)N  4(1 ‒ ε/4)N ≤ δ and (1 ‒ x)≤exp( ‒ x)  4exp(‒ εN/4) ≤ δ and N ≥ (4/ε)log(4/δ)
  • 11. Noise and Model Complexity 11 Use the simpler one because  Simpler to use (lower computational complexity)  Easier to train (lower space complexity)  Easier to explain (more interpretable)  Generalizes better (lower variance - Occam’s razor)
  • 12. Multiple Classes, Ci i=1,...,K N t t t ,r 1 } {   x X        , if if i j r j t i t t i C C x x 0 1          , if if i j h j t i t t i C C x x x 0 1 12 Train hypotheses hi(x), i =1,...,K:
  • 13. Regression   0 1 w x w x g     0 1 2 2 w x w x w x g              N t t t x g r N g E 1 2 1 X | 13            N t t t w x w r N w w E 1 2 0 1 0 1 1 X | ,            t t t N t t t x f r r r x 1 , X
  • 14. Model Selection & Generalization 14  Learning is an ill-posed problem; data is not sufficient to find a unique solution  The need for inductive bias, assumptions about H  Generalization: How well a model performs on new data  Overfitting: H more complex than C or f  Underfitting: H less complex than C or f
  • 15. Triple Trade-Off 15  There is a trade-off between three factors (Dietterich, 2003): 1. Complexity of H, c (H), 2. Training set size, N, 3. Generalization error, E, on new data  As N,E  As c (H),first Eand then E
  • 16. Cross-Validation 16  To estimate generalization error, we need data unseen during training. We split the data as  Training set (50%)  Validation set (25%)  Test (publication) set (25%)  Resampling when there is few data
  • 17. Dimensions of a Supervised Learner 1. Model: 2. Loss function: 3. Optimization procedure:    | x g         t t t g r L E   | , | x X 17   X | min arg *    E 