SlideShare a Scribd company logo
1 of 46
Download to read offline
Prof. Pier Luca Lanzi
Classification
Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
Prof. Pier Luca Lanzi
In classification, we have data that have been labeled according to a certain concept
(good/bad, positive/negative, etc.) identified by a supervisor who actually labeled the data
Prof. Pier Luca Lanzi
The goal of classification is to compute a model that can discriminate between examples that have
been labeled differently, e.g., the points above the planes are green and the one below are red
Prof. Pier Luca Lanzi
The model is then used to label previously unseen points.
?
?
?
?
Prof. Pier Luca Lanzi
As for regression, we have to compute a
model using known data that will perform
well on unknown data.
Prof. Pier Luca Lanzi
Prof. Pier Luca Lanzi
Contact Lenses Data 7
NoneReducedYesHypermetropePre-presbyopic
NoneNormalYesHypermetropePre-presbyopic
NoneReducedNoMyopePresbyopic
NoneNormalNoMyopePresbyopic
NoneReducedYesMyopePresbyopic
HardNormalYesMyopePresbyopic
NoneReducedNoHypermetropePresbyopic
SoftNormalNoHypermetropePresbyopic
NoneReducedYesHypermetropePresbyopic
NoneNormalYesHypermetropePresbyopic
SoftNormalNoHypermetropePre-presbyopic
NoneReducedNoHypermetropePre-presbyopic
HardNormalYesMyopePre-presbyopic
NoneReducedYesMyopePre-presbyopic
SoftNormalNoMyopePre-presbyopic
NoneReducedNoMyopePre-presbyopic
hardNormalYesHypermetropeYoung
NoneReducedYesHypermetropeYoung
SoftNormalNoHypermetropeYoung
NoneReducedNoHypermetropeYoung
HardNormalYesMyopeYoung
NoneReducedYesMyopeYoung
SoftNormalNoMyopeYoung
NoneReducedNoMyopeYoung
Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
Prof. Pier Luca Lanzi
Rows are typically called
examples or data points
Columns are typically called
attributes, variables or features
The target variable to be
predicted is usually called “class”
Prof. Pier Luca Lanzi
A Model for the Contact Lenses Data 9
If tear production rate = reduced then recommendation = none
If age = young and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = pre-presbyopic and astigmatic = no
and tear production rate = normal then recommendation = soft
If age = presbyopic and spectacle prescription = myope
and astigmatic = no then recommendation = none
If spectacle prescription = hypermetrope and astigmatic = no
and tear production rate = normal then recommendation = soft
If spectacle prescription = myope and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age young and astigmatic = yes
and tear production rate = normal then recommendation = hard
If age = pre-presbyopic
and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
If age = presbyopic and spectacle prescription = hypermetrope
and astigmatic = yes then recommendation = none
Prof. Pier Luca Lanzi
Classification
The target to predict is a label (the class variable)
(good/bad, none/soft/hard, etc.)
Prediction/Regression
The target to predict is numerical
10
Prof. Pier Luca Lanzi
classification = model building + model usage
Prof. Pier Luca Lanzi
Classification: Model Construction 14
Classification
Algorithm
IF rank = ‘professor’
OR years > 6
THEN tenured = ‘yes’
name rank years tenured
Mike Assistant Prof 3 no
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
Dave Assistant Prof 6 no
Anne Associate Prof 3 no
Training
Data
Classifier
(Model)
Prof. Pier Luca Lanzi
Classification: Model Usage 15
tenured = yes
name rank years tenured
Tom Assistant Prof 2 no
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Test
Data
Classifier
(Model)
Unseen Data
Jeff, Professor, 4
Prof. Pier Luca Lanzi
Evaluating Classification Methods
• Accuracy
§Classifier accuracy in predicting the correct the class labels
• Speed
§Time to construct the model (training time)
§Time to use the model to label unseen data
• Other Criteria
§Robustness in handling noise
§Scalability
§Interpretability
§…
16
Prof. Pier Luca Lanzi
Logistic Regression
20
Prof. Pier Luca Lanzi
To build a model to label the green and red data points we can compute an hyperplane
Score(x) that separates the examples labeled green from the one labeled red
Prof. Pier Luca Lanzi
To predict the labels we check whether the value of Score(x0, x1) is
positive or negative and we label the examples accordingly
Score(x0,x1) = 0
Score(x0,x1) > 0
Score(x0,x1) < 0
Prof. Pier Luca Lanzi
Intuition Behind Linear Classifiers
(Two Classes)
• We define a score function similar to the one used in linear
regression,
• The label is determined by the sign of the score value,
23
Prof. Pier Luca Lanzi
Logistic Regression
• Well-known and widely used statistical classification method
• Instead of computing the label, it computes the probability of
assigning a class to an example, that is,
• For this purpose, logistic regression assumes that,
• By making the score computation explicit and using h to identify
all the feature transformation hj we get,
24
Prof. Pier Luca Lanzi
25
Prof. Pier Luca Lanzi
Logistic Regression
• Logistic Regression search for the weight vector that corresponds
to the highest likelihood
• For this purpose, it performs a gradient ascent on the log
likelihood function
• Which updates weight j using,
26
Prof. Pier Luca Lanzi
Classification using Logistic
Regression
• To classify an example X, we select the class Y that maximizes
the probability P(Y=y | X) or check
or
• By taking the natural logarithm of both sides, we obtain a linear
classification rule that assign the label Y=-1 if
• Y=+1 otherwise
27
Prof. Pier Luca Lanzi
Overfitting & Regularization
28
Prof. Pier Luca Lanzi
L1 & L2 Regularization
• Logistic regression can use L1 and L2 regularization to limit
overfitting and produce sparse solution
• Like it happened with regression, overfitting is often associated to
large weights so to limit overfitting we penalize large weights
Total cost = Measure of Fit - Magnitude of Coefficients
• Regularization
§L1 uses the sum of absolute values, which penalizes large
weights and at the same time promotes sparse solutions
§L2 uses the sum of squares, which penalizes large weights
29
Prof. Pier Luca Lanzi
L1 Regularization
L2 Regularization
30
Prof. Pier Luca Lanzi
If ⍺ is zero, then we have no regularization
If ⍺ tends to infinity,
the solution is a zero weight vector
⍺ must balance fit and the weight magnitude
Prof. Pier Luca Lanzi
Example
32
Prof. Pier Luca Lanzi
Run the Logistic Regression Python
notebooks provided on Beep
Prof. Pier Luca Lanzi
Safe Loans Prediction
• Given data downloaded from the Lending Club Corporation
website regarding safe and risky loans, build a model to predict
whether a loan should be given
• Model evaluation using 10-fold crossvalidation
§Simple Logistic Regression μ=0.63 σ=0.05
§Logistic Regression (L1) μ=0.63 σ=0.07
§Logistic Regression (L2) μ=0.64 σ=0.05
34
Prof. Pier Luca Lanzi
Weights of the simple logistic model, the L1 regularized model, and the L2 regularized model.
35
Prof. Pier Luca Lanzi
From now on we are going to use k-fold
crossvalidation a lot to score models
k-fold crossvalidation generates
k separate models
Which one should be deployed?
Prof. Pier Luca Lanzi
K-fold crossvalidation provides an evaluation
of model performance on unknown data
Its output it’s the evaluation, not the model!
The model should be obtained by running
the algorithm on the entire dataset!
Prof. Pier Luca Lanzi
Multiclass Classification
38
Prof. Pier Luca Lanzi
Logistic regression assumes that
there are only two class values (e.g., +1/-1)
What if we have more?
(e.g., none, soft, hard or Setosa/Versicolour/Virginica)
39
Prof. Pier Luca Lanzi
One Versus the Rest Evaluation
• For each class, it creates one classifier that predicts the target
class against all the others
• For instance, given three classes A, B, C, it computes three
models: one that predicts A against B and C, one that predicts B
against A and C, and one that predicts C against A and B
• Then, given an example, all the three classifiers are applied and
the label with the highest probability is returned
• Alternative approaches include the minimization of loss based on
the multinomial loss fit across the entire probability distribution
40
Prof. Pier Luca Lanzi
One Versus Rest multiclass model using the Iris dataset
41
Prof. Pier Luca Lanzi
Multinomial multiclass model using the Iris dataset
Prof. Pier Luca Lanzi
Run the Logistic Regression Python notebooks
on multiclass classification available on Beep
Prof. Pier Luca Lanzi
Categorical Attributes
44
Prof. Pier Luca Lanzi
So far we applied Logistic Regression
(and regression) to numerical variables
What if the variables are categorical?
Prof. Pier Luca Lanzi
The Weather Dataset
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot	 High	 True No
Overcast	 Hot		 High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
46
Prof. Pier Luca Lanzi
One Hot Encoding
Map each categorical attribute with
n values into n binary 0/1 variables
Each one describing one
specific attribute values
For example, attribute Outlook is replaced by three
binary variables Sunny, Overcast, and Rainy
Prof. Pier Luca Lanzi
One Hot Encoding for the Outlook attribute.
overcast rainy sunny
0 0 1
0 0 1
1 0 0
0 1 0
0 1 0
0 1 0
1 0 0
0 0 1
0 0 1
0 1 0
0 0 1
1 0 0
1 0 0
0 1 0
Prof. Pier Luca Lanzi
Run the Logistic Regression Python notebooks
on one hot encoding available on Beep
Prof. Pier Luca Lanzi
Summary
50
Prof. Pier Luca Lanzi
Classification
• The goal of classification is to build a model of a target concept (the class
attribute) available in the data
• The target concept is typically a symbolic label (good/bad, yes/no,
none/soft/hard, etc.)
• Linear classifiers are one of the simplest of classification algorithm that
computes a model represented as a hyperplane that separates examples of the
two classes
• Logistic regression is a classification algorithm that computes the probability to
classify an example with a certain label
• It works similarly to linear regression and can be coupled with L1 and L2
regularization to limit overfitting
• It assumes that the class has only two values but it can be extended to
problems involving more than two classes
• Like linear regression it assumes that the variables are numerical, if they are
not, categorical variables must be mapped into numerical values
• One of the typical approaches is to use one-hot-encoding
51

More Related Content

What's hot

DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsPier Luca Lanzi
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesPier Luca Lanzi
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringPier Luca Lanzi
 
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsDMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsPier Luca Lanzi
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringPier Luca Lanzi
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsPier Luca Lanzi
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationPier Luca Lanzi
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringPier Luca Lanzi
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesPier Luca Lanzi
 
DMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationDMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationPier Luca Lanzi
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringPier Luca Lanzi
 
DMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesDMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesPier Luca Lanzi
 
DMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationDMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationPier Luca Lanzi
 
DMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationDMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationPier Luca Lanzi
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningPier Luca Lanzi
 
DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringPier Luca Lanzi
 
DMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationDMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationPier Luca Lanzi
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationPier Luca Lanzi
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningPier Luca Lanzi
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesPier Luca Lanzi
 

What's hot (20)

DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethods
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rules
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 Clustering
 
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other MethodsDMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
DMTM 2015 - 13 Naive bayes, Nearest Neighbours and Other Methods
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification Models
 
DMTM Lecture 05 Data representation
DMTM Lecture 05 Data representationDMTM Lecture 05 Data representation
DMTM Lecture 05 Data representation
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clustering
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision trees
 
DMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data ExplorationDMTM 2015 - 04 Data Exploration
DMTM 2015 - 04 Data Exploration
 
DMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical ClusteringDMTM 2015 - 07 Hierarchical Clustering
DMTM 2015 - 07 Hierarchical Clustering
 
DMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification RulesDMTM 2015 - 12 Classification Rules
DMTM 2015 - 12 Classification Rules
 
DMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to ClassificationDMTM 2015 - 10 Introduction to Classification
DMTM 2015 - 10 Introduction to Classification
 
DMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data RepresentationDMTM 2015 - 03 Data Representation
DMTM 2015 - 03 Data Representation
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph mining
 
DMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based ClusteringDMTM 2015 - 09 Density Based Clustering
DMTM 2015 - 09 Density Based Clustering
 
DMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data PreparationDMTM 2015 - 16 Data Preparation
DMTM 2015 - 16 Data Preparation
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data exploration
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text mining
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rules
 

Similar to DMTM Lecture 04 Classification

GECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System TutorialGECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System TutorialPier Luca Lanzi
 
GECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle IntroductionGECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle IntroductionPier Luca Lanzi
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
 
Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonPeadar Coyle
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsAlejandro Bellogin
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptxHadrian7
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Ian Morgan
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Bayes Nets meetup London
 
Evolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems WayEvolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems WayPier Luca Lanzi
 
Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Greg Landrum
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptxGaytriDhingra1
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxMayankChadha14
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningKhaled Saleh
 
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...Yun Huang
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18Matt Yang
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learningSubrat Panda, PhD
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfssuseradaf5f
 

Similar to DMTM Lecture 04 Classification (20)

GECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System TutorialGECCO 2014 - Learning Classifier System Tutorial
GECCO 2014 - Learning Classifier System Tutorial
 
GECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle IntroductionGECCO-2014 Learning Classifier Systems: A Gentle Introduction
GECCO-2014 Learning Classifier Systems: A Gentle Introduction
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Introduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in PythonIntroduction to Bayesian Analysis in Python
Introduction to Bayesian Analysis in Python
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
Evolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems WayEvolving Rules to Solve Problems: The Learning Classifier Systems Way
Evolving Rules to Solve Problems: The Learning Classifier Systems Way
 
Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)Building useful models for imbalanced datasets (without resampling)
Building useful models for imbalanced datasets (without resampling)
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptx
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptx
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
 
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...
 
Hierarchical Object Detection with Deep Reinforcement Learning
Hierarchical Object Detection with Deep Reinforcement LearningHierarchical Object Detection with Deep Reinforcement Learning
Hierarchical Object Detection with Deep Reinforcement Learning
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
 

More from Pier Luca Lanzi

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i VideogiochiPier Luca Lanzi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiPier Luca Lanzi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomePier Luca Lanzi
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...Pier Luca Lanzi
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaPier Luca Lanzi
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Pier Luca Lanzi
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringPier Luca Lanzi
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionPier Luca Lanzi
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningPier Luca Lanzi
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelinePier Luca Lanzi
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityPier Luca Lanzi
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationPier Luca Lanzi
 
VDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game designVDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game designPier Luca Lanzi
 

More from Pier Luca Lanzi (14)

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei Videogiochi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning Welcome
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di apertura
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clustering
 
DMTM Lecture 01 Introduction
DMTM Lecture 01 IntroductionDMTM Lecture 01 Introduction
DMTM Lecture 01 Introduction
 
DMTM Lecture 02 Data mining
DMTM Lecture 02 Data miningDMTM Lecture 02 Data mining
DMTM Lecture 02 Data mining
 
VDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipelineVDP2016 - Lecture 16 Rendering pipeline
VDP2016 - Lecture 16 Rendering pipeline
 
VDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with UnityVDP2016 - Lecture 15 PCG with Unity
VDP2016 - Lecture 15 PCG with Unity
 
VDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generationVDP2016 - Lecture 14 Procedural content generation
VDP2016 - Lecture 14 Procedural content generation
 
VDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game designVDP2016 - Lecture 13 Data driven game design
VDP2016 - Lecture 13 Data driven game design
 

Recently uploaded

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 

DMTM Lecture 04 Classification

  • 1. Prof. Pier Luca Lanzi Classification Data Mining andText Mining (UIC 583 @ Politecnico di Milano)
  • 2. Prof. Pier Luca Lanzi In classification, we have data that have been labeled according to a certain concept (good/bad, positive/negative, etc.) identified by a supervisor who actually labeled the data
  • 3. Prof. Pier Luca Lanzi The goal of classification is to compute a model that can discriminate between examples that have been labeled differently, e.g., the points above the planes are green and the one below are red
  • 4. Prof. Pier Luca Lanzi The model is then used to label previously unseen points. ? ? ? ?
  • 5. Prof. Pier Luca Lanzi As for regression, we have to compute a model using known data that will perform well on unknown data.
  • 7. Prof. Pier Luca Lanzi Contact Lenses Data 7 NoneReducedYesHypermetropePre-presbyopic NoneNormalYesHypermetropePre-presbyopic NoneReducedNoMyopePresbyopic NoneNormalNoMyopePresbyopic NoneReducedYesMyopePresbyopic HardNormalYesMyopePresbyopic NoneReducedNoHypermetropePresbyopic SoftNormalNoHypermetropePresbyopic NoneReducedYesHypermetropePresbyopic NoneNormalYesHypermetropePresbyopic SoftNormalNoHypermetropePre-presbyopic NoneReducedNoHypermetropePre-presbyopic HardNormalYesMyopePre-presbyopic NoneReducedYesMyopePre-presbyopic SoftNormalNoMyopePre-presbyopic NoneReducedNoMyopePre-presbyopic hardNormalYesHypermetropeYoung NoneReducedYesHypermetropeYoung SoftNormalNoHypermetropeYoung NoneReducedNoHypermetropeYoung HardNormalYesMyopeYoung NoneReducedYesMyopeYoung SoftNormalNoMyopeYoung NoneReducedNoMyopeYoung Recommended lensesTear production rateAstigmatismSpectacle prescriptionAge
  • 8. Prof. Pier Luca Lanzi Rows are typically called examples or data points Columns are typically called attributes, variables or features The target variable to be predicted is usually called “class”
  • 9. Prof. Pier Luca Lanzi A Model for the Contact Lenses Data 9 If tear production rate = reduced then recommendation = none If age = young and astigmatic = no and tear production rate = normal then recommendation = soft If age = pre-presbyopic and astigmatic = no and tear production rate = normal then recommendation = soft If age = presbyopic and spectacle prescription = myope and astigmatic = no then recommendation = none If spectacle prescription = hypermetrope and astigmatic = no and tear production rate = normal then recommendation = soft If spectacle prescription = myope and astigmatic = yes and tear production rate = normal then recommendation = hard If age young and astigmatic = yes and tear production rate = normal then recommendation = hard If age = pre-presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none If age = presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none
  • 10. Prof. Pier Luca Lanzi Classification The target to predict is a label (the class variable) (good/bad, none/soft/hard, etc.) Prediction/Regression The target to predict is numerical 10
  • 11. Prof. Pier Luca Lanzi classification = model building + model usage
  • 12. Prof. Pier Luca Lanzi Classification: Model Construction 14 Classification Algorithm IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ name rank years tenured Mike Assistant Prof 3 no Mary Assistant Prof 7 yes Bill Professor 2 yes Jim Associate Prof 7 yes Dave Assistant Prof 6 no Anne Associate Prof 3 no Training Data Classifier (Model)
  • 13. Prof. Pier Luca Lanzi Classification: Model Usage 15 tenured = yes name rank years tenured Tom Assistant Prof 2 no Merlisa Associate Prof 7 no George Professor 5 yes Joseph Assistant Prof 7 yes Test Data Classifier (Model) Unseen Data Jeff, Professor, 4
  • 14. Prof. Pier Luca Lanzi Evaluating Classification Methods • Accuracy §Classifier accuracy in predicting the correct the class labels • Speed §Time to construct the model (training time) §Time to use the model to label unseen data • Other Criteria §Robustness in handling noise §Scalability §Interpretability §… 16
  • 15. Prof. Pier Luca Lanzi Logistic Regression 20
  • 16. Prof. Pier Luca Lanzi To build a model to label the green and red data points we can compute an hyperplane Score(x) that separates the examples labeled green from the one labeled red
  • 17. Prof. Pier Luca Lanzi To predict the labels we check whether the value of Score(x0, x1) is positive or negative and we label the examples accordingly Score(x0,x1) = 0 Score(x0,x1) > 0 Score(x0,x1) < 0
  • 18. Prof. Pier Luca Lanzi Intuition Behind Linear Classifiers (Two Classes) • We define a score function similar to the one used in linear regression, • The label is determined by the sign of the score value, 23
  • 19. Prof. Pier Luca Lanzi Logistic Regression • Well-known and widely used statistical classification method • Instead of computing the label, it computes the probability of assigning a class to an example, that is, • For this purpose, logistic regression assumes that, • By making the score computation explicit and using h to identify all the feature transformation hj we get, 24
  • 20. Prof. Pier Luca Lanzi 25
  • 21. Prof. Pier Luca Lanzi Logistic Regression • Logistic Regression search for the weight vector that corresponds to the highest likelihood • For this purpose, it performs a gradient ascent on the log likelihood function • Which updates weight j using, 26
  • 22. Prof. Pier Luca Lanzi Classification using Logistic Regression • To classify an example X, we select the class Y that maximizes the probability P(Y=y | X) or check or • By taking the natural logarithm of both sides, we obtain a linear classification rule that assign the label Y=-1 if • Y=+1 otherwise 27
  • 23. Prof. Pier Luca Lanzi Overfitting & Regularization 28
  • 24. Prof. Pier Luca Lanzi L1 & L2 Regularization • Logistic regression can use L1 and L2 regularization to limit overfitting and produce sparse solution • Like it happened with regression, overfitting is often associated to large weights so to limit overfitting we penalize large weights Total cost = Measure of Fit - Magnitude of Coefficients • Regularization §L1 uses the sum of absolute values, which penalizes large weights and at the same time promotes sparse solutions §L2 uses the sum of squares, which penalizes large weights 29
  • 25. Prof. Pier Luca Lanzi L1 Regularization L2 Regularization 30
  • 26. Prof. Pier Luca Lanzi If ⍺ is zero, then we have no regularization If ⍺ tends to infinity, the solution is a zero weight vector ⍺ must balance fit and the weight magnitude
  • 27. Prof. Pier Luca Lanzi Example 32
  • 28. Prof. Pier Luca Lanzi Run the Logistic Regression Python notebooks provided on Beep
  • 29. Prof. Pier Luca Lanzi Safe Loans Prediction • Given data downloaded from the Lending Club Corporation website regarding safe and risky loans, build a model to predict whether a loan should be given • Model evaluation using 10-fold crossvalidation §Simple Logistic Regression μ=0.63 σ=0.05 §Logistic Regression (L1) μ=0.63 σ=0.07 §Logistic Regression (L2) μ=0.64 σ=0.05 34
  • 30. Prof. Pier Luca Lanzi Weights of the simple logistic model, the L1 regularized model, and the L2 regularized model. 35
  • 31. Prof. Pier Luca Lanzi From now on we are going to use k-fold crossvalidation a lot to score models k-fold crossvalidation generates k separate models Which one should be deployed?
  • 32. Prof. Pier Luca Lanzi K-fold crossvalidation provides an evaluation of model performance on unknown data Its output it’s the evaluation, not the model! The model should be obtained by running the algorithm on the entire dataset!
  • 33. Prof. Pier Luca Lanzi Multiclass Classification 38
  • 34. Prof. Pier Luca Lanzi Logistic regression assumes that there are only two class values (e.g., +1/-1) What if we have more? (e.g., none, soft, hard or Setosa/Versicolour/Virginica) 39
  • 35. Prof. Pier Luca Lanzi One Versus the Rest Evaluation • For each class, it creates one classifier that predicts the target class against all the others • For instance, given three classes A, B, C, it computes three models: one that predicts A against B and C, one that predicts B against A and C, and one that predicts C against A and B • Then, given an example, all the three classifiers are applied and the label with the highest probability is returned • Alternative approaches include the minimization of loss based on the multinomial loss fit across the entire probability distribution 40
  • 36. Prof. Pier Luca Lanzi One Versus Rest multiclass model using the Iris dataset 41
  • 37. Prof. Pier Luca Lanzi Multinomial multiclass model using the Iris dataset
  • 38. Prof. Pier Luca Lanzi Run the Logistic Regression Python notebooks on multiclass classification available on Beep
  • 39. Prof. Pier Luca Lanzi Categorical Attributes 44
  • 40. Prof. Pier Luca Lanzi So far we applied Logistic Regression (and regression) to numerical variables What if the variables are categorical?
  • 41. Prof. Pier Luca Lanzi The Weather Dataset Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 46
  • 42. Prof. Pier Luca Lanzi One Hot Encoding Map each categorical attribute with n values into n binary 0/1 variables Each one describing one specific attribute values For example, attribute Outlook is replaced by three binary variables Sunny, Overcast, and Rainy
  • 43. Prof. Pier Luca Lanzi One Hot Encoding for the Outlook attribute. overcast rainy sunny 0 0 1 0 0 1 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0
  • 44. Prof. Pier Luca Lanzi Run the Logistic Regression Python notebooks on one hot encoding available on Beep
  • 45. Prof. Pier Luca Lanzi Summary 50
  • 46. Prof. Pier Luca Lanzi Classification • The goal of classification is to build a model of a target concept (the class attribute) available in the data • The target concept is typically a symbolic label (good/bad, yes/no, none/soft/hard, etc.) • Linear classifiers are one of the simplest of classification algorithm that computes a model represented as a hyperplane that separates examples of the two classes • Logistic regression is a classification algorithm that computes the probability to classify an example with a certain label • It works similarly to linear regression and can be coupled with L1 and L2 regularization to limit overfitting • It assumes that the class has only two values but it can be extended to problems involving more than two classes • Like linear regression it assumes that the variables are numerical, if they are not, categorical variables must be mapped into numerical values • One of the typical approaches is to use one-hot-encoding 51