SlideShare a Scribd company logo
1 of 24
Download to read offline
Introduction to Machine
       Learning
            Lecture 7
     Instance Based Learning

                Albert Orriols i Puig
               aorriols@salle.url.edu
                   i l @ ll       ld

      Artificial Intelligence – Machine Learning
          Enginyeria i Arquitectura La Salle
              gy           q
                 Universitat Ramon Llull
Recap of Lecture 6

                          LET’S START WITH DATA
                             CLASSIFICATION




                                                               Slide 2
Artificial Intelligence                     Machine Learning
Recap of Lecture 6

                  Data Set            Classification Model             How?




       We are going to deal with:
               • Data described by nominal and continuous attributes
               • Data that may have instances with missing values



                                                                              Slide 3
Artificial Intelligence                 Machine Learning
Recap of Lecture 6
        We want to build decision trees
        How can I automatically
        generate these types
        of trees?
                Decide which attribute we
                should put in each node
                Decide a split point




                Rely on information theory
                We also saw many other improvements



                                                          Slide 4
Artificial Intelligence                Machine Learning
Today’s Agenda


        Classification without building a model
        K-Nearest Neighbor (kNN)
        Effect of K
        Distance functions
        Variants of K-NN
        Strengths and weaknesses



                                                  Slide 5
Artificial Intelligence        Machine Learning
Classification without Building a Model

        Forget about a global model!
           g           g
                Simply store all the training examples
                Build local model f each new t t i t
                B ild a l l   d l for  h     test instance
                Refered to as lazy learners


        Some approaches to IBL
                Nearest neighbors
                Locally weighted regression
                Case-based reasoning




                                                             Slide 6
Artificial Intelligence                Machine Learning
k-Nearest Neighbors
        Algorithm
          g
                Store all the training data
                Given a new t t instance
                Gi          test i t
                          Recover the k neighbors of the test instance
                          Predict th
                          P di t the majority class among the neighbors
                                       j it l             th    i hb


                Voronoi Cells: The feature space is
                decomposed into several cells.
                E.g. for k=1




                                                                          Slide 7
Artificial Intelligence                      Machine Learning
k-Nearest Neighbors
        But, where is the learning process?
           ,                     gp
                Select the k neighbors and return the majority class is learning?
                No, that’s just t i i
                N th t’ j t retrieving


        But still, some important issues
                Which k should I use?
                Which distance functions should I use?
                Should I maintain all instances of the training data set?




                                                                            Slide 8
Artificial Intelligence                 Machine Learning
Which k Should I Use?
        The effect of k
                             15-NN                             1-NN




                Do you remember the discussion about overfitting in C4.5?
                          Apply the same concepts here!

                                                                            Slide 9
Artificial Intelligence                     Machine Learning
Which k Should I Use?
        Some experimental results on the use of different k
               p
                                                            7-NN




                          Number of neighbors

                Notice that the test error decreases as k increases but at k ≈ 5-
                                                          increases,
                7, it starts increasing again
                Rule of thumb: k=3 k=5 and k=7 seem to work ok in the
                                k=3, k=5,
                majority of problems
                                                                            Slide 10
Artificial Intelligence                  Machine Learning
Distance Functions
        Distance functions must be able to
                Nominal attributes
                Continuous attributes
                C ti        tt ib t
                Missing values
        The key
                They must return a low value for similar objects and a high
                value for different objects
                Seems obvious, right? But still, it is domain dependent
                      obvious             still
        There are many of them. Let’s see some of the most
        used



                                                                              Slide 11
Artificial Intelligence                 Machine Learning
Distance Functions
        Distance between two points in the same space
                             p                   p
                d(x, y)


        Some properties expected to be satisfied in general
                d(x, y) ≥ 0 and d(x, x) = 0
                d(x y) = d(y x)
                d(x,     d(y,
                d(x, y) + d(y, z) ≥ d(x, z)




                                                                 Slide 12
Artificial Intelligence                       Machine Learning
Distances for Continuous Variables
          Given x=(x1,…,xn)’ and y=(y1,…,yn)’
                                                                n
                                               d E ( x, y ) = [∑ ( xi − yi ) 2 ]1/ 2
                  Euclidean
                                                               i =1



                                                                n
                                              d E ( x, y ) = [∑ ( xi − yi ) ] q 1/ q
                  Minkowsky
                                                               i =1



                                                                       n
                                               d ABS ( x, y ) = ∑ | xi − yi |
                  Distance absolute value
                                                                      i =1




                                                                                  Slide 13
Artificial Intelligence                Machine Learning
Distances for Continuous Variables
          What if attributes are measured over different scales?
                  Attribute 1 ranging in [0,1]
                  Attribute 2 ranging in [0 1000]
                                         [0,
                  Can you detect any potential problem in the aforementioned
                  distance functions?




                      X in [0,1], y in [0,1000]                      X in [0,1000], y in [0,1000]

                                                                                                Slide 14
Artificial Intelligence                           Machine Learning
Distances for Continuous Variables
        The larger the scale, the larger the influence of the
               g              ,      g
        attribute in the distance function
        Solution: Normalize each attribute
        How:
                Normalization by means of the range

                                                 d (ex1a , ex2 )
                                                             a
                          d anorm (ex1 , ex2 ) =
                                     a     a

                                                 max a − min a

                Normalization by means of the standard deviation

                                                  d (ex1a , ex2 )
                                                              a
                          d anorm (ex1a , ex2 ) =
                                            a

                                                      4σ a
                                                                    Slide 15
Artificial Intelligence                   Machine Learning
Distances for Nominal Attributes
        Several metrics to deal with nominal attributes
                Overlap distance function




                          Idea: Two nominal attributes are equal only if they have the same
                          value




                                                                                      Slide 16
Artificial Intelligence                       Machine Learning
Distances for Nominal Attributes
        Several metrics to deal with nominal attributes
                Value difference metric (VDM)




                                                                 C = number of classes
                                                                 P(a exia, c) = conditional probability
                                                                 P(a,
                                                                 that the output class is c given that
                                                                 the attribute a has de value exia.




                          Idea: Two nominal values are similar if they have more similar
                          correlations with the output classes
                See (Wilson & Martinez) for more distance functions
                                                                                                 Slide 17
Artificial Intelligence                       Machine Learning
Distances for Heterogeneous Attributes


        What if my data set is described by both nominal and
        continuous attributes?
                Apply the same distance function
                Use nominal distance functions for nominal attributes
                Use continuous distance function for continuous attributes




                                                                             Slide 18
Artificial Intelligence               Machine Learning
Variants of kNN


        Different variants of kNN
                Distance-weighted kNN
                Attribute-weighted kNN




                                                        Slide 19
Artificial Intelligence              Machine Learning
Distance-Weighted kNN
         Inference of original kNN
                         g
                 The k nearest neighbors vote for the class
         Shouldn t
         Shouldn’t the closest examples have a higher influence in the
         decision process?
                 Weight the contribution of each of the k neighbors wrt their distance
                  E.g.,                                 k
                                  f ( xq ) = arg max ∑ wiδ (v, f ( xi ))
                                  ˆ                                                      k
                                                                                        ∑ wi f ( xi )
                                                v∈V    i =1
                                                                           f ( xq ) =
                                                                           ˆ            i =1
                                                   1                                            k
                                  where wi =                                                   ∑ wi
                                             d ( xq , xi ) 2                                   i =1




                 More robust to noisy instances and outliers

         E.g.: Shepard’s method (Shepard,1968)


                                                                                                      Slide 20
Artificial Intelligence                          Machine Learning
Attribute-weighted kNN
        What if some attributes are irrelevant or misleading?
                                                           g
                If irrelevant   cost increases, but accuracy is not affected
                If misleading
                    i l di       cost increases and accuracy may d
                                    ti            d              decrease


        Weight attributes:
                                                n
                                d w( x, y ) = ∑ wi ( xi − yi )   2

                                               i =1

        How to determine the weights?
                Option 1: The expert p
                 p              p provide us with the weights
                                                         g
                Option 2: Use a machine learning approach
                More will be said in the next lecture!

                                                                               Slide 21
Artificial Intelligence                 Machine Learning
Strengths and Weaknesses
  Strengths of kNN
           Building of a new local model for each test instance
           Learning has no cost
           Empirical results show that the method is highly accurate w.r.t other
           machine learning techniques
  Weaknesses
           Retrieving approach, but does not learn
           No global model. The knowledge is not legible
           Test cost increases linearly with the input instances
           No generalization
           Curse of dimensionality: What happens if we have many attributes?
           Noise and outliers may have a very negative effect
                                                                          Slide 22
Artificial Intelligence              Machine Learning
Next Class

        From instance-based to case-based reasoning
        A little bit more on learning
                Distance functions
                Prototype selection




                                                         Slide 23
Artificial Intelligence               Machine Learning
Introduction to Machine
       Learning
           Lecture 7
    Instance Based Learning

               Albert Orriols i Puig
              aorriols@salle.url.edu
                  i l @ ll       ld

     Artificial Intelligence – Machine Learning
         Enginyeria i Arquitectura La Salle
             gy           q
                Universitat Ramon Llull

More Related Content

What's hot

Instance based learning
Instance based learningInstance based learning
Instance based learningSlideshare
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Modelspetitegeek
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Simplilearn
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection methodAmir Razmjou
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsPalin analytics
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximizationbutest
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagationKrish_ver2
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineSomnathMore3
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic conceptsKrish_ver2
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & UnderfittingSOUMIT KAR
 
GANs Deep Learning Summer School
GANs Deep Learning Summer SchoolGANs Deep Learning Summer School
GANs Deep Learning Summer SchoolRubens Zimbres, PhD
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixAndrew Ferlitsch
 
How to choose Machine Learning algorithm.
How to choose Machine Learning  algorithm.How to choose Machine Learning  algorithm.
How to choose Machine Learning algorithm.Mala Deep Upadhaya
 

What's hot (20)

Instance based learning
Instance based learningInstance based learning
Instance based learning
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
 
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
Hierarchical Clustering | Hierarchical Clustering in R |Hierarchical Clusteri...
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection method
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin Analytics
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 
GANs Deep Learning Summer School
GANs Deep Learning Summer SchoolGANs Deep Learning Summer School
GANs Deep Learning Summer School
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion Matrix
 
How to choose Machine Learning algorithm.
How to choose Machine Learning  algorithm.How to choose Machine Learning  algorithm.
How to choose Machine Learning algorithm.
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 

Viewers also liked

HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsAlbert Orriols-Puig
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesAlbert Orriols-Puig
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIIAlbert Orriols-Puig
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryAlbert Orriols-Puig
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceAlbert Orriols-Puig
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IIAlbert Orriols-Puig
 
Bias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsBias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsGilles Louppe
 
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...PlanetData Network of Excellence
 

Viewers also liked (20)

Lecture8 - From CBR to IBk
Lecture8 - From CBR to IBkLecture8 - From CBR to IBk
Lecture8 - From CBR to IBk
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasets
 
Lecture18
Lecture18Lecture18
Lecture18
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rules
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART III
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 
Lecture11 - neural networks
Lecture11 - neural networksLecture11 - neural networks
Lecture11 - neural networks
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
Lecture19
Lecture19Lecture19
Lecture19
 
Lecture17
Lecture17Lecture17
Lecture17
 
Lecture21
Lecture21Lecture21
Lecture21
 
Lecture20
Lecture20Lecture20
Lecture20
 
Lecture23
Lecture23Lecture23
Lecture23
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART II
 
Lecture22
Lecture22Lecture22
Lecture22
 
Bias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsBias-variance decomposition in Random Forests
Bias-variance decomposition in Random Forests
 
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...
Hubness-Based Fuzzy Measures for High-Dimensional k-Nearest Neighbor Classifi...
 

Similar to Lecture7 - IBk

Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning LiveMike Anderson
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureRouyun Pan
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningMustafa Aldemir
 
Deep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And LimitationsDeep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And LimitationsSlideTeam
 
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...Chris Rackauckas
 
Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Victor Miagkikh
 
SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOINN Inc.
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
 
Yann le cun
Yann le cunYann le cun
Yann le cunYandex
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearningEyad Alshami
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
 

Similar to Lecture7 - IBk (20)

Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning Live
 
Lecture3 - Machine Learning
Lecture3 - Machine LearningLecture3 - Machine Learning
Lecture3 - Machine Learning
 
Lecture2 - Machine Learning
Lecture2 - Machine LearningLecture2 - Machine Learning
Lecture2 - Machine Learning
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & Future
 
Lecture5 - C4.5
Lecture5 - C4.5Lecture5 - C4.5
Lecture5 - C4.5
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And LimitationsDeep Learning Overview Classification Types Examples And Limitations
Deep Learning Overview Classification Types Examples And Limitations
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Lecture1 - Machine Learning
Lecture1 - Machine LearningLecture1 - Machine Learning
Lecture1 - Machine Learning
 
Convolutional neural networks
Convolutional neural  networksConvolutional neural  networks
Convolutional neural networks
 
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
Automatic Differentiation and SciML in Reality: What can go wrong, and what t...
 
Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?Learning in Networks: were Pavlov and Hebb right?
Learning in Networks: were Pavlov and Hebb right?
 
Lecture4 - Machine Learning
Lecture4 - Machine LearningLecture4 - Machine Learning
Lecture4 - Machine Learning
 
SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOIAM (SOINN-AM)
SOIAM (SOINN-AM)
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine Learning
 
Yann le cun
Yann le cunYann le cun
Yann le cun
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearning
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 

Recently uploaded

2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 

Recently uploaded (20)

2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 

Lecture7 - IBk

  • 1. Introduction to Machine Learning Lecture 7 Instance Based Learning Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull
  • 2. Recap of Lecture 6 LET’S START WITH DATA CLASSIFICATION Slide 2 Artificial Intelligence Machine Learning
  • 3. Recap of Lecture 6 Data Set Classification Model How? We are going to deal with: • Data described by nominal and continuous attributes • Data that may have instances with missing values Slide 3 Artificial Intelligence Machine Learning
  • 4. Recap of Lecture 6 We want to build decision trees How can I automatically generate these types of trees? Decide which attribute we should put in each node Decide a split point Rely on information theory We also saw many other improvements Slide 4 Artificial Intelligence Machine Learning
  • 5. Today’s Agenda Classification without building a model K-Nearest Neighbor (kNN) Effect of K Distance functions Variants of K-NN Strengths and weaknesses Slide 5 Artificial Intelligence Machine Learning
  • 6. Classification without Building a Model Forget about a global model! g g Simply store all the training examples Build local model f each new t t i t B ild a l l d l for h test instance Refered to as lazy learners Some approaches to IBL Nearest neighbors Locally weighted regression Case-based reasoning Slide 6 Artificial Intelligence Machine Learning
  • 7. k-Nearest Neighbors Algorithm g Store all the training data Given a new t t instance Gi test i t Recover the k neighbors of the test instance Predict th P di t the majority class among the neighbors j it l th i hb Voronoi Cells: The feature space is decomposed into several cells. E.g. for k=1 Slide 7 Artificial Intelligence Machine Learning
  • 8. k-Nearest Neighbors But, where is the learning process? , gp Select the k neighbors and return the majority class is learning? No, that’s just t i i N th t’ j t retrieving But still, some important issues Which k should I use? Which distance functions should I use? Should I maintain all instances of the training data set? Slide 8 Artificial Intelligence Machine Learning
  • 9. Which k Should I Use? The effect of k 15-NN 1-NN Do you remember the discussion about overfitting in C4.5? Apply the same concepts here! Slide 9 Artificial Intelligence Machine Learning
  • 10. Which k Should I Use? Some experimental results on the use of different k p 7-NN Number of neighbors Notice that the test error decreases as k increases but at k ≈ 5- increases, 7, it starts increasing again Rule of thumb: k=3 k=5 and k=7 seem to work ok in the k=3, k=5, majority of problems Slide 10 Artificial Intelligence Machine Learning
  • 11. Distance Functions Distance functions must be able to Nominal attributes Continuous attributes C ti tt ib t Missing values The key They must return a low value for similar objects and a high value for different objects Seems obvious, right? But still, it is domain dependent obvious still There are many of them. Let’s see some of the most used Slide 11 Artificial Intelligence Machine Learning
  • 12. Distance Functions Distance between two points in the same space p p d(x, y) Some properties expected to be satisfied in general d(x, y) ≥ 0 and d(x, x) = 0 d(x y) = d(y x) d(x, d(y, d(x, y) + d(y, z) ≥ d(x, z) Slide 12 Artificial Intelligence Machine Learning
  • 13. Distances for Continuous Variables Given x=(x1,…,xn)’ and y=(y1,…,yn)’ n d E ( x, y ) = [∑ ( xi − yi ) 2 ]1/ 2 Euclidean i =1 n d E ( x, y ) = [∑ ( xi − yi ) ] q 1/ q Minkowsky i =1 n d ABS ( x, y ) = ∑ | xi − yi | Distance absolute value i =1 Slide 13 Artificial Intelligence Machine Learning
  • 14. Distances for Continuous Variables What if attributes are measured over different scales? Attribute 1 ranging in [0,1] Attribute 2 ranging in [0 1000] [0, Can you detect any potential problem in the aforementioned distance functions? X in [0,1], y in [0,1000] X in [0,1000], y in [0,1000] Slide 14 Artificial Intelligence Machine Learning
  • 15. Distances for Continuous Variables The larger the scale, the larger the influence of the g , g attribute in the distance function Solution: Normalize each attribute How: Normalization by means of the range d (ex1a , ex2 ) a d anorm (ex1 , ex2 ) = a a max a − min a Normalization by means of the standard deviation d (ex1a , ex2 ) a d anorm (ex1a , ex2 ) = a 4σ a Slide 15 Artificial Intelligence Machine Learning
  • 16. Distances for Nominal Attributes Several metrics to deal with nominal attributes Overlap distance function Idea: Two nominal attributes are equal only if they have the same value Slide 16 Artificial Intelligence Machine Learning
  • 17. Distances for Nominal Attributes Several metrics to deal with nominal attributes Value difference metric (VDM) C = number of classes P(a exia, c) = conditional probability P(a, that the output class is c given that the attribute a has de value exia. Idea: Two nominal values are similar if they have more similar correlations with the output classes See (Wilson & Martinez) for more distance functions Slide 17 Artificial Intelligence Machine Learning
  • 18. Distances for Heterogeneous Attributes What if my data set is described by both nominal and continuous attributes? Apply the same distance function Use nominal distance functions for nominal attributes Use continuous distance function for continuous attributes Slide 18 Artificial Intelligence Machine Learning
  • 19. Variants of kNN Different variants of kNN Distance-weighted kNN Attribute-weighted kNN Slide 19 Artificial Intelligence Machine Learning
  • 20. Distance-Weighted kNN Inference of original kNN g The k nearest neighbors vote for the class Shouldn t Shouldn’t the closest examples have a higher influence in the decision process? Weight the contribution of each of the k neighbors wrt their distance E.g., k f ( xq ) = arg max ∑ wiδ (v, f ( xi )) ˆ k ∑ wi f ( xi ) v∈V i =1 f ( xq ) = ˆ i =1 1 k where wi = ∑ wi d ( xq , xi ) 2 i =1 More robust to noisy instances and outliers E.g.: Shepard’s method (Shepard,1968) Slide 20 Artificial Intelligence Machine Learning
  • 21. Attribute-weighted kNN What if some attributes are irrelevant or misleading? g If irrelevant cost increases, but accuracy is not affected If misleading i l di cost increases and accuracy may d ti d decrease Weight attributes: n d w( x, y ) = ∑ wi ( xi − yi ) 2 i =1 How to determine the weights? Option 1: The expert p p p provide us with the weights g Option 2: Use a machine learning approach More will be said in the next lecture! Slide 21 Artificial Intelligence Machine Learning
  • 22. Strengths and Weaknesses Strengths of kNN Building of a new local model for each test instance Learning has no cost Empirical results show that the method is highly accurate w.r.t other machine learning techniques Weaknesses Retrieving approach, but does not learn No global model. The knowledge is not legible Test cost increases linearly with the input instances No generalization Curse of dimensionality: What happens if we have many attributes? Noise and outliers may have a very negative effect Slide 22 Artificial Intelligence Machine Learning
  • 23. Next Class From instance-based to case-based reasoning A little bit more on learning Distance functions Prototype selection Slide 23 Artificial Intelligence Machine Learning
  • 24. Introduction to Machine Learning Lecture 7 Instance Based Learning Albert Orriols i Puig aorriols@salle.url.edu i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull