SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
Weiwei Cheng, Jens Hühn and Eyke Hüllermeier
     Knowledge Engineering & Bioinformatics Lab
   Department of Mathematics and Computer Science
          University of Marburg, Germany
Label Ranking (an example)
Learning customers’ preferences on cars:

                                   label ranking  
       customer 1              MINI _ Toyota _ BMW 
       customer 2              BMW _ MINI _ Toyota
       customer 3              BMW _ Toyota _ MINI
       customer 4              Toyota _ MINI _ BMW
       new customer                     ???

      where the customers could be described by feature 
      vectors, e.g., (gender, age, place of birth, has child, …)

                                                                   1/17
Label Ranking (an example)
Learning customers’ preferences on cars:

                        MINI       Toyota      BMW
       customer 1         1          2           3
       customer 2         2          3           1
       customer 3         3          2           1
       customer 4         2           1          3
       new customer       ?           ?          ?


        π(i) = position of the i‐th label in the ranking
             1: MINI        2: Toyota       3: BMW
                                                           2/17
Label Ranking (more formally)
Given:
  a set of training instances 
  a set of labels
  for each training instance     : a set of pairwise preferences
  of the form                (for some of the labels)

Find:
  A ranking function (            mapping) that maps each           
  to a ranking      of L (permutation     ) and generalizes well 
  in terms of a loss function on rankings (e.g., Kendall’s tau)

                                                                   3/17
Existing Approaches
… essentially reduce label ranking to classification:

  Ranking by pairwise comparison
  Fürnkranz and Hüllermeier,  ECML‐03
  Constraint classification  (CC)
  Har‐Peled , Roth and Zimak,  NIPS‐03
  Log linear models for label ranking
  Dekel, Manning and Singer,  NIPS‐03


   − are efficient but may come with a loss of information
   − may have an improper bias and lack flexibility
   − may produce models that are not easily interpretable

                                                             4/17
Local Approach (this work)
        nearest neighbor                      decision tree




 Target function         is estimated (on demand) in a local way.
 Distribution of rankings is (approx.) constant in a local region.
 Core part is to estimate the locally constant model.
                                                                     5/17
Local Approach (this work)
 Output (ranking) of an instance x is generated according to a 
 distribution              on 

 This distribution is (approximately) constant within the local 
 region under consideration.

 Nearby preferences are considered as a sample generated by  
 which is estimated on the basis of this sample via ML. 




                                                                   6/17
Probabilistic Model for Ranking 
Mallows model (Mallows, Biometrika, 1957)


with 
center ranking
spread parameter
and        is a right invariant metric on permutations




                                                         7/17
Inference (complete rankings)
Rankings                           observed locally.

                                                 ML




                                                       monotone in θ
Inference (incomplete rankings)
Probability of an incomplete ranking:


where       denotes the set of consistent extensions of    .

Example for label set {a,b,c}:
              Observation σ       Extensions E(σ)
                                     a_b_c
                  a_b                a_c_b
                                     c_a_b


                                                               9/17
Inference (incomplete rankings)  cont.
The corresponding likelihood:




Exact MLE                                        becomes infeasible when 
n is large. Approximation is needed.
                                                                            10/17
1 2 3 4
                                                         1 2 4 3
                                                         4 3
                                                                   1 2 3 4
                                                                   1 2 4 3
                                                               1 2 4 3
                                                               1 4 3
Inference (incomplete rankings)  cont.
  Approximation via a variant of EM, viewing the non‐observed 
  labels as hidden variables. 
     replace the E‐step of EM algorithm with a maximization step 
     (widely used in learning HMM, K‐means clustering, etc.)

  1.   Start with an initial center ranking (via generalized Borda count)
  2.   Replace an incomplete observation with its most probable 
       extension (first M‐step, can be done efficiently)
  3.   Obtain MLE as in the complete ranking case (second M‐step)
  4.   Replace the initial center ranking with current estimation
  5.   Repeat until convergence

                                                                               11/17
Inference
                                      low variance




                                          high variance



Not only the estimated ranking     is of interest …
… but also the spread parameter    , which is a measure of precision 
and, therefore, reflects the confidence/reliability of the prediction 
(just like the variance of an estimated mean). 
The bigger   , the more peaked the distribution around the center 
ranking.
                                                                    12/17
Label Ranking Trees
Major modifications: 
    split criterion 
    Split ranking set      into        and        , maximizing goodness‐of‐fit




    stopping criterion for partition
    1. tree is pure
         any two labels in two different rankings have the same order
    2. number of labels in a node is too small 
         prevent an excessive fragmentation
                                                                                 13/17
Label Ranking Trees
Labels: BMW, Mini, Toyota




                            14/17
Experimental Results




    IBLR: instance‐based label ranking        LRT: label ranking trees
                      CC: constraint classification
                  Performance in terms of Kentall’s tau                  15/17
Accuracy (Kendall’s tau)
Typical “learning curves”:
                                        0,8

                                       0,75                                     LRT
                 Ranking performance




                                        0,7
                                                                     CC
                                       0,65

                                        0,6

                                                            Probability of missing label
                                       0,55
                                                0     0,1      0,2        0,3    0,4   0,5   0,6    0,7
                                              more     amount of preference information            less



    Main observation:  Local methods are more flexible and can exploit more 
    preference information compared with the model‐based approach.

                                                                                                          16/17
Take‐away Messages
  An instance‐based method for label ranking using a 
  probabilistic model.
  Suitable for complete and incomplete rankings.
  Comes with a natural measure of the reliability of a 
  prediction. Makes other types of learners possible: 
  label ranking trees.

o More efficient inference for the incomplete case. 
o Dealing with variants of the label ranking problem, such as 
  calibrated label ranking and multi‐label classification.

                                                             17/17
Google “kebi germany” for more info.

Weitere ähnliche Inhalte

Was ist angesagt?

daa-unit-3-greedy method
daa-unit-3-greedy methoddaa-unit-3-greedy method
daa-unit-3-greedy method
hodcsencet
 

Was ist angesagt? (20)

Distance Vector Routing
Distance Vector RoutingDistance Vector Routing
Distance Vector Routing
 
Automata Theory
Automata TheoryAutomata Theory
Automata Theory
 
Statistics and data science
Statistics and data scienceStatistics and data science
Statistics and data science
 
R programming presentation
R programming presentationR programming presentation
R programming presentation
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
 
Exploratory Data Analysis using Python
Exploratory Data Analysis using PythonExploratory Data Analysis using Python
Exploratory Data Analysis using Python
 
2.7 normal forms cnf & problems
2.7 normal forms  cnf & problems2.7 normal forms  cnf & problems
2.7 normal forms cnf & problems
 
Coloring graphs
Coloring graphsColoring graphs
Coloring graphs
 
Algorithms Lecture 2: Analysis of Algorithms I
Algorithms Lecture 2: Analysis of Algorithms IAlgorithms Lecture 2: Analysis of Algorithms I
Algorithms Lecture 2: Analysis of Algorithms I
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
 
Graphs bfs dfs
Graphs bfs dfsGraphs bfs dfs
Graphs bfs dfs
 
Graph Traversal Algorithm
Graph Traversal AlgorithmGraph Traversal Algorithm
Graph Traversal Algorithm
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Computational Complexity: Introduction-Turing Machines-Undecidability
Computational Complexity: Introduction-Turing Machines-UndecidabilityComputational Complexity: Introduction-Turing Machines-Undecidability
Computational Complexity: Introduction-Turing Machines-Undecidability
 
Unit 2.pptx
Unit 2.pptxUnit 2.pptx
Unit 2.pptx
 
daa-unit-3-greedy method
daa-unit-3-greedy methoddaa-unit-3-greedy method
daa-unit-3-greedy method
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with R
 
Attention in Deep Learning
Attention in Deep LearningAttention in Deep Learning
Attention in Deep Learning
 
Software Engineering : Software testing
Software Engineering : Software testingSoftware Engineering : Software testing
Software Engineering : Software testing
 

Ähnlich wie Decision tree and instance-based learning for label ranking

Query Linguistic Intent Detection
Query Linguistic Intent DetectionQuery Linguistic Intent Detection
Query Linguistic Intent Detection
butest
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3
zukun
 
Figure 1
Figure 1Figure 1
Figure 1
butest
 
Tutorial rpo
Tutorial rpoTutorial rpo
Tutorial rpo
mosi2005
 

Ähnlich wie Decision tree and instance-based learning for label ranking (20)

A new instance-based label ranking approach using the Mallows model
A new instance-based label ranking approach using the Mallows modelA new instance-based label ranking approach using the Mallows model
A new instance-based label ranking approach using the Mallows model
 
A Simple Instance-Based Approach to Multilabel Classi cation Using the Mallow...
A Simple Instance-Based Approach to Multilabel Classication Using the Mallow...A Simple Instance-Based Approach to Multilabel Classication Using the Mallow...
A Simple Instance-Based Approach to Multilabel Classi cation Using the Mallow...
 
Query Linguistic Intent Detection
Query Linguistic Intent DetectionQuery Linguistic Intent Detection
Query Linguistic Intent Detection
 
0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf0 Model Interpretation setting.pdf
0 Model Interpretation setting.pdf
 
Dimensionality reduction
Dimensionality reductionDimensionality reduction
Dimensionality reduction
 
07 dimensionality reduction
07 dimensionality reduction07 dimensionality reduction
07 dimensionality reduction
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3
 
Shape context
Shape context Shape context
Shape context
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
 
Combining Instance-Based Learning and Logistic Regression for Multilabel Clas...
Combining Instance-Based Learning and Logistic Regression for Multilabel Clas...Combining Instance-Based Learning and Logistic Regression for Multilabel Clas...
Combining Instance-Based Learning and Logistic Regression for Multilabel Clas...
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
Figure 1
Figure 1Figure 1
Figure 1
 
Tutorial rpo
Tutorial rpoTutorial rpo
Tutorial rpo
 
OPTIMAL CHOICE: NEW MACHINE LEARNING PROBLEM AND ITS SOLUTION
OPTIMAL CHOICE: NEW MACHINE LEARNING PROBLEM AND ITS SOLUTIONOPTIMAL CHOICE: NEW MACHINE LEARNING PROBLEM AND ITS SOLUTION
OPTIMAL CHOICE: NEW MACHINE LEARNING PROBLEM AND ITS SOLUTION
 
SVM & KNN Presentation.pptx
SVM & KNN Presentation.pptxSVM & KNN Presentation.pptx
SVM & KNN Presentation.pptx
 
Machine Learning Explanations: LIME framework
Machine Learning Explanations: LIME framework Machine Learning Explanations: LIME framework
Machine Learning Explanations: LIME framework
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Fractal analysis of good programming style
Fractal analysis of good programming styleFractal analysis of good programming style
Fractal analysis of good programming style
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Decision tree and instance-based learning for label ranking