SlideShare ist ein Scribd-Unternehmen logo
1 von 5
Smith 1


Jordan Smith
MUMT 611
Written summary of classifiers
18 February 2008


                            A Review of Support Vector Machines

Abstract
        A support vector machine (SVM) is a learning machine that can be used for classification
problems (Cortes 1995) as well as for regression and novelty detection (Bennett 2000). SVMs
look for the hyperplane that optimally separates two classes of data. Important features of SVMs
are the absence of local minima, the well-controlled capacity of the solution (Christiani 2000),
and the ability to handle high-dimensional input data efficiently (Cortes 1995). It is conceptually
quite simple, but also very powerful: in its infancy, it has performed well against other popular
classifiers (Meyer 2002, 2003), and has been applied to problems in several fields, including that
of music information retrieval.

1.     Introduction

1.1      History
         The support vector machine was developed quite recently, emerging only in the early
1990s. However, it is also the product of decades of research in computational learning theory by
Russian mathematicians Vladimir Vapnik and Alexey Chervonenkis. Their resulting theory,
summarized in Vapnik’s 1982 book Estimation of Dependences Based on Empirical Data, has
been called Vapnik-Chervonenkis or VC theory (Vapnik 2006). That book describes the
implementation of a support vector machine for linearly separable training data (Cortes 1995).
Beginning in the early 1990s by researchers at Bell Labs, a number of important extensions were
made to the SVM: in 1992, Boser, Guyon and Vapnik proposed using Aizerman's kernel trick to
classify data perhaps only separable by polynomial or radial basis functions; in 1995, Vapnik and
Cortes extended the theory to handle non-separable training data by using a cost function;
finally, a method of support vector regression was developed in 1996 (Drucker).

1.2     Summary
        This very brief introduction to SVMs will first describe in historical order: the case of
using a SVM to classify linear, separable data; the case of using a kernel function to make a non-
linear classification; and the case of using a cost function to allow for non-separable data. In
section 3, a number of studies using SVMs will be described, including several related to music
information retrieval. Finally, some studies evaluating the performance of SVMs are
summarized.

2.     Support Vector Machines

2.1    Linear, separable data
       The basic problem that a SVM learns and solves is a two-category classification problem.
Following the method of Bennett’s discussion (2000), suppose we have a set of l observations.
Smith 2


Each observation can be represented
by a pair {xi, yi} where xi є RN and yi є
{-1, 1}. That is, each observation
contains an N-dimensional vector x
and a class assignment y. Our goal is to
find the optimal separating hyperplane;
that is, the flat (N-1)-dimensional
surface that best separates the data.
         For the time being we assume
that a separating hyperplane exists, and
is defined by the normal vector w. On
either side of this plane we construct a
pair of parallel planes such that:
              w·xi ≥ b + 1 for yi = 1         Figure 1. Two data sets, represented by squares and circles,
             w·xi ≤ b – 1 for yi = -1         are separated by two parallel hyperplanes subtended by
                                              support vectors (circled). The distance between these planes –
         where b indicates the offset of      the margin – is the quantity maximized by a SVM. The solid
the plane from the origin. This               line is the optimal separating hyperplane.
situation is pictured in Figure 1, where
the separating plan is the solid line and the two parallel planes are the dashed lines. The dashed
lines ‘push up’ against some of the training data points: these points are called ‘support vectors,’
and in fact they completely determine the solution. The gap between these lines is called the
margin, and we wish to maximize the size of this gap. In terms of w, we wish to maximize:
                                                       ½||w||2
subject to the constraint:
                                                 yi (w·xi – b) ≥ 1
The solution can be obtained using Lagrange multipliers (Burges 1998).


2.2     Kernel functions
        Often, a non-linear
solution plane is required to
separate data. To repeat the
above steps and maximize
the separation between two
non-linear functions can be
computationally expensive.
Instead, the kernel trick is
used: input data are mapped
into a higher dimensional
feature space via a specified Figure dimensional featurethe kernel trick. Input function,mapped into a
                                higher
                                        2. Visualization of
                                                            space using a kernel
                                                                                 data are
                                                                                          resulting in
kernel function. The data       linearly-separable training data. Source: Holbrey, R. “Dimension Reduction
are linearly separable in the   Algorithms for Data Mining and Visualization.”
higher dimensional space.       <http://www.comp.leeds.ac.uk/richardh/astro/index.html> Accessed 12
Furthermore, if a good          February 2008.
kernel function is selected,
the dot product will be preserved in the feature space (Cortes 1995) so that the mathematical
Smith 3


approach outlined in section 2.1 is still applicable. The important kernel functions who have been
used and whose properties have been studied most extensively are linear and polynomial
functions, the radial-basis function, and the sigmoid function (Sherrod 2008).

2.3     Non-separable data
        A method of accommodating errors and outliers in the input data was developed in 1995
(Cortes), and can be implemented simply by allowing an error of up to ξ in each dimension
(resulting in a ‘fuzzy margin’) and adding a cost function C(i) to the optimization equation
(Burges). We then want to minimize:
                                               ½||w||2 + C·(Σ ξi)
subject to the constraint:
                                             yi (w·xi – b) + ξi ≥ 1
(Bennett 2000). This is substantially harder to solve than the separable case. In Chang and Lin’s
LIBSVM manual, the minimization conditions, constraints, and resulting decision functions are
defined for each type of classification, along with algorithms for solving the required quadratic
programming problems (2007).

3      Studies using SVMs

3.1     Applications
        Throughout his early papers, Vapnik often used optical text recognition as an
experimental example application (Boser 1992, Cortes 1995, Schölkopf 1996). (See also
Sebastiani 1999, Joachims 1997.) Since then, many authors have since used SVMs to develop
classifiers in other disciplines: see, for instance, the work on face detection by Osuna et al.
(1997b) or on gene expression data by Brown et al. (2000). In the field of music information
retrieval, Dhanaraj and Logan used SVMs in their automatic identification of hit songs based on
lyrics and acoustic features (2005), Laurier and Herrera submitted a second-place finishing mood
classifier to MIREX 2007 that relied on SVMs and acoustic features, and Meng used SVMs at
multiple stages in his dissertation: first to perform temporal feature integration and second to
perform automatic genre identification based on these features (2006). Both Mandel (2005,
2006) and Xu (2003) have studied musical genre classification using SVMs based on acoustic
features. The free software package LIB-SVM is a library of tools for implementing various
types of SVMs (Chang 2007) while DTREG can implement a number of predictive models, from
SVMs to various types of neural nets and decision trees (Sherrod 2008).

3.2     Performance
        According to Vapnik, the performance of his SVM hand-written digit classifier easily
outperforms state-of-the-art classifiers based on other learning routines. However, since their rise
in popularity in the 1990s, SVMs have been the object of closer scrutiny: a study by Meyer
concluded that although SVMs performed very well in classification and regression tasks, other
methods were as competitive (2002).
        While the two-category classification problem is the classic problem to study
analytically, but in practice, more categories must be distinguished. Hsu (2002) compared the
performance of various methods of combining binary classifiers, concluding that one-against-one
and ‘directed acyclic graph SVM’ were better than one-against-all.
Smith 4


                                          Bibliography


Bennett, K., and C. Campbell. 2000. “Support vector machines: Hype or hallelujah?” Special
       Interest Group on Knowledge Discovery and Data Mining Explorations. 2(2): 1–13.

Boser, B., I. Guyon, and V. Vapnik. 1992. “A training algorithm for optimal margin classifiers.”
       Proceedings of the 5th Annual Workshop on Computational Learning Theory. 144–52.

Brown, M., W. Grundy, D. Lin, N. Cristianini, C. Sugnet, T. Furey, M. Ares Jr., and D. Haussler.
      2000. Knowledge-based analysis of microarray gene expression data by using support
      vector machines. Proceedings of the National Academy of Science. 97: 262–267.

Burges, C. 1998. “A tutorial on support vector machines for pattern recognition.” Data Mining
       and Knowledge Discovery. 2(2): 955–74.

Chang, C., and C. Lin. 2007. “LIBSVM: a library for support vector machines.” Manual for
       software available online: <http://www.csie.ntu.edu.tw/~cjlin/libsvm/>

Christiani, N., and J. Shawe-Taylor. 2000. Chapter 6: Further reading and advanced topics. In An
        Introduction to Support Vector Machines. Cambridge: Cambridge University Press.
        <http://www.support-vector.net/chapter_6.html>

Cortes, C., and V. Vapnik. 1995. Support-vector networks. Machine Learning. 20(3): 273–297.

Dhanaraj, R., and B. Logan. 2005. “Automatic prediction of hit songs.” International
      Conference on Music Information Retrieval, London UK. 488–91.

Drucker, H., C. Burges, L. Kaufman, A. Smola, and V. Vapnik. 1996. Support vector regression
      machine. Advances in Neural Information Processing Systems. Cambridge: MIT Press
      9(9): 155–61.

Hsu, C., and C. Lin. 2002. A comparison of methods for multiclass support vector machines.
       IEEE Transactions on Neural Networks. 13(2): 415–425.

Hsu, C., C. Chang, and C. Lin. 2007. A practical guide to support vector classification.
       <http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf>

Joachims, T. 1997. Text categorization with support vector machines: Learning with many
       relevant features. Springer Lecture Notes in Computer Science. 1398: 137–42.

Laurier, C., and P. Herrera. 2007. Audio music mood classification using support vector
       machine.” Proceedings of 8th International Conference on Music Information Retrieval.

Mandel, M., and D. Ellis. 2005. Song-level features and support vector machines for music
      classification. Proceedings of the 6th International Conference on Music Information
Smith 5


       Retrieval. 594–599

Mandel, M., G. Poliner, and D. Ellis. 2006. Support vector machine active learning for music
      retrieval. Multimedia Systems. 12(1): 3–13.

Meng, A. 2006. Temporal feature integration for music organization. PhD diss., Technical
      University of Denmark.

Meyer, D., F. Leisch, and K. Hornik. 2002. Benchmarking support vector machines. Report
       Series SFB, Adaptive Information Systems and Modelling in Economics and
       Management Science. 78.

Meyer, D., F. Leisch, and K. Hornik. 2003. The support vector machine under test.
       Neurocomputing. 55: 169–86.

Osuna, E., R. Freund, and F. Girosi. 1997a. An improved training algorithm for support vector
       machines. Proceedings of the IEEE Workshop on Neural Networks for Signal
       Processing. 276–85.

Osuna, E., R. Freund, and F. Girosi. 1997b. Training support vector machines: an application to
       face detection. Proceedings of the IEEE Conference on Computer Vision and Pattern
       Recognition. 130–7.

Schölkopf, B, C. Burges, and V. Vapnik. 1996. Incorporating invariances in support vector
      learning machines. Springer Lecture Notes in Computer Science. 1112: 47–52.

Sebastiani, F. 1999. Machine learning in automated text categorization. Technical Report,
       Consiglio Nazionale delle Ricerche. Pisa, Italy. 1–59.

Sherrod, P. 2008. “DTREG Predictive Modeling Software.” Manual for software available
       online: <www.dtreg.com>

Smola, A., and B. Schölkopf. 1998. A tutorial on support vector regression. NeuroCOLT2
       Technical Report NC2-TR-1998-030. Holloway College, London.

Vapnik, V. 2006. Empirical Inference Science. Afterword in 1982 reprint of Estimation of
      Dependences Based on Empirical Data.

Xu, C. N. Maddage, X. Shao, F. Cao, and Q. Tian. 2003. Musical genre classification using
       support vector machines. Proceedings of IEEE International Conference on Acoustics,
       Speech, and Signal Processing. 5: 429–32.

Weitere ähnliche Inhalte

Was ist angesagt?

Efficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input DataEfficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input Dataymelka
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Koteswar Rao Jerripothula
 
FUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATION
FUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATIONFUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATION
FUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATIONijcsit
 
Ijarcet vol-2-issue-7-2230-2231
Ijarcet vol-2-issue-7-2230-2231Ijarcet vol-2-issue-7-2230-2231
Ijarcet vol-2-issue-7-2230-2231Editor IJARCET
 
Kernel methods in machine learning
Kernel methods in machine learningKernel methods in machine learning
Kernel methods in machine learningbutest
 
Image Segmentation Using Two Weighted Variable Fuzzy K Means
Image Segmentation Using Two Weighted Variable Fuzzy K MeansImage Segmentation Using Two Weighted Variable Fuzzy K Means
Image Segmentation Using Two Weighted Variable Fuzzy K MeansEditor IJCATR
 
6. 7772 8117-1-pb
6. 7772 8117-1-pb6. 7772 8117-1-pb
6. 7772 8117-1-pbIAESIJEECS
 
Paper id 24201464
Paper id 24201464Paper id 24201464
Paper id 24201464IJRAT
 
Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector MachinesSilicon Mentor
 
IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...
IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...
IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...IRJET Journal
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482IJRAT
 
Dynamic clustering algorithm using fuzzy c means
Dynamic clustering algorithm using fuzzy c meansDynamic clustering algorithm using fuzzy c means
Dynamic clustering algorithm using fuzzy c meansWrishin Bhattacharya
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
A Mat Lab built software application for similar image retrieval
A Mat Lab built software application for similar image retrievalA Mat Lab built software application for similar image retrieval
A Mat Lab built software application for similar image retrievalIOSR Journals
 
Multimedia Security - JPEG Artifact details
Multimedia Security - JPEG Artifact detailsMultimedia Security - JPEG Artifact details
Multimedia Security - JPEG Artifact detailsSebastiano Battiato
 
FUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis ToolFUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis ToolSelman Bozkır
 

Was ist angesagt? (19)

Efficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input DataEfficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input Data
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
 
FUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATION
FUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATIONFUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATION
FUZZY IMAGE SEGMENTATION USING VALIDITY INDEXES CORRELATION
 
Ijarcet vol-2-issue-7-2230-2231
Ijarcet vol-2-issue-7-2230-2231Ijarcet vol-2-issue-7-2230-2231
Ijarcet vol-2-issue-7-2230-2231
 
Kernel methods in machine learning
Kernel methods in machine learningKernel methods in machine learning
Kernel methods in machine learning
 
Image Segmentation Using Two Weighted Variable Fuzzy K Means
Image Segmentation Using Two Weighted Variable Fuzzy K MeansImage Segmentation Using Two Weighted Variable Fuzzy K Means
Image Segmentation Using Two Weighted Variable Fuzzy K Means
 
6. 7772 8117-1-pb
6. 7772 8117-1-pb6. 7772 8117-1-pb
6. 7772 8117-1-pb
 
Paper id 24201464
Paper id 24201464Paper id 24201464
Paper id 24201464
 
DCT
DCTDCT
DCT
 
Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector Machines
 
IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...
IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...
IRJET-An Effective Strategy for Defense & Medical Pictures Security by Singul...
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482
 
Dynamic clustering algorithm using fuzzy c means
Dynamic clustering algorithm using fuzzy c meansDynamic clustering algorithm using fuzzy c means
Dynamic clustering algorithm using fuzzy c means
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
A Mat Lab built software application for similar image retrieval
A Mat Lab built software application for similar image retrievalA Mat Lab built software application for similar image retrieval
A Mat Lab built software application for similar image retrieval
 
Multimedia Security - JPEG Artifact details
Multimedia Security - JPEG Artifact detailsMultimedia Security - JPEG Artifact details
Multimedia Security - JPEG Artifact details
 
FUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis ToolFUAT – A Fuzzy Clustering Analysis Tool
FUAT – A Fuzzy Clustering Analysis Tool
 
Bj4103381384
Bj4103381384Bj4103381384
Bj4103381384
 
Sefl Organizing Map
Sefl Organizing MapSefl Organizing Map
Sefl Organizing Map
 

Andere mochten auch

SP.Matveev.IComp.Cover.AUG2016
SP.Matveev.IComp.Cover.AUG2016SP.Matveev.IComp.Cover.AUG2016
SP.Matveev.IComp.Cover.AUG2016Alex Matveev
 
Alexander Shigin Slides
Alexander Shigin SlidesAlexander Shigin Slides
Alexander Shigin Slidesguest092df8
 
Writing a phone gap or cordova plugin to install it by command line
Writing a phone gap or cordova plugin to install it by command lineWriting a phone gap or cordova plugin to install it by command line
Writing a phone gap or cordova plugin to install it by command lineOodles Technologies Pvt. Ltd.
 
Machine learning solutions for transportation networks
Machine learning solutions for transportation networksMachine learning solutions for transportation networks
Machine learning solutions for transportation networksbutest
 

Andere mochten auch (6)

SP.Matveev.IComp.Cover.AUG2016
SP.Matveev.IComp.Cover.AUG2016SP.Matveev.IComp.Cover.AUG2016
SP.Matveev.IComp.Cover.AUG2016
 
Alexander Shigin Slides
Alexander Shigin SlidesAlexander Shigin Slides
Alexander Shigin Slides
 
Writing a phone gap or cordova plugin to install it by command line
Writing a phone gap or cordova plugin to install it by command lineWriting a phone gap or cordova plugin to install it by command line
Writing a phone gap or cordova plugin to install it by command line
 
Futbola
FutbolaFutbola
Futbola
 
CEO Certificate_Q3_2016
CEO Certificate_Q3_2016CEO Certificate_Q3_2016
CEO Certificate_Q3_2016
 
Machine learning solutions for transportation networks
Machine learning solutions for transportation networksMachine learning solutions for transportation networks
Machine learning solutions for transportation networks
 

Ähnlich wie (MS word document)

Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...ijfls
 
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...ijfls
 
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...Vincenzo Russo
 
An empirical assessment of different kernel functions on the performance of s...
An empirical assessment of different kernel functions on the performance of s...An empirical assessment of different kernel functions on the performance of s...
An empirical assessment of different kernel functions on the performance of s...riyaniaes
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...cscpconf
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revisedKrish_ver2
 
Linear Discrimination Centering on Support Vector Machines
Linear Discrimination Centering on Support Vector MachinesLinear Discrimination Centering on Support Vector Machines
Linear Discrimination Centering on Support Vector Machinesbutest
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...IJNSA Journal
 
FOCUS.doc
FOCUS.docFOCUS.doc
FOCUS.docbutest
 
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In VideosA Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videosijtsrd
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognitionzukun
 
fuzzy LBP for face recognition ppt
fuzzy LBP for face recognition pptfuzzy LBP for face recognition ppt
fuzzy LBP for face recognition pptAbdullah Gubbi
 
SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text IJERA Editor
 
ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...
ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...
ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...grssieee
 

Ähnlich wie (MS word document) (20)

Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Km2417821785
Km2417821785Km2417821785
Km2417821785
 
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...
 
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...
 
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Br...
 
An empirical assessment of different kernel functions on the performance of s...
An empirical assessment of different kernel functions on the performance of s...An empirical assessment of different kernel functions on the performance of s...
An empirical assessment of different kernel functions on the performance of s...
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised
 
Linear Discrimination Centering on Support Vector Machines
Linear Discrimination Centering on Support Vector MachinesLinear Discrimination Centering on Support Vector Machines
Linear Discrimination Centering on Support Vector Machines
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...
 
FOCUS.doc
FOCUS.docFOCUS.doc
FOCUS.doc
 
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In VideosA Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos
A Novel GA-SVM Model For Vehicles And Pedestrial Classification In Videos
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognition
 
fuzzy LBP for face recognition ppt
fuzzy LBP for face recognition pptfuzzy LBP for face recognition ppt
fuzzy LBP for face recognition ppt
 
SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text SVM Based Identification of Psychological Personality Using Handwritten Text
SVM Based Identification of Psychological Personality Using Handwritten Text
 
ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...
ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...
ABayesianApproachToLocalizedMultiKernelLearningUsingTheRelevanceVectorMachine...
 

Mehr von butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Mehr von butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

(MS word document)

  • 1. Smith 1 Jordan Smith MUMT 611 Written summary of classifiers 18 February 2008 A Review of Support Vector Machines Abstract A support vector machine (SVM) is a learning machine that can be used for classification problems (Cortes 1995) as well as for regression and novelty detection (Bennett 2000). SVMs look for the hyperplane that optimally separates two classes of data. Important features of SVMs are the absence of local minima, the well-controlled capacity of the solution (Christiani 2000), and the ability to handle high-dimensional input data efficiently (Cortes 1995). It is conceptually quite simple, but also very powerful: in its infancy, it has performed well against other popular classifiers (Meyer 2002, 2003), and has been applied to problems in several fields, including that of music information retrieval. 1. Introduction 1.1 History The support vector machine was developed quite recently, emerging only in the early 1990s. However, it is also the product of decades of research in computational learning theory by Russian mathematicians Vladimir Vapnik and Alexey Chervonenkis. Their resulting theory, summarized in Vapnik’s 1982 book Estimation of Dependences Based on Empirical Data, has been called Vapnik-Chervonenkis or VC theory (Vapnik 2006). That book describes the implementation of a support vector machine for linearly separable training data (Cortes 1995). Beginning in the early 1990s by researchers at Bell Labs, a number of important extensions were made to the SVM: in 1992, Boser, Guyon and Vapnik proposed using Aizerman's kernel trick to classify data perhaps only separable by polynomial or radial basis functions; in 1995, Vapnik and Cortes extended the theory to handle non-separable training data by using a cost function; finally, a method of support vector regression was developed in 1996 (Drucker). 1.2 Summary This very brief introduction to SVMs will first describe in historical order: the case of using a SVM to classify linear, separable data; the case of using a kernel function to make a non- linear classification; and the case of using a cost function to allow for non-separable data. In section 3, a number of studies using SVMs will be described, including several related to music information retrieval. Finally, some studies evaluating the performance of SVMs are summarized. 2. Support Vector Machines 2.1 Linear, separable data The basic problem that a SVM learns and solves is a two-category classification problem. Following the method of Bennett’s discussion (2000), suppose we have a set of l observations.
  • 2. Smith 2 Each observation can be represented by a pair {xi, yi} where xi є RN and yi є {-1, 1}. That is, each observation contains an N-dimensional vector x and a class assignment y. Our goal is to find the optimal separating hyperplane; that is, the flat (N-1)-dimensional surface that best separates the data. For the time being we assume that a separating hyperplane exists, and is defined by the normal vector w. On either side of this plane we construct a pair of parallel planes such that: w·xi ≥ b + 1 for yi = 1 Figure 1. Two data sets, represented by squares and circles, w·xi ≤ b – 1 for yi = -1 are separated by two parallel hyperplanes subtended by support vectors (circled). The distance between these planes – where b indicates the offset of the margin – is the quantity maximized by a SVM. The solid the plane from the origin. This line is the optimal separating hyperplane. situation is pictured in Figure 1, where the separating plan is the solid line and the two parallel planes are the dashed lines. The dashed lines ‘push up’ against some of the training data points: these points are called ‘support vectors,’ and in fact they completely determine the solution. The gap between these lines is called the margin, and we wish to maximize the size of this gap. In terms of w, we wish to maximize: ½||w||2 subject to the constraint: yi (w·xi – b) ≥ 1 The solution can be obtained using Lagrange multipliers (Burges 1998). 2.2 Kernel functions Often, a non-linear solution plane is required to separate data. To repeat the above steps and maximize the separation between two non-linear functions can be computationally expensive. Instead, the kernel trick is used: input data are mapped into a higher dimensional feature space via a specified Figure dimensional featurethe kernel trick. Input function,mapped into a higher 2. Visualization of space using a kernel data are resulting in kernel function. The data linearly-separable training data. Source: Holbrey, R. “Dimension Reduction are linearly separable in the Algorithms for Data Mining and Visualization.” higher dimensional space. <http://www.comp.leeds.ac.uk/richardh/astro/index.html> Accessed 12 Furthermore, if a good February 2008. kernel function is selected, the dot product will be preserved in the feature space (Cortes 1995) so that the mathematical
  • 3. Smith 3 approach outlined in section 2.1 is still applicable. The important kernel functions who have been used and whose properties have been studied most extensively are linear and polynomial functions, the radial-basis function, and the sigmoid function (Sherrod 2008). 2.3 Non-separable data A method of accommodating errors and outliers in the input data was developed in 1995 (Cortes), and can be implemented simply by allowing an error of up to ξ in each dimension (resulting in a ‘fuzzy margin’) and adding a cost function C(i) to the optimization equation (Burges). We then want to minimize: ½||w||2 + C·(Σ ξi) subject to the constraint: yi (w·xi – b) + ξi ≥ 1 (Bennett 2000). This is substantially harder to solve than the separable case. In Chang and Lin’s LIBSVM manual, the minimization conditions, constraints, and resulting decision functions are defined for each type of classification, along with algorithms for solving the required quadratic programming problems (2007). 3 Studies using SVMs 3.1 Applications Throughout his early papers, Vapnik often used optical text recognition as an experimental example application (Boser 1992, Cortes 1995, Schölkopf 1996). (See also Sebastiani 1999, Joachims 1997.) Since then, many authors have since used SVMs to develop classifiers in other disciplines: see, for instance, the work on face detection by Osuna et al. (1997b) or on gene expression data by Brown et al. (2000). In the field of music information retrieval, Dhanaraj and Logan used SVMs in their automatic identification of hit songs based on lyrics and acoustic features (2005), Laurier and Herrera submitted a second-place finishing mood classifier to MIREX 2007 that relied on SVMs and acoustic features, and Meng used SVMs at multiple stages in his dissertation: first to perform temporal feature integration and second to perform automatic genre identification based on these features (2006). Both Mandel (2005, 2006) and Xu (2003) have studied musical genre classification using SVMs based on acoustic features. The free software package LIB-SVM is a library of tools for implementing various types of SVMs (Chang 2007) while DTREG can implement a number of predictive models, from SVMs to various types of neural nets and decision trees (Sherrod 2008). 3.2 Performance According to Vapnik, the performance of his SVM hand-written digit classifier easily outperforms state-of-the-art classifiers based on other learning routines. However, since their rise in popularity in the 1990s, SVMs have been the object of closer scrutiny: a study by Meyer concluded that although SVMs performed very well in classification and regression tasks, other methods were as competitive (2002). While the two-category classification problem is the classic problem to study analytically, but in practice, more categories must be distinguished. Hsu (2002) compared the performance of various methods of combining binary classifiers, concluding that one-against-one and ‘directed acyclic graph SVM’ were better than one-against-all.
  • 4. Smith 4 Bibliography Bennett, K., and C. Campbell. 2000. “Support vector machines: Hype or hallelujah?” Special Interest Group on Knowledge Discovery and Data Mining Explorations. 2(2): 1–13. Boser, B., I. Guyon, and V. Vapnik. 1992. “A training algorithm for optimal margin classifiers.” Proceedings of the 5th Annual Workshop on Computational Learning Theory. 144–52. Brown, M., W. Grundy, D. Lin, N. Cristianini, C. Sugnet, T. Furey, M. Ares Jr., and D. Haussler. 2000. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Science. 97: 262–267. Burges, C. 1998. “A tutorial on support vector machines for pattern recognition.” Data Mining and Knowledge Discovery. 2(2): 955–74. Chang, C., and C. Lin. 2007. “LIBSVM: a library for support vector machines.” Manual for software available online: <http://www.csie.ntu.edu.tw/~cjlin/libsvm/> Christiani, N., and J. Shawe-Taylor. 2000. Chapter 6: Further reading and advanced topics. In An Introduction to Support Vector Machines. Cambridge: Cambridge University Press. <http://www.support-vector.net/chapter_6.html> Cortes, C., and V. Vapnik. 1995. Support-vector networks. Machine Learning. 20(3): 273–297. Dhanaraj, R., and B. Logan. 2005. “Automatic prediction of hit songs.” International Conference on Music Information Retrieval, London UK. 488–91. Drucker, H., C. Burges, L. Kaufman, A. Smola, and V. Vapnik. 1996. Support vector regression machine. Advances in Neural Information Processing Systems. Cambridge: MIT Press 9(9): 155–61. Hsu, C., and C. Lin. 2002. A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks. 13(2): 415–425. Hsu, C., C. Chang, and C. Lin. 2007. A practical guide to support vector classification. <http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf> Joachims, T. 1997. Text categorization with support vector machines: Learning with many relevant features. Springer Lecture Notes in Computer Science. 1398: 137–42. Laurier, C., and P. Herrera. 2007. Audio music mood classification using support vector machine.” Proceedings of 8th International Conference on Music Information Retrieval. Mandel, M., and D. Ellis. 2005. Song-level features and support vector machines for music classification. Proceedings of the 6th International Conference on Music Information
  • 5. Smith 5 Retrieval. 594–599 Mandel, M., G. Poliner, and D. Ellis. 2006. Support vector machine active learning for music retrieval. Multimedia Systems. 12(1): 3–13. Meng, A. 2006. Temporal feature integration for music organization. PhD diss., Technical University of Denmark. Meyer, D., F. Leisch, and K. Hornik. 2002. Benchmarking support vector machines. Report Series SFB, Adaptive Information Systems and Modelling in Economics and Management Science. 78. Meyer, D., F. Leisch, and K. Hornik. 2003. The support vector machine under test. Neurocomputing. 55: 169–86. Osuna, E., R. Freund, and F. Girosi. 1997a. An improved training algorithm for support vector machines. Proceedings of the IEEE Workshop on Neural Networks for Signal Processing. 276–85. Osuna, E., R. Freund, and F. Girosi. 1997b. Training support vector machines: an application to face detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 130–7. Schölkopf, B, C. Burges, and V. Vapnik. 1996. Incorporating invariances in support vector learning machines. Springer Lecture Notes in Computer Science. 1112: 47–52. Sebastiani, F. 1999. Machine learning in automated text categorization. Technical Report, Consiglio Nazionale delle Ricerche. Pisa, Italy. 1–59. Sherrod, P. 2008. “DTREG Predictive Modeling Software.” Manual for software available online: <www.dtreg.com> Smola, A., and B. Schölkopf. 1998. A tutorial on support vector regression. NeuroCOLT2 Technical Report NC2-TR-1998-030. Holloway College, London. Vapnik, V. 2006. Empirical Inference Science. Afterword in 1982 reprint of Estimation of Dependences Based on Empirical Data. Xu, C. N. Maddage, X. Shao, F. Cao, and Q. Tian. 2003. Musical genre classification using support vector machines. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. 5: 429–32.