Support Vector Machines ( SVM )

Support Vector Machine
Classification , Regression and Outliers detection
Khan
Introduction
SVM
A Support Vector Machine
(SVM) is a discriminative
classifier which intakes
training data (supervised
learning), the algorithm
outputs an optimal
hyperplane which
categorizes new examples.
What could be drawn to classify the black dots from blue squares?
A line drawn between these data points classify the black dots and
blue squares.
Linearly separable data
Linear vs Nonlinear separable data
What could be drawn to classify these data points ( red dots from
blue stars )?
NonLinearly separable data
Here the hyperplane is a 2d plane drawn parallel to x-axis that is
the separator.
NonLinearly separable data
Non Linear data ( type 2 )
Raw Data Line as Hyperplane
For the previous data the line , if used as a
Hyperplane
● Two black dots also fall in category of
blue squares
● Data separation is not perfect
● It tolerates some outliers in the
classification
This type of separator best provides the classification.
But
● It is quite difficult to train a model like this .
● This is termed as Regularisation parameter.
Tuning
Parameters
SVM
1. Kernel
2. Regularization
3. Gamma
4. Margin
Margin
Margin is the perpendicular distance between the
closest data points and the Hyperplane ( on both sides )
The best optimised line ( hyperplane ) with maximum
margin is termed as Margin Maximal Hyperplane.
The closest points where the margin distance is
calculated are considered as the support vectors.
Support Vector Machines ( SVM )
Regularization
● Also the ‘ C ‘ parameter in Python’s SkLearn Library
● Optimises SVM classifier to avoid misclassifying the
data.
● C → large Margin of hyperplane → small
● C → small Margin of hyperplane → large
● misclassification(possible)
1. C ---> large , chance of overfit
2. C ---> small , chance of underfitting
Support Vector Machines ( SVM )
Gamma
● Defines how far influences the calculation of of
plausible line of separation.
● Low gamma -----> points far from plausible line are
considered for calculation
● High gamma -----> points close to plausible line are
considered for calculation
High Gamma Value Low Gamma Value
Kernels
● Mathematical functions for transforming data
● using some linear algebra
● Different SVM algorithms use different types of
kernel functions
Various kernels available
1. Linear kernel
2. Non - linear kernel
3. Radial basis function ( RBF )
4. Sigmoid
5. Polynomial
6. Exponential
Example :
K(x, y) = <f(x), f(y)>
Kernel function dot product of n- dimensional inputs
Mathematical representation
x = (x1, x2, x3); y = (y1, y2, y3)
f(x) = (x1x1, x1x2, x1x3, x2x1, x2x2, x2x3, x3x1, x3x2, x3x3)
f(y) = (y1y1, y1y2, y1y3, y2y1, y2y2, y2y3, y3y1, y3y2, y3y3)
K(x, y ) = (<x, y>)²
x = (1, 2, 3)
y = (4, 5, 6)
f(x) = (1, 2, 3, 2, 4, 6, 3, 6, 9)
f(y) = (16, 20, 24, 20, 25, 30, 24, 30, 36)
<f(x), f(y)> = 16 + 40 + 72 + 40 + 100+ 180 + 72 + 180 + 324 = 1024
K(x, y ) = (4 + 10 + 18)² = 1024 ----> Kernel function
Pros :
● It works really well with clear margin of separation
● It is effective in high dimensional spaces.
● It is effective in cases where number of dimensions is greater
than the number of samples.
● It uses a subset of training points in the decision function
(called support vectors), so it is also memory efficient.
Cons :
● It doesn’t perform well, when we have large data set because
the required training time is higher
● It also doesn’t perform very well, when the data set has more
noise i.e. target classes are overlapping
● SVM doesn’t directly provide probability estimates, these are
calculated using an expensive five-fold cross-validation. It is
related SVC method of Python scikit-learn library.
Applications :
1. Face detection
2. Text and hypertext categorization
3. Classification of images
4. Bioinformatics
5. Handwriting recognition
6. Protein fold and remote homology detection
7. Generalized predictive control(GPC)
Let’s code now
Data used : Iris from Sklearn
Plots : Matplotlib
Kernels : Linear and rbf
File : svm_final.py
Link to code : Click here for code
Thank You
1 von 26

Más contenido relacionado

Was ist angesagt?(20)

Support vector machineSupport vector machine
Support vector machine
Rishabh Gupta1.1K views
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula13.4K views
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony9.8K views
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
Pabna University of Science & Technology2.8K views
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
Sharayu Patil1.2K views
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib19.9K views
Deep learning presentationDeep learning presentation
Deep learning presentation
Tunde Ajose-Ismail33K views
Unsupervised learningUnsupervised learning
Unsupervised learning
amalalhait14.3K views
Feedforward neural networkFeedforward neural network
Feedforward neural network
Sopheaktra YONG9.4K views
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
Milind Gokhale16.2K views
Feature selectionFeature selection
Feature selection
Dong Guo14.1K views
Scikit Learn introScikit Learn intro
Scikit Learn intro
9xdot1K views
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil8.6K views
Image segmentation pptImage segmentation ppt
Image segmentation ppt
Gichelle Amon81.2K views
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
omaraldabash1.9K views
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
Dr Ganesh Iyer16K views
CnnCnn
Cnn
Nirthika Rajendran13.9K views

Similar a Support Vector Machines ( SVM ) (20)

svm-proyekt.pptxsvm-proyekt.pptx
svm-proyekt.pptx
ElinEliyev25 views
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector Machines
Silicon Mentor913 views
05 contours seg_matching05 contours seg_matching
05 contours seg_matching
ankit_ppt451 views
svm.pptxsvm.pptx
svm.pptx
PriyadharshiniG4114 views
Data clustering Data clustering
Data clustering
GARIMA SHAKYA2.9K views
Module-3_SVM_Kernel_KNN.pptxModule-3_SVM_Kernel_KNN.pptx
Module-3_SVM_Kernel_KNN.pptx
VaishaliBagewadikar12 views
SVM_notes.pdfSVM_notes.pdf
SVM_notes.pdf
ShwetaGargade1188 views
background.pptxbackground.pptx
background.pptx
KabileshCm6 views
SVM.pdfSVM.pdf
SVM.pdf
HibaBellafkih27 views
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Prediction
sriram306919.1K views

Más de Mohammad Junaid Khan(7)

Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
Mohammad Junaid Khan6.7K views
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan5K views
PythonPython
Python
Mohammad Junaid Khan901 views
My SQL 1My SQL 1
My SQL 1
Mohammad Junaid Khan307 views
Ruby_Basic.pptxRuby_Basic.pptx
Ruby_Basic.pptx
Mohammad Junaid Khan334 views
Wireless transmission of powerWireless transmission of power
Wireless transmission of power
Mohammad Junaid Khan264 views

Support Vector Machines ( SVM )

  • 1. Support Vector Machine Classification , Regression and Outliers detection Khan
  • 2. Introduction SVM A Support Vector Machine (SVM) is a discriminative classifier which intakes training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples.
  • 3. What could be drawn to classify the black dots from blue squares?
  • 4. A line drawn between these data points classify the black dots and blue squares. Linearly separable data
  • 5. Linear vs Nonlinear separable data
  • 6. What could be drawn to classify these data points ( red dots from blue stars )? NonLinearly separable data
  • 7. Here the hyperplane is a 2d plane drawn parallel to x-axis that is the separator. NonLinearly separable data
  • 8. Non Linear data ( type 2 ) Raw Data Line as Hyperplane
  • 9. For the previous data the line , if used as a Hyperplane ● Two black dots also fall in category of blue squares ● Data separation is not perfect ● It tolerates some outliers in the classification
  • 10. This type of separator best provides the classification. But ● It is quite difficult to train a model like this . ● This is termed as Regularisation parameter.
  • 12. Margin Margin is the perpendicular distance between the closest data points and the Hyperplane ( on both sides ) The best optimised line ( hyperplane ) with maximum margin is termed as Margin Maximal Hyperplane. The closest points where the margin distance is calculated are considered as the support vectors.
  • 14. Regularization ● Also the ‘ C ‘ parameter in Python’s SkLearn Library ● Optimises SVM classifier to avoid misclassifying the data. ● C → large Margin of hyperplane → small ● C → small Margin of hyperplane → large ● misclassification(possible) 1. C ---> large , chance of overfit 2. C ---> small , chance of underfitting
  • 16. Gamma ● Defines how far influences the calculation of of plausible line of separation. ● Low gamma -----> points far from plausible line are considered for calculation ● High gamma -----> points close to plausible line are considered for calculation
  • 17. High Gamma Value Low Gamma Value
  • 18. Kernels ● Mathematical functions for transforming data ● using some linear algebra ● Different SVM algorithms use different types of kernel functions
  • 19. Various kernels available 1. Linear kernel 2. Non - linear kernel 3. Radial basis function ( RBF ) 4. Sigmoid 5. Polynomial 6. Exponential
  • 20. Example : K(x, y) = <f(x), f(y)> Kernel function dot product of n- dimensional inputs
  • 21. Mathematical representation x = (x1, x2, x3); y = (y1, y2, y3) f(x) = (x1x1, x1x2, x1x3, x2x1, x2x2, x2x3, x3x1, x3x2, x3x3) f(y) = (y1y1, y1y2, y1y3, y2y1, y2y2, y2y3, y3y1, y3y2, y3y3) K(x, y ) = (<x, y>)² x = (1, 2, 3) y = (4, 5, 6) f(x) = (1, 2, 3, 2, 4, 6, 3, 6, 9) f(y) = (16, 20, 24, 20, 25, 30, 24, 30, 36) <f(x), f(y)> = 16 + 40 + 72 + 40 + 100+ 180 + 72 + 180 + 324 = 1024 K(x, y ) = (4 + 10 + 18)² = 1024 ----> Kernel function
  • 22. Pros : ● It works really well with clear margin of separation ● It is effective in high dimensional spaces. ● It is effective in cases where number of dimensions is greater than the number of samples. ● It uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
  • 23. Cons : ● It doesn’t perform well, when we have large data set because the required training time is higher ● It also doesn’t perform very well, when the data set has more noise i.e. target classes are overlapping ● SVM doesn’t directly provide probability estimates, these are calculated using an expensive five-fold cross-validation. It is related SVC method of Python scikit-learn library.
  • 24. Applications : 1. Face detection 2. Text and hypertext categorization 3. Classification of images 4. Bioinformatics 5. Handwriting recognition 6. Protein fold and remote homology detection 7. Generalized predictive control(GPC)
  • 25. Let’s code now Data used : Iris from Sklearn Plots : Matplotlib Kernels : Linear and rbf File : svm_final.py Link to code : Click here for code