SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
1
BDIGITAL: After Work Knowledge Program
Practical approach to machine learning techniques for
classification and anomaly detection
Xavier Rafael-Palou
xrafael@bdigital.org
(12/12/2014)
2
New Hype surrounding AI
3
Even… Turing test!!
4
(Classic test)
Natural Language Processing - communication
Knowledge representation - knowledge storage (KS)
Automated reasoning - use KS to answer questions
Machine Learning - detect patterns, adapt (total Turing Test)
(Advanced Turing Test)
Computer vision - perceive objects
Robotics - manipulate objects + move around
Blade Runner (Ridley Scott, 1982): Deckard and the Voight-Kampff machine in 2019.
Inspired on Philip K. Dick's book "Do Android's Dream of Electric Sheep” (1968)
(*) Source:
“Artificial
Intelligence, a
modern approach“
by Stuart Russel &
Peter Norvig.
5
Agenda
1. Introduction (15 min)
2. Basic Techniques (45 min)
3. Guides & Tips Building a Classifier (15)
4. Practice:
- Environment(15 min)
- Examples & exercises (60 min)
6. References
6
Introduction
Classification, Anomaly detection but also clustering, regression are examples of
Machine Learning (ML) tasks.
ML is a subfield of Artificial Intelligence to :
- Give computers the ability to learn without being explicitly programmed. (Arthur
Samuel, 1959)
- Give computer program ability to learn from experience E with respect to some task
T and some performance measure P, if its performance on T, as measured by P,
improves with experience E. (Tom Mitchell, 1998)
Data mining (DM) overlaps in many ways with Machine Learning:
- DM uses many ML methods, but often with a slightly different goal of discovering
previously unknown knowledge.
-While ML aims to perform accurately on new, unseen examples/tasks after having
experienced a learning data set.
7
Main ML tasks:
Supervised learning. The goal is to learn a general rule given a set of examples
that maps inputs to outputs.
Others:
Unsupervised learning, no labels are given to discovering patterns in data.
Reinforcement learning, interaction with a dynamic environment in which it must
perform a certain goal without a teacher.
Semi-supervised learning, the teacher gives an incomplete training set with some of the
target outputs missing.
8
Examples:
Email: Spam / Not Spam?
Online Transactions: Fraudulent (Yes / No)?
Tumor: Malignant / Benign ?
0: “Negative Class” (e.g., benign tumor)
1: “Positive Class” (e.g., malignant tumor)
Classification
Variable to predict:
9
Tumor SizeTumor Size
(Yes) 1
(No) 0
Binary Classification (y = 0 or 1)
Anomaly?
Decision boundary
Classification
Malignant ?
10
Classification Complexities
11
x1
x2
x1
x2
Binary classification: Multi-class classification:
Multiclass classification
Email foldering/tagging: Work, Friends, Family, Hobby
Medical diagrams: Not ill, Cold, Flu
Weather: Sunny, Cloudy, Rain, Snow
12
x1
x2
One-vs-all (one-vs-rest)
Class 1:
Class 2:
Class 3:
x1
x2
x1
x2
x1
x2
On a new input output the class that maximizes
Principle: Divide & conquer
13
Agenda
1. Introduction
2. Basic Techniques
3. Guides & Tips Building a Classifier
4. Practice:
- Environment
- Examples & exercises
6. References
14
There are multiple classification techniques:
- Probabilistic
- Decision Tree
- Linear
- Instance-based
- Genetic algorithms
- Fuzzy logic
- …
Each of them learns a decision function in a different way:
Basic Classification Methods
15
Probabilistic classifiers
Example: “Automatic fruit classification”
- Random variable (y) says if fruit is M or A
- Looking at the conveyor belt during some time, we get probs of M, A (“a priori”
knowledge of the harvest) P(y=M), P(y=A) both sum up 1
- Classifier: M if p(y=M) >= p(y=A) else A enough?
CompacInVision 9000
16
- We add new random variable x to the system for a better performance
x = size degree of the fruit [1,2,3…]
- So, we get probs of p(x) too
- Since x depends on the type of fruit, we get densities of x depending on the type of
fruit:
p(x| y=A) , p(x | y=M) “conditional probability densities”
How size affects our attitude regarding the type of fruit in question?
- p(y=A | x) = (p(x| y=A) P(y=A)) / p(x)
- P(y=M | x) = (p(x| y=M) P(y=M)) /p(x)
Naive Bayes: A if p(y=A | x) >= p(y=M | x) else M (probs “a posteriori”)
Probabilistic classifiers
17
Pros:
- Simple to implement
- Fast to compute (e.g. fits in map & reduce paradigm)
- works surprisingly well
- Compatible with missing data
- Used in text mining Multinomial Naive Bayes
Cons:
- Unrealistic hypothesis: All features equally important and independent of another
given a class
- Dependencies among features (i.e. recall all have same power)
- Zero probs holds a veto over other ones
- Requires process all data
Probabilistic classifiers
18
- Widely used because of the ease of understanding of the knowledge proposed
- Set of conditions (nodes) organized hierarchically
- Prediction: Apply a new unseen instance from root to leaves of the tree
Decision Tree Learning
19
Tid Refund Marital
Status
Taxable
Income Cheat
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
Training
Greedy strategy. Split records based on an attribute test that optimizes certain criterion.
The tree is built recursively adding conditions until the leaves containing the same kind
elements
- Partitioning strategy: best attribute, best condition NP problem
- Determine when to stop
Don’t
Cheat
Refund
Don’t
Cheat
Don’t
Cheat
Yes No
Refund
Don’t
Cheat
Yes No
Marital
Status
Don’t
Cheat
Cheat
Single,
Divorced
Married
Taxable
Income
Don’t
Cheat
< 80K >= 80K
Refund
Don’t
Cheat
Yes No
Marital
Status
Don’t
Cheat
Cheat
Single,
Divorced
Married
Example:
Decision Tree Learning
20
Partitioning strategy : Preferred aattribute's that generate disjoint sets (homogeneity)
Strategy examples :
∑−=
j
tjptGINI 2
)]|([1)(
Non-homogeneous,
High degree of impurity
Homogeneous,
Low degree of impurity
p( j | t) is the relative frequency of class j at node t
)|(max1)( tjPtError
j
−=
Decision Tree Learning
Measures homogeneity of a node. Used in CART, SLIQ, SPRINT






−= ∑
=
k
i
i
split
iEntropy
n
n
pEntropyGAIN 1
)()(
Measures misclassification error made by a node
Choose split that achieves most homogeneity reduction (e.g. ID3, C4.5)
21
Based on the principle that the instances within a dataset will generally exist in
close proximity to other instances that have similar properties.
kNN (Cover and Hart, 1967) locates the k nearest instances to the query instance and
determines its class by identifying the single most frequent class label.
Instances can be considered as points
within an n-dimensional instance space
where each of the n-dimensions corresponds
to one of the n-features.
A distance metric must minimize the distance
between two similarly classified instances,
while maximizing the distance between instances
of different classes
Instance-Based Learning
22
Distance metrics: Euclidean distance (*), Mahalanobis, Manhattam,…
To determine the class given the neighbour list, we can use e.g. majority voting or
weights according to distance (1/d2)
Instance-Based Learning
23
Pros:
- Less computational cost during training (Lazy learning)
Cons:
- Slow classification
- Requires store large amounts of information
- Sensitive to the choice of the similarity method
- Unclear selection criteria K
Instance-Based Learning
24
x1
Decision Boundary
1 2 3
1
2
3
Predicts “ “ when…
The idea is to get a function h (x) (parameters and attributes) to partition
data into desired output classes
x2
Probabilistic Statistical Classification
Principal objective is to find h(x) :
25
Then we predict “ “ if
predict “ “ if
z
1
Expected values for h(x) are :
We need to transform h(x) to accommodate it to this behavior (Sigmoid function)
Logistic Regression :
Replace z for:
Probabilistic Statistical Classification
26
How to choose parameters ? Those that minimize error (cost)
If y = 1
10
Cost function
The more our hypothesis is off from y, the
larger the cost function output. If our
hypothesis is equal to y, then our cost is 0
Logistic Regression
Gradient descendent Method to find local minimum cost
27
Logistic vs SVM vs Neural Networks
N (features) is large Preferred using a logistic regression, or SVM without a kernel
(the "linear kernel")
N is small and M (instances) is intermediate Preferred using a SVM with a Gaussian
Kernel
N is small and M is large manually create/add more features , then use logistic
regression or SVM without a kernel.
Neural networks is likely to work well for any of these situations, but may be slower to
train.
Comparative Classification Methods
28
Comparative Classification Methods
29
Supervised Machine Learning: A Review of Classification Techniques
S. B. Kotsiantis. Informatica 31 (2007) 249–268
Comparative Classification Methods
30
Anomaly Detection
Anomalous behavior's Classification
• Fraud detection
• Manufacturing (e.g. aircraft
engines)
• Monitoring machines in a data
center
• Email spam classification
• Weather prediction
(sunny/rainy/etc).
• Cancer classification
31
Anomaly detection vs Classification
Very small number of positive
examples (y=1). (0-20 is common).
Large number of negative (y=0)
examples.
Many different “types” of anomalies.
Hard for any algorithm to learn from
positive examples what the anomalies
look like; future anomalies may look
nothing like any of the anomalous
examples we’ve seen so far.
Large number of positive and
negative examples.
Enough positive examples for
algorithm to get a sense of what
positive examples are like, future
positive examples likely to be similar
to ones in training set.
32
Given a new example we want to know whether is abnormal/anomalous.
We define a "model" p(x) that says the probability the example is not anomalous.
We use a threshold ϵ (epsilon) as a dividing line so we can say which examples are
anomalous and which are not.
If our anomaly detector is flagging too many anomalous examples, then we need to
decrease our threshold ϵ
Anomaly Detection Methods
33
The Gaussian Distribution is a familiar bell-shaped curve that can be described by a
function N(μ,σ2)
Mu, or μ, describes the centre of the curve, called the mean. The width of the curve is
described by sigma, or σ, called the standard deviation.
Parameter μ is the average of all the examples:
We can estimate σ2, with our familiar squared error formula:
Gaussian Distribution Method
34
Given a training set of examples, {x(1),…,x(m)} where each example is a vector, x∈Rn.
An "independent assumption" on the values of the features inside training example x.
More compactly, the above expression can be written as follows:
Anomaly if p(x)<ϵ
Gaussian Distribution Method
35
Fit model on training set
On a cross validation/test, predict x as:
Possible evaluation metrics:
- True positive, false positive, false negative, true negative
- Precision/Recall
- F1-score
Tricks:
- Choose features that might take on unusually large or small values in the event of
an anomaly
- Use cross validation set to choose sigma parameter
- Train only on normal data
- Test and validation: add anomalies (50% each)
Gaussian Distribution Method
36
An extension of anomaly detection and may (or may not) catch more anomalies.
Instead of modelling p(x1),p(x2),… separately, we will model p(x) all in one go.
Parameters are : μ∈ Rn and Σ ∈ Rn×n
We can vary Σ for changes in shape, width, and orientation of the contours.
Changing μ will move the centre of the distribution.
Anomaly if p(x)<ϵ
Multivariate Gaussian Distribution
37
One-class SVM
The multivariate Gaussian model can automatically capture correlations between
different features of x.
However, the original model is computationally cheaper (no matrix to invert) and it
performs well even with small training set size.
One-class SVM can be used for anomaly detection.
Could work better than multivariate when data does not follow a Gaussian distribution
38
Agenda
1. Introduction
2. Basic Techniques
3. Guides & Tips Building a Classifier
4. Practice:
- Environment
- Examples & exercises
6. References
39
If classification performance is not what we expected, What to work on?
- Get more training examples?
- Try smaller sets of features?
- Try getting additional features?
- Try changing model?
- Try decreasing regularization?
- Try increasing regularization?
Guides & Tips Building Classifiers
40
The attributes
petal width and
petal length provide a
moderate separation of
the Irish species
Data exploration
Manually examine the examples (in cross validation set) that your algorithm made errors on.
See if you spot any systematic trend in what type of examples it is making errors on.
Arrange good features for your classifier:
- Discrimination ability: Values significantly different for objects of different classes
- Reliability: Similar values for objects same class
- Independence: Attributes should be uncorrelated. Instead, combine them:
E.g. diameter and weight: diameter3 / weight (scale invariant)
41
Bias-Variance Trade-Off
- Balance between capacity generalize classifier performance
- Plot learning curves to decide if more data, more features, etc. are likely to help.
42
Start with a simple algorithm that you can implement quickly.
Implement and test it on your cross-validation data.
Split data in 3 different sets: Training + Validation + Test
Accuracy, percentage of correct predictions (SPAM or no) by all predictions
Precision, percentage of e-mails classified as SPAMs which truly are
Recall, percentage of e-mails classified as SPAMs over the total of
examples that are SPAM
How to compare precision/recall numbers?
FNTP
TP
TPRrecall
+
==
FPTP
TP
precision
+
=
Model Evaluation
F1 Score:
43
Agenda
1. Introduction
2. Basic Techniques
3. Guides & Tips Building a Classifier
4. Practice:
- Environment
- Examples & exercises
6. References
44
Practice: Environment
0) Python:
Language interpreted dynamically-typed nature
Download:
- Python already installed:
pip install ipython or only dependencies "ipython[notebook]“
- Otherwise:
Anaconda (http://continuum.io/downloads) is a completely free Python distribution
(including for commercial use and redistribution). It includes over 195 of the most
popular python packages for science, math, engineering, data analysis.
$ Conda info
$ conda install <packageName>
$ conda update <packageName>
45
Practice: Environment
1) Ipython:
Ipython provides a rich architecture for interactive computing with:
- Powerful interactive shells (terminal and Qt-based).
- A browser-based notebook with support for code, text, mathematical expressions,
inline plots and other rich media.
- Support for interactive data visualization and use of GUI toolkits.
- Flexible, embeddable interpreters to load into your own projects.
- Easy to use, high performance tools for parallel computing.
Start console Ipython –pylab
46
Practice: Environment
2) Notebook
Web-based interactive computational environment where to combine code execution,
text, mathematics, plots and rich media into a single document
Start notebook server ipython notebook
(http://127.0.0.1:8888)
Open an existing notebook ipython notebook <name.ipynb>
The notebook consists of a sequence of cells.
A cell is a multi-line text input field, and its contents can be executed by commands or
clicking either “Play” button, or Cell | Run in the menu bar.
Commands:
Shift-Enter Runs cell and goes to next
Ctrl-Enter Runs cell & stays in same cell
Esc and Enter Command mode and edit mode
Tab auto-complete
47
Practice: Environment
3) Numpy + scipy
Numpy offers a specific data structure for high-performance numerical computing:
the multidimensional array
- Data is stored in contiguous block of memoryin Ram. This makes more efficient
Use of cpu cycles and cache
- Array operations implemented internally with C loops rather than python.
Numpy has all standard array functions, linear algebra, and fancy indexing.
Numpy+scipy docs: http://docs.scipy.org
4) Matplotlib
Graphical library to plot and visualize your data
5) Scikit-Learn
Librería para machine learning
48
Agenda
1. Introduction
2. Basic Techniques
3. Guides & Tips Building a Classifier
4. Practice:
- Environment
- Examples & exercises
6. References
49
Practice: Exercises
- An introduction to machine learning with Python and scikit-learn (repo and overview)
by Hannes Schulz and Andreas Mueller.
- PyCon 2014 Scikit-learn Tutorial (Ipython and machine learning) by Jake VanderPlas
50
Agenda
1. Introduction
2. Basic Techniques
3. Guides & Tips Building a Classifier
4. Practice:
- Environment
- Examples & exercises
6. References
51
References
- Data mining. Practical Machine Learning Tools and Techniques. I. Frank, et al
- Introduction to Machine learning with Ipython. LxMLS 2014. A. Mueller
- Ipython and machine learning. PyCon ’14
- Introduction to Machine learning. Coursera 2014. A. Ng
- scikit-learn. http://scikit-learn.org (see especially the narrative documentation)
- Matplotlib. http://matplotlib.org (see especially the gallery section)
- Ipython. http://ipython.org (also check out http://nbviewer.ipython.org)
- Anaconda. https://store.continuum.io/cshop/anaconda/
- Notebook. http://ipython.org/ipython-doc/stable/notebook/index.html
52
Thanks a lot!!
xrafael@bdigital.org

Weitere ähnliche Inhalte

Was ist angesagt?

Intro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data ScientistsIntro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data ScientistsParinaz Ameri
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Support Vector Machine (Classification) - Step by Step
Support Vector Machine (Classification) - Step by StepSupport Vector Machine (Classification) - Step by Step
Support Vector Machine (Classification) - Step by StepManish nath choudhary
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVMCarlo Carandang
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineRishabh Gupta
 
Linear Algebra – A Powerful Tool for Data Science
Linear Algebra – A Powerful Tool for Data ScienceLinear Algebra – A Powerful Tool for Data Science
Linear Algebra – A Powerful Tool for Data SciencePremier Publishers
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector MachinesCloudxLab
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMohsin Ul Haq
 
Ensemble Learning and Random Forests
Ensemble Learning and Random ForestsEnsemble Learning and Random Forests
Ensemble Learning and Random ForestsCloudxLab
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector MachineDerek Kane
 
Ml5 svm and-kernels
Ml5 svm and-kernelsMl5 svm and-kernels
Ml5 svm and-kernelsankit_ppt
 
Aaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAminaRepo
 
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Simplilearn
 
2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methods2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methodsKrish_ver2
 
Dm part03 neural-networks-homework
Dm part03 neural-networks-homeworkDm part03 neural-networks-homework
Dm part03 neural-networks-homeworkokeee
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)Sharayu Patil
 

Was ist angesagt? (20)

Intro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data ScientistsIntro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data Scientists
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Support Vector Machine (Classification) - Step by Step
Support Vector Machine (Classification) - Step by StepSupport Vector Machine (Classification) - Step by Step
Support Vector Machine (Classification) - Step by Step
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Linear Algebra – A Powerful Tool for Data Science
Linear Algebra – A Powerful Tool for Data ScienceLinear Algebra – A Powerful Tool for Data Science
Linear Algebra – A Powerful Tool for Data Science
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
 
Ensemble Learning and Random Forests
Ensemble Learning and Random ForestsEnsemble Learning and Random Forests
Ensemble Learning and Random Forests
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Ml5 svm and-kernels
Ml5 svm and-kernelsMl5 svm and-kernels
Ml5 svm and-kernels
 
Aaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble Learning
 
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
 
2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methods2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methods
 
Dm part03 neural-networks-homework
Dm part03 neural-networks-homeworkDm part03 neural-networks-homework
Dm part03 neural-networks-homework
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 

Andere mochten auch

P03 neural networks cvpr2012 deep learning methods for vision
P03 neural networks cvpr2012 deep learning methods for visionP03 neural networks cvpr2012 deep learning methods for vision
P03 neural networks cvpr2012 deep learning methods for visionzukun
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Longhow Lam
 
Chapter 10 Anomaly Detection
Chapter 10 Anomaly DetectionChapter 10 Anomaly Detection
Chapter 10 Anomaly DetectionKhalid Elshafie
 
101: Convolutional Neural Networks
101: Convolutional Neural Networks 101: Convolutional Neural Networks
101: Convolutional Neural Networks Mad Scientists
 
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)Lee Seungeun
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooChristian Perone
 
Backpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkBackpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkHiroshi Kuwajima
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 

Andere mochten auch (10)

P03 neural networks cvpr2012 deep learning methods for vision
P03 neural networks cvpr2012 deep learning methods for visionP03 neural networks cvpr2012 deep learning methods for vision
P03 neural networks cvpr2012 deep learning methods for vision
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)
 
Chapter 10 Anomaly Detection
Chapter 10 Anomaly DetectionChapter 10 Anomaly Detection
Chapter 10 Anomaly Detection
 
101: Convolutional Neural Networks
101: Convolutional Neural Networks 101: Convolutional Neural Networks
101: Convolutional Neural Networks
 
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural Zoo
 
Backpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkBackpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural Network
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 

Ähnlich wie Introduction to conventional machine learning techniques

An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_financeStefan Duprey
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzerbutest
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos butest
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
Maxim Kazantsev
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningAI Summary
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptAnshika865276
 
Kaggle Projects Presentation Sawinder Pal Kaur
Kaggle Projects Presentation Sawinder Pal KaurKaggle Projects Presentation Sawinder Pal Kaur
Kaggle Projects Presentation Sawinder Pal KaurSawinder Pal Kaur
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasicengrasi
 
Introduction
IntroductionIntroduction
Introductionbutest
 
data mining cocepts and techniques chapter
data mining cocepts and techniques chapterdata mining cocepts and techniques chapter
data mining cocepts and techniques chapterNaveenKumar5162
 

Ähnlich wie Introduction to conventional machine learning techniques (20)

An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Jörg Stelzer
Jörg StelzerJörg Stelzer
Jörg Stelzer
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

 
Introduction to data mining and machine learning
Introduction to data mining and machine learningIntroduction to data mining and machine learning
Introduction to data mining and machine learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Clustering
ClusteringClustering
Clustering
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
Lect4
Lect4Lect4
Lect4
 
Winnow vs perceptron
Winnow vs perceptronWinnow vs perceptron
Winnow vs perceptron
 
Kaggle Projects Presentation Sawinder Pal Kaur
Kaggle Projects Presentation Sawinder Pal KaurKaggle Projects Presentation Sawinder Pal Kaur
Kaggle Projects Presentation Sawinder Pal Kaur
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
10 clusbasic
10 clusbasic10 clusbasic
10 clusbasic
 
Introduction
IntroductionIntroduction
Introduction
 
data mining cocepts and techniques chapter
data mining cocepts and techniques chapterdata mining cocepts and techniques chapter
data mining cocepts and techniques chapter
 

Kürzlich hochgeladen

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 

Kürzlich hochgeladen (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 

Introduction to conventional machine learning techniques

  • 1. 1 BDIGITAL: After Work Knowledge Program Practical approach to machine learning techniques for classification and anomaly detection Xavier Rafael-Palou xrafael@bdigital.org (12/12/2014)
  • 4. 4 (Classic test) Natural Language Processing - communication Knowledge representation - knowledge storage (KS) Automated reasoning - use KS to answer questions Machine Learning - detect patterns, adapt (total Turing Test) (Advanced Turing Test) Computer vision - perceive objects Robotics - manipulate objects + move around Blade Runner (Ridley Scott, 1982): Deckard and the Voight-Kampff machine in 2019. Inspired on Philip K. Dick's book "Do Android's Dream of Electric Sheep” (1968) (*) Source: “Artificial Intelligence, a modern approach“ by Stuart Russel & Peter Norvig.
  • 5. 5 Agenda 1. Introduction (15 min) 2. Basic Techniques (45 min) 3. Guides & Tips Building a Classifier (15) 4. Practice: - Environment(15 min) - Examples & exercises (60 min) 6. References
  • 6. 6 Introduction Classification, Anomaly detection but also clustering, regression are examples of Machine Learning (ML) tasks. ML is a subfield of Artificial Intelligence to : - Give computers the ability to learn without being explicitly programmed. (Arthur Samuel, 1959) - Give computer program ability to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. (Tom Mitchell, 1998) Data mining (DM) overlaps in many ways with Machine Learning: - DM uses many ML methods, but often with a slightly different goal of discovering previously unknown knowledge. -While ML aims to perform accurately on new, unseen examples/tasks after having experienced a learning data set.
  • 7. 7 Main ML tasks: Supervised learning. The goal is to learn a general rule given a set of examples that maps inputs to outputs. Others: Unsupervised learning, no labels are given to discovering patterns in data. Reinforcement learning, interaction with a dynamic environment in which it must perform a certain goal without a teacher. Semi-supervised learning, the teacher gives an incomplete training set with some of the target outputs missing.
  • 8. 8 Examples: Email: Spam / Not Spam? Online Transactions: Fraudulent (Yes / No)? Tumor: Malignant / Benign ? 0: “Negative Class” (e.g., benign tumor) 1: “Positive Class” (e.g., malignant tumor) Classification Variable to predict:
  • 9. 9 Tumor SizeTumor Size (Yes) 1 (No) 0 Binary Classification (y = 0 or 1) Anomaly? Decision boundary Classification Malignant ?
  • 11. 11 x1 x2 x1 x2 Binary classification: Multi-class classification: Multiclass classification Email foldering/tagging: Work, Friends, Family, Hobby Medical diagrams: Not ill, Cold, Flu Weather: Sunny, Cloudy, Rain, Snow
  • 12. 12 x1 x2 One-vs-all (one-vs-rest) Class 1: Class 2: Class 3: x1 x2 x1 x2 x1 x2 On a new input output the class that maximizes Principle: Divide & conquer
  • 13. 13 Agenda 1. Introduction 2. Basic Techniques 3. Guides & Tips Building a Classifier 4. Practice: - Environment - Examples & exercises 6. References
  • 14. 14 There are multiple classification techniques: - Probabilistic - Decision Tree - Linear - Instance-based - Genetic algorithms - Fuzzy logic - … Each of them learns a decision function in a different way: Basic Classification Methods
  • 15. 15 Probabilistic classifiers Example: “Automatic fruit classification” - Random variable (y) says if fruit is M or A - Looking at the conveyor belt during some time, we get probs of M, A (“a priori” knowledge of the harvest) P(y=M), P(y=A) both sum up 1 - Classifier: M if p(y=M) >= p(y=A) else A enough? CompacInVision 9000
  • 16. 16 - We add new random variable x to the system for a better performance x = size degree of the fruit [1,2,3…] - So, we get probs of p(x) too - Since x depends on the type of fruit, we get densities of x depending on the type of fruit: p(x| y=A) , p(x | y=M) “conditional probability densities” How size affects our attitude regarding the type of fruit in question? - p(y=A | x) = (p(x| y=A) P(y=A)) / p(x) - P(y=M | x) = (p(x| y=M) P(y=M)) /p(x) Naive Bayes: A if p(y=A | x) >= p(y=M | x) else M (probs “a posteriori”) Probabilistic classifiers
  • 17. 17 Pros: - Simple to implement - Fast to compute (e.g. fits in map & reduce paradigm) - works surprisingly well - Compatible with missing data - Used in text mining Multinomial Naive Bayes Cons: - Unrealistic hypothesis: All features equally important and independent of another given a class - Dependencies among features (i.e. recall all have same power) - Zero probs holds a veto over other ones - Requires process all data Probabilistic classifiers
  • 18. 18 - Widely used because of the ease of understanding of the knowledge proposed - Set of conditions (nodes) organized hierarchically - Prediction: Apply a new unseen instance from root to leaves of the tree Decision Tree Learning
  • 19. 19 Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes 10 Training Greedy strategy. Split records based on an attribute test that optimizes certain criterion. The tree is built recursively adding conditions until the leaves containing the same kind elements - Partitioning strategy: best attribute, best condition NP problem - Determine when to stop Don’t Cheat Refund Don’t Cheat Don’t Cheat Yes No Refund Don’t Cheat Yes No Marital Status Don’t Cheat Cheat Single, Divorced Married Taxable Income Don’t Cheat < 80K >= 80K Refund Don’t Cheat Yes No Marital Status Don’t Cheat Cheat Single, Divorced Married Example: Decision Tree Learning
  • 20. 20 Partitioning strategy : Preferred aattribute's that generate disjoint sets (homogeneity) Strategy examples : ∑−= j tjptGINI 2 )]|([1)( Non-homogeneous, High degree of impurity Homogeneous, Low degree of impurity p( j | t) is the relative frequency of class j at node t )|(max1)( tjPtError j −= Decision Tree Learning Measures homogeneity of a node. Used in CART, SLIQ, SPRINT       −= ∑ = k i i split iEntropy n n pEntropyGAIN 1 )()( Measures misclassification error made by a node Choose split that achieves most homogeneity reduction (e.g. ID3, C4.5)
  • 21. 21 Based on the principle that the instances within a dataset will generally exist in close proximity to other instances that have similar properties. kNN (Cover and Hart, 1967) locates the k nearest instances to the query instance and determines its class by identifying the single most frequent class label. Instances can be considered as points within an n-dimensional instance space where each of the n-dimensions corresponds to one of the n-features. A distance metric must minimize the distance between two similarly classified instances, while maximizing the distance between instances of different classes Instance-Based Learning
  • 22. 22 Distance metrics: Euclidean distance (*), Mahalanobis, Manhattam,… To determine the class given the neighbour list, we can use e.g. majority voting or weights according to distance (1/d2) Instance-Based Learning
  • 23. 23 Pros: - Less computational cost during training (Lazy learning) Cons: - Slow classification - Requires store large amounts of information - Sensitive to the choice of the similarity method - Unclear selection criteria K Instance-Based Learning
  • 24. 24 x1 Decision Boundary 1 2 3 1 2 3 Predicts “ “ when… The idea is to get a function h (x) (parameters and attributes) to partition data into desired output classes x2 Probabilistic Statistical Classification Principal objective is to find h(x) :
  • 25. 25 Then we predict “ “ if predict “ “ if z 1 Expected values for h(x) are : We need to transform h(x) to accommodate it to this behavior (Sigmoid function) Logistic Regression : Replace z for: Probabilistic Statistical Classification
  • 26. 26 How to choose parameters ? Those that minimize error (cost) If y = 1 10 Cost function The more our hypothesis is off from y, the larger the cost function output. If our hypothesis is equal to y, then our cost is 0 Logistic Regression Gradient descendent Method to find local minimum cost
  • 27. 27 Logistic vs SVM vs Neural Networks N (features) is large Preferred using a logistic regression, or SVM without a kernel (the "linear kernel") N is small and M (instances) is intermediate Preferred using a SVM with a Gaussian Kernel N is small and M is large manually create/add more features , then use logistic regression or SVM without a kernel. Neural networks is likely to work well for any of these situations, but may be slower to train. Comparative Classification Methods
  • 29. 29 Supervised Machine Learning: A Review of Classification Techniques S. B. Kotsiantis. Informatica 31 (2007) 249–268 Comparative Classification Methods
  • 30. 30 Anomaly Detection Anomalous behavior's Classification • Fraud detection • Manufacturing (e.g. aircraft engines) • Monitoring machines in a data center • Email spam classification • Weather prediction (sunny/rainy/etc). • Cancer classification
  • 31. 31 Anomaly detection vs Classification Very small number of positive examples (y=1). (0-20 is common). Large number of negative (y=0) examples. Many different “types” of anomalies. Hard for any algorithm to learn from positive examples what the anomalies look like; future anomalies may look nothing like any of the anomalous examples we’ve seen so far. Large number of positive and negative examples. Enough positive examples for algorithm to get a sense of what positive examples are like, future positive examples likely to be similar to ones in training set.
  • 32. 32 Given a new example we want to know whether is abnormal/anomalous. We define a "model" p(x) that says the probability the example is not anomalous. We use a threshold ϵ (epsilon) as a dividing line so we can say which examples are anomalous and which are not. If our anomaly detector is flagging too many anomalous examples, then we need to decrease our threshold ϵ Anomaly Detection Methods
  • 33. 33 The Gaussian Distribution is a familiar bell-shaped curve that can be described by a function N(μ,σ2) Mu, or μ, describes the centre of the curve, called the mean. The width of the curve is described by sigma, or σ, called the standard deviation. Parameter μ is the average of all the examples: We can estimate σ2, with our familiar squared error formula: Gaussian Distribution Method
  • 34. 34 Given a training set of examples, {x(1),…,x(m)} where each example is a vector, x∈Rn. An "independent assumption" on the values of the features inside training example x. More compactly, the above expression can be written as follows: Anomaly if p(x)<ϵ Gaussian Distribution Method
  • 35. 35 Fit model on training set On a cross validation/test, predict x as: Possible evaluation metrics: - True positive, false positive, false negative, true negative - Precision/Recall - F1-score Tricks: - Choose features that might take on unusually large or small values in the event of an anomaly - Use cross validation set to choose sigma parameter - Train only on normal data - Test and validation: add anomalies (50% each) Gaussian Distribution Method
  • 36. 36 An extension of anomaly detection and may (or may not) catch more anomalies. Instead of modelling p(x1),p(x2),… separately, we will model p(x) all in one go. Parameters are : μ∈ Rn and Σ ∈ Rn×n We can vary Σ for changes in shape, width, and orientation of the contours. Changing μ will move the centre of the distribution. Anomaly if p(x)<ϵ Multivariate Gaussian Distribution
  • 37. 37 One-class SVM The multivariate Gaussian model can automatically capture correlations between different features of x. However, the original model is computationally cheaper (no matrix to invert) and it performs well even with small training set size. One-class SVM can be used for anomaly detection. Could work better than multivariate when data does not follow a Gaussian distribution
  • 38. 38 Agenda 1. Introduction 2. Basic Techniques 3. Guides & Tips Building a Classifier 4. Practice: - Environment - Examples & exercises 6. References
  • 39. 39 If classification performance is not what we expected, What to work on? - Get more training examples? - Try smaller sets of features? - Try getting additional features? - Try changing model? - Try decreasing regularization? - Try increasing regularization? Guides & Tips Building Classifiers
  • 40. 40 The attributes petal width and petal length provide a moderate separation of the Irish species Data exploration Manually examine the examples (in cross validation set) that your algorithm made errors on. See if you spot any systematic trend in what type of examples it is making errors on. Arrange good features for your classifier: - Discrimination ability: Values significantly different for objects of different classes - Reliability: Similar values for objects same class - Independence: Attributes should be uncorrelated. Instead, combine them: E.g. diameter and weight: diameter3 / weight (scale invariant)
  • 41. 41 Bias-Variance Trade-Off - Balance between capacity generalize classifier performance - Plot learning curves to decide if more data, more features, etc. are likely to help.
  • 42. 42 Start with a simple algorithm that you can implement quickly. Implement and test it on your cross-validation data. Split data in 3 different sets: Training + Validation + Test Accuracy, percentage of correct predictions (SPAM or no) by all predictions Precision, percentage of e-mails classified as SPAMs which truly are Recall, percentage of e-mails classified as SPAMs over the total of examples that are SPAM How to compare precision/recall numbers? FNTP TP TPRrecall + == FPTP TP precision + = Model Evaluation F1 Score:
  • 43. 43 Agenda 1. Introduction 2. Basic Techniques 3. Guides & Tips Building a Classifier 4. Practice: - Environment - Examples & exercises 6. References
  • 44. 44 Practice: Environment 0) Python: Language interpreted dynamically-typed nature Download: - Python already installed: pip install ipython or only dependencies "ipython[notebook]“ - Otherwise: Anaconda (http://continuum.io/downloads) is a completely free Python distribution (including for commercial use and redistribution). It includes over 195 of the most popular python packages for science, math, engineering, data analysis. $ Conda info $ conda install <packageName> $ conda update <packageName>
  • 45. 45 Practice: Environment 1) Ipython: Ipython provides a rich architecture for interactive computing with: - Powerful interactive shells (terminal and Qt-based). - A browser-based notebook with support for code, text, mathematical expressions, inline plots and other rich media. - Support for interactive data visualization and use of GUI toolkits. - Flexible, embeddable interpreters to load into your own projects. - Easy to use, high performance tools for parallel computing. Start console Ipython –pylab
  • 46. 46 Practice: Environment 2) Notebook Web-based interactive computational environment where to combine code execution, text, mathematics, plots and rich media into a single document Start notebook server ipython notebook (http://127.0.0.1:8888) Open an existing notebook ipython notebook <name.ipynb> The notebook consists of a sequence of cells. A cell is a multi-line text input field, and its contents can be executed by commands or clicking either “Play” button, or Cell | Run in the menu bar. Commands: Shift-Enter Runs cell and goes to next Ctrl-Enter Runs cell & stays in same cell Esc and Enter Command mode and edit mode Tab auto-complete
  • 47. 47 Practice: Environment 3) Numpy + scipy Numpy offers a specific data structure for high-performance numerical computing: the multidimensional array - Data is stored in contiguous block of memoryin Ram. This makes more efficient Use of cpu cycles and cache - Array operations implemented internally with C loops rather than python. Numpy has all standard array functions, linear algebra, and fancy indexing. Numpy+scipy docs: http://docs.scipy.org 4) Matplotlib Graphical library to plot and visualize your data 5) Scikit-Learn Librería para machine learning
  • 48. 48 Agenda 1. Introduction 2. Basic Techniques 3. Guides & Tips Building a Classifier 4. Practice: - Environment - Examples & exercises 6. References
  • 49. 49 Practice: Exercises - An introduction to machine learning with Python and scikit-learn (repo and overview) by Hannes Schulz and Andreas Mueller. - PyCon 2014 Scikit-learn Tutorial (Ipython and machine learning) by Jake VanderPlas
  • 50. 50 Agenda 1. Introduction 2. Basic Techniques 3. Guides & Tips Building a Classifier 4. Practice: - Environment - Examples & exercises 6. References
  • 51. 51 References - Data mining. Practical Machine Learning Tools and Techniques. I. Frank, et al - Introduction to Machine learning with Ipython. LxMLS 2014. A. Mueller - Ipython and machine learning. PyCon ’14 - Introduction to Machine learning. Coursera 2014. A. Ng - scikit-learn. http://scikit-learn.org (see especially the narrative documentation) - Matplotlib. http://matplotlib.org (see especially the gallery section) - Ipython. http://ipython.org (also check out http://nbviewer.ipython.org) - Anaconda. https://store.continuum.io/cshop/anaconda/ - Notebook. http://ipython.org/ipython-doc/stable/notebook/index.html