3. Today let’s try to demystify machine learning
Focus less on glorifying the machine learning and
more on the technical details
Demystify and Glorify
4. The slides and the talk solely represent the speakers’
personal views
5. In its general form, Machine Learning means
teaching to a computational machine the way of
solving a problem by giving examples
Machine learning algorithms then automatically
infer rules to associate inputs to the correspondent
outputs
Machine Learning
8. Real Example: Characters Recognition
Letters have predefined shapes: we can measure and quantify
their relative proportions
9. In real world, rules are hard to formulate
Even if we might have a comprehensive set of rules, it would
be hard to scale and make them robust over different writing
styles
Handwritten Character Recognition in
the Real World
10. Input Output
Learned
Program
Machine
Learning
Machine Learning: Inputs and Outputs
We will focus on Supervised Learning i.e.,
learning associations between Inputs and
Outputs
There exist other types of Learning
Unsupervised: learning association
between inputs only
Semi-supervised: learning from few
outputs and a large amount of inputs
Reinforcement: giving rewards for good
associations
11. Inputs are generally represented by features
characteristic and meaningful measures computed on raw-
data
they provide domain information from human to machine
they make the learning process easier
Machine Learning: Inputs
12. Signal processing: from sound we can extract frequency, maximum
amplitude, power spectrum, etc ...
Probability and Statistics: from text we can compute probability
distributions of usual words, words co-occurrences, etc …
Nevertheless, many inputs are already structured in feature (numeric)
format
For example: The customer information like income range, payment dues
etc. are the possible features for credit risk profiling.
What Features are?
14. How Learning looks like in a Features
space
Using Features we can use Linear Algebra as main tool for the learning process
15. Signal Processing
Probability and Statistics
Linear Algebra
Inputs Outputs
New Input Predicted Output
Machine Learning Model
…the final ingredients
23. Non-linear SVMs: The Kernel Trick
Φ: x → φ(x)
The features space can always be mapped to some higher-dimensional
space where the training set is linearly separable
33. Original
Training data
....D1
D2 Dt-1 Dt
D
Step 1:
Create Multiple
Data Sets
C1 C2 Ct -1 Ct
Step 2:
Build Multiple
Classifiers
C*
Step 3:
Combine
Classifiers
Ensemble Methods: General Idea
• Construct a set of classifiers from the training data
• Predict class label of previously unseen records by aggregating
predictions made by multiple classifiers
34. • Suppose there are 25 base classifiers
• Each classifier has error rate, = 0.35
• Assume classifiers are independent
• Probability that the ensemble classifier makes a wrong
prediction:
25
i
æ
è
ç
ö
ø
÷ei
(1-e)25-i
= 0.06
i=1
25
å
Why does it work
This is the only formula in this presentation.
Enjoy it!!
36. Deep Learning: the new Frontier
Brief history of Neural Nets: born in 70s, developed in 80s, dead
in 90s, forgotten in 2000, state of the art in 2010
38. Real world problems: Unpredictable size
The volume and velocity of the input data can be very big
For example:-
One wearable device streams out more than a million
message per day
Scalable Deployment