This slide gives brief overview of supervised, unsupervised and reinforcement learning. Algorithms discussed are Naive Bayes, K nearest neighbour, SVM,decision tree, Markov model.
Difference between regression and classification. difference between supervised and reinforcement, iterative functioning of Markov model and machine learning applications.
1. Artificial Intelligence[ECS 801] Presentation
Subject Professor: Dr Y.N.Singh
Topic: Introduction to Machine Learning
Presented by:
Akshay Kanchan(1205210006)
Mohd Iqbal(1305210903)
Institute of Engineering and Technology
Lucknow
2. In Artificial Intelligence, an intelligent machine
should be able to:
1. Think and act Rationally
2. Store and retrieve knowledge
3. Adapt and Learn in new environment and
with new Data (Machine Learning)
3. "Field of study that gives computers the
ability to learn without being explicitly
programmed.”
What is machine learning?
5. -autonomous, self-driving car
- determining election results
- developing pharmaceutical drugs (combinatorial chemistry)
- predicting tastes in music (Pandora)
- predicting tastes in movies/shows (Netflix)
- search engines (Google)
- predicting interests (Facebook)
- predicting other books you might like (Amazon)
Where is Machine Learning being Used
8. • 1950 — Alan Turing creates the “Turing Test” to determine if a computer has
real intelligence.
• 1952 — Arthur Samuel wrote the first computer learning program. The program
was the game of checkers.
• 1957 — Frank Rosenblatt designed the first neural network for computers.
• 1967 — The “nearest neighbour” algorithm was written, allowing computers to
begin using very basic pattern recognition.
• 1979 — Students at Stanford University invent the “Stanford Cart” which can
navigate obstacles in a room on its own
Brief History
9. • 1990s — Work on machine learning shifts from a knowledge-driven
approach to a data-driven approach. Scientists begin creating programs
for computers to analyze large amounts of data and draw conclusions —
or “learn” — from the results.
• ASIMO, a Humanoid Robot designed and developed by Honda.
Introduced in 2000.
• 2016, Google program AlphaGo beats Professional World Go champion
by 4 games to 1.
10. In Machine Learning a computer program is
said to learn from experience E with respect to
some task T and performance metric P, if its
performance at tasks in T, as measured by P,
improves with experience E.
Formal Definition
11. Why is Machine Learning Important?
•Some tasks cannot be defined well, except by examples
(e.g., recognizing people).
•Relationships and correlations can be hidden within
large amounts of data. Machine Learning may be
able to find these relationships.
11
12. Areas of Influence for Machine Learning
•Statistics: How best to use samples drawn from
unknown probability distributions to help decide
from which distribution some new sample is drawn.
•Psychology: How to model human performance on
various learning tasks?
•Economics: How to write algorithms to maximum
profits.
•Neural/Brain Models: How to model certain aspects
of biological evolution to improve the performance
of computer programs?
12
13. • Prepare Data
Remove noise, smoothening, feature extraction, dimensionality reduction,
• Choose an Algorithm
Linear, non-linear, complexity, speed, accuracy.
• Train a Model
Prevent Over fitting and Under fitting
• Test the model
• Use for Prediction
Steps involved in Learning:
23. Naïve Bayes
• Naive Bayes methods are a set
of supervised learning
algorithms based on applying
Bayes’ theorem with the “naive”
assumption of independence
between every pair of features
• Uses Probabilistic approach to
assign label to data
• Based on Bayesian Probability
rule.
• It uses prior probability,
evidence and posterior
probability for classification
24. Support Vector machine
Support vector machines (SVMs) are a set of supervised learning
methods used for classification, regression and outliers
detection.
• Effective in high dimensional spaces.
• Still effective in cases where number of dimensions is greater
than the number of samples.
• Uses a subset of training points in the decision function (called
support vectors), so it is also memory efficient.
• Versatile: different Kernel functions can be specified for the
decision function. Common kernels are provided, but it is also
possible to specify custom kernels.
25. SVM(cont’d)
• SVMs try to maximize margin of hyperplane.
• SVM uses Kernel functions that take low-dimension input
space and map it to higher dimensional space.
X,Y(Kernel)X1,X2,X3
• SVM uses parameters like Gamma, C, Kernel etc to define
itself.
29. SVM(cont’d)
1. Kernel can be linear, non-linear etc
2. Gamma- describes how far the influence of a single training
example reaches.
For low Gamma value influence is Far
and for high Gamma values influence is low
3. C parameter: defines if decision boundary will be smooth or of high
order. It is a trade-off between biasing and variance.
Low C value: Smooth decision boundary
High C value: high order classification
30. Decision Tree
• Decision Trees (DTs) are a non-parametric supervised learning
method. The goal is to create a model that predicts the value of a
target variable by learning simple decision rules.
• Uses a white box model. If a given situation is observable in a
model, the explanation for the condition is easily explained by
Boolean logic.
• The problem of learning an optimal decision tree is known to be NP-
complete so locally optimal decisions are made at each node.
31.
32. Regression
• Regression analysis is also used to understand which among the
independent variables are related to the dependent variable, and to
explore the forms of these relationships.
• It includes many techniques for modelling and analysing several
variables, when the focus is on the relationship between
a dependent variable and one or more independent variables (or
'predictors').
33.
34. Classification vs Regression
•Classification means to
group the output into a
class.
•classification to predict
the type of tumor i.e.
harmful or not harmful
using training data
•if it is
discrete/categorical
variable, then it is
classification problem
• Regression means to
predict the output
value using training
data.
• regression to predict
the house price from
training data
• if it is a real
number/continuous,
then it is regression
problem.
35. The correct classes
of the training data
are not known
Unsupervised Learning
Unsupervised learning
36. Clustering
• Cluster analysis or clustering is the task of grouping a set of objects in
such a way that objects in the same group (called a cluster) are more
similar (in some sense or another) to each other than to those in
other groups (clusters).
37. K means clustering
• The algorithm clusters data by trying to separate samples in n groups of
equal variance, minimizing a criterion known as the inertia or within-cluster
sum-of-squares.
• This algorithm requires the number of clusters to be specified.
• It scales well to large number of samples and has been used across a large
range of application areas in many different fields.
39. That algorithm presents
a state dependent on
the input data in which a user
rewards or punishes the
algorithm via the action the user
took, this continues over time
Reinforcement Learning
Reinforcement learning
40.
41. Markov model
• It is a type of reinforcement learning.
• There are three fundamental problems for HMMs:
1. Given the model parameters and observed data, estimate the
optimal sequence of hidden states.
2. Given the model parameters and observed data, calculate the
likelihood of the data.
3. Given just the observed data, estimate the model parameters.
43. References
1. All definitions and explanations: http://scikit-learn.org/
2. Machine Learning History: http://www.forbes.com/
3. Images Online lectures of CMU Prof Sebastian Thrun.
44. Latest technologies in all field are being replaced by smart machines. Stock Market,
Ecommerce, Personalized customer experience etc etc.
In future maybe presentations will be
prepared and given by robots!!
Conclusion