This was part of my inaugural lecture of Summer Internship on Machine Learning at NMAM Institute of Technology, Nitte on 7th June, 2018. A lot more than what was on this presentation was discussed. We spoke on the ethics of choices we make as developers, socio-cultural impact of AI and ML and the political repercussions of deploying ML and AI.
2. Before we begin….
How many people have heard about Machine Learning
How many people know about Machine Learning
How many people are using Machine Learning
Machine Learning
Deep Learning
Artificial Intelligence
Introduction
• Basics
• Classification
• Clustering
• Regression
• Use-Cases
3. “A computer program is said to learn from
experience (E) with some class of tasks (T) and
a performance measure (P) if its performance at
tasks in T as measured by P improves with E”
subfield of Artificial Intelligence (AI)
• name is derived from the concept that it deals with
“construction and study of systems that can learn from data”
• can be seen as building blocks to make computers learn to behave more
intelligently
• It is a theoretical concept. There are various techniques with various
implementations.
Terminology
Features
– The number of features or distinct traits that can be used to describe
each item in a quantitative manner.
Samples
– A sample is an item to process (e.g. classify). It can be a document, a picture, a
sound, a video, a row in database or CSV file, or whatever you can describe with a
fixed set of quantitative traits.
Feature vector
– is an n-dimensional vector of numerical features that represent some
object.
Feature extraction
– Preparation of feature vector
– transforms the data in the high-dimensional space to a space of
fewer dimensions.
Training / Evolution set
– Set of data to discover potentially predictive relationships.
4. Apple
What comes to your mind?
Learning (Training)
Features:
1. Color: Radish/Red
2. Type : Fruit
3. Shape
etc…
Features:
1. Sky Blue
2. Logo
3. Shape
etc…
Features:
1. Yellow
2. Fruit
3. Shape
etc…
Learning (Training)
6. With supervised learning, you feed the output of your algorithm into the
system. Machine already knows the output.
Work out the steps or process needed to reach from the input to the output.
Training data set exists.
If the process goes haywire and the algorithms come up with results
completely different than what should be expected, then the training data
does its part to guide the algorithm back towards the right path.
Supervised Machine Learning currently makes up most of the ML that is being
used by systems across the world. The input variable (x) is used to connect
with the output variable (y) through the use of an algorithm. All of the input,
the output, the algorithm, and the scenario are being provided by humans. We
can understand supervised learning in an even better way by looking at it
through two types of problems.
Classification: Classification problems categorize all the variables that form
the output. Examples of these categories formed through classification would
Whenever people talk about computers and machines developing the ability
to “teach themselves” in a seamless manner, rather than us humans having to
do the honor, they are in a way alluding to the processes involved in
unsupervised learning.
Just consider that we have a digital image that has a variety of colored
geometric shapes on it. These geometric shapes needed to be matched into
groups according to color and other classification features. For a system that
follows supervised learning, this whole process is a bit too simple. The
procedure is extremely straightforward, as you just have to teach the
computer all the details pertaining to the figures. You can let the system know
that all shapes with four sides are known as squares, and others with eight
sides are known as octagons, etc. We can also teach the system to interpret
the colors and see how the light being given out is classified.
However, in unsupervised learning, the whole process becomes a little trickier.
7. The biggest difference between supervised and unsupervised machine
learning is this: Supervised machine learning algorithms are trained on
datasets that include labels added by a machine learning engineer or data
scientist that guide the algorithm to understand which features are important
to the problem at hand. Unsupervised machine learning algorithms, on the
other hand, are trained on unlabeled data and must determine feature
importance on their own based on inherent patterns in the data. (If the ideas
of training algorithms or quantifying feature importance seem completely
foreign, be sure to check out our executive’s guide to predictive modeling!)
As you may have guessed, semi-supervised learning algorithms are trained on
a combination of labeled and unlabeled data. This is useful for a few reasons.
First, the process of labeling massive amounts of data for supervised learning
is often prohibitively time-consuming and expensive. What’s more, too much
labeling can impose human biases on the model. That means including lots of
unlabeled data during the training process actually tends to improve the
Semi-supervised learning
Semi-supervised learning is a win-win for use cases like webpage
classification, speech recognition, or even for genetic sequencing. In all of
these cases, data scientists can access large volumes of unlabeled data, but
the process of actually assigning supervision information to all of it would be
an insurmountable task.
Using classification as an example, let’s compare how these three approaches
work in practice:
Supervised classification: The algorithm learns to assign labels to types of
webpages based on the labels that were inputted by a human during the
training process.
Unsupervised clustering: The algorithm looks at inherent similarities
between webpages to place them into groups.
Semi-supervised classification: Labeled data is used to help identify that
there are specific groups of webpage types present in the data and what they
might be. The algorithm is then trained on unlabeled data to define the
8. Reinforcement Learning spurs off from the concept of Unsupervised Learning,
and gives a high sphere of control to software agents and machines to
determine what the ideal behavior within a context can be. This link is formed
to maximize the performance of the machine in a way that helps it to grow.
Simple feedback that informs the machine about its progress is required here
to help the machine learn its behavior.
Reinforcement Learning is not simple, and is tackled by a plethora of different
algorithms. As a matter of fact, in Reinforcement Learning an agent decides
the best action based on the current state of the results.
The growth in Reinforcement Learning has led to the production of a wide
variety of algorithms that help machines learn the outcome of what they are
doing. Since we have a basic understanding of Reinforcement Learning by
now, we can get a better grasp by forming a comparative analysis between
Reinforcement Learning and the concepts of Supervised and Unsupervised
Learning that we have studied in detail before.
Machine Learning Techniques
Supervised Learning: The correct classes of the training data are known
Unsupervised Learning: The correct classes of the training data are not
known
Semi-supervised learning: A Mix of Supervised and Unsupervised learning
Reinforcement Learning: Allows the machine or software agent to learn its
behavior based on feedback from the environment. This behavior can be
learnt once and for all, or keep on adapting as time goes by.
9. Classification
Clustering
Regression
Classification: predict class from observations
Clustering: Group observations into “meaningful” groups
Regression (Prediction): Predict value from observations
Classification
Classify a document into a predefined category.
Documents can be text, images etc.
Popular one is Naïve Bayes Classifier.
Simple technique for constructing classifiers: models that assign class labels
to problem instances, represented as vectors of feature values, where the
class labels are drawn from some finite set.
NOT A SINGLE ALGORITHM
They assume that the value of a particular feature is independent of the value
of any other feature, given the class variable.
A fruit may be considered to be an apple if it is red, round, and about 10 cm in
diameter. A naive Bayes classifier considers each of these features to
contribute independently to the probability that this fruit is an apple,
regardless of any possible correlations between the color, roundness, and
diameter features.
Text Categorization, Automatic Medical Diagnosis etc. Highly scaleable.
10. Steps:
– Step1 : Train the program (Building a Model) using a training set with a
category for e.g. sports, cricket, news,
– Classifier will compute probability for each word, the probability that it
makes a document belong to each of considered categories
– Step2 : Test with a test data set against this Model
Clustering
Clustering is the task of grouping a set of objects in such a
way that objects in the same group (called a cluster) are
more similar to each other.
Objects are not predefined
Popular ones are K-means clustering and Hierarchical
clustering
For e.g. these keywords
– “man’s shoe”
– “women’s shoe”
– “women’s t-shirt”
– “man’s t-shirt”
– can be cluster into 2 categories “shoe” and “t-shirt” or
“man” and “women”
11. K-Means Clustering
https://youtu.be/wFL6JcepP3M
Pizza delivery centers, Swiggy etc.
Random centroid
Assign to similar center —> Identify Cluster centroids —>Reassign based on
minimum distance to centroid —>Identify new cluster centroid
———————-^
Hierarchical Clustering
Method of cluster analysis which seeks to build a hierarchy of clusters.
Agglomerative: This is a "bottom up" approach: each observation starts in its
own cluster, and pairs of clusters are merged as one moves up the hierarchy.
Divisive: This is a "top down" approach: all observations start in one cluster,
and splits are performed recursively as one moves down the hierarchy.
12. Regression
is a measure of the relation between the mean value of one variable (e.g.
output) and corresponding values of other variables (e.g. time and cost).
regression analysis is a statistical process for estimating the relationships
among variables.
Regression means to predict the output value using training data.
Popular one is Logistic regression
(binary regression)
Regression: To predict the house price from training data
If it is a real number/continuous, then it is regression problem.
If it is discrete/categorical variable, then it is classification problem
Classification: To predict the type of tumor i.e. harmful or not harmful using
training data
13. Spam Email Detection
Machine Translation (Language Translation)
Image Search (Similarity)
Clustering (K-Means) : Amazon Recommendations
Classification : Google News
Use-Cases
Text Summarization - Google News
Rating a Review / Comment: Zomato
Fraud detection : Credit card Providers
Decision Making : e.g. Bank/Insurance sector
Sentiment Analysis
Speech Understanding – iPhone with Siri
Face Detection – Facebook’s Photo tagging