SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Machine Learning
An Introduction
$:whoami
Vedaj J P
It doesn’t really matter
Twitter: @vedaj | Quora: Vedaj-J-Padman
Before we begin….
How many people have heard about Machine Learning
How many people know about Machine Learning
How many people are using Machine Learning
Machine Learning
Deep Learning
Artificial Intelligence
Introduction 

• Basics

• Classification

• Clustering 

• Regression 

• Use-Cases
“A computer program is said to learn from
experience (E) with some class of tasks (T) and
a performance measure (P) if its performance at
tasks in T as measured by P improves with E”
subfield of Artificial Intelligence (AI)

• name is derived from the concept that it deals with

“construction and study of systems that can learn from data”

• can be seen as building blocks to make computers learn to behave more
intelligently

• It is a theoretical concept. There are various techniques with various
implementations.
Terminology
Features
– The number of features or distinct traits that can be used to describe
each item in a quantitative manner.
Samples
– A sample is an item to process (e.g. classify). It can be a document, a picture, a
sound, a video, a row in database or CSV file, or whatever you can describe with a
fixed set of quantitative traits.
Feature vector
– is an n-dimensional vector of numerical features that represent some
object.
Feature extraction
– Preparation of feature vector
– transforms the data in the high-dimensional space to a space of
fewer dimensions.
Training / Evolution set
– Set of data to discover potentially predictive relationships.
Apple
What comes to your mind?
Learning (Training)
Features:
1. Color: Radish/Red
2. Type : Fruit
3. Shape
etc…
Features:
1. Sky Blue
2. Logo
3. Shape
etc…
Features:
1. Yellow
2. Fruit
3. Shape
etc…
Learning (Training)
Workflow
Workflow
Supervised
Unsupervised
Semi-supervised
Reinforcement
Learning
}
With supervised learning, you feed the output of your algorithm into the
system. Machine already knows the output.

Work out the steps or process needed to reach from the input to the output.
Training data set exists. 

If the process goes haywire and the algorithms come up with results
completely different than what should be expected, then the training data
does its part to guide the algorithm back towards the right path.

Supervised Machine Learning currently makes up most of the ML that is being
used by systems across the world. The input variable (x) is used to connect
with the output variable (y) through the use of an algorithm. All of the input,
the output, the algorithm, and the scenario are being provided by humans. We
can understand supervised learning in an even better way by looking at it
through two types of problems.

Classification: Classification problems categorize all the variables that form
the output. Examples of these categories formed through classification would
Whenever people talk about computers and machines developing the ability
to “teach themselves” in a seamless manner, rather than us humans having to
do the honor, they are in a way alluding to the processes involved in
unsupervised learning.

Just consider that we have a digital image that has a variety of colored
geometric shapes on it. These geometric shapes needed to be matched into
groups according to color and other classification features. For a system that
follows supervised learning, this whole process is a bit too simple. The
procedure is extremely straightforward, as you just have to teach the
computer all the details pertaining to the figures. You can let the system know
that all shapes with four sides are known as squares, and others with eight
sides are known as octagons, etc. We can also teach the system to interpret
the colors and see how the light being given out is classified.

However, in unsupervised learning, the whole process becomes a little trickier.
The biggest difference between supervised and unsupervised machine
learning is this: Supervised machine learning algorithms are trained on
datasets that include labels added by a machine learning engineer or data
scientist that guide the algorithm to understand which features are important
to the problem at hand. Unsupervised machine learning algorithms, on the
other hand, are trained on unlabeled data and must determine feature
importance on their own based on inherent patterns in the data. (If the ideas
of training algorithms or quantifying feature importance seem completely
foreign, be sure to check out our executive’s guide to predictive modeling!)

As you may have guessed, semi-supervised learning algorithms are trained on
a combination of labeled and unlabeled data. This is useful for a few reasons.
First, the process of labeling massive amounts of data for supervised learning
is often prohibitively time-consuming and expensive. What’s more, too much
labeling can impose human biases on the model. That means including lots of
unlabeled data during the training process actually tends to improve the
Semi-supervised learning
Semi-supervised learning is a win-win for use cases like webpage
classification, speech recognition, or even for genetic sequencing. In all of
these cases, data scientists can access large volumes of unlabeled data, but
the process of actually assigning supervision information to all of it would be
an insurmountable task.

Using classification as an example, let’s compare how these three approaches
work in practice:

Supervised classification: The algorithm learns to assign labels to types of
webpages based on the labels that were inputted by a human during the
training process.

Unsupervised clustering: The algorithm looks at inherent similarities
between webpages to place them into groups.

Semi-supervised classification: Labeled data is used to help identify that
there are specific groups of webpage types present in the data and what they
might be. The algorithm is then trained on unlabeled data to define the
Reinforcement Learning spurs off from the concept of Unsupervised Learning,
and gives a high sphere of control to software agents and machines to
determine what the ideal behavior within a context can be. This link is formed
to maximize the performance of the machine in a way that helps it to grow.
Simple feedback that informs the machine about its progress is required here
to help the machine learn its behavior.

Reinforcement Learning is not simple, and is tackled by a plethora of different
algorithms. As a matter of fact, in Reinforcement Learning an agent decides
the best action based on the current state of the results.

The growth in Reinforcement Learning has led to the production of a wide
variety of algorithms that help machines learn the outcome of what they are
doing. Since we have a basic understanding of Reinforcement Learning by
now, we can get a better grasp by forming a comparative analysis between
Reinforcement Learning and the concepts of Supervised and Unsupervised
Learning that we have studied in detail before.
Machine Learning Techniques
Supervised Learning: The correct classes of the training data are known 

Unsupervised Learning: The correct classes of the training data are not
known 

Semi-supervised learning: A Mix of Supervised and Unsupervised learning 

Reinforcement Learning: Allows the machine or software agent to learn its
behavior based on feedback from the environment. This behavior can be
learnt once and for all, or keep on adapting as time goes by. 

Classification
Clustering
Regression
Classification: predict class from observations

Clustering: Group observations into “meaningful” groups 

Regression (Prediction): Predict value from observations 

Classification
Classify a document into a predefined category. 

Documents can be text, images etc.

Popular one is Naïve Bayes Classifier. 

Simple technique for constructing classifiers: models that assign class labels
to problem instances, represented as vectors of feature values, where the
class labels are drawn from some finite set.

NOT A SINGLE ALGORITHM

They assume that the value of a particular feature is independent of the value
of any other feature, given the class variable. 

A fruit may be considered to be an apple if it is red, round, and about 10 cm in
diameter. A naive Bayes classifier considers each of these features to
contribute independently to the probability that this fruit is an apple,
regardless of any possible correlations between the color, roundness, and
diameter features.

Text Categorization, Automatic Medical Diagnosis etc. Highly scaleable.
Steps: 

– Step1 : Train the program (Building a Model) using a training set with a
category for e.g. sports, cricket, news, 

– Classifier will compute probability for each word, the probability that it
makes a document belong to each of considered categories 

– Step2 : Test with a test data set against this Model 

Clustering
Clustering is the task of grouping a set of objects in such a
way that objects in the same group (called a cluster) are
more similar to each other. 

Objects are not predefined 

Popular ones are K-means clustering and Hierarchical
clustering 

For e.g. these keywords 

	 –  “man’s shoe” 

	 –  “women’s shoe” 

	 –  “women’s t-shirt” 

	 –  “man’s t-shirt” 

	 –  can be cluster into 2 categories “shoe” and “t-shirt” or 

“man” and “women” 

K-Means Clustering
https://youtu.be/wFL6JcepP3M

Pizza delivery centers, Swiggy etc. 

Random centroid

Assign to similar center —> Identify Cluster centroids —>Reassign based on
minimum distance to centroid —>Identify new cluster centroid
———————-^
Hierarchical Clustering
Method of cluster analysis which seeks to build a hierarchy of clusters. 

Agglomerative: This is a "bottom up" approach: each observation starts in its
own cluster, and pairs of clusters are merged as one moves up the hierarchy. 

Divisive: This is a "top down" approach: all observations start in one cluster,
and splits are performed recursively as one moves down the hierarchy. 



Regression
is a measure of the relation between the mean value of one variable (e.g.
output) and corresponding values of other variables (e.g. time and cost). 

regression analysis is a statistical process for estimating the relationships
among variables. 

Regression means to predict the output value using training data. 

Popular one is Logistic regression 

(binary regression) 

Regression: To predict the house price from training data

If it is a real number/continuous, then it is regression problem. 

If it is discrete/categorical variable, then it is classification problem 

Classification: To predict the type of tumor i.e. harmful or not harmful using
training data 



Spam Email Detection 

Machine Translation (Language Translation) 

Image Search (Similarity) 

Clustering (K-Means) : Amazon Recommendations 

Classification : Google News
Use-Cases
Text Summarization - Google News

Rating a Review / Comment: Zomato

Fraud detection : Credit card Providers

Decision Making : e.g. Bank/Insurance sector
Sentiment Analysis

Speech Understanding – iPhone with Siri

Face Detection – Facebook’s Photo tagging
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..
butest
 

Was ist angesagt? (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Basics of machine learning
Basics of machine learningBasics of machine learning
Basics of machine learning
 
Machine learning and types
Machine learning and typesMachine learning and types
Machine learning and types
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning Tutorial for Beginners
Machine Learning Tutorial for BeginnersMachine Learning Tutorial for Beginners
Machine Learning Tutorial for Beginners
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesis
 
Machine Learning Landscape
Machine Learning LandscapeMachine Learning Landscape
Machine Learning Landscape
 
Machine learning
Machine learningMachine learning
Machine learning
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
 
Learning in AI
Learning in AILearning in AI
Learning in AI
 
Meetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesMeetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo cases
 

Ähnlich wie An Introduction to Machine Learning

machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
iaeronlineexm
 

Ähnlich wie An Introduction to Machine Learning (20)

Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
ML crash course
ML crash courseML crash course
ML crash course
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdfMachine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdf
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
machine learning
machine learningmachine learning
machine learning
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdf
 
Industrial training ppt
Industrial training pptIndustrial training ppt
Industrial training ppt
 
detailed Presentation on supervised learning
 detailed Presentation on supervised learning detailed Presentation on supervised learning
detailed Presentation on supervised learning
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptx
 
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
Machine Learning PPT BY RAVINDRA SINGH KUSHWAHA B.TECH(IT) CHAUDHARY CHARAN S...
 
Machine learning presentation (razi)
Machine learning presentation (razi)Machine learning presentation (razi)
Machine learning presentation (razi)
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

An Introduction to Machine Learning

  • 1. Machine Learning An Introduction $:whoami Vedaj J P It doesn’t really matter Twitter: @vedaj | Quora: Vedaj-J-Padman
  • 2. Before we begin…. How many people have heard about Machine Learning How many people know about Machine Learning How many people are using Machine Learning Machine Learning Deep Learning Artificial Intelligence Introduction • Basics • Classification • Clustering • Regression • Use-Cases
  • 3. “A computer program is said to learn from experience (E) with some class of tasks (T) and a performance measure (P) if its performance at tasks in T as measured by P improves with E” subfield of Artificial Intelligence (AI) • name is derived from the concept that it deals with “construction and study of systems that can learn from data” • can be seen as building blocks to make computers learn to behave more intelligently • It is a theoretical concept. There are various techniques with various implementations. Terminology Features – The number of features or distinct traits that can be used to describe each item in a quantitative manner. Samples – A sample is an item to process (e.g. classify). It can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits. Feature vector – is an n-dimensional vector of numerical features that represent some object. Feature extraction – Preparation of feature vector – transforms the data in the high-dimensional space to a space of fewer dimensions. Training / Evolution set – Set of data to discover potentially predictive relationships.
  • 4. Apple What comes to your mind? Learning (Training) Features: 1. Color: Radish/Red 2. Type : Fruit 3. Shape etc… Features: 1. Sky Blue 2. Logo 3. Shape etc… Features: 1. Yellow 2. Fruit 3. Shape etc… Learning (Training)
  • 6. With supervised learning, you feed the output of your algorithm into the system. Machine already knows the output. Work out the steps or process needed to reach from the input to the output. Training data set exists. If the process goes haywire and the algorithms come up with results completely different than what should be expected, then the training data does its part to guide the algorithm back towards the right path. Supervised Machine Learning currently makes up most of the ML that is being used by systems across the world. The input variable (x) is used to connect with the output variable (y) through the use of an algorithm. All of the input, the output, the algorithm, and the scenario are being provided by humans. We can understand supervised learning in an even better way by looking at it through two types of problems. Classification: Classification problems categorize all the variables that form the output. Examples of these categories formed through classification would Whenever people talk about computers and machines developing the ability to “teach themselves” in a seamless manner, rather than us humans having to do the honor, they are in a way alluding to the processes involved in unsupervised learning. Just consider that we have a digital image that has a variety of colored geometric shapes on it. These geometric shapes needed to be matched into groups according to color and other classification features. For a system that follows supervised learning, this whole process is a bit too simple. The procedure is extremely straightforward, as you just have to teach the computer all the details pertaining to the figures. You can let the system know that all shapes with four sides are known as squares, and others with eight sides are known as octagons, etc. We can also teach the system to interpret the colors and see how the light being given out is classified. However, in unsupervised learning, the whole process becomes a little trickier.
  • 7. The biggest difference between supervised and unsupervised machine learning is this: Supervised machine learning algorithms are trained on datasets that include labels added by a machine learning engineer or data scientist that guide the algorithm to understand which features are important to the problem at hand. Unsupervised machine learning algorithms, on the other hand, are trained on unlabeled data and must determine feature importance on their own based on inherent patterns in the data. (If the ideas of training algorithms or quantifying feature importance seem completely foreign, be sure to check out our executive’s guide to predictive modeling!) As you may have guessed, semi-supervised learning algorithms are trained on a combination of labeled and unlabeled data. This is useful for a few reasons. First, the process of labeling massive amounts of data for supervised learning is often prohibitively time-consuming and expensive. What’s more, too much labeling can impose human biases on the model. That means including lots of unlabeled data during the training process actually tends to improve the Semi-supervised learning Semi-supervised learning is a win-win for use cases like webpage classification, speech recognition, or even for genetic sequencing. In all of these cases, data scientists can access large volumes of unlabeled data, but the process of actually assigning supervision information to all of it would be an insurmountable task. Using classification as an example, let’s compare how these three approaches work in practice: Supervised classification: The algorithm learns to assign labels to types of webpages based on the labels that were inputted by a human during the training process. Unsupervised clustering: The algorithm looks at inherent similarities between webpages to place them into groups. Semi-supervised classification: Labeled data is used to help identify that there are specific groups of webpage types present in the data and what they might be. The algorithm is then trained on unlabeled data to define the
  • 8. Reinforcement Learning spurs off from the concept of Unsupervised Learning, and gives a high sphere of control to software agents and machines to determine what the ideal behavior within a context can be. This link is formed to maximize the performance of the machine in a way that helps it to grow. Simple feedback that informs the machine about its progress is required here to help the machine learn its behavior. Reinforcement Learning is not simple, and is tackled by a plethora of different algorithms. As a matter of fact, in Reinforcement Learning an agent decides the best action based on the current state of the results. The growth in Reinforcement Learning has led to the production of a wide variety of algorithms that help machines learn the outcome of what they are doing. Since we have a basic understanding of Reinforcement Learning by now, we can get a better grasp by forming a comparative analysis between Reinforcement Learning and the concepts of Supervised and Unsupervised Learning that we have studied in detail before. Machine Learning Techniques Supervised Learning: The correct classes of the training data are known Unsupervised Learning: The correct classes of the training data are not known Semi-supervised learning: A Mix of Supervised and Unsupervised learning Reinforcement Learning: Allows the machine or software agent to learn its behavior based on feedback from the environment. This behavior can be learnt once and for all, or keep on adapting as time goes by. 

  • 9. Classification Clustering Regression Classification: predict class from observations Clustering: Group observations into “meaningful” groups Regression (Prediction): Predict value from observations Classification Classify a document into a predefined category. 
 Documents can be text, images etc.
 Popular one is Naïve Bayes Classifier. 
 Simple technique for constructing classifiers: models that assign class labels to problem instances, represented as vectors of feature values, where the class labels are drawn from some finite set. NOT A SINGLE ALGORITHM They assume that the value of a particular feature is independent of the value of any other feature, given the class variable. A fruit may be considered to be an apple if it is red, round, and about 10 cm in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of any possible correlations between the color, roundness, and diameter features. Text Categorization, Automatic Medical Diagnosis etc. Highly scaleable.
  • 10. Steps: 
 – Step1 : Train the program (Building a Model) using a training set with a category for e.g. sports, cricket, news, 
 – Classifier will compute probability for each word, the probability that it makes a document belong to each of considered categories 
 – Step2 : Test with a test data set against this Model 
 Clustering Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other. 
 Objects are not predefined 
 Popular ones are K-means clustering and Hierarchical clustering 
 For e.g. these keywords –  “man’s shoe” –  “women’s shoe” –  “women’s t-shirt” –  “man’s t-shirt” –  can be cluster into 2 categories “shoe” and “t-shirt” or 
 “man” and “women” 

  • 11. K-Means Clustering https://youtu.be/wFL6JcepP3M Pizza delivery centers, Swiggy etc. Random centroid Assign to similar center —> Identify Cluster centroids —>Reassign based on minimum distance to centroid —>Identify new cluster centroid ———————-^ Hierarchical Clustering Method of cluster analysis which seeks to build a hierarchy of clusters. 
 Agglomerative: This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. Divisive: This is a "top down" approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy. 
 

  • 12. Regression is a measure of the relation between the mean value of one variable (e.g. output) and corresponding values of other variables (e.g. time and cost). 
 regression analysis is a statistical process for estimating the relationships among variables. 
 Regression means to predict the output value using training data. 
 Popular one is Logistic regression 
 (binary regression) Regression: To predict the house price from training data If it is a real number/continuous, then it is regression problem. If it is discrete/categorical variable, then it is classification problem 
 Classification: To predict the type of tumor i.e. harmful or not harmful using training data 
 

  • 13. Spam Email Detection 
 Machine Translation (Language Translation) 
 Image Search (Similarity) 
 Clustering (K-Means) : Amazon Recommendations 
 Classification : Google News Use-Cases Text Summarization - Google News
 Rating a Review / Comment: Zomato
 Fraud detection : Credit card Providers
 Decision Making : e.g. Bank/Insurance sector Sentiment Analysis
 Speech Understanding – iPhone with Siri
 Face Detection – Facebook’s Photo tagging