SlideShare ist ein Scribd-Unternehmen logo
1 von 36
 Self Introduction
Name: Rizwan Shaukat (Razi)
Major: MS(Software Engineering)
ID: 2019272110006
Introduction to Machine Learning
Basics of Machine Learning
 About
 Subfield of Artificial Intelligence (AI)
 Name is derived from the concept that it deals with
“construction and study of systems that can learn from data”
 Can be seen as building blocks to make computers learn to behave more
intelligently
 It is a theoretical concept. There are various techniques with various
implementations.
 http://en.wikipedia.org/wiki/Machine_learning
 In other words:
 “A computer program is said to learn from experience (E) with some
class of tasks (T) and a performance measure (P) if its performance
at tasks in T as measured by P improves with E”
 Definitions
 “Learning is any process by which a system improves performance from experience.”
(By Herbert Simon)
 Definition by Tom Mitchell (1998):
 Machine Learning is the study of algorithms that
 improve their performance P
 at some task T
 with experience E.
 A well-defined learning task is given by <P, T, E>.
 Machine learning investigates the mechanisms by which knowledge is acquired through
experience
 Machine Learning is the field that concentrates on induction algorithms and on other
algorithms that can be said to ``learn.''
 When Do We Use Machine Learning
 ML is used when:
 Human expertise does not exist (navigating on Mars)
 Humans can’t explain their expertise (speech recognition)
 Models must be customized (personalized medicine)
 Models are based on huge amounts of data (genomics)
 Terminologies
 Features
 The number of features or distinct traits that can be used to describe
each item in a quantitative manner.
 Samples
 A sample is an item to process (e.g. classify). It can be a document, a
picture, a sound, a video, a row in database or CSV file, or whatever
you can describe with a fixed set of quantitative traits.
 Feature vector
 is an n-dimensional vector of numerical features that represent some
object.
 Terminologies Conti….
 Feature extraction
 Preparation of feature vector
 transforms the data in the high-dimensional space to a space of fewer
dimensions.
 Training/Evolution set
 Set of data to discover potentially predictive relationships.
 Lets go some what deeply…
 What do you mean by
Apple
 Learning (Training)
 Example of Apple:…
Features:
1. Color:
Radish/Red
2. Type : Fruit
3. Shape
etc…
Features:
1. Sky Blue
2. Logo
3. Shape
etc…
Features:
1. Yellow
2. Fruit
3. Shape
etc…
 Workflow
 Growth of Machine Learning
 Machine learning is preferred approach to
 Speech recognition, Natural language processing
 Computer vision
 Medical outcomes analysis
 Robot control
 Computational biology
 Growth of Machine Learning Conti…
 This trend is accelerating
 Improved machine learning algorithms
 Improved data capture, networking, faster computers
 Software too complex to write by hand
 New sensors / IO devices
 Demand for self-customization to user, environment
 It turns out to be difficult to extract knowledge from
human expertsfailure of expert systems in the 1980’s.
 Applications
 Association Analysis
 Supervised Learning
 Unsupervised Learning
 Semi-Supervised Learning
 Reinforcement Learning
 Learning Association
 Basket analysis:
P (Y | X ) probability that somebody who buys X also buys Y where X and Y are
products/services.
Example: P ( chips | beer ) = 0.7
Market-Basket transactions
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
 Supervised Learning
 the correct classes of the training data are known
Credit: http://us.hudson.com/legal/blog/postid/513/predictive-
analytics-artificial-intelligence-science-fiction-e-discovery-truth
 Supervised Learning: Uses
 Prediction of future cases: Use the rule to predict the output for future
inputs
 Knowledge extraction: The rule is easy to understand
 Compression: The rule is simpler than the data it explains
 Outlier detection: Exceptions that are not covered by the rule, e.g., fraud
Example: decision trees tools that create rules
 Unsupervised Learning Conti…
 the correct classes of the training data are not known
Credit: http://us.hudson.com/legal/blog/postid/513/predictive-
analytics-artificial-intelligence-science-fiction-e-discovery-truth
 Unsupervised Learning
 Learning “what normally happens”
 No output
 Clustering: Grouping similar instances
 Other applications: Summarization, Association Analysis
 Example applications
 Customer segmentation in CRM
 Image compression: Color quantization
 Bioinformatics: Learning motifs
 Semi-Supervised Learning
 A Mix of Supervised and Unsupervised learning
Credit: http://us.hudson.com/legal/blog/postid/513/predictive-
analytics-artificial-intelligence-science-fiction-e-discovery-truth
 Reinforcement Learning
 allows the machine or software agent to learn its behavior based on
feedback from the environment.
 This behavior can be learnt once and for all, or keep on adapting as time
goes by.
Credit: http://us.hudson.com/legal/blog/postid/513/predictive-
analytics-artificial-intelligence-science-fiction-e-discovery-truth
 Reinforcement Learning
 Topics:
 Policies: what actions should an agent take in a particular situation
 Utility estimation: how good is a state (used by policy)
 No supervised output but delayed reward
 Credit assignment problem (what was responsible for the outcome)
 Applications:
 Game playing
 Robot in a maze
 Multiple agents, partial observability, ...
 Machine Learning Techniques
 Techniques
 classification:
 predict class from observations
 clustering:
 group observations into “meaningful” groups
 regression (prediction):
 predict value from observations
 Classification
 Example: Credit scoring
 Differentiating between
low-risk and high-risk
customers from their
income and savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
Model
 Classification : Applications
 Pattern recognition
 Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair
style
 Character recognition: Different handwriting styles.
 Speech recognition: Temporal dependency.
 Use of a dictionary or the syntax of the language.
 Sensor fusion: Combine multiple modalities; eg, visual (lip image) and
acoustic for speech
 Medical diagnosis: From symptoms to illnesses
 Web Advertising: Predict if a user clicks on an ad on the Internet.
 Face Recognition
 In Face Recognition there is some raw data for the detection of any person’s
mood.
 System will be trained through this data and it will give out put in percentages.
Training examples of a person
Test images
 Clustering
 Clustering is the task of grouping a set of objects in such a way
that objects in the same group (called a cluster) are more similar
to each other
 objects are not predefined
 For e.g. these keywords
 “man’s shoe”
 “women’s shoe”
 “women’s t-shirt”
 “man’s t-shirt”
 can be cluster into 2 categories “shoe” and “t-shirt” or “man”
and “women”
 Popular ones are K-means clustering and Hierarchical clustering
 K-means Clustering
 partition n observations into k clusters in which each observation belongs to
the cluster with the nearest mean, serving as a prototype of the cluster.
 http://en.wikipedia.org/wiki/K-means_clustering
http://pypr.sourceforge.net/kmeans.html
 Hierarchical clustering
 method of cluster analysis which seeks to build a hierarchy of clusters.
 There can be two strategies
 Agglomerative:
 This is a "bottom up" approach: each observation starts in its own cluster,
and pairs of clusters are merged as one moves up the hierarchy.
 Time complexity is O(n^3)
 Divisive:
 This is a "top down" approach: all observations start in one cluster, and splits
are performed recursively as one moves down the hierarchy.
 Time complexity is O(2^n)
 http://en.wikipedia.org/wiki/Hierarchical_clustering
 Prediction : Regression
 is a measure of the relation between the mean
value of one variable (e.g. output) and
corresponding values of other variables (e.g.
time and cost).
 regression analysis is a statistical process
for estimating the relationships among
variables.
 Regression means to predict the output
value using training data.
 Popular one is Logistic regression (binary
regression)
 http://en.wikipedia.org/wiki/Logistic_regression
 Regression Application
 Navigating a car: Angle of the steering wheel (CMU NavLab)
 Kinematics of a robot arm
α1= g1(x,y)
α2= g2(x,y)
α1
α2
(x,y)
 Classification vs Regression
 Classification
 Classification means to group the
output into a class.
 classification to predict the type of
tumor i.e. harmful or not harmful
using training data
 if it is discrete/categorical
variable, then it is classification
problem
 Regression
 Regression means to predict the
output value using training
data.
 regression to predict the house
price from training data
 if it is a real
number/continuous, then it is
regression problem.
 Popular Frameworks/Tools
 Weka
 Carrot2
 Gate
 OpenNLP
 LingPipe
 Stanford NLP
 Mallet – Topic Modelling
 Popular Frameworks/Tools
 Gensim – Topic Modelling (Python)
 Apache Mahout
 MLib – Apache Spark
 scikit-learn - Python
 LIBSVM : Support Vector Machines
 and many more…
Any Question?
Machine learning presentation (razi)

Weitere ähnliche Inhalte

Was ist angesagt?

Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 
DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..
butest
 
DATA MINING.doc
DATA MINING.docDATA MINING.doc
DATA MINING.doc
butest
 

Was ist angesagt? (19)

Lecture1 introduction to machine learning
Lecture1 introduction to machine learningLecture1 introduction to machine learning
Lecture1 introduction to machine learning
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
 
DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
DATA MINING.doc
DATA MINING.docDATA MINING.doc
DATA MINING.doc
 
Seminar(Pattern Recognition)
Seminar(Pattern Recognition)Seminar(Pattern Recognition)
Seminar(Pattern Recognition)
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Meetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesMeetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo cases
 
Machine learning_ Replicating Human Brain
Machine learning_ Replicating Human BrainMachine learning_ Replicating Human Brain
Machine learning_ Replicating Human Brain
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
 

Ähnlich wie Machine learning presentation (razi)

machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introduction
butest
 
slides
slidesslides
slides
butest
 
slides
slidesslides
slides
butest
 
On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Mining
butest
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
butest
 

Ähnlich wie Machine learning presentation (razi) (20)

Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introduction
 
slides
slidesslides
slides
 
slides
slidesslides
slides
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 
On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Mining
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
Lect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdfLect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdf
 
introduction to machine learning and nlp
introduction to machine learning and nlpintroduction to machine learning and nlp
introduction to machine learning and nlp
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Data Science.pptx
Data Science.pptxData Science.pptx
Data Science.pptx
 
recent.pptx
recent.pptxrecent.pptx
recent.pptx
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Machine learning presentation (razi)

  • 1.  Self Introduction Name: Rizwan Shaukat (Razi) Major: MS(Software Engineering) ID: 2019272110006
  • 2. Introduction to Machine Learning Basics of Machine Learning
  • 3.  About  Subfield of Artificial Intelligence (AI)  Name is derived from the concept that it deals with “construction and study of systems that can learn from data”  Can be seen as building blocks to make computers learn to behave more intelligently  It is a theoretical concept. There are various techniques with various implementations.  http://en.wikipedia.org/wiki/Machine_learning  In other words:  “A computer program is said to learn from experience (E) with some class of tasks (T) and a performance measure (P) if its performance at tasks in T as measured by P improves with E”
  • 4.  Definitions  “Learning is any process by which a system improves performance from experience.” (By Herbert Simon)  Definition by Tom Mitchell (1998):  Machine Learning is the study of algorithms that  improve their performance P  at some task T  with experience E.  A well-defined learning task is given by <P, T, E>.  Machine learning investigates the mechanisms by which knowledge is acquired through experience  Machine Learning is the field that concentrates on induction algorithms and on other algorithms that can be said to ``learn.''
  • 5.  When Do We Use Machine Learning  ML is used when:  Human expertise does not exist (navigating on Mars)  Humans can’t explain their expertise (speech recognition)  Models must be customized (personalized medicine)  Models are based on huge amounts of data (genomics)
  • 6.  Terminologies  Features  The number of features or distinct traits that can be used to describe each item in a quantitative manner.  Samples  A sample is an item to process (e.g. classify). It can be a document, a picture, a sound, a video, a row in database or CSV file, or whatever you can describe with a fixed set of quantitative traits.  Feature vector  is an n-dimensional vector of numerical features that represent some object.
  • 7.  Terminologies Conti….  Feature extraction  Preparation of feature vector  transforms the data in the high-dimensional space to a space of fewer dimensions.  Training/Evolution set  Set of data to discover potentially predictive relationships.
  • 8.  Lets go some what deeply…  What do you mean by Apple
  • 9.  Learning (Training)  Example of Apple:… Features: 1. Color: Radish/Red 2. Type : Fruit 3. Shape etc… Features: 1. Sky Blue 2. Logo 3. Shape etc… Features: 1. Yellow 2. Fruit 3. Shape etc…
  • 11.  Growth of Machine Learning  Machine learning is preferred approach to  Speech recognition, Natural language processing  Computer vision  Medical outcomes analysis  Robot control  Computational biology
  • 12.  Growth of Machine Learning Conti…  This trend is accelerating  Improved machine learning algorithms  Improved data capture, networking, faster computers  Software too complex to write by hand  New sensors / IO devices  Demand for self-customization to user, environment  It turns out to be difficult to extract knowledge from human expertsfailure of expert systems in the 1980’s.
  • 13.  Applications  Association Analysis  Supervised Learning  Unsupervised Learning  Semi-Supervised Learning  Reinforcement Learning
  • 14.  Learning Association  Basket analysis: P (Y | X ) probability that somebody who buys X also buys Y where X and Y are products/services. Example: P ( chips | beer ) = 0.7 Market-Basket transactions TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke
  • 15.  Supervised Learning  the correct classes of the training data are known Credit: http://us.hudson.com/legal/blog/postid/513/predictive- analytics-artificial-intelligence-science-fiction-e-discovery-truth
  • 16.  Supervised Learning: Uses  Prediction of future cases: Use the rule to predict the output for future inputs  Knowledge extraction: The rule is easy to understand  Compression: The rule is simpler than the data it explains  Outlier detection: Exceptions that are not covered by the rule, e.g., fraud Example: decision trees tools that create rules
  • 17.  Unsupervised Learning Conti…  the correct classes of the training data are not known Credit: http://us.hudson.com/legal/blog/postid/513/predictive- analytics-artificial-intelligence-science-fiction-e-discovery-truth
  • 18.  Unsupervised Learning  Learning “what normally happens”  No output  Clustering: Grouping similar instances  Other applications: Summarization, Association Analysis  Example applications  Customer segmentation in CRM  Image compression: Color quantization  Bioinformatics: Learning motifs
  • 19.  Semi-Supervised Learning  A Mix of Supervised and Unsupervised learning Credit: http://us.hudson.com/legal/blog/postid/513/predictive- analytics-artificial-intelligence-science-fiction-e-discovery-truth
  • 20.  Reinforcement Learning  allows the machine or software agent to learn its behavior based on feedback from the environment.  This behavior can be learnt once and for all, or keep on adapting as time goes by. Credit: http://us.hudson.com/legal/blog/postid/513/predictive- analytics-artificial-intelligence-science-fiction-e-discovery-truth
  • 21.  Reinforcement Learning  Topics:  Policies: what actions should an agent take in a particular situation  Utility estimation: how good is a state (used by policy)  No supervised output but delayed reward  Credit assignment problem (what was responsible for the outcome)  Applications:  Game playing  Robot in a maze  Multiple agents, partial observability, ...
  • 22.  Machine Learning Techniques
  • 23.  Techniques  classification:  predict class from observations  clustering:  group observations into “meaningful” groups  regression (prediction):  predict value from observations
  • 24.  Classification  Example: Credit scoring  Differentiating between low-risk and high-risk customers from their income and savings Discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk Model
  • 25.  Classification : Applications  Pattern recognition  Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair style  Character recognition: Different handwriting styles.  Speech recognition: Temporal dependency.  Use of a dictionary or the syntax of the language.  Sensor fusion: Combine multiple modalities; eg, visual (lip image) and acoustic for speech  Medical diagnosis: From symptoms to illnesses  Web Advertising: Predict if a user clicks on an ad on the Internet.
  • 26.  Face Recognition  In Face Recognition there is some raw data for the detection of any person’s mood.  System will be trained through this data and it will give out put in percentages. Training examples of a person Test images
  • 27.  Clustering  Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other  objects are not predefined  For e.g. these keywords  “man’s shoe”  “women’s shoe”  “women’s t-shirt”  “man’s t-shirt”  can be cluster into 2 categories “shoe” and “t-shirt” or “man” and “women”  Popular ones are K-means clustering and Hierarchical clustering
  • 28.  K-means Clustering  partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.  http://en.wikipedia.org/wiki/K-means_clustering http://pypr.sourceforge.net/kmeans.html
  • 29.  Hierarchical clustering  method of cluster analysis which seeks to build a hierarchy of clusters.  There can be two strategies  Agglomerative:  This is a "bottom up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy.  Time complexity is O(n^3)  Divisive:  This is a "top down" approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.  Time complexity is O(2^n)  http://en.wikipedia.org/wiki/Hierarchical_clustering
  • 30.  Prediction : Regression  is a measure of the relation between the mean value of one variable (e.g. output) and corresponding values of other variables (e.g. time and cost).  regression analysis is a statistical process for estimating the relationships among variables.  Regression means to predict the output value using training data.  Popular one is Logistic regression (binary regression)  http://en.wikipedia.org/wiki/Logistic_regression
  • 31.  Regression Application  Navigating a car: Angle of the steering wheel (CMU NavLab)  Kinematics of a robot arm α1= g1(x,y) α2= g2(x,y) α1 α2 (x,y)
  • 32.  Classification vs Regression  Classification  Classification means to group the output into a class.  classification to predict the type of tumor i.e. harmful or not harmful using training data  if it is discrete/categorical variable, then it is classification problem  Regression  Regression means to predict the output value using training data.  regression to predict the house price from training data  if it is a real number/continuous, then it is regression problem.
  • 33.  Popular Frameworks/Tools  Weka  Carrot2  Gate  OpenNLP  LingPipe  Stanford NLP  Mallet – Topic Modelling
  • 34.  Popular Frameworks/Tools  Gensim – Topic Modelling (Python)  Apache Mahout  MLib – Apache Spark  scikit-learn - Python  LIBSVM : Support Vector Machines  and many more…