SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Downloaden Sie, um offline zu lesen
Machine
Learning
Andrea Iacono
https://github.com/andreaiacono/MachineLearning
Machine Learning: Intro

What is Machine Learning?
[Wikipedia]: a branch of artificial intelligence
that allows the construction and the study of
systems that can learn from data
Machine Learning: Intro

Some approaches:
- Regression analysis
- Similarity and metric learning
- Decision tree learning
- Association rule learning
- Artificial neural networks
- Genetic programming
- Support vector machines
(classification and regression analysis)
- Clustering
- Bayesian networks
Machine Learning: Intro

Supervised learning
vs
Unsupervised learning
Machine learning
vs
Data mining
Machine Learning: Regression analysis

Regression Analysis
A statistical technique for estimating
the relationships among a dependent
variable and independent variables
Machine Learning: Regression analysis

Prediction of house prices
Size (x)

Price (y)

0.80

70

0.90

83

1.00

74

1.10

93

1.40

89

1.40

58

1.50

85

1.60

114

1.80

95

2.00

100

2.40

138

2.50

111

2.70

124

3.20

172

3.50

172
Machine Learning: Regression analysis

Prediction of house prices

Hypothesis:

h θ ( x )=θ0 + θ1 x
Machine Learning: Regression analysis

Prediction of house prices
Hypothesis:

h θ (x )=θ0 + θ1 x

Cost function for linear regression:
m
1
J (θ 0, θ1 )=
(h θ (x (i) )− y(i ) )2
∑
2m i=1
Machine Learning: Regression analysis

Prediction of house prices
Hypothesis:

h θ (x )=θ0 + θ1 x

Cost function for linear regression:
m
1
J (θ 0, θ1 )=
(h θ (x (i) )− y(i ) )2
∑
2m i=1

Gradient Descent
repeat until convergence :
m
1
(i )
(i )
θ 0=θ 0−α ∑ (hθ ( x )− y )
m i =1
m
1
θ1 =θ1 −α ∑ [(h θ (x (i) )− y(i )) x (i) ]
m i =1
Machine Learning: Regression analysis

Prediction of house prices
Iterative minimization of cost function
with gradient descent
Machine Learning: Regression analysis

Hands on
Machine Learning: Regression analysis

Regression analysis
- one / multiple variables
- linear / higher order curves
- several optimization algorithms
- linear regression
- logistic regression
- simulated annealing
- ...
Machine Learning: Regression analysis

Overfitting vs underfitting
Machine Learning: Similarity and metric learning

Similarity and metric learning
- concept of distance
Machine Learning: Similarity and metric learning

Euclidean distance

euclidean distance (p , q )=

√

n

∑ (p i −q i )2

i =1
Machine Learning: Similarity and metric learning

Manhattan distance

n

manhattan distance (p , q )=∑ ∣(p i −q i )∣
i =1
Machine Learning: Similarity and metric learning

Pearson's correlation

n

n

∑ pi ∑ qi

n

∑ (p i q i )− i =1
Pearson ' s correlation ( p , q )=

i =1

√

n

n

2
i

(∑ p −
i =1

i =1

n
n

2

(∑ p i )
i =1

n

2

n

(∑ qi )

i =1

n

)( ∑ q 2 −
i

i =1

)
Machine Learning: Similarity and metric learning

Collaborative filtering
Searches a large group of users for finding a
small subset that have tastes like yours.
Based on what this subset likes or dislikes
the system can recommend you other items.
Two main approaches:
- User based filtering
- Item based filtering
Machine Learning: Similarity and metric learning

User based filtering
- based on ratings given to
the items, we can measure
the distance among users
- we can recommend to the
user the items that have
the highest ratings among
the closest users
Machine Learning: Similarity and metric learning

Hands on
Machine Learning: Similarity and metric learning

Is user based filtering good for
- scalability?
- sparse data?
- quickly changing data?
Machine Learning: Similarity and metric learning

Is user based filtering good for
- scalability?
- sparse data?
- quickly changing data?

No, it's better to use item
based filtering
Machine Learning: Similarity and metric learning

Euclidean distance for item based filtering:
nothing has changed!
- based on ratings got from
the users, we can measure
the distance among items
- we can recommend an
item to a user, getting the
items that are closer to
the highest rated by the
user
Machine Learning: Similarity and metric learning

Hands on
Machine Learning: Bayes' classifier

Bayes' theorem
P ( A∣B )=

P (B∣A)P (A )
P (B )

Example: given a company where 70% of developers use Java and 30%
use C++, and knowing that half of the Java developers always use
enhanced for loop, if you look at the snippet:
for (int j=0; j<100; j++) {
t = tests[j];
}
which is the probability that the developer who wrote it uses Java?
Machine Learning: Bayes' classifier

Bayes' theorem
P ( A∣B )=

P (B∣A)P (A )
P (B )

Example: given a company where 70% of developers use Java and 30%
use C++, and knowing that half of the Java developers always use
enhanced for loop, if you look at the snippet:
for (int j=0; j<100; j++) {
t = tests[j];
}
which is the probability that the developer who wrote it uses Java?
Hint:
A = developer uses Java
B = developer writes old for loops
Machine Learning: Bayes' classifier

Bayes' theorem
P ( A∣B )=

P (B∣A)P (A )
P (B )

Example: given a company where 70% of developers use Java and 30%
use C++, and knowing that half of the Java developers always use
enhanced for loop, if you look at the snippet:
for (int j=0; j<100; j++) {
t = tests[j];
}
which is the probability that the developer who wrote it uses Java?
Solution:
A = developer uses Java
B = developer writes old for loops
P(A) = prob. that a developer uses Java = 0.7
P(B) = prob. that any developer uses old for loop = 0.3 + 0.7*0.5 = 0.65
P(B|A) = prob. that a Java developer uses old for loop = 0.5
P (B∣A)P (A) 0.5⋅0.7
P (A∣B )=
=
=0.54
P (B )
0.65
Machine Learning: Bayes' classifier

Naive Bayes' classifier
- supervised learning
- trained on a set of known classes
- computes probabilities of elements to be in a class
- smoothing required
n

∏ P (c∣w i )
P c (w 1 , .... , w n )=

i =1

n

n

i =1

i =1

∏ P (c∣w i )+ ∏ (1−P (c∣w i ))
Machine Learning: Bayes' classifier

Naive Bayes' classifier
Example
- we want a classifier for Twitter messages
- define a set of classes: {art, tech, home, events,.. }
- trains the classifier with a set of alreay classified tweets
- when a new tweet arrives, the classifier will (hopefully)
tell us which class it belongs to
Machine Learning: Bayes' classifier

Hands on
Machine Learning: Bayes' classifier

Sentiment analysis
- define two classes: { +, - }
- define a set of words: { like, enjoy, hate, bore, fun, …}
- train a NBC with a set of known +/- comments
- let NBC classify any new comment to know if +/- performance is related to quality of training set
Machine Learning: Clustering

Clustering
- Unsupervised learning
- Different algorithms:
- Hierarchical clustering
- K-Means clustering
- ...
Common use cases:
- navigation habits
- online commerce
- social/political attitudes
- ...
Machine Learning: Clustering

K-Means clustering
K-Means aims at identifying
cluster centroids, such that an
item belonging to a cluster X,
is closer to the centroid of
cluster X than to the centroid
of any other cluster.
Machine Learning: Clustering

K-Means clustering
The algorithm requires a
number of clusters to start, in
this case 3. The centroids are
placed in the item space,
typically in random locations.
Machine Learning: Clustering

K-Means clustering
The algorithm will then assign
to each centroid all items that
are closer to it than to any
other centroid.
Machine Learning: Clustering

K-Means clustering
The centroids are then moved
to the center of mass of the
items in the clusters.
Machine Learning: Clustering

K-Means clustering
A new iteration occurs, taking
into account the new centroid
positions.
Machine Learning: Clustering

K-Means clustering
The centroids are again moved
to the center of mass of the
items in the clusters.
Machine Learning: Clustering

K-Means clustering
Another iteration occurs,
taking into account the new
centroid positions.
Machine Learning: Clustering

K-Means clustering
The centroids are again moved
to the center of mass of the
items in the clusters.
Machine Learning: Clustering

K-Means clustering
Another iteration occurs,
taking into account the new
centroid positions. Note that
this
time
the
cluster
membership did not change.
The cluster centers will not
move anymore.
Machine Learning: Clustering

K-Means clustering
The solution is found.
Machine Learning: Clustering

Hands on
Machine Learning: Neural networks

Neural networks
A logical calculus of the ideas immanent in nervous activity
by McCulloch and Pitts in 1943
Machine Learning: Neural networks

Neural networks
Feedforward Perceptron
Machine Learning: Neural networks

Neural networks
Logic operators with neural networks:
Threshold = 0
X0
-10
-10
-10
-10

X1
0
0
20
20

X2
0
20
0
20

Σ
-10
10
10
30

Result
0
1
1
1

OR operator
Machine Learning: Neural networks

Neural networks
Logic operators with neural networks:
Threshold = 0
X0
-30
-30
-30
-30

X1
0
0
20
20

X2
0
20
0
20

Σ

Result

which operator?
Machine Learning: Neural networks

Neural networks
Logic operators with neural networks:
Threshold = 0
X0
-30
-30
-30
-30

X1
0
0
20
20

X2
0
20
0
20

Σ
-30
-10
-10
10

Result
0
0
0
1

AND operator
Machine Learning: Neural networks

Hands on
Machine Learning: Neural networks

Neural networks
Backpropagation
Phase 1: Propagation
- Forward propagation of a training pattern's input
through the neural network in order to generate
the propagation's output activations
- Backward propagation of the propagation's output
activations through the neural network using the
training pattern target in order to generate the
deltas of all output and hidden neurons
Phase 2: Weight update
- Multiply its output delta and input activation to
get the gradient of the weight
- Bring the weight in the opposite direction of the
gradient by subtracting a ratio of it from the weight
Machine Learning: Neural networks

Neural networks
Multilayer perceptrons
Machine Learning: Neural networks

Hands on
Machine Learning: Genetic algorithms

Genetic algorithms
GA is a programming technique that mimics
biological evolution as a problem-solving strategy

Steps
- maps the variables of the problem into a sequence of
bits, a chromosome

Chromosome
- creates a random population of chromosomes
- let evolve the population using evolution laws:
- the higher the fitness, the higher the chance of breeding
- crossover of chromosomes
- mutation in chromosomes
- if otpimal solution is found or after n steps the
process is stopped
Machine Learning: Genetic algorithms

Genetic algorithms
Mutation

Crossover
Machine Learning: Genetic algorithms

Hands on
Machine Learning

Thanks!
The code is available on:
https://github.com/andreaiacono/MachineLearning

Weitere ähnliche Inhalte

Was ist angesagt?

Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
Machine Learning 1 - Introduction
Machine Learning 1 - IntroductionMachine Learning 1 - Introduction
Machine Learning 1 - Introduction
butest
 
Download presentation source
Download presentation sourceDownload presentation source
Download presentation source
butest
 
Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it works
Kevin Lee
 
3_learning.ppt
3_learning.ppt3_learning.ppt
3_learning.ppt
butest
 

Was ist angesagt? (20)

Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive Bayes
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
 
Intro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data ScientistsIntro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data Scientists
 
Introduction to-machine-learning
Introduction to-machine-learningIntroduction to-machine-learning
Introduction to-machine-learning
 
Ml ppt at
Ml ppt atMl ppt at
Ml ppt at
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Machine Learning 1 - Introduction
Machine Learning 1 - IntroductionMachine Learning 1 - Introduction
Machine Learning 1 - Introduction
 
Using Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar DressesUsing Deep Learning to Find Similar Dresses
Using Deep Learning to Find Similar Dresses
 
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 
Introdution and designing a learning system
Introdution and designing a learning systemIntrodution and designing a learning system
Introdution and designing a learning system
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
boosting algorithm
boosting algorithmboosting algorithm
boosting algorithm
 
Download presentation source
Download presentation sourceDownload presentation source
Download presentation source
 
large scale Machine learning
large scale Machine learninglarge scale Machine learning
large scale Machine learning
 
Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it works
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
3_learning.ppt
3_learning.ppt3_learning.ppt
3_learning.ppt
 

Andere mochten auch

Capital Bikeshare Presentation
Capital Bikeshare PresentationCapital Bikeshare Presentation
Capital Bikeshare Presentation
donahuerm
 
Probabilistic generative models for machine vision
Probabilistic generative models for machine visionProbabilistic generative models for machine vision
Probabilistic generative models for machine vision
zukun
 
Discriminant analysis basicrelationships
Discriminant analysis basicrelationshipsDiscriminant analysis basicrelationships
Discriminant analysis basicrelationships
divyakalsi89
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
Elkana Rorio
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
butest
 

Andere mochten auch (20)

CDTW Capstone Presentation
CDTW Capstone Presentation CDTW Capstone Presentation
CDTW Capstone Presentation
 
Red Blue Presentation
Red Blue PresentationRed Blue Presentation
Red Blue Presentation
 
Lets eat presentation_final_20160521
Lets eat presentation_final_20160521Lets eat presentation_final_20160521
Lets eat presentation_final_20160521
 
Personalizing a Stream of Content
Personalizing a Stream of ContentPersonalizing a Stream of Content
Personalizing a Stream of Content
 
Analysis of differential investor performance captstone presentation final
Analysis of differential investor  performance   captstone  presentation finalAnalysis of differential investor  performance   captstone  presentation final
Analysis of differential investor performance captstone presentation final
 
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
 
Capital Bikeshare Presentation
Capital Bikeshare PresentationCapital Bikeshare Presentation
Capital Bikeshare Presentation
 
Probabilistic generative models for machine vision
Probabilistic generative models for machine visionProbabilistic generative models for machine vision
Probabilistic generative models for machine vision
 
Machine learning
Machine learningMachine learning
Machine learning
 
Georgetown Data Analytics - Team 1 Capstone Project
Georgetown Data Analytics - Team 1 Capstone ProjectGeorgetown Data Analytics - Team 1 Capstone Project
Georgetown Data Analytics - Team 1 Capstone Project
 
Georgetown Data Analytics Project (Team DC)
Georgetown Data Analytics Project (Team DC)Georgetown Data Analytics Project (Team DC)
Georgetown Data Analytics Project (Team DC)
 
Discriminant analysis basicrelationships
Discriminant analysis basicrelationshipsDiscriminant analysis basicrelationships
Discriminant analysis basicrelationships
 
Intro au Big Data & Machine Learning
Intro au Big Data & Machine LearningIntro au Big Data & Machine Learning
Intro au Big Data & Machine Learning
 
Hotel Performance FINAL
Hotel Performance FINALHotel Performance FINAL
Hotel Performance FINAL
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 

Ähnlich wie Machine learning

Computational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding RegionsComputational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding Regions
butest
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819
HODCSE21
 
Search Engines
Search EnginesSearch Engines
Search Engines
butest
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
Margaret Wang
 

Ähnlich wie Machine learning (20)

The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with Tensorflow
 
Computational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding RegionsComputational Biology, Part 4 Protein Coding Regions
Computational Biology, Part 4 Protein Coding Regions
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithms
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Support Vector Machine (Classification) - Step by Step
Support Vector Machine (Classification) - Step by StepSupport Vector Machine (Classification) - Step by Step
Support Vector Machine (Classification) - Step by Step
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Neural networks using tensor flow in amazon deep learning server
Neural networks using tensor flow in amazon deep learning serverNeural networks using tensor flow in amazon deep learning server
Neural networks using tensor flow in amazon deep learning server
 
Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)Machine Learning Algorithms Review(Part 2)
Machine Learning Algorithms Review(Part 2)
 
Machine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data MeetupMachine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data Meetup
 
Machine learning for computer vision part 2
Machine learning for computer vision part 2Machine learning for computer vision part 2
Machine learning for computer vision part 2
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
 
Introduction to Deep Learning with Python
Introduction to Deep Learning with PythonIntroduction to Deep Learning with Python
Introduction to Deep Learning with Python
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learning
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
Naive_hehe.pptx
Naive_hehe.pptxNaive_hehe.pptx
Naive_hehe.pptx
 
Design and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.pptDesign and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.ppt
 

Mehr von Andrea Iacono (6)

Mapreduce by examples
Mapreduce by examplesMapreduce by examples
Mapreduce by examples
 
The Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphXThe Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphX
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphX
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
Functional Java 8 in everyday life
Functional Java 8 in everyday lifeFunctional Java 8 in everyday life
Functional Java 8 in everyday life
 
How to build_a_search_engine
How to build_a_search_engineHow to build_a_search_engine
How to build_a_search_engine
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Machine learning

  • 2. Machine Learning: Intro What is Machine Learning? [Wikipedia]: a branch of artificial intelligence that allows the construction and the study of systems that can learn from data
  • 3. Machine Learning: Intro Some approaches: - Regression analysis - Similarity and metric learning - Decision tree learning - Association rule learning - Artificial neural networks - Genetic programming - Support vector machines (classification and regression analysis) - Clustering - Bayesian networks
  • 4. Machine Learning: Intro Supervised learning vs Unsupervised learning Machine learning vs Data mining
  • 5. Machine Learning: Regression analysis Regression Analysis A statistical technique for estimating the relationships among a dependent variable and independent variables
  • 6. Machine Learning: Regression analysis Prediction of house prices Size (x) Price (y) 0.80 70 0.90 83 1.00 74 1.10 93 1.40 89 1.40 58 1.50 85 1.60 114 1.80 95 2.00 100 2.40 138 2.50 111 2.70 124 3.20 172 3.50 172
  • 7. Machine Learning: Regression analysis Prediction of house prices Hypothesis: h θ ( x )=θ0 + θ1 x
  • 8. Machine Learning: Regression analysis Prediction of house prices Hypothesis: h θ (x )=θ0 + θ1 x Cost function for linear regression: m 1 J (θ 0, θ1 )= (h θ (x (i) )− y(i ) )2 ∑ 2m i=1
  • 9. Machine Learning: Regression analysis Prediction of house prices Hypothesis: h θ (x )=θ0 + θ1 x Cost function for linear regression: m 1 J (θ 0, θ1 )= (h θ (x (i) )− y(i ) )2 ∑ 2m i=1 Gradient Descent repeat until convergence : m 1 (i ) (i ) θ 0=θ 0−α ∑ (hθ ( x )− y ) m i =1 m 1 θ1 =θ1 −α ∑ [(h θ (x (i) )− y(i )) x (i) ] m i =1
  • 10. Machine Learning: Regression analysis Prediction of house prices Iterative minimization of cost function with gradient descent
  • 11. Machine Learning: Regression analysis Hands on
  • 12. Machine Learning: Regression analysis Regression analysis - one / multiple variables - linear / higher order curves - several optimization algorithms - linear regression - logistic regression - simulated annealing - ...
  • 13. Machine Learning: Regression analysis Overfitting vs underfitting
  • 14. Machine Learning: Similarity and metric learning Similarity and metric learning - concept of distance
  • 15. Machine Learning: Similarity and metric learning Euclidean distance euclidean distance (p , q )= √ n ∑ (p i −q i )2 i =1
  • 16. Machine Learning: Similarity and metric learning Manhattan distance n manhattan distance (p , q )=∑ ∣(p i −q i )∣ i =1
  • 17. Machine Learning: Similarity and metric learning Pearson's correlation n n ∑ pi ∑ qi n ∑ (p i q i )− i =1 Pearson ' s correlation ( p , q )= i =1 √ n n 2 i (∑ p − i =1 i =1 n n 2 (∑ p i ) i =1 n 2 n (∑ qi ) i =1 n )( ∑ q 2 − i i =1 )
  • 18. Machine Learning: Similarity and metric learning Collaborative filtering Searches a large group of users for finding a small subset that have tastes like yours. Based on what this subset likes or dislikes the system can recommend you other items. Two main approaches: - User based filtering - Item based filtering
  • 19. Machine Learning: Similarity and metric learning User based filtering - based on ratings given to the items, we can measure the distance among users - we can recommend to the user the items that have the highest ratings among the closest users
  • 20. Machine Learning: Similarity and metric learning Hands on
  • 21. Machine Learning: Similarity and metric learning Is user based filtering good for - scalability? - sparse data? - quickly changing data?
  • 22. Machine Learning: Similarity and metric learning Is user based filtering good for - scalability? - sparse data? - quickly changing data? No, it's better to use item based filtering
  • 23. Machine Learning: Similarity and metric learning Euclidean distance for item based filtering: nothing has changed! - based on ratings got from the users, we can measure the distance among items - we can recommend an item to a user, getting the items that are closer to the highest rated by the user
  • 24. Machine Learning: Similarity and metric learning Hands on
  • 25. Machine Learning: Bayes' classifier Bayes' theorem P ( A∣B )= P (B∣A)P (A ) P (B ) Example: given a company where 70% of developers use Java and 30% use C++, and knowing that half of the Java developers always use enhanced for loop, if you look at the snippet: for (int j=0; j<100; j++) { t = tests[j]; } which is the probability that the developer who wrote it uses Java?
  • 26. Machine Learning: Bayes' classifier Bayes' theorem P ( A∣B )= P (B∣A)P (A ) P (B ) Example: given a company where 70% of developers use Java and 30% use C++, and knowing that half of the Java developers always use enhanced for loop, if you look at the snippet: for (int j=0; j<100; j++) { t = tests[j]; } which is the probability that the developer who wrote it uses Java? Hint: A = developer uses Java B = developer writes old for loops
  • 27. Machine Learning: Bayes' classifier Bayes' theorem P ( A∣B )= P (B∣A)P (A ) P (B ) Example: given a company where 70% of developers use Java and 30% use C++, and knowing that half of the Java developers always use enhanced for loop, if you look at the snippet: for (int j=0; j<100; j++) { t = tests[j]; } which is the probability that the developer who wrote it uses Java? Solution: A = developer uses Java B = developer writes old for loops P(A) = prob. that a developer uses Java = 0.7 P(B) = prob. that any developer uses old for loop = 0.3 + 0.7*0.5 = 0.65 P(B|A) = prob. that a Java developer uses old for loop = 0.5 P (B∣A)P (A) 0.5⋅0.7 P (A∣B )= = =0.54 P (B ) 0.65
  • 28. Machine Learning: Bayes' classifier Naive Bayes' classifier - supervised learning - trained on a set of known classes - computes probabilities of elements to be in a class - smoothing required n ∏ P (c∣w i ) P c (w 1 , .... , w n )= i =1 n n i =1 i =1 ∏ P (c∣w i )+ ∏ (1−P (c∣w i ))
  • 29. Machine Learning: Bayes' classifier Naive Bayes' classifier Example - we want a classifier for Twitter messages - define a set of classes: {art, tech, home, events,.. } - trains the classifier with a set of alreay classified tweets - when a new tweet arrives, the classifier will (hopefully) tell us which class it belongs to
  • 30. Machine Learning: Bayes' classifier Hands on
  • 31. Machine Learning: Bayes' classifier Sentiment analysis - define two classes: { +, - } - define a set of words: { like, enjoy, hate, bore, fun, …} - train a NBC with a set of known +/- comments - let NBC classify any new comment to know if +/- performance is related to quality of training set
  • 32. Machine Learning: Clustering Clustering - Unsupervised learning - Different algorithms: - Hierarchical clustering - K-Means clustering - ... Common use cases: - navigation habits - online commerce - social/political attitudes - ...
  • 33. Machine Learning: Clustering K-Means clustering K-Means aims at identifying cluster centroids, such that an item belonging to a cluster X, is closer to the centroid of cluster X than to the centroid of any other cluster.
  • 34. Machine Learning: Clustering K-Means clustering The algorithm requires a number of clusters to start, in this case 3. The centroids are placed in the item space, typically in random locations.
  • 35. Machine Learning: Clustering K-Means clustering The algorithm will then assign to each centroid all items that are closer to it than to any other centroid.
  • 36. Machine Learning: Clustering K-Means clustering The centroids are then moved to the center of mass of the items in the clusters.
  • 37. Machine Learning: Clustering K-Means clustering A new iteration occurs, taking into account the new centroid positions.
  • 38. Machine Learning: Clustering K-Means clustering The centroids are again moved to the center of mass of the items in the clusters.
  • 39. Machine Learning: Clustering K-Means clustering Another iteration occurs, taking into account the new centroid positions.
  • 40. Machine Learning: Clustering K-Means clustering The centroids are again moved to the center of mass of the items in the clusters.
  • 41. Machine Learning: Clustering K-Means clustering Another iteration occurs, taking into account the new centroid positions. Note that this time the cluster membership did not change. The cluster centers will not move anymore.
  • 42. Machine Learning: Clustering K-Means clustering The solution is found.
  • 44. Machine Learning: Neural networks Neural networks A logical calculus of the ideas immanent in nervous activity by McCulloch and Pitts in 1943
  • 45. Machine Learning: Neural networks Neural networks Feedforward Perceptron
  • 46. Machine Learning: Neural networks Neural networks Logic operators with neural networks: Threshold = 0 X0 -10 -10 -10 -10 X1 0 0 20 20 X2 0 20 0 20 Σ -10 10 10 30 Result 0 1 1 1 OR operator
  • 47. Machine Learning: Neural networks Neural networks Logic operators with neural networks: Threshold = 0 X0 -30 -30 -30 -30 X1 0 0 20 20 X2 0 20 0 20 Σ Result which operator?
  • 48. Machine Learning: Neural networks Neural networks Logic operators with neural networks: Threshold = 0 X0 -30 -30 -30 -30 X1 0 0 20 20 X2 0 20 0 20 Σ -30 -10 -10 10 Result 0 0 0 1 AND operator
  • 49. Machine Learning: Neural networks Hands on
  • 50. Machine Learning: Neural networks Neural networks Backpropagation Phase 1: Propagation - Forward propagation of a training pattern's input through the neural network in order to generate the propagation's output activations - Backward propagation of the propagation's output activations through the neural network using the training pattern target in order to generate the deltas of all output and hidden neurons Phase 2: Weight update - Multiply its output delta and input activation to get the gradient of the weight - Bring the weight in the opposite direction of the gradient by subtracting a ratio of it from the weight
  • 51. Machine Learning: Neural networks Neural networks Multilayer perceptrons
  • 52. Machine Learning: Neural networks Hands on
  • 53. Machine Learning: Genetic algorithms Genetic algorithms GA is a programming technique that mimics biological evolution as a problem-solving strategy Steps - maps the variables of the problem into a sequence of bits, a chromosome Chromosome - creates a random population of chromosomes - let evolve the population using evolution laws: - the higher the fitness, the higher the chance of breeding - crossover of chromosomes - mutation in chromosomes - if otpimal solution is found or after n steps the process is stopped
  • 54. Machine Learning: Genetic algorithms Genetic algorithms Mutation Crossover
  • 55. Machine Learning: Genetic algorithms Hands on
  • 56. Machine Learning Thanks! The code is available on: https://github.com/andreaiacono/MachineLearning