The document introduces artificial intelligence and defines it as machines exhibiting human-like intelligence through tasks like reasoning, planning, learning from data, natural language processing, and perception. It then discusses supervised machine learning, where humans provide training data to teach algorithms to classify new examples, and some challenges of this approach including the need for large, representative training datasets customized to each user.
3. Artificial intelligence: a definition
”An ideal intelligent machine is a flexible rational agent that perceives its
environment and takes actions that maximize its chance of success at an
arbitrary goal”
3
???
4. Artificial intelligence: a definition
”An ideal intelligent machine is a flexible rational agent that perceives its
environment and takes actions that maximize its chance of success at an
arbitrary goal”
4
!
5. Artificial intelligence: a definition
”An ideal intelligent machine is a flexible rational agent that perceives its
environment and takes actions that maximize its chance of success at an
arbitrary goal”
5
!
6. Artificial intelligence: a definition
”An ideal intelligent machine is a flexible rational agent that perceives its
environment and takes actions that maximize its chance of success at an
arbitrary goal”
6
!
7. Are machines really intelligent?
Machines learn to do a task without having a proper knowledge base
Machines are NOT intelligent! They mimic human behaviors without giving
them any semantics
7
Human reasoning
✔ Know precisely what a cat is
✔ Recognize the cat in the photo
✔ “The photo contains a cat”
Machine “reasoning”
✔ Cat = set of features
✔ Lack of features in “ears” area
✔ “The photo does not contain a cat”
10. Well, sometimes even humans are not so good!
“Machines cannot learn what humans cannot do”
10
11. Beware!
Artificial intelligence is NOT the solution to every task you could think
(machines are not intelligent as humans, remember?)
11
12. Remember this guy?
Andrew exhibits artificial general
intelligence (or, full AI):
“The machine can successfully perform any
intellectual task that a human being can”
(BTW, thanks Asimov for the wishful thinking)
12
13. The bicentennial man is NOT the reality
No matter what you wish, what you can
achieve is what is called weak AI:
“Weak AI defines a non-sentient computer
intelligence that is focused on one narrow
task”
(So long for Andrew, right?)
13
14. An again, beware!
Artificial intelligence is NOT typical of just robots
(computers are machine, too)
(or, better: robots are computers, too)
14
16. Reasoning
Reasoning is the act of applying logic to establish and verify facts
Example: Automated theorem proving
16
http://www.repubblica.it/scienze/2016/06/13/news/d
imostrazione_matematica_piu_lunga-141910538/?ref
=HREC1-36
17. Planning
Planning is the act of realizing strategies or action sequences, typically by:
- intelligent agents
- autonomous robots
- unmanned vehicles
17
18. Learning
Learners are systems that can learn from data
Example: distinguish between spam/non-spam emails
...We will discuss this later…
18
19. Natural language processing (NLP)
NLP is the act of understanding natural language, i.e., enabling computers to
derive meaning from human or natural input
19
20. Perception
Perception is the identification and interpretation of sensory information in
order to understand and represent the environment
20
22. What is an algorithm?
An algorithm is a self-contained step-by-step
set of operations to be performed
22
23. What is machine learning?
Machine learning explores the study and construction of algorithms that can:
- learn from data
- make predictions on data
23
24. Different “types” of machine learning
Supervised learning: humans assist computers while learning
Unsupervised learning: humans just give data to computers, and
let them understand the rules governing data by themselves
Reinforcement learning: humans let computers do their
learning, but from time to time help them in
understanding whether they are learning well
24
26. Definitions
From now on, I’ll refer to the following concepts:
(Well, I could use boring images of computers, but it’s cuter in this way)
26
The
learner
Features
(x)
Class
(y)
The amount of time
required for him to
decide to sit down
The volume of his
owner’s voice
Task: “sit down”
27. Definitions
From now on, I’ll refer to the following concepts:
(Well, I could use boring images of computers, but it’s cuter in this way)
27
Features
(x)
Class
(y)
The
learner
Features
(x)
Class
(y)
positive class
(label: “happy”)
negative class
(label: “sad”)
This is called
binary
classification
(two classes in the
outcomes)
The amount of time
required for him to
decide to sit down
The volume of his
owner’s voice
Task: “sit down”
28. How to learn to make your owner happy?
The learner has to:
- Observe the distribution of classes with
respect to features (i.e., voice volume
and spent time distributions)
- Understand which features
(time/volume) are needed to make the
owner happy/sad
28
29. How does it work?
29
Data collection
Build a set of data used to
train the model
Training
Make your learner read and
understand the data in
training set, so as to make it
extract the rules that
describe such data
Validation
Test your algorithm, to
evaluate your algorithm
performance on fresh data
Training set
collected data
Test set
collected data
Test set
Training set Training set
collected data
Test set
30. Data collection
What is “data collection” in our case?
1. we make the dog sit in different conditions
- different volume voice
- different time needed to sit down
2. we observe the owner to understand if he is happy or sad
30
31. Data collection: representation
31
The data we collect are points in this
space, where:
- coordinates are its features
(voice volume and time to sit)
- each point is associated with a
class (happy/sad)
32. Data collection: representation
32
the more we move to this direction, the more
the owner should be sad, because he will
have to scream and wait for the dog to sit
hence, we expect to find more “sad” samples
in the upper-right corner of the graph
34. Data collection: labeling errors
34
Features (x) Class (y)
... ...
There could be some
labels that are
wrongly attributed to
data points
The more of these
errors, the more the
probability the learner
will learn wrong things
35. Data collection: forming training and test set
35
Features (x) Class (y)
... ...
we
keep
these
as
training
set
and
use
them
to
train
our
learner
we
keep
these
as
test
set
and
use
them
to
compute
the
performance
of
the
learner
36. Training (i.e., using training set to learn from data)
36
the learner tries to understand how to
divide “happy” and “sad” samples
this will serve him to classify new
samples: when the new sample
arrives, the learner will know if it will
be on this or that side of the line
SAD
HAPPY
37. Training with a small dataset
37
how
the
learner divides
betw
een
happy and
sad
SAD
HAPPY
38. Training with a small dataset
38
how
the
learner divides
betw
een
happy and
sad
SAD
HAPPY
Probably our mighty owner would still be
happy here, but the learner does not have
enough data to know it
39. Training with a small dataset
39
how
the
learner divides
betw
een
happy and
sad
SAD
HAPPY
Probably our mighty owner would still be
happy here, but the learner does not have
enough data to know it
OVERFITTING
The learner learned the available data,
but not the rule that generated them
40. Training with an adequate amount of data
40
how the learner divides
between happy and sad
SAD
HAPPY
41. Training with an adequate amount of data
41
how the learner divides
between happy and sad
SAD
HAPPY
These determine
the classification
error
42. The model
By the way, the plane dividing the space in
“what belongs to the SAD class” and “what
belongs to the HAPPY class” is called
model
We can see it as a function
y = f(x)
i.e., the function that computes the class
given the features
42
42
SAD
HAPPY
43. How can I use the model?
Given a new sample (e.g., from the test set)
made of its features, i.e.,:
- time needed to sit down
- volume of voice
the model tells us which class it will be
associated with:
- happy, if below the line
- sad, if above the line
43
SAD
HAPPY
this will be associated
with the “sad” class
44. Validation: computing the performance on test set
To measure the performance of a model:
44
Positive Negative
Positive True Positive (TP) False Positive (FP)
Negative False Negative (FN) True Negative (TN)
actual class
predicted class
Precision =
TP
TP + FP
(how many of the positive samples the model
recognized were actually positive)
Recall =
TP
TP + FN
(how many of the positive samples the model
was able to recognize)
Accuracy =
TP + TN
TP + TN + FP + FN
(how many samples were detected with their actual,
correct label)
45. Examples of application of supervised learning
Given an ECG beat, determine which arrhythmia class it belongs to
- Features (x): the points constituting the beat
- Classes (y): normal beat, arrhythmia type 1, arrhythmia type 2…
Given some HRV characteristics, determine if the patient was stressed during
their acquisition
- Features (x): the HRV characteristics
- Classes (y): stressed, non-stressed
45
46. Problems of supervised learning
Expensive data gathering
Imagine how many times the dog has to sit to
learn how to make his owner happy
The larger the training set:
- The better the result
- The larger the cost
46
47. Problems of supervised learning
A learner is customized
If owners are all the same, a single learner could
make all owners happy…
47
48. Problems of supervised learning
A learner is customized
…but since owners have different behaviors and
preferences, each owner has to be provided with
his own learner
48
49. Problems of supervised learning
A learner is customized
What does this mean?
49
To satisfy him... ...this learner can be
trained to sit down...
...with this dataset
this trains a model
with a certain
accuracy
...or with this dataset
this trains the same
model as before,
with a different
accuracy
To satisfy him instead...
...this learner can be
trained to fetch the ball...
...with this dataset
this trains a
different model
with its own
accuracy
50. Problems of supervised learning
A learner is customized
What does this mean?
50
To satisfy him... ...this learner can be
trained to sit down...
...with this dataset
this trains a model
with a certain
accuracy
...or with this dataset
this trains the same
model as before,
with a different
accuracy
To satisfy him instead...
...this learner can be
trained to fetch the ball...
...with this dataset
this trains a
different model
with its own
accuracy
These models can
be both used to
satisfy the first
owner, but not the
second
51. Problems of supervised learning
A learner is customized
What does this mean?
51
To satisfy him... ...this learner can be
trained to sit down...
...with this dataset
this trains a model
with a certain
accuracy
...or with this dataset
this trains the same
model as before,
with a different
accuracy
To satisfy him instead...
...this learner can be
trained to fetch the ball...
...with this dataset
this trains a
different model
with its own
accuracy
This model can be
used to satisfy the
second owner, but
not the first
52. Problems of supervised learning
A learner is customized
More in “health” terms:
To detect
arrhythmia in ECG
...we build a model...
...with this dataset
this trains a model
with a certain
accuracy
...or with this dataset
this trains the same
model as before,
with a different
accuracy
To detect stress in HRV ...with this dataset
this trains a
different model
with its own
accuracy
“normal”
“arrhythmia”
“arrhythmia”
...we build a model...
“normal”
“arrhythmia”
2.2 1.7 … 5.0 “stress”
7.8 1.5 … 4.6 “stress”
1.4 5.7 … 9.2 “no stress”
53. Problems of supervised learning
Training data: a good representative of real data?
A practical example:
I may decide to train my learner to recognize cats
in photos (classes: “there is a cat” / “there is not a
cat”), using super beautiful photos that are
taken with professional cameras
The model I extract will not probably work with
pictures of my cat, taken with a smartphone,
with different resolution and different cat
postures
54. Problems of supervised learning
Training data: a good representative of real data?
This brings to lack of accuracy, even in
production
54
Villa Del Balbianello
furniture
squirrel
55. Problems of supervised learning
Training data: a good representative of real data?
This brings to lack of accuracy, even in
production
55
56. Multiclass classification
Up to now, we have seen binary classification (i.e., the task of classifying an
item into two classes)
Multiclass classification is the task of classifying an item into three or more
classes
56
Classes
Husky
Samoyed
French bulldog
…
Australian shepherd
57. Multiclass classification
In our health-related examples:
Stress detection
This task is a binary classification task (stress/no-stress)
Arrhythmia detection
This task is a multiclass classification task (normal/arrhythmia-type-1, …,
arrhythmia-type-N)
57
59. The neuron
Remember the fact that a model can be seen as a function?
y = f(x)
Visually:
59
f(x)
x1
x2
x3
y
60. The neuron: a practical example
We could build, for instance, a function that computes the price of a house (y)
given the size of the house (x):
60
f(x)
size price
61. Connecting more neurons
But what if we can combine many of these functions?
- some of the inputs build up more complex features
- these features are further combined
- at the end, via combinations of combinations, I reach the outcome
61
family size
price
size
number of
rooms walkability
city
wealth
school quality
62. Artificial neural network
An artificial neural network (ANN) is a model used to learn concepts for
which data are described by a large number of inputs (i.e., “many x”)
The magic of a neural network is that you don’t have to define the measures in
the middle; we just provide inputs, and the network is able on its own to find
semantics to the internal nodes
62
family size
price
size
number of
rooms walkability
city
wealth
school quality
63. What does this have to do with the human brain?
Not a lot, actually :)
The only similarity (which is a loose analogy) is as follows:
However, even today we are not able to know what a single neuron does
63
x1
x2
x3
y
x1
x2
x3
y
64. Types of neural networks
64
Shallow neural network (NN)
Few layers (i.e., few neurons)
Deep neural network (DNN)
Many layers (i.e., many neurons)
(ah, the trending “deep learning”!)
Convolutional neural network (CNN)
Used mostly for images
Recurrent neural network (RNN)
Used mostly for time series (e.g., audio)
65. A couple of nice examples
MariFlow: self-driving Mario Kart with Recurrent Neural Network
https://www.youtube.com/watch?v=Ipi40cb_RsI
Neural network plays Flappy Bird:
https://www.youtube.com/watch?v=QWdEub_7EcA
65
67. Unsupervised learning
Unsupervised learning is the machine learning task of inferring a function to
describe hidden structure from unlabeled data
Since the examples given to the learner are unlabeled, there is no error or reward
signal to evaluate a potential solution
67
Supervised learning
{#edges=4;label=‘square’}
{#edges=4;label=‘square’}
{#edges=4;label=‘square’}
{#edges=4;label=‘square’}
{#edges=3;label=‘triangle’}
{#edges=3;label=‘triangle’}
{#edges=3;label=‘triangle’}
class
square
class
triangle
Unsupervised learning
{#edges=4}
{#edges=4}
{#edges=4}
{#edges=4}
{#edges=3}
{#edges=3}
{#edges=3}
class
1
class
2
Two
classes
69. Reinforcement learning
With reinforcement learning, a computer program interacts with a dynamic
environment in which it must perform a certain goal without a teacher
explicitly telling it whether it has come close to its goal
69
The
learner
“good”
action
“bad”
action
The
goal
time
71. Short answer: no DNA involved
A genetic algorithm mimics the process of natural selection
71
✔ Start with an initial
population
✔ Evaluate the fitness of each
individual to the desired
requirements
Crossover
Combine aspects of
the selected
individuals
Selection
Select the individuals
that best fit to the
requirements
Mutation
Randomly change a
portion of chromosome
enough
generations?
yes
no
END
72. Short answer: no DNA involved
A genetic algorithm mimics the process of natural selection
72
✔ Start with an initial
population
✔ Evaluate the fitness of each
individual to the desired
requirements
Crossover
Combine aspects of
the selected
individuals
Selection
Select the individuals
that best fit to the
requirements
Mutation
Randomly change a
portion of chromosome
enough
generations?
yes
no
END
Genetic algorithms
are evolutionary
algorithms which
are not part of the
machine learning
suite
73. A couple of nice examples
Genetic cars: https://rednuht.org/genetic_cars_2/
Genetic walkers: http://rednuht.org/genetic_walkers/
73