3. What is Learning?
memorizing something
learning facts through observation and
exploration
improving motor and/or cognitive skills
through practice
organizing new knowledge into
general, effective representations
4. Some types of learning
Rote learning
Reinforcement learning – an agent interacting
with the world makes observations, takes actions,
and is rewarded or punished; it should learn to
choose actions in such a way as to obtain a lot of
reward
Supervised induction (with a teacher) – given a
set of input/output pairs, find a rule that does a
good job of predicting the output associated with a
new input
Unsupervised induction
Analogy (case-based) learning
Evolutionary learning
5. Learning
Rote learning Samuel‟s checker program
… by taking advice FOO
… from problem-solving STRIPS (Shakey)
experience
… from examples & Winston‟s concept learner
counterexamples
… by parameter adjustment Samuel‟s checker program
… by chunking SOAR
… by analogy
… by discovery AM/EURISKO
Genetic learning
Connectionism (Neural Net)
7. Learning
The idea behind learning is that percepts
should not only be used for acting
now, but also for improving the agent‟s
ability to act in the future.
In the psychology literature, learning is
considered one of the keys to human
intelligence.
8. Learning element
Design of a learning element is affected by
Which components of the performance element
are to be learned
What feedback is available to learn these
components
What representation is used for the components
Type of feedback:
Supervised learning: correct answers for each
example
Unsupervised learning: correct answers not
given
Reinforcement learning: occasional rewards
9. Rote Learning
This is the simplest form of learning.
It involves nothing more than memorizing
experiences, e.g., sensor inputs, actions
taken, rewards received.
It‟s surprisingly effective!
10. Machine learning programs for
classification (concept learning)
Assume you have a goal concept that you are trying to
learn, called the target concept. Your guesses or
approximations of the target concept are called
hypotheses.
Anexample target concept might be a
description of diseased soybean plants.
An object (fact) which is used to help learn the goal
concept is called an instance or an example. It can
also be called a case.
An instance/example x is described by a vector of
features, also called attributes , i.e., x = <x1,……,xn>.
11. Classification tasks
Many engineering and diagnosis tasks involve
classification or prediction, e.g.,
Parts inspection: classify parts into: defective or OK.
Mammogram analysis: given a mammogram,
estimate the probability that is normal, pre-cancerous
or cancerous.
Document understanding: given a rectangular region
from a scanned region, classify it as text or graphics.
Soybean plant analysis: classify plants into: diseased
or not diseased.
12. Machine learning programs for
classification (concept learning)
Given: A labeling function f that maps feature
vectors into a discrete set of k classes. That is, f(x)
in {0,1,……,k-1}. Often, there are only 2
classes, called “positive” (+) and “negative” (-).
Represent each training example as a pair (x,f(x)).
These are the examples that will be used for
learning the concept.
Problem: From a set of (x,f(x)) pairs, learn the
target concept f.
13. The learning problem
Given <x,f(x)> pairs, infer f
x f(x) Given a finite sample, it is often
1 1 impossible to guess the true
function f.
2 4
3 9 Approach: Find some pattern
(called a hypothesis) in
4 16
the training examples, and assume
5 ? that the pattern will hold for future
examples too.
14. Hypothesis or model selection
What class of hypotheses (models) should we consider?
Assume f is a set of rules. Then the space of hypotheses
consists of rule sets.
Assume f is a simple polynomial. Then the space of
hypotheses consists of simple polynomials. Regression
could be used to learn f.
Assume f can be expressed as a decision tree. Then the
space of hypotheses consists of decision trees. Decision
tree learning can be applied.
Assume f can be expressed as a neural network. Then
the space of hypotheses consists of neural nets, and
backprop can be used to learn the weights.
Let H be the space of all possible hypotheses H that a
learning program considers. Then the learner seeks the H in
H that “fits” the given data the “best.” This is a process of
search through the space of possible hypotheses in H.
15. Learning decision trees
Problem: decide whether to wait for a table at a
restaurant, based on the following attributes:
1. Alternate: is there an alternative restaurant nearby?
2. Bar: is there a comfortable bar area to wait in?
3. Fri/Sat: is today Friday or Saturday?
4. Hungry: are we hungry?
5. Patrons: number of people in the restaurant
(None, Some, Full)
6. Price: price range ($, $$, $$$)
7. Raining: is it raining outside?
8. Reservation: have we made a reservation?
9. Type: kind of restaurant (French, Italian, Thai, Burger)
10. WaitEstimate: estimated waiting time (0-10, 10-30, 30-
60, >60)
16. Attribute-based representations
Examples described by attribute values
(Boolean, discrete, continuous)
E.g., situations where I will/won't wait for a table:
Classification of examples is positive (T) or negative (F)
18. Recall: Physical Symbol System
A physical symbol system is a machine that
produces through time an evolving collection of
symbol structures. Such a system exists in a
world of objects wider than just these symbolic
expressions themselves.
The Physical Symbol System Hypothesis
A physical symbol system has the necessary and
sufficient means for general intelligent action.
PSS => Intelligence Intelligence => PSS
Well…maybe ???
19. Symbols & The Main Goals of AI
Engineering: Build intelligent systems
Lots of fantastic symbolic AI systems for a multitude of
specialized tasks….and many more to come!
But general intelligent systems are a major problem, since
common sense is hard to represent and reason with
symbolically.
Science: Understand natural intelligence via computers
Cognitive Science founded by symbolic AI researchers.
But they took the metaphor too far.
Organisms clearly compute, but not necessarily:
as Von Neumann computers (i.e. serially)
as Logic theorem provers (i.e. mathematically
complete & consistent)
with symbols!! …We can interpret the reasoning
process as symbolic, but the underlying mechanism
may not be. The ends don‟t explain the means!
20. The Intelligence Spectrum
Robot (Moravec ,1999)
Calculate Reason Sense & Act
Humans
Computers
■ On the fringes:
Humans are slow, error-prone calculators
Robots sense and act no better (and much slower) than frogs
■ The battle for middle ground:
Deep Blue beat the best human chess player
But minimax search ≠ “reasoning”.
21. GOFAI -vs- The New AI
Calc Reason Sense & Act
GOFAI
New AI
GOFAI (“Good Old Fashioned AI)
■ Disembodied reasoning systems can‟t plug-and-play on robots.
■ Lack of common sense => no general human reasoning abilities.
New AI
■ Embodied S&A gives basis for common sense but has not yet
scaled up to sophisticated human-like abstract reasoning.
22. Evolutionary Progressions along the
Intelligence Spectrum
Living organisms Computers
Sense & Act: 10,000,000+ years. 15+ years
Reason: 100,000+ 30+ years
Calculate 1,000+ 50+ years
Evolution of reasoning was tightly constrained and influenced by
sensorimotor capabilities. Else extinction!
GOFAI systems are often in their own little worlds, making
unreasonable assumptions about independent sensorimotor
apparatus.
To achieve AI‟s scientific goal of understanding human
intelligence, the road from sense-and-act to reasoning via
simulated evolution may be the only way.
But, to achieve AI‟s engineering goals, both approaches seem
important.
23. The Situated & Embodied AI Hypothesis
Complex intelligence is better understood and
more successfully embodied in artifacts by
working up from low-level sensory-motor agents
than from abstract cognitive mechanisms of
rationality (e.g. logic, means-ends analysis, etc.).
Cognitive Incrementalism: Cognition (and
hence common sense) is an extension of
sensorimotor behavior.
Brooks, Steels, Pfeifer, Scheier, Beer, Nolfi, Floreano…
24. The Artificial Life Approach to AI
Simple Robots
Cellular Automata
22222222
2170140142
2022222202
272 212 Simulated Real Worlds
212 212
202 212
272 212
21222222122222
20710710711111
2222222222222
Langton Loop
■ Synthetic: Bottom-up, multiple interacting agents
■ Self-Organizing: Global structure is emergent.
■ Self-Regulating: No global/centralized control.
■ Adaptive: Learning and/or evolving
■ Complex: On the edge of chaos; dissipative
25. Adaptation
Key focus of Situated & Embodied AI (i.e., Alife AI)
But now, often at level of simple organisms
(ants, flies, frogs, etc.)
Machine Learning (ML) is also a key part of GOFAI.
Alife AI is very interested in subsymbolic ML
techniques:
Artificial Neural Networks (ANNs)
Evolutionary Algorithms (EAs)
Learning: agents modify their own behavior (normally
to improve performance) in their lifetime.
Evolution: populations of agents change their
behavior over the course of many generations.
Both: Evolving populations of learning agents
26. Neural Networks
Also „connectionism‟, „Parallel Distributed
Processing‟, „subsymbolic AI‟
AI technique
Analogous to processes in the brain
“Intelligence emerges from the interactions
of large numbers of simple processing
units” (Rumelhart et al., 1986)
Roughly based on brains – some
simplification is made
27. Real Neuron
from Searleman & Searleman, Introduction to Cognition
28. Two interacting neurons
Excitatory (E) and Inhibitory (I) impulses
(from Searleman & Searleman, Introduction to Cognition)
29. Neurons
NeuroPhysiology
Synapses
Axon Dendrites
• Dense: Human brain has 1011 neurons, 1014 synapses
• Highly Interconnected: Human neurons have 104 fan-in.
• Neurons firing: send action potentials (APs) down the
axons when sufficiently stimulated by SUM of incoming
APs along the dendrites.
• Neurons can either stimulate or inhibit other neurons.
• Synapses vary in transmission efficiency
30. Features of the human brain
Robust – fault tolerant and degrade
gracefully
Flexible -- can learn without being
explicitly programmed
Can deal with fuzzy, probabilistic
information
Is highly parallel
31. Connectionist models
Key intuition: Much of intelligence is in the connections
between the 10 billion neurons in the human brain.
Neuron switching time is roughly 0.001 second; scene
recognition time is about 0.1 second. This suggests that
the brain is massively parallel because 100
computational steps are simply not sufficient to
accomplish scene recognition.
Development: Formation of basic connection topology
Learning: Fine-tuning of topology + Major synaptic-
efficiency changes.
The matrix IS the intelligence!