18. Intelligent Agents
What is an agent ?
An agent is anything that perceiving its
environment through sensors and acting
upon that environment through actuators
Example:
Human is an agent
A robot is also an agent with cameras and motors
A thermostat detecting room temperature.
21. Simple Terms
Percept
Agent’s perceptual inputs at any given instant
Percept sequence
Complete history of everything that the agent
has ever perceived.
22. Agent function & program
Agent’s behavior is mathematically
described by
Agent function
A function mapping any given percept
sequence to an action
Practically it is described by
An agent program
The real implementation
25. Program implements the agent
function tabulated in Fig. 2.3
Function Reflex-Vacuum-Agent([location,statuse])
return an action
If status = Dirty then return Suck
else if location = A then return Right
else if location = B then return left
26. Concept of Rationality
Rational agent
One that does the right thing
= every entry in the table for the agent
function is correct (rational).
What is correct?
The actions that cause the agent to be
most successful
So we need ways to measure success.
27. Performance measure
Performance measure
An objective function that determines
How the agent does successfully
E.g., 90% or 30% ?
An agent, based on its percepts
action sequence :
if desirable, it is said to be performing well.
No universal performance measure for all
agents
28. Performance measure
A general rule:
Design performance measures according to
What one actually wants in the environment
Rather than how one thinks the agent should
behave
E.g., in vacuum-cleaner world
We want the floor clean, no matter how the
agent behave
We don’t restrict how the agent behaves
29. Rationality
What is rational at any given time depends
on four things:
The performance measure defining the criterion
of success
The agent’s prior knowledge of the environment
The actions that the agent can perform
The agents’s percept sequence up to now
30. Rational agent
For each possible percept sequence,
an rational agent should select
an action expected to maximize its performance
measure, given the evidence provided by the
percept sequence and whatever built-in knowledge
the agent has
E.g., an exam
Maximize marks, based on
the questions on the paper & your knowledge
31. Omniscience
An omniscient agent
Knows the actual outcome of its actions in
advance
No other possible outcomes
However, impossible in real world
An example
crossing a street but died of the fallen
cargo door from 33,000ft irrational?
32. Omniscience
Based on the circumstance, it is rational.
As rationality maximizes
Expected performance
Perfection maximizes
Actual performance
Hence rational agents are not
omniscient.
33. Learning
Does a rational agent depend on only
current percept?
No, the past percept sequence should also
be used
This is called learning
After experiencing an episode, the agent
should adjust its behaviors to perform better
for the same job next time.
34. Autonomy
If an agent just relies on the prior knowledge of
its designer rather than its own percepts then
the agent lacks autonomy
A rational agent should be autonomous- it
should learn what it can to compensate
for partial or incorrect prior knowledge.
E.g., a clock
No input (percepts)
Run only but its own algorithm (prior knowledge)
No learning, no experience, etc.
35. Software Agents
Sometimes, the environment may not be
the real world
E.g., flight simulator, video games, Internet
They are all artificial but very complex
environments
Those agents working in these environments
are called
Software agent (softbots)
Because all parts of the agent are software
36. Task environments
Task environments are the problems
While the rational agents are the solutions
Specifying the task environment
PEAS description as fully as possible
Performance
Environment
Actuators
Sensors
In designing an agent, the first step must always be to
specify the task environment as fully as possible.
Use automated taxi driver as an example
37. Task environments
Performance measure
How can we judge the automated driver?
Which factors are considered?
getting to the correct destination
minimizing fuel consumption
minimizing the trip time and/or cost
minimizing the violations of traffic laws
maximizing the safety and comfort, etc.
38. Task environments
Environment
A taxi must deal with a variety of roads
Traffic lights, other vehicles, pedestrians,
stray animals, road works, police cars, etc.
Interact with the customer
39. Task environments
Actuators (for outputs)
Control over the accelerator, steering, gear
shifting and braking
A display to communicate with the
customers
Sensors (for inputs)
Detect other vehicles, road situations
GPS (Global Positioning System) to know
where the taxi is
Many more devices are necessary
41. Properties of task environments
Fully observable vs. Partially observable
If an agent’s sensors give it access to the
complete state of the environment at each
point in time then the environment is
effectively and fully observable
if the sensors detect all aspects
That are relevant to the choice of action
42. Partially observable
An environment might be Partially observable
because of noisy and inaccurate sensors or
because parts of the state are simply missing
from the sensor data.
Example:
A local dirt sensor of the cleaner cannot tell
Whether other squares are clean or not
43. Properties of task environments
Deterministic vs. stochastic
next state of the environment Completely
determined by the current state and the
actions executed by the agent, then the
environment is deterministic, otherwise, it
is Stochastic.
-Cleaner and taxi driver are:
Stochastic because of some unobservable
aspects noise or unknown
44. Properties of task environments
Episodic vs. sequential
An episode = agent’s single pair of perception & action
The quality of the agent’s action does not depend on
other episodes
Every episode is independent of each other
Episodic environment is simpler
The agent does not need to think ahead
Sequential
Current action may affect all future decisions
-Ex. Taxi driving and chess.
45. Properties of task environments
Static vs. dynamic
A dynamic environment is always changing
over time
E.g., the number of people in the street
While static environment
E.g., the destination
Semidynamic
environment is not changed over time
but the agent’s performance score does
46. Properties of task environments
Discrete vs. continuous
If there are a limited number of distinct
states, clearly defined percepts and actions,
the environment is discrete
E.g., Chess game
Continuous: Taxi driving
47. Properties of task environments
Single agent VS. multiagent
Playing a crossword puzzle – single agent
Chess playing – two agents
Competitive multiagent environment
Chess playing
Cooperative multiagent environment
Automated taxi driver
Avoiding collision
48. Properties of task environments
Known vs. unknown
This distinction refers not to the environment itslef but
to the agent’s (or designer’s) state of knowledge
about the environment.
-In known environment, the outcomes for all actions
are
given. ( example: solitaire card games).
- If the environment is unknown, the agent will have to
learn how it works in order to make good decisions.
( example: new video game).
51. Structure of agents
Agent = architecture + program
Architecture = some sort of computing
device (sensors + actuators)
(Agent) Program = some function that
implements the agent mapping = “?”
Agent Program = Job of AI
52. Agent programs
Input for Agent Program
Only the current percept
Input for Agent Function
The entire percept sequence
The agent must remember all of them
Implement the agent program as
A look up table (agent function)
54. Agent Programs
P = the set of possible percepts
T= lifetime of the agent
The total number of percepts it receives
Size of the look up table∑t =1 P
T t
Consider playing chess
P =10, T=150
Will require a table of at least 10150 entries
55. Agent programs
Despite of huge size, look up table does
what we want.
The key challenge of AI
Find out how to write programs that, to the
extent possible, produce rational behavior
From a small amount of code
Rather than a large amount of table entries
E.g., a five-line program of Newton’s Method
V.s. huge tables of square roots, sine, cosine,
…
57. Simple reflex agents
It uses just condition-action rules
The rules are like the form “if … then …”
efficient but have narrow range of applicability
Because knowledge sometimes cannot be
stated explicitly
Work only
if the environment is fully observable
60. A Simple Reflex Agent in Nature
percepts
(size, motion)
RULES:
(1) If small moving object,
then activate SNAP
(2) If large moving object,
then activate AVOID and inhibit SNAP
ELSE (not moving) then NOOP
needed for
completeness Action: SNAP or AVOID or NOOP
61. Model-based Reflex Agents
For the world that is partially observable
the agent has to keep track of an internal state
That depends on the percept history
Reflecting some of the unobserved aspects
E.g., driving a car and changing lane
Requiring two types of knowledge
How the world evolves independently of the
agent
How the agent’s actions affect the world
62. Example Table Agent
With Internal State
IF THEN
Saw an object ahead, Go straight
and turned right, and
it’s now clear ahead
Saw an object on my Halt
right, turned right, and
object ahead again
See no objects ahead Go straight
See an object ahead Turn randomly
63. Example Reflex Agent With Internal State:
Wall-Following
star
t
Actions: left, right, straight, open-door
Rules:
3. If open(left) & open(right) and open(straight) then
choose randomly between right and left
5. If wall(left) and open(right) and open(straight) then straight
6. If wall(right) and open(left) and open(straight) then straight
7. If wall(right) and open(left) and wall(straight) then left
8. If wall(left) and open(right) and wall(straight) then right
9. If wall(left) and door(right) and wall(straight) then open-door
10. If wall(right) and wall(left) and open(straight) then straight.
11. (Default) Move randomly
66. Goal-based agents
Current state of the environment is
always not enough
The goal is another issue to achieve
Judgment of rationality / correctness
Actions chosen goals, based on
the current state
the current percept
67. Goal-based agents
Conclusion
Goal-based agents are less efficient
but more flexible
Agent Different goals different tasks
Search and planning
two other sub-fields in AI
to find out the action sequences to achieve its goal
69. Utility-based agents
Goals alone are not enough
to generate high-quality behavior
E.g. meals in Canteen, good or not ?
Many action sequences the goals
some are better and some worse
If goal means success,
then utility means the degree of success
(how successful it is)
71. Utility-based agents
it is said state A has higher utility
If state A is more preferred than others
Utility is therefore a function
that maps a state onto a real number
the degree of success
72. Utility-based agents (3)
Utility has several advantages:
When there are conflicting goals,
Only some of the goals but not all can be
achieved
utility describes the appropriate trade-off
When there are several goals
None of them are achieved certainly
utility provides a way for the decision-making
73. Learning Agents
After an agent is programmed, can it
work immediately?
No, it still need teaching
In AI,
Once an agent is done
We teach it by giving it a set of examples
Test it by using another set of examples
We then say the agent learns
A learning agent
74. Learning Agents
Four conceptual components
Learning element
Making improvement
Performance element
Selecting external actions
Critic
Tells the Learning element how well the agent is doing with
respect to fixed performance standard.
(Feedback from user or examples, good or not?)
Problem generator
Suggest actions that will lead to new and informative
experiences.
80. Holiday Planning
On holiday in Romania; Currently in Arad.
Flight leaves tomorrow from Bucharest.
Formulate Goal:
Be in Bucharest
Formulate Problem:
States: various cities
Actions: drive between cities
Find solution:
Sequence of cities: Arad, Sibiu, Fagaras, Bucharest
83. Problem-solving agent
Four general steps in problem solving:
Goal formulation
What are the successful world states
Problem formulation
What actions and states to consider given the goal
Search
Determine the possible sequence of actions that lead to
the states of known values and then choosing the best
sequence.
Execute
Give the solution perform the actions.
84. Problem-solving agent
function SIMPLE-PROBLEM-SOLVING-AGENT(percept) return an action
static: seq, an action sequence
state, some description of the current world state
goal, a goal
problem, a problem formulation
state ← UPDATE-STATE(state, percept)
if seq is empty then
goal ← FORMULATE-GOAL(state)
problem ← FORMULATE-PROBLEM(state,goal)
seq ← SEARCH(problem)
action ← FIRST(seq)
seq ← REST(seq)
return action
85. Assumptions Made (for now)
The environment is static
The environment is discretizable
The environment is observable
The actions are deterministic
86. Problem formulation
A problem is defined by:
An initial state, e.g. Arad
Successor function S(X)= set of action-state pairs
e.g. S(Arad)={<Arad → Zerind, Zerind>,…}
intial state + successor function = state space
Goal test, can be
Explicit, e.g. x=‘at bucharest’
Implicit, e.g. checkmate(x)
Path cost (additive)
e.g. sum of distances, number of actions executed, …
c(x,a,y) is the step cost, assumed to be >= 0
A solution is a sequence of actions from initial to goal state.
Optimal solution has the lowest path cost.
87. Selecting a state space
Real world is absurdly complex.
State space must be abstracted for problem solving.
State = set of real states.
Action = complex combination of real actions.
e.g. Arad →Zerind represents a complex set of possible
routes, detours, rest stops, etc.
The abstraction is valid if the path between two states is
reflected in the real world.
Solution = set of real paths that are solutions in the
real world.
Each abstract action should be “easier” than the real
problem.
89. Example: vacuum world
States?? two locations with or without dirt: 2 x 22=8
states.
Initial state?? Any state can be initial
Actions?? {Left, Right, Suck}
Goal test?? Check whether squares are clean.
Path cost?? Number of actions to reach goal.
91. Example: 8-puzzle
States?? Integer location of each tile
Initial state?? Any state can be initial
Actions?? {Left, Right, Up, Down}
Goal test?? Check whether goal configuration is
reached
Path cost?? Number of actions to reach goal
94. Example: 8-puzzle
Size of the state space = 9!/2 = 181,440
15-puzzle .65 x 1012
0.18 sec
6 days
24-puzzle .5 x 1025
12 billion years
10 million states/sec
95. Example: 8-queens
Place 8 queens in a chessboard so that no two queens
are in the same row, column, or diagonal.
A solution Not a solution
97. Example: 8-queens
Formulation #1:
•States: any arrangement of
0 to 8 queens on the board
• Initial state: 0 queens on the
board
• Actions: add a
queen in any square
• Goal test: 8 queens on the
board, none attacked
• Path cost: none
648 states with 8 queens
98. Example: 8-queens
Formulation #2:
•States: any arrangement of
k = 0 to 8 queens in the k
leftmost columns with none
attacked
• Initial state: 0 queens on the
board
• Successor function: add a
queen to any square in the
leftmost empty column such
that it is not attacked
by any other queen
2,067 states • Goal test: 8 queens on the
board
99. Real-world Problems
Route finding
Touring problems
VLSI layout
Robot Navigation
Automatic assembly sequencing
Drug design
Internet searching
…
100. Route Finding
states
locations
initial state
starting point
successor function (operators)
move from one location to another
goal test
arrive at a certain location
path cost
may be quite complex
money, time, travel comfort, scenery, ...
101. Traveling Salesperson
states
locations / cities
illegal states
each city may be visited only once
visited cities must be kept as state information
initial state
starting point
no cities visited
successor function (operators)
move from one location to another one
goal test
all locations visited
agent at the initial location
path cost
distance between locations
102. VLSI Layout
states
positions of components, wires on a chip
initial state
incremental: no components placed
complete-state: all components placed (e.g. randomly,
manually)
successor function (operators)
incremental: place components, route wire
complete-state: move component, move wire
goal test
all components placed
components connected as specified
path cost
may be complex
distance, capacity, number of connections per component
103. Robot Navigation
states
locations
position of actuators
initial state
start position (dependent on the task)
successor function (operators)
movement, actions of actuators
goal test
task-dependent
path cost
may be very complex
distance, energy consumption
104. Assembly Sequencing
states
location of components
initial state
no components assembled
successor function (operators)
place component
goal test
system fully assembled
path cost
number of moves
105. Search Strategies
A strategy is defined by picking the order of node
expansion
Performance Measures:
Completeness – does it always find a solution if one exists?
Time complexity – number of nodes generated/expanded
Space complexity – maximum number of nodes in memory
Optimality – does it always find a least-cost solution
Time and space complexity are measured in terms of
b – maximum branching factor of the search tree
d – depth of the least-cost solution
m – maximum depth of the state space (may be ∞)
106. Uninformed search strategies
(a.k.a. blind search) = use only information available
in problem definition.
When strategies can determine whether one non-goal
state is better than another → informed search.
Categories defined by expansion algorithm:
Breadth-first search
Uniform-cost search
Depth-first search
Depth-limited search
Iterative deepening search.
Bidirectional search
107. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (1)
4 5 6 7
8 9
108. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (2, 3)
4 5 6 7
8 9
109. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (3, 4, 5)
4 5 6 7
8 9
110. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (4, 5, 6, 7)
4 5 6 7
8 9
111. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (5, 6, 7, 8)
4 5 6 7
8 9
112. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (6, 7, 8)
4 5 6 7
8 9
113. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (7, 8, 9)
4 5 6 7
8 9
114. Breadth-First Strategy
Expand shallowest unexpanded node
Implementation: fringe is a FIFO queue
New nodes are inserted at the end of the queue
1
2 3 FRINGE = (8, 9)
4 5 6 7
8 9
115. Breadth-first search: evaluation
Completeness:
Does it always find a solution if one exists?
YES
If shallowest goal node is at some finite depth d
Condition: If b is finite
(maximum num. of succ. nodes is finite)
116. Breadth-first search: evaluation
Completeness:
YES (if b is finite)
Time complexity:
Assume a state space where every state has b
successors.
root has b successors, each node at the next level has
again b successors (total b2), …
Assume solution is at depth d
Worst case; expand all but the last node at depth d
Total numb. of nodes generated:
1 + b + b2 + … + bd + b(bd-1) = O(bd+1)
117. Breadth-first search: evaluation
Completeness:
YES (if b is finite)
Time complexity:
Total numb. of nodes generated:
1 + b + b2 + … + bd + b(bd-1) = O(bd+1)
Space complexity:O(bd+1)
118. Breadth-first search: evaluation
Completeness:
YES (if b is finite)
Time complexity:
Total numb. of nodes generated:
1 + b + b2 + … + bd + b(bd-1) = O(bd+1)
Space complexity:O(bd+1)
Optimality:
Does it always find the least-cost solution?
In general YES
unless actions have different cost.
119. Breadth-first search: evaluation
lessons:
Memory requirements are a bigger problem than execution time.
Exponential complexity search problems cannot be solved by
uninformed search methods for any but the smallest instances.
DEPTH NODES TIME MEMORY
2 1100 0.11 seconds 1 megabyte
4 111100 11 seconds 106
megabytes
6 107 19 minutes 10 gigabytes
8 109 31 hours 1 terabyte
10 1011 129 days 101 terabytes
12 1013 35 years 10 petabytes
Assumptions: b = 10; 10,000 nodes/sec; 1000 bytes/node
120. Uniform-cost search
Extension of BF-search:
Expand node with lowest path cost
Implementation: fringe = queue ordered by path
cost.
UC-search is the same as BF-search when all step-
costs are equal.
121. Uniform-cost search
Completeness:
YES, if step-cost > ε (smal positive constant)
Time complexity:
Assume C* the cost of the optimal solution.
Assume that every action costs at least ε
Worst-case: *ε
C/
O )
(b
Space complexity:
Idem to time complexity
Optimality:
nodes expanded in order of increasing path cost.
YES, if complete.
135. Depth-first search: evaluation
Completeness;
Does it always find a solution if one exists?
NO
unless search space is finite and no loops are
possible.
136. Depth-first search: evaluation
Completeness;
NO unless search space is finite.
m
Ob )
(
Time complexity;
Terrible if m is much larger than d (depth of
optimal solution)
But if many solutions, then faster than BFS
137. Depth-first search: evaluation
Completeness;
NO unless search space is finite.
Time complexity;Ob )
(m
Space complexity; Om1
( +
b )
Backtracking search uses even less memory
One successor instead of all b.
138. Depth-first search: evaluation
Completeness;
NO unless search space is finite.
Time complexity; Ob )
(m
( +
Space complexity; Om1
b )
Optimality; No
139. Depth-Limited Strategy
Depth-first with depth cutoff k (maximal depth below
which nodes are not expanded)
Three possible outcomes:
Solution
Failure (no solution)
Cutoff (no solution within cutoff)
Solves the infinite-path problem.
If k< d then incompleteness results.
If k> d then not optimal.
Time complexity: O(bk)
Space complexity O(bk)
140. Iterative Deepening Strategy
Repeat for k = 0, 1, 2, …:
Perform depth-first with depth cutoff k
Complete
Optimal if step cost =1
Time complexity is:
(d+1)(1) + db + (d-1)b2 + … + (1) bd = O(bd)
Space complexity is: O(bd)
141. Comparison of Strategies
Breadth-first is complete and optimal,
but has high space complexity
Depth-first is space efficient, but neither
complete nor optimal
Iterative deepening combines benefits
of DFS and BFS and is asymptotically
optimal
142. Bidirectional Strategy
2 fringe queues: FRINGE1 and FRINGE2
Time and space complexity = O(bd/2) << O(bd)
The predecessor of each node should be efficiently computable.
143. Summary of algorithms
Criterion Breadth- Uniform- Depth- Depth- Iterative Bidirectio
First cost First limited deepening nal search
Complete YES* YES* NO YES, YES YES*
?
if l ≥ d
Time bd+1 bC*/e bm bl bd bd/2
Space bd+1 bC*/e bm bl bd bd/2
Optimal? YES* YES* NO NO YES YES