2. General problem solvers
A general problem solver is a program that accepts
high-level descriptions of problems and
automatically computes their solution
4. What is planning?
Planning is a key area in Artificial Intelligence. In its
general form, planning is concerned with the
automatic synthesis of action strategies (plans)
from a description of actions, sensors, and goals.
5. Elements of Planning
1. Representation languages for describing
problems conveniently.
2. Mathematical models for making the different
planning tasks precise
3. Algorithms for solving these models effectively
6. Models
Models needed to define scope of a planner
⢠What is a planning problem
⢠What is a solution(plan)
⢠What is an optimal solution
7. Classical Planning
Classical planning can be understood in terms of
deterministic state models characterized by the following
elements
⢠A finite and discrete state space S,
⢠An initial situation given by a state s0 â S,
⢠A goal situation given by a non empty set SG â S,
⢠Actions a(s) â A applicable in each state s â S,
⢠A deterministic state transition function f(a, s) for a â
a(s)
⢠Positive action costs c(a, s) for doing action a in s.
9. Planning with Uncertainty
Unlike in classical planning here states of the
system and that state transitions are non
deterministic so we have to define how uncertainty
is modeled. Also we have to take sensing and
feedback in to the account.
10. Modeling Uncertainty
⢠Pure non determinism
Uncertainty about the state of the world is
represented by the set of states Sâ â S that are
deemed possible
⢠Probability
Represented by a probability distribution over S.
12. Planning under uncertainty
without feedback
In both above cases the problem of planning under uncertainty
without feedback reduces to a deterministic search problem in belief
space, a space which can be characterized by the following
elements.
⢠A space B of belief states over S,
⢠An initial situation given by a belief state b0 â B,
⢠A goal situation given by a set of target beliefs BG
⢠Actions a(b) â A applicable in each belief state b
⢠Deterministic transitions b to ba for a â a(b) givenby (1) and (2)
above
⢠Positive action costs c(a, b).
13. Planning with Sensing
With the ability to sense the world, the choice of
the actions depends on the observation gathered
and thus the form of the plans changes.
14. Planning with Sensing con.
Full-state Observability
In the presence of sensing, the choice of the action ai at time
i depends on all observations o0, o1, . . . , oiâ1 gathered up to
that point.
Partial Observability
observations reveal partial information about the true state
of the world and it is necessary to model how the two are
related. The solution then takes the form of functions
mapping belief states into actions, as states are no longer
known and belief states summarize all the information from
previous belief states and partial observations
16. Temporal Planning
Temporal models extend classical planning in a
different direction. This is a simple but general
model where actions have durations and their
execution can overlap in time.
⢠we assume a duration d(a) > 0 for each action
a, and a predicate comp(A) that defines when a
set of actions A can be executed concurrently.
17. Model for temporal planning
⢠Need to replace the single actions a in that
model by sets of legal actions.
{A0,A1,A2,âŚ..}
⢠each set Ai start their execution at the same time
ti. The end or completion time of an action a in Ai
is thus ti + d(a) where d(a) is the duration of a.
⢠t0 = 0 and ti+1 is given by the end time of the first
action in A0, . . . , Ai that completes after ti.
18. Model for temporal planning
con.
⢠The initial state s0 is given, while si+1 is a function of
the state si at time ti and the set of actions Ai in the
plan that complete exactly at time t + 1; i.e.,
si+1 = fT (Ai, si).
⢠The state transition function fT is obtained from the
representation of the individual actions
⢠A valid temporal plan is a sequence of legal sets of
actions mapping the initial state into a goal state.
20. Temporal planning vs.
sequential planning
⢠Though the model for sequential planning and
the model for temporal planning both appear to
be close from a mathematical point of view they
are quite different from a computational point of
view.
⢠Heuristic search is probably the best current
approach for optimal and non-optimal sequential
planning, it does not represent the best
approach for parallel planning
21. Languages
In large problems, the state space and state
transitions need to be represented implicitly in a
logical action language, normally through a set of
(state) variables and action rules. A good action
language is one that supports compact encodings
of the models of interest.
22. Strips
⢠In AI Planning, the standard language for many
years has been the Strips language introduced in
1971 by Fikes & Nilsson.
⢠While from a logical point of view, Strips is a very
limited language, Strips is well known and helps
to illustrate the relationship between planning
languages and planning models, and to motivate
some of the extensions that have been
proposed.
23. Strips con.
Strips
State language
Operator language
24. Elements of Strips
⢠The Strips language L is a simple logical language
made up of two types of symbols: relational and
constant symbols. E.g.
on(a, b)
relational constant
symbol symbols
⢠In Strips, there are no functional symbols and the
constant symbols are the only terms.
25. Elements of Strips con.
⢠Atoms
combination p(t1, . . . , tk) of a relational symbol p
and a tuple of terms ti of the same arity as p.
⢠Operators
defined over the set of atoms in L. Each operator
op has a precondition, add, and delete lists
Prec(op), Add(op), and Del(op) given by sets of
atoms.
26. Planning problems in Strips
P = <A,O, I,G>
⢠A stands for the set of all atoms in the domain
⢠O is the set of operators
⢠I and G are sets of atoms defining the initial and
goal situations.
27. Planning problems in Strips con.
The problem P defines a deterministic state model
S(P)like below
⢠The states s are sets of atoms from A
⢠The initial state s0 is I
⢠The goal states are the states s such that G â s
⢠A(s) is the set of operators o â O s.t. Prec(o) â s
⢠The state transition function f is such that
f(a, s) =s + add(a) â del(a) for a â a(s)
⢠Costs c(a, s) are all equal to 1
28. Advanced languages
⢠Domain-independent planners with expressive state
and action languages like GPT (Bonet &
Geffner2000) and MBP (Bertoli et al. 2001) has been
introduced. both of them provide additional
constructs for expressing non determinism and
sensing.
⢠Knowledge-based planners has been introduced
which provide very rich modeling languages, often
including facilities for representing time and
resources
30. Heuristic Search
⢠Simple , powerful and explains success of recent
approaches
⢠Maps planning problems into search problems
⢠Explicitly searches state space with heuristic h(s) that
estimates cost from s to Goal
⢠Heuristic h extracted automatically from problem
representation
⢠Uses A*, IDA*âŚâŚ etc to plan
31. Heuristic Functions
⢠Heuristic search depends on the choice of heuristic
function
⢠Heuristics derived as optimal cost function of relaxed
problems
⢠Simple relaxation used in planning
⢠Ignore delete lists
⢠Reduce large goal sets to subsets
⢠Ignore certain atoms
33. Additive Heuristic
⢠Assume atoms are independent
⢠Heuristic not admissible (not lower bounded )
⢠Informative and fast
⢠Useful for optimal planning
⢠The heuristic h+(s) of a state is h+(G) where G is
the goal, and is obtained by solving the above
first equation with single shortest path
algorithms.
34. Fast Forward Heuristic
⢠Does not assume that atoms are independent
⢠Solve the relaxation suboptimally
⢠Extracts useful information for guiding hill
climbing search
36. Generalisation: hm heuristics
⢠For fixed m=1,2âŚ. Assume cost of achieving set C
given by cost of most costly subset of size m
â For m=1 , hm = hmax
â For m=2 , hm = hG (Graphplan)
â For any m , hm admisible and polinomial
37. Pattern databases
⢠Project the state space S of a problem into a
smaller state space Sâ
⢠Sâ can be solved optimally or exhaustively
⢠Heuristic h(s) for the original state space is
obtained from the solution cost hââ(sâ) of the
projected state sâ in the relaxed space.
38. Branching Scheme
⢠A branch is an object that specifies a linear
sequence of versions of an element
⢠Enable parallel development
⢠Usually seldom mentioned in texts
⢠But has very strong influence on performance
39. Types of branching
⢠Forward Branching
â Start from the initial state and go forward till the goal
state is found
⢠Backward Branching
â start from the goal state and go backward till the
initial state
⢠Other
â When both above methods do not workout
40. Classification in AI planners
⢠State space planners
â Progression and regression planning
â Search in the space of states
â Build plans from head or tail only
â Estimated cost consists of two parts
⢠the accumulated cost of the plan g(p) that depends on p
⢠the estimated cost of the remaining plan h(s) that depends
only on the state s obtained by progression or regression
â High branching factor
41. Classification in AI planning
⢠Partial âorder planners
â Search in the space of plans
â The partial plan heads or tails can be suitably
summarized
â Useful for computing the estimated cost f(p) of the
best complete plans
42. Heuristics and branching
convergence
it should be possible to combine informative lower
bounds and effective branching rules, thus allowing
us to prune partial solutions p whose estimated
completion cost f(p) exceeds a bound B.
43. Search in Non-Deterministic Spaces
⢠Heuristic and constraint-based approaches are
not directly applicable to problems involving
non-determinism and feedback
⢠The solution of these problems is not a sequence
of actions but a function mapping states into
actions.
⢠Dynamic programming is used
47. DP features
⢠Works well for spaces containing large number of
states
⢠For larger spaces, the time and space
requirements of pure DP methods is expensive
⢠In comparison heuristic search methods for
deterministic problems that can deal with huge
state spaces provided a good heuristic function
48. Converging DP and Heuristic
methods
New strategies that integrate DP and heuristic
search methods have been proposed.
⢠Real time dynamic programming
⢠In every non-goal state s, the best action a according to the
heuristic is selected
⢠Heuristic value h(s) of the state s is updated using Bellman
equation.
⢠Then a random successor state is selected
⢠This process is repeated until the goal is reached
Hinweis der Redaktion
The development of general problem solvers has been one of the main goals in Artificial Intelligence. SlideâŚâŚâŚâŚâŚ.
A general problem solver has two main parts a general modeling language for describing problems, and algorithms for solving them.
Planning is a form of general problem solving. It is concerned with automatic synthesis of action strategies (plans) from a description of actions, sensors, and goals.AI Planning is general problem solving over a class of models. Models define the scope of a planner, the types of problems it is supposed to handle, the form of the solutions, and the solutions that are best or optimal.
When we talk about planning we can identify three main elements.We need a formal representation language to describe the problemModels to understand them and Algorithms to solve them
Lets first study models in planning.Why we need a model what can a model do?A model is used to define the scope of a planner. Without a scope it will be impractical to implement them. Also it is used to define what are the solutions to a particular planning problem and what the optimal solution is.
Now we understand the importance of models. Our next step is to understand some planning models which are developed to address different categories of planning problems. First is classical planning.Describe the model
Here it is shown the definitions for solutions and optimal solutions in classical planning.In simple words a solution is a sequence of applicable action that maps s0 to SG.An optimal solution is a solution that minimizes sum of action costs.In classical planning, it is also assumed that all costs c(a, s) are equal and thus that the optimal plans are the ones with minimal length.
Classical planning assumes that the initial state of the system is known and that state transitions are deterministic. But there are situations which we cannot apply these assumptions. In this kind of situations we have to develop a model which includes uncertainty.To do that we have to take sensing and feedback into the account.We need sensing to check what are the states.We need feedback to asses our actions.
When we plan with uncertinity we can follow two approches. Those are pure non determinisam and probabilities.In pure non determinism âŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ.In probabilitiesâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ..
Now letâs s how the state transitions happens in this model. There are two equations for the two cases. In either case, an action a deterministically maps a belief state b into a new belief state ba.Here a belief state represents set of states deemed possible, and summarizes past actions and observations.
If we are going to handle a planning problem with uncertainty without feedback we have to solve it in a deterministic manner.
Until now we talked about situations where no additional information available in execution time. But when we plan with sensing in other words feedback we have to deal with the information we receive in execution time.As the slide saysâŚâŚâŚâŚâŚâŚâŚâŚâŚ..But here there is important fact that is sensing only makes sense in a state of uncertainty; if there is no uncertainty, sensing provides no useful information and can be ignored
There are two ctegories of planing with sensing problems. Those areFull-state ObservabilityPartial ObservabilityIn full oberservabilityâŚâŚâŚâŚâŚâŚâŚ..In parialobservabilityâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ..
As in the earlier situation belief states can be shown as this when planning with sensing.
Until this state we does not involve time durations in our planning problems. But some times that is essential. We can use temporal planing in those kind o situations.SlideâŚ..
Second bullet pointThis simply says we have to consider each action separately in the set Ai to obtain state transition function.
Actually we can see that up to this point this coincide with the model for classical planning if primitive actions are replaced by legal set of actions. When defining the action costs we cannot simply add the cost of each action to get the total cost because the contribution of the actions in Ai depends on the actions taken at the previous steps. So we have to calculate it as this.Total cost to reach the goal can be given as thisInitial cost + sum of transition costs
Here parallel planning is temporal planning with actions of unit durations only.SAT and CSP are two approaches which are introduced for parallel planning.
As we discussed early we need a formal language to represent complex planning problems. So we can develop languages as described in the slide to address this.Basically planning languages do two major tasks,It specifies the model and reveal useful heuristic information.
Stanford Research Institute Problem Solver
State Language- a language for describing the world Operator Language - language for describing how the world changes.We consider the Strips language as used currently in planningrather than the original version of Strips that is more complex.
Strips has two kinds of symbols. Relational and constant symbols. In the expression given I the slide on is a relational symbol. We call it a relational symbol with arity two because it contains two constant symbols.A main difference between relational and constant symbols in Strips is that the former are used to keep track of aspects of the world that may change as a result of the actions (e.g., the symbol on in on(a, b)), while the latter are used to refer to objects in the domain (e.g., the symbols a and b in on(a, b)).In Strips, there are no functional symbols and the constant symbols are the only terms.
There is a construct called atoms in Strips. They acts like boolean variables of the domain. Atom is defined as a âŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚEach operator has three lists precondition, add and delete.Here a precondition is something like this if we need to by coffee we need to be already in the coffee shop.
We can represent a problem in Strips as a tuple like this.Here P is the problemâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚâŚ
This mapping defines the semantics of a Strips planning problem P, whose solution is given by the solution of the state model S(P).GPT (Bonet & Geffner 2000) and MBP (Bertoli et al. 2001),
Heuristic not admissible (not lower bounded ) but informative and fastUsefull for optimal planningThe heuristic h+(s) if a state is h+(G) where G is the goal, and is obtained by solving the above first equation with single shortest path algorithms.
Atoms are independent â the cost of achieving a set of atoms corresponds to the sum of the costs of achieving each atom in the set
Heuristic admissible but not very informative
To find the heuristic functionRecent idea of relaxationConsidering a set of possible values as a same value
Every element has one main branch, which represents the principal line of development, and may have multiple subbranches, each of which represents a separate line of developmentParallel development- we can consider branches separatelyThis is different from branching factor
Other â especial cases no need to describe here
Now I am going to classify different planners according to how they operateIn AI planning classification is slightly differentthe estimated cost of the remaining plan h(s) that depends only on the state s obtained by progression or regression and which summarizes the partial plan p completelybranching factor in temporal planning, where the set of parallel macro actions is exponential in the number of primitive actions, and in a number of sequential domains like Sokoban (Junghanns & Schaeffer 1999), where the number of applicable actions is just too large.
Search a partially developed planIf we have already identified heads and tails we can use this
Huristic , branching combined
Heuristic and constraint-based approaches, so powerful in the deterministic setting, are not directly applicable to problems involving non-determinism and feedback, as the solution of these problems is not a sequence of actions but a function mapping states into actions.Dynamic programming (DP) methods compute a value function over all states, and use this function to define the policy
The greedy policy đV (s) relative to a given value function V corresponds to the function that maps states s into actions a that minimize the worst cost or the expected cost of reaching the goal from s according to whether state transitions are modeled nondeterministically or probabilistically.The greedy policy đV (s) relative to a given value function V corresponds to the function that maps states s into actions a that minimize the worst cost or the expected cost of reaching the goal from s according to whether state transitions are modeled nondeterministically or probabilistically.
Dynamic programming (DP) methods compute a value function over all states, and use this function to define the policyThe greedy policy đV (s) relative to a given value function V corresponds to the function that maps states s into actions a that minimize the worst cost or the expected cost of reaching the goal from s according to whether state transitions are modeled nondeterministically or probabilistically.Greedy policy is optimal when V is the optimal cost functionany heuristic function h determines a greedy policy đV for V = hDynamic programming (DP) methods compute a value function over all states, and use this function to define the policyThe greedy policy đV (s) relative to a given value function V corresponds to the function that maps states s into actions a that minimize the worst cost or the expected cost of reaching the goal from s according to whether state transitions are modeled nondeterministically or probabilistically.Greedy policy is optimal when V is the optimal cost functionany heuristic function h determines a greedy policy đV for V = h
The optimal cost function is the solution of the Bellman equation for the non-deterministic and stochastic cases respectively,In both cases with V (s) = 0 for all goal states. Value iteration solves the Bellman equation by plugging an estimate Vi function on the right hand side, and obtaining a new value function i+1 on the left hand side. This process is iterated until a fixed point is reached (in the probabilistic case, the convergence is defined in a slightly different way
..Heuristic function is to limit the no of avoid consideration of most states.
In the last few years, promising strategies that integrate DP and heuristic search methods have been proposed. Real time dynamic programmingMore precisely, in every non-goal state s, the best action a according to the heuristic is selected (i.e., a = Ďh(s)) and the heuristic value h(s) of the state s is updated using Bellman equation.Then a random successor state of s and a is selected using the transition function or transition probabilities, and this process is repeated until the goal is reached.