T12 Distributed search and constraint handling

Distributed Constraint Handling
and Optimization

Alessandro Farinelli1 Alex Rogers2 Meritxell Vinyals1

1 Computer Science Department

University of Verona, Italy
2 Agents, Interaction and Complexity Group

School of Electronics and Computer Science
University of Southampton,UK

Tutorial EASSS 2012 Valencia
https://sites.google.com/site/
easss2012optimization/

Outline

Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions

Constraints

• Pervade our everyday lives
• Are usually perceived as elements that limit solutions to the
problems we face

Constraints

From a computational point of view, they:
• Reduce the space of possible solutions
• Encode knowledge about the problem at hand
• Are key components for efﬁciently solving hard problems

Constraint Processing

Many different disciplines deal with hard computational problems that
can be made tractable by carefully considering the constraints that
deﬁne the structure of the problem.

Planning Operational Automated Reasoning Computer
Scheduling Research Decision Theory Vision

Constraint Processing in Multi-Agent Systems

Focus on how constraint processing can be used to address
optimization problems in Multi-Agent Systems (MAS) where:

A set of agents must come to some agreement, typically via some
form of negotiation, about which action each agent should take in
order to jointly obtain the best solution for the whole system.

M2
M1

A1 A2 A2 A3

Distributed Constraint Optimization Problems (DCOPs)

We will consider Distributed Constraint Optimization Problems (DCOP)
where:

Each agent negotiates locally with just a subset of other agents
(usually called neighbors) that are those that can directly inﬂuence
his/her behavior.

M2
M1

A2

A1 A3

Distributed Constraint Optimization Problems (DCOPs)

After reading this chapter, you will understand:
• The mathematical formulation of a DCOP
• The main exact solution techniques for DCOPs
• Key differences, beneﬁts and limitations
• The main approximate solution techniques for DCOPs
• Key differences, beneﬁts and limitations
• The quality guarantees these approach provide:
• Types of quality guarantees
• Frameworks and techniques

Constraint Networks

A constraint network N is formally deﬁned as a tuple X, D,C where:

• X = {x1 , . . . , xn } is a set of discrete variables;
• D = {D1 , . . . , Dn } is a set of variable domains, which enumerate
all possible values of the corresponding variables; and
• C = {C1 , . . . ,Cm } is a set of constraints; where a constraint Ci is
deﬁned on a subset of variables Si ⊆ X which comprise the
scope of the constraint
• r = |Si | is the arity of the constraint
• Two types: hard or soft

Hard constraints

• A hard constraint Cih is a relation Ri that enumerates all the valid
joint assignments of all variables in the scope of the constraint.

Ri ⊆ Di1 × . . . × Dir

Ri xj xk
0 1
1 0

Soft constraints

• A soft constraint Cis is a function Fi that maps every possible joint
assignment of all variables in the scope to a real value.
Fi : Di1 × . . . × Dir → ℜ

Fi xj xk
2 0 0
0 0 1
0 1 0
1 1 1

Binary Constraint Networks

x1
• Binary constraint networks are those where:

F1,3
,2
• Each constraint (soft or hard) is deﬁned

F1
over two variables.
• Every constraint network can be mapped to
a binary constraint network x2 x3

4
F1,
• requires the addition of variables and
constraints

F2,4
• may add complexity to the model
• They can be represented by a constraint
graph x4

Different objectives, different problems

• Constraint Satisfaction Problem (CSP)

• Objective: find an assignment for all the variables in the network
that satisfies all constraints.

• Constraint Optimization Problem (COP)

• Objective: find an assignment for all the variables in the network
that satisfies all constraints and optimizes a global function.

• Global function = aggregation (typically sum) of local functions.
F(x) = ∑i Fi (xi )


A1
When operating in a
decentralized context:
• a set of agents control
variables
A2
• agents interact to ﬁnd a A4
solution to the constraint
network

A3


Two types of decentralized problems:
• distributed CSP (DCSP)
• distributed COP (DCOP)

Here, we focus on DCOPs.

Distributed Constraint Optimization Problem (DCOP)

A DCOP consists of a constraint network N = X, D,C and a set of
agents A = {A1 , . . . , Ak } where each agent:
• controls a subset of the variables Xi ⊆ X
• is only aware of constraints that involve variable it controls
• communicates only with its neighbours

Distributed Constraint Optimization Problem (DCOP)

• Agents are assumed to be fully cooperative
• Goal: ﬁnd the assignment that optimizes the global function, not
their local local utilities.
• Solving a COP is NP-Hard and DCOP is as hard as COP.

Motivation

Why distribute?
• Privacy
• Robustness
• Scalability

Real World Applications

Many standard benchmark problems in computer science can be
modeled using the DCOP framework:
• graph coloring
As can many real world applications:
• human-agent organizations (e.g. meeting scheduling)
• sensor networks and robotics (e.g. target tracking)

Outline

Introduction


Graph coloring
Meeting Scheduling
Target Tracking



Conclusions

Graph coloring

• Popular benchmark
• Simple formulation
• Complexity controlled with few parameters:
• Number of available colors
• Number of nodes
• Density (#nodes/#constraints)
• Many versions of the problem:
• CSP, MaxCSP, COP

Graph coloring - CSP

• Nodes can take k colors
• Any two adjacent nodes should have different colors
• If it happens this is a conﬂict

Yes! No!

Graph coloring - Max-CSP

• Minimize the number of conﬂicts

0 -1

Graph coloring - COP

• Different weights to violated constraints
• Preferences for different colors

0 -2
-1

-3 -2

-1

Graph coloring - DCOP

• Each node:
• controlled by one agent
• Each agent:
• Preferences for different colors
• Communicates with its direct neighbours in the graph

-1
A1 A3 • A1 and A2 exchange
-3 -2 preferences and conﬂicts
• A3 and A4 do not
A2 A4 communicate
-1

Meeting Scheduling

Motivation:
• Privacy
• Robustness
• Scalability

Meeting Scheduling

In large organizations many people, possibly working in different
departments, are involved in a number of work meetings.

Meeting Scheduling

People might have various private preferences on meeting start times

Better after 12:00am

Meeting Scheduling

Two meetings that share a participant cannot overlap

Window: 15:00-18:00
Duration: 2h

Window: 15:00-17:00
Duration: 1h

DCOP formalization for the meeting scheduling problem

• A set of agents representing participants
• A set of variables representing meeting starting times according
to a participant.
• Hard Constraints:
• Starting meeting times across different agents are equal
• Meetings for the same agent are non-overlapping.
• Soft Constraints:
• Represent agent preferences on meeting starting times.

Objective: ﬁnd a valid schedule for the meeting while maximizing the
sum of individuals’ preferences.

Target Tracking

Motivation:
• Privacy
• Robustness
• Scalability

Target Tracking

A set of sensors tracking a set of targets in order to provide an
accurate estimate of their positions.

T4

T1
T3

T2

Crucial for surveillance and monitoring applications.

Target Tracking

Sensors can have different sensing modalities that impact on the
accuracy of the estimation of the targets’ positions.

MODES
T4

MODES MODES
T1 T3

MODES T2

Target Tracking

Collaboration among sensors is crucial to improve system
performance

T4

T1
T3

T2

DCOP formalization for the target tracking problem

• Agents represent sensors
• Variables encode the different sensing modalities of each sensor
• Constraints
• relate to a speciﬁc target
• represent how sensor modalities impacts on the tracking
performance
• Objective:
• Maximize coverage of the environment
• Provide accurate estimations of potentially dangerous targets

Complete Algorithms

U
D Always ﬁnd an optimal solution

D Exhibit an exponentially increasing coordination overhead
Very limited scalability on general problems.

Complete Algorithms

• Completely decentralised
• Search-based.
• Synchronous: SyncBB, AND/OR search
• Asynchronous: ADOPT, NCBB and AFB.
• Dynamic programming.

• Partially decentralised
• OptAPO

Next, we focus on completely decentralised algorithms

Decentralised Complete Algorithms

Search-based Dynamic programming
• Uses distributed search • Uses distributed inference
• Exchange individual values • Exchange constraints
• Small messages but • Few messages but
. . . exponentially many . . . exponentially large
Representative: ADOPT [Modi Representative: DPOP [Petcu
et al., 2005] and Faltings, 2005]

Outline

Introduction



Search Based: ADOPT
Dynamic Programming DPOP


Conclusions

ADOPT

ADOPT (Asynchronous Distributed OPTimization) [Modi et al., 2005]:

• Distributed backtrack search using a best-ﬁrst strategy

• Best value based on local information:

• Lower/upper bound estimates of each possible value of its
variable

• Backtrack thresholds used to speed up the search of previously
explored solutions.

• Termination conditions that check if the bound interval is less than
a given valid error bound (0 if optimal)

ADOPT by example

4 variables (4 agents): x1 , x2 , x3 , x4 with D = {0, 1}

4 identical cost functions
x1 Fi, j xi xj
2 0 0
F1,3
,2
F1

0 0 1
0 1 0
x2 x3
1 1 1
4
F1,
F2,4

Goal: ﬁnd a variable assignment with minimal cost

x4 Solution: x1 = 1, x2 = 0, x3 = 0 and x4 = 1
giving total cost 1.

DFS arrangement

• Before executing ADOPT, agents must be arranged in a depth
ﬁrst search (DFS) tree.
• DFS trees have been frequently used in optimization because
they have two interesting properties:
• Agents in different branches of the tree do not share any
constraints;
• Every constraint network admits a DFS tree.

ADOPT by example

A1 (root)
x1

→

←
Fi, j xi xj

ent

par
par

ent ,
2 0 0
F1,3
,2

,

chil
hil d
F1

0 0 1

d→
←c
0 1 0
x2 x3 1 1 1 A2 A3
4
F1,

→
arent
F2,4

→

il d , p
← ch
x4 DFS arrangement
A4

Cost functions

The local cost function for an agent Ai (δ (xi )) is the sum of the values
of constraints involving only higher neighbors in the DFS.

ADOPT by example

A1 δ (x1 ) = 0

δ (x1 , x2 ) = F1,2 (x1 , x2 ) A2 A3 δ (x1 , x3 ) = F1,3 (x1 , x3 )

A4

δ (x1 , x2 , x4 ) = F1,4 (x1 , x4 ) + F2,4 (x2 , x4 )

Initialization

Each agent initially chooses a random value for their variables and
initialize the lower and upper bounds to zero and inﬁnity respectively.

x1 = 0, LB = 0,UB = ∞ A1

x2 = 0, LB = 0,UB = ∞ A2 A3 x3 = 0, LB = 0,UB = ∞

x4 = 0, LB = 0,UB = ∞ A4

ADOPT by example

Value messages are sent by an agent to all its neighbors that are
lower in the DFS tree

x1 = 0 A1
← =0

x 1−−
−

−
−−

=→
x1

0 A1 sends three value
message to A2 , A3 and
x2 = 0 A2 A3 x3 = 0
←−0

A4 informing them that its
−−
x1 =
←−0
−−

current value is 0.
x2 =

x4 = 0 A4

ADOPT by example

Current Context: a partial variable assignment maintained by each
agent that records the assignment of all higher neighbours in the DFS.

A1

• Updated by each VALUE
message
c2 : {x1 = 0} A2 A3
• If current context is not
c3 : {x1 = 0} compatible with some child
context, the latter is re-initialized
A4 (also the child bound)

c4 : {x1 = 0, x2 = 0}

ADOPT by example

Each agent Ai sends a cost message to its parent A p

A1
, c2 ]

[0, 0

Each cost message reports:
[0, ∞

,
c 3]

• The minimum lower bound (LB)
• The maximum upper bound (UB)
A2 A3
• The context (ci )
4] c
[0, 0,

[LB,UP, ci ]

A4

Lower bound computation

Each agent evaluates for each possible value of its variable:
• its local cost function with respect to the current context
• adding all the compatible lower bound messages received from
children.

Analogous computation for upper bounds

ADOPT by example

Consider the lower bound in the cost message sent by A4 :

A1 • Recall that A4 local cost function is:
δ (x1 , x2 , x4 ) = F1,4 (x1 , x4 ) + F2,4 (x2 , x4 )
• Restricted to the current context
c4 = {(x1 = 0, x2 = 0)}:
λ (0, 0, x4 ) = F1,4 (0, x4 ) + F2,4 (0, x4 ).
A2 A3 • For x4 = 0:
λ (0, 0, 0) = F1,4 (0, 0) + F2,4 (0, 0) = 2 + 2 = 4.
4] c

• For x4 = 1:
[0, 0,

λ (0, 0, 1) = F1,4 (0, 1) + F2,4 (0, 1) = 0 + 0 = 0.

A4 Then the minimum lower bound across variable
values is LB = 0.

ADOPT by example

Each agent asynchronously chooses the value of its variable that
minimizes its lower bound.

A2 computes for each possible value of its
A1
variable its local function restricted to the
, c2 ]

current context c2 = {(x1 = 0)}
[0, 2

(λ (0, x2 ) = F1,2 (0, x2 )) and adding lower
bound message from A4 (lb).
x2 = 0 → 1 A2 A3 • For x2 = 0: LB(x2 = 0) = λ (0, x2 =
0) + lb(x2 = 0) = 2 + 0 = 2.
1
x2 =

• For x2 = 1: LB(x2 = 1) = λ (0, x2 =
1) + 0 = 0 + 0 = 0.
A4
A2 changes its value to x2 = 1 with LB = 0.

Backtrack thresholds

The search strategy is based on lower bounds

Problem
• Values abandoned before proven to be
suboptimal
• Lower/upper bounds only stored for the
current context

Solution
• Backtrack thresholds: used to speed up
the search of previously explored
solutions.

ADOPT by example

x1 = 0 → 1 → 0

A1

A1 changes its value and the context with
x1 = 0 is visited again.
• Reconstructing from scratch is inefﬁcient
A2 A3
• Remembering solutions is expensive

A4

Backtrack thresholds

Solution: Backtrack thresholds
• Lower bound previously determined by children
• Polynomial space
• Control backtracking to efﬁciently search
• Key point: do not change value until LB(currentvalue)> threshold

A child agent will not change its variable value so long as cost is less
than the backtrack threshold given to it by its parent.

LB(x1 = 0) = 1 A1

t (x 1 =
1
2
0) =

0) =
t (x1 =

1
2
1 1
LB(x2 = 0) > 2 ? A2 A3 LB(x3 = 0) > 2 ?

A4

Rebalance incorrect threshold

How to correctly subdivide threshold among children?

• Parent distributes the accumulated bound among children
• Arbitrarily/Using some heuristics

• Correct subdivision as feedback is received from children
• LB < t(CONT EXT )
• t(CONT EXT ) = ∑Ci t(CONT EXT ) + δ

Backtrack Threshold Computation

A1

(2) 2 t
1
=1
= 0) =

1 (x 1 =
(1) LB

1

• When A1 receives a new lower bound
x

0) = 0
(2) t (

from A2 rebalances thresholds
• A1 resends threshold messages to A2
A2 A3 and A3

A4

ADOPT extensions

• BnB-ADOPT [Yeoh et al., 2008] reduces computation time by
using depth-ﬁrst search with branch and bound strategy

• [Ali et al., 2005] suggest the use of preprocessing techniques for
guiding ADOPT search and show that this can result in a
consistent increase in performance.

DPOP

DPOP (Dynamic Programming Optimization Protocol) [Petcu and
Faltings, 2005]:
• Based on the dynamic programming paradigm.
• Special case of Bucket Tree Elimination Algorithm (BTE)
[Dechter, 2003].

DPOP by example

x1 x1
Fi, j xi xj

← P2 , child →
F1,3
,2
F1

→

← PP3
2 0 0

4
il d , PP
,4

0 0 1
F2,3
F1

, pseud
x2 x3 0 1 0

ud och
1 1 1

ochil d
x2

← pse
,4
F2

→

←

→
=>

P3,
4
,P

chi
ld
chi

ld
x4
DFS arrangement

←

→
Objective: ﬁnd assignment x4 x3
with maximal value

DPOP phases

Given a DFS tree structure, DPOP runs in two phases:
• Util propagation: agents exchange util messages up the tree.
• Aim: aggregate all info so that root agent can choose optimal
value
• Value propagation: agents exchange value messages down the
tree.
• Aim: propagate info so that all agents can make their choice given
choices of ancestors

Sepi : set of agents preceding Ai in the pseudo-tree order that are
connected with Ai or with a descendant of Ai .

x1 Sep1 = 0
/

← P2 , child →
→

← PP3
4
il d , PP

, pseud
ud och

ochil d
x2 Sep2 = {x1 }
← pse

→

←

→
P3,
4
,P

chi
ld
chi

ld
←

→
Sep4 = {x1 , x2 } x4 x3 Sep3 = {x1 , x2 }

Util message

The Util message Ui→ j that agent Ai sends to its parent A j can be
computed as:

Ui→ j (Sepi ) = max Uk→i ⊗ Fi,p
xi Ak ∈Ci A p ∈Pi ∪PPi

Size exponential All incoming messages Shared constraints with
in Sepi from children parents/pseudoparents
The ⊗ operator is a join operator that sums up functions with different
but overlapping scores consistently.

Join operator

F2,4 x2 x4 max{x4 } (F1,4 ⊗ F2,4 )
F1,4 ⊗ F2,4
2 0 0
x1 x2 x4 x1 x2 x4
0 0 1
4 0 0 0 0 0 0
0 1 0 max(4,0)
0 0 0 1 0 0 1
1 1 1 Project
2 0 1 0 0 1 0
Add max(2,1)
1 0 1 1 out x4 0 1 1
F1,4 x1 x4
2 1 0 0 1 0 0
2 0 0 max(2,2)
2 1 0 1 1 0 1
0 0 1
0 1 1 0 1 1 0
0 1 0 max(0,2)
2 1 1 1 1 1 1
1 1 1

Complexity exponential to the largest Sepi .
Largest Sepi = induced width of the DFS tree ordering used.

A1 (root)
U2→1 x1 Sep2

U1→2
←−
−−
0 10
Sep4 1 5
A2 max(U3→2 ⊗U4→2 ⊗ F1,2 )
x2
U4→2 x1 x2 U3→2 x1 x2 Sep3

U← −
−− 2

4 0 0 4 0 0
→
− 4→

3 −
−
→
U

2
2 0 1 2 0 1
2 1 0 2 1 0
2 1 1 2 1 1
A4 A3
1 max(F1,3 ⊗ F2,3 )
2 max(F1,4 ⊗ F2,4 ) x3
x4

Value message

Keeping ﬁxed the value of parent/pseudoparents, ﬁnds the value that
maximizes the computed cost function in the util phase:

∗
xi = arg max ∑ U j→i (xi , x∗ ) +
p ∑ Fi, j (xi , x∗ )
j
xi
A j ∈Ci A j ∈Pi ∪PPi

where x∗ = A j ∈Pi ∪PPi {x∗ } is the set of optimal values for Ai ’s parent
p j
and pseudoparents received from Ai ’s parent.
Propagates this value through children down the tree:

∗ ∗
Vi→ j = {xi = xi } ∪ {xs = xs }
xs ∈Sepi ∩Sep j

∗
x1 = max U1→2 (x1 )
A1 x1

V1→2
−→
−
∗ ∗ ∗ ∗
x2 = max(U3→2 (x1 , x2 ) ⊗U4→2 (x1 , x2 ) ⊗ F1,2 (x1 , x2 ))
A2 x2

V 2→→
−4
−

−
←2→

−
V

3

A4 A3
∗ ∗ ∗
x4 = max(F1,4 (x1 , x4 ) ⊗ F2,4 (x2 , x4 )) ∗ ∗ ∗
x3 = max(F1,3 (x1 , x3 ) ⊗ F2,3 (x2 , x3 ))
x4 x3

DPOP extensions

• MB-DPOP [Petcu and Faltings, 2007] trades-off message size
against the number of messages.
• A-DPOP trades-off message size against solution quality [Petcu
and Faltings, 2005(2)].

Conclusions

• Constraint processing
• exploit problem structure to solve hard problems efﬁciently
• DCOP framework
• applies constraint processing to solve decision making problems
in Multi-Agent Systems
• increasingly being applied within real world problems.

References I

• [Modi et al., 2005] P. J. Modi, W. Shen, M. Tambe, and M.Yokoo. ADOPT: Asynchronous
distributed constraint optimization with quality guarantees. Artiﬁcial Intelligence Jour- nal,
(161):149-180, 2005.
• [Yeoh et al., 2008] W. Yeoh, A. Felner, and S. Koenig. BnB-ADOPT: An asynchronous
branch-and-bound DCOP algorithm. In Proceedings of the Seventh International Joint
Conference on Autonomous Agents and Multiagent Systems, pages 591Ð598, 2008.
• [Ali et al., 2005] S. M. Ali, S. Koenig, and M. Tambe. Preprocessing techniques for
accelerating the DCOP algorithm ADOPT. In Proceedings of the Fourth International Joint
Conference on Autonomous Agents and Multiagent Systems, pages 1041Ð1048, 2005.
• [Petcu and Faltings, 2005] A. Petcu and B. Faltings. DPOP: A scalable method for
multiagent constraint optimization. In Proceedings of the Nineteenth International Joint
Conference on Arti- ﬁcial Intelligence, pages 266-271, 2005.
• [Dechter, 2003] R. Dechter. Constraint Processing. Morgan Kaufmann, 2003.

References II

• [Petcu and Faltings, 2005(2)] A. Petcu and B. Faltings. A-DPOP: Approximations in
distributed optimization. In Principles and Practice of Constraint Programming, pages
802-806, 2005.
• [Petcu and Faltings, 2007] A. Petcu and B. Faltings. MB-DPOP: A new memory-bounded
algorithm for distributed optimization. In Proceedings of the Twentieth International Joint
Confer- ence on Artiﬁcial Intelligence, pages 1452-1457, 2007.
• [S. Fitzpatrick and L. Meetrens, 2003] S. Fitzpatrick and L. Meetrens. Distributed Sensor
Networks: A multiagent perspective, chapter Distributed coordination through anarchic
optimization, pages 257- 293. Kluwer Academic, 2003.
• [R. T. Maheswaran et al., 2004] R. T. Maheswaran, J. P. Pearce, and M. Tambe.
Distributed algorithms for DCOP: A graphical game-based approach. In Proceedings of
the Seventeenth International Conference on Parallel and Distributed Computing
Systems, pages 432-439, 2004.

Approximate Algorithms: outline

• No guarantees
– DSA-1, MGM-1 (exchange individual assignments)
– Max-Sum (exchange functions)
• Off-Line guarantees
– K-optimality and extensions
• On-Line Guarantees
– Bounded max-sum

Why Approximate Algorithms
• Motivations
– Often optimality in practical applications is not achievable
– Fast good enough solutions are all we can have
• Example – Graph coloring
– Medium size problem (about 20 nodes, three colors per
node)
– Number of states to visit for optimal solution in the worst
case 3^20 = 3 billions of states
• Key problem
– Provides guarantees on solution quality

Exemplar Application: Surveillance
• Event Detection
– Vehicles passing on a road
• Energy Constraints
– Sense/Sleep modes
– Recharge when sleeping
• Coordination
– Activity can be detected
by single sensor duty cycle

– Roads have different time

traffic loads Good Schedule

• Aim [Rogers et al. 10] Bad Schedule
– Focus on road with more
traffic load Heavy traffic road small road

Guarantees on solution quality
• Key Concept: bound the optimal solution
– Assume a maximization problem
– optimal solution, a solution
–
– percentage of optimality
• [0,1]
• The higher the better
– approximation ratio
• >= 1
• The lower the better
– is the bound

Types of Guarantees
Instance-specific Accuracy: high alpha
Generality: less use of
Bounded Max- instance specific knowledge
Sum
DaCSA
Accuracy
Instance-generic
No guarantees
K-optimality
MGM-1,
T-optimality
DSA-1,
Max-Sum Region Opt.

Generality

Centralized Local Greedy approaches
• Greedy local search
– Start from random solution
– Do local changes if global solution improves
– Local: change the value of a subset of variables, usually one
-1 -1 -1
-4
-1
0
0 -2
-1 -1
-2

0

Centralized Local Greedy approaches
• Problems
– Local minima
– Standard solutions: RandomWalk, Simulated Annealing

-1 -1
-2

-1 -1

-1 -1 -1 -1 -1 -1

Distributed Local Greedy approaches
• Local knowledge
• Parallel execution:
– A greedy local move might be harmful/useless
– Need coordination
-1 -1 -1
-4
-1

0 0 -2
-2 0 0 -2
-2

-4

Distributed Stochastic Algorithm
• Greedy local search with activation probability to
mitigate issues with parallel executions
• DSA-1: change value of one variable at time
• Initialize agents with a random assignment and
communicate values to neighbors
• Each agent:
– Generates a random number and execute only if rnd less
than activation probability
– When executing changes value maximizing local gain
– Communicate possible variable change to neighbors

DSA-1: Execution Example

rnd > ¼ ? rnd > ¼ ? rnd > ¼ ? rnd > ¼ ?
-1 -1 -1
P = 1/4
-1

0 -2

DSA-1: discussion
• Extremely “cheap” (computation/communication)
• Good performance in various domains
– e.g. target tracking [Fitzpatrick Meertens 03, Zhang et al. 03],
– Shows an anytime property (not guaranteed)
– Benchmarking technique for coordination
• Problems
– Activation probablity must be tuned [Zhang et al. 03]
– No general rule, hard to characterise results across domains

Maximum Gain Message (MGM-1)
• Coordinate to decide who is going to move
– Compute and exchange possible gains
– Agent with maximum (positive) gain executes
• Analysis [Maheswaran et al. 04]
– Empirically, similar to DSA
– More communication (but still linear)
– No Threshold to set
– Guaranteed to be monotonic (Anytime behavior)

MGM-1: Example

-1 -1

0 -2 -1 -1
-1 -1 0 -2
G = -2
G=0
G=0 G=2

Local greedy approaches
• Exchange local values for variables
– Similar to search based methods (e.g. ADOPT)
• Consider only local information when maximizing
– Values of neighbors
• Anytime behaviors
• Could result in very bad solutions

Max-sum
Agents iteratively computes local functions that depend
only on the variable they control

X1 X2
Choose arg max

X4 X3 Shared constraint

All incoming
messages except x2
All incoming
messages

Factor Graph and GDL
• Factor Graph
– [Kschischang, Frey, Loeliger 01]
– Computational framework to represent factored computation
– Bipartite graph, Variable - Factor

H ( X1, X 2 , X 3 ) = H ( X1) + H ( X 2 | X1) + H ( X 3 | X1 )

H ( X 2 | X1)
x1 x2 H ( X1) x1 x2

x3 H ( X 3 | X1)
x3

Max-Sum on acyclic graphs
• Max-sum Optimal on acyclic
graphs
– Different branches are H ( X 2 | X1)
independent
H ( X1) x1 x2
– Each agent can build a correct
estimation of its contribution to the
global problem (z functions)
• Message equations very similar
to Util messages in DPOP x3
– GDL generalizes DPOP [Vinyals
et al. 2010a] H ( X 3 | X1)

sum up info from other nodes

local maximization step

(Loopy) Max-sum Performance
• Good performance on loopy networks [Farinelli et al. 08]
– When it converges very good results
• Interesting results when only one cycle [Weiss 00]
– We could remove cycle but pay an exponential price (see
DPOP)
– Java Library for max-sum http://code.google.com/p/jmaxsum/

Max-Sum for low power devices
• Low overhead
– Msgs number/size
• Asynchronous computation
– Agents take decisions whenever new messages arrive
• Robust to message loss

Max-Sum for UAVs
Task Assignment for UAVs [Delle Fave et al 12]

Video Streaming

Max-Sum

Interest points

Task utility
Task completion
Priority

Urgency
First assigned UAVs reaches
task
Last assigned UAVs leaves
task (consider battery life)
24

Factor Graph Representation

PDA2 UAV2
X2 PDA1
T1
U 2 T2
U1
UAV1
U3
T3
X1 PDA3

Quality guarantees for approx.
techniques
• Key area of research
• Address trade-off between guarantees and
computational effort
• Particularly important for many real world applications
– Critical (e.g. Search and rescue)
– Constrained resource (e.g. Embedded devices)
– Dynamic settings

Instance-generic guarantees
Instance-specific

Bounded Max- Characterise solution quality without
Sum running the algorithm

DaCSA
Accuracy
Instance-generic
No guarantees
K-optimality
MGM-1,
T-optimality
DSA-1,
Max-Sum Region Opt.

Generality

K-Optimality framework
• Given a characterization of solution gives bound on
solution quality [Pearce and Tambe 07]
• Characterization of solution: k-optimal
• K-optimal solution:
– Corresponding value of the objective function can not be
improved by changing the assignment of k or less
variables.

K-Optimal solutions

1
1
1 1 1 1 1

1
2-optimal ? Yes 3-optimal ? No
2
2
0 0 1 0
0

2

Bounds for K-Optimality
For any DCOP with non-negative rewards [Pearce and Tambe 07]

Number of agents Maximum arity of constraints

K-optimal solution

Binary Network (m=2):

K-Optimality Discussion
• Need algorithms for computing k-optimal solutions
– DSA-1, MGM-1 k=1; DSA-2, MGM-2 k=2 [Maheswaran et al. 04]
– DALO for generic k (and t-optimality) [Kiekintveld et al. 10]
• The higher k the more complex the computation
(exponential)

Percentage of Optimal:
• The higher k the better
• The higher the number of
agents the worst

Trade-off between generality and solution
quality
• K-optimality based on worst case analysis
• assuming more knowledge gives much better bounds
• Knowledge on structure [Pearce and Tambe 07]

Trade-off between generality and
solution quality
• Knowledge on reward [Bowring et al. 08]
• Beta: ratio of least minimum reward to the maximum

Off-Line Guarantees: Region
Optimality
• k-optimality: use size as a criterion for optimality
• t-optimality: use distance to a central agent in the
constraint graph
• Region Optimality: define regions based on general
criteria (e.g. S-size bounded distance) [Vinyals et al 11]
• Ack: Meritxell Vinyals
3-size regions 1-distance regions C regions

x0 x0
x1 x2 x1 x2
x3 x3
x0 x0
x1 x2 x1 x2
x3 x3
x0 x1 x2 x3 x0 x1 x2 x3

Size-Bounded Distance
• Region optimality can explore new
regions: s-size bounded distance
• One region per agent, largest t- 3-size bounded distance
distance group whose size is less
than s x0 x0
x1 x2 x1 x2
• S-Size-bounded distance
x3 x3
– C-DALO extension of DALO for general
regions t=1 t=0
– Can provides better bounds and x0 x0
keep under control size and x1 x2 x1 x2
number of regions x3 x3
t=0 t=1

Max-Sum and Region Optimality
• Can use region optimality to provide bounds for Max-
sum [Vinyals et al 10b]
• Upon convergence Max-Sum is optimal on SLT regions of
the graph [Weiss 00]
• Single Loops and Trees (SLT): all groups of agents whose
vertex induced subgraph contains at most one cycle
x0
x0 x0
x1 x2 x1 x2
x1 x2 x3 x3
x0 x0
x3 x1 x2 x1 x2
x3 x3

Bounds for Max-Sum
• Complete: same as
3-size optimality
• bipartite

• 2D grids

Variable Disjoint Cycles
Very high quality guarantees if smallest cycle is large

Instance-specific guarantees

Instance-specific

Bounded Max- Characterise solution quality after/while
Sum running the algorithm

DaCSA
Accuracy
Instance-generic
No guarantees
K-optimality
MGM-1,
T-optimality
DSA-1,
Max-Sum Region Opt.

Generality

Bounded Max-Sum
Aim: Remove cycles from Factor Graph avoiding
exponential computation/communication (e.g. no junction tree)
Key Idea: solve a relaxed problem instance [Rogers et al.11]

X1 F2 X3 X1 F2 X3

Build Spanning tree

F1 X2 F3 F1 X2 F3
Compute Bound
Run Max-Sum

X1 X2 X3
Optimal solution on tree

Factor Graph Annotation

• Compute a weight for
each edge
– maximum possible impact X1 F2 X3
of the variable on the w21 w23
function

w11 w22 w33

w12 w32

F1 X2 F3

Factor Graph Modification
• Build a Maximum
Spanning Tree
– Keep higher weights
X1 F2 X3
• Cut remaining
w21 w23
dependencies
– Compute
• Modify functions w11 w22 w33

• Compute bound w12 w32

F1 X2 F3

W = w22 + w23

Results: Random Binary Network
Optimal
Bound is significant Approx.
– Approx. ratio is Lower Bound
typically 1.23 (81 %) Upper Bound

Comparison with k-optimal
with knowledge on
reward structure
Much more accurate less
general

Discussion
• Discussion with other data-dependent techniques
– BnB-ADOPT [Yeoh et al 09]
• Fix an error bound and execute until the error bound is met
• Worst case computation remains exponential
– ADPOP [Petcu and Faltings 05b]
• Can fix message size (and thus computation) or error bound and
leave the other parameter free
• Divide and coordinate [Vinyals et al 10]
– Divide problems among agents and negotiate agreement
by exchanging utility
– Provides anytime quality guarantees

Summary
• Approximation techniques crucial for practical applications:
surveillance, rescue, etc.
• DSA, MGM, Max-Sum heuristic approaches
– Low coordination overhead, acceptable performance
– No guarantees (convergence, solution quality)
• Instance generic guarantees:
– K-optimality framework
– Loose bounds for large scale systems
• Instance specific guarantees
– Bounded max-sum, ADPOP, BnB-ADOPT
– Performance depend on specific instance

References I
DOCPs for MRS
• [Delle Fave et al 12] A methodology for deploying the max-sum algorithm and a case study on
unmanned aerial vehicles. In, IAAI 2012
• [Taylor et al. 11] Distributed On-line Multi-Agent Optimization Under Uncertainty: Balancing
Exploration and Exploitation, Advances in Complex Systems
MGM
• [Maheswaran et al. 04] Distributed Algorithms for DCOP: A Graphical Game-Based Approach,
PDCS-2004
DSA
• [Fitzpatrick and Meertens 03] Distributed Coordination through Anarchic Optimization,
Distributed Sensor Networks: a multiagent perspective.
• [Zhang et al. 03] A Comparative Study of Distributed Constraint algorithms, Distributed
Sensor Networks: a multiagent perspective.
Max-Sum
• [Stranders at al 09] Decentralised Coordination of Mobile Sensors Using the Max-Sum
Algorithm, AAAI 09
• [Rogers et al. 10] Self-organising Sensors for Wide Area Surveillance Using the Max-sum
Algorithm, LNCS 6090 Self-Organizing Architectures
• [Farinelli et al. 08] Decentralised coordination of low-power embedded devices using the
max-sum algorithm, AAMAS 08

References II
Instance-based Approximation
• [Yeoh et al. 09] Trading off solution quality for faster computation in DCOP search algorithms,
IJCAI 09
• [Petcu and Faltings 05b] A-DPOP: Approximations in Distributed Optimization, CP 2005
• [Rogers et al. 11] Bounded approximate decentralised coordination via the max-sum
algorithm, Artificial Intelligence 2011.
Instance-generic Approximation
• [Vinyals et al 10b] Worst-case bounds on the quality of max-product fixed-points, NIPS 10
• [Vinyals et al 11] Quality guarantees for region optimal algorithms, AAMAS 11
• [Pearce and Tambe 07] Quality Guarantees on k-Optimal Solutions for Distributed Constraint
Optimization Problems, IJCAI 07
• [Bowring et al. 08] On K-Optimal Distributed Constraint Optimization Algorithms: New
Bounds and Algorithms, AAMAS 08
• [Weiss 00] Correctness of local probability propagation in graphical models with loops, Neural
Computation
• [Kiekintveld et al. 10] Asynchronous Algorithms for Approximate Distributed Constraint
Optimization with Quality Bounds, AAMAS 10

T12 Distributed search and constraint handling

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (7)

Ähnlich wie T12 Distributed search and constraint handling

Ähnlich wie T12 Distributed search and constraint handling (20)

Mehr von EASSS 2012

Mehr von EASSS 2012 (9)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

T12 Distributed search and constraint handling