SlideShare ist ein Scribd-Unternehmen logo
1 von 127
Downloaden Sie, um offline zu lesen
Distributed Constraint Handling
        and Optimization

Alessandro Farinelli1      Alex Rogers2            Meritxell Vinyals1

                  1 Computer Science Department

                     University of Verona, Italy
             2 Agents, Interaction and Complexity Group

            School of Electronics and Computer Science
                   University of Southampton,UK


             Tutorial EASSS 2012 Valencia
      https://sites.google.com/site/
          easss2012optimization/
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Constraints



• Pervade our everyday lives
• Are usually perceived as elements that limit solutions to the
  problems we face
Constraints




From a computational point of view, they:
  • Reduce the space of possible solutions
  • Encode knowledge about the problem at hand
  • Are key components for efficiently solving hard problems
Constraint Processing



Many different disciplines deal with hard computational problems that
 can be made tractable by carefully considering the constraints that
                 define the structure of the problem.




  Planning     Operational     Automated Reasoning       Computer
 Scheduling     Research         Decision Theory          Vision
Constraint Processing in Multi-Agent Systems

  Focus on how constraint processing can be used to address
  optimization problems in Multi-Agent Systems (MAS) where:

A set of agents must come to some agreement, typically via some
form of negotiation, about which action each agent should take in
   order to jointly obtain the best solution for the whole system.


                                       M2
                    M1




                   A1    A2           A2    A3
Distributed Constraint Optimization Problems (DCOPs)

We will consider Distributed Constraint Optimization Problems (DCOP)
where:


    Each agent negotiates locally with just a subset of other agents
  (usually called neighbors) that are those that can directly influence
                           his/her behavior.

                                            M2
                 M1

                                 A2




                 A1                               A3
Distributed Constraint Optimization Problems (DCOPs)



After reading this chapter, you will understand:
  • The mathematical formulation of a DCOP
  • The main exact solution techniques for DCOPs
        • Key differences, benefits and limitations
  • The main approximate solution techniques for DCOPs
        • Key differences, benefits and limitations
  • The quality guarantees these approach provide:
        • Types of quality guarantees
        • Frameworks and techniques
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Constraint Networks



A constraint network N is formally defined as a tuple X, D,C where:


  • X = {x1 , . . . , xn } is a set of discrete variables;
  • D = {D1 , . . . , Dn } is a set of variable domains, which enumerate
    all possible values of the corresponding variables; and
  • C = {C1 , . . . ,Cm } is a set of constraints; where a constraint Ci is
    defined on a subset of variables Si ⊆ X which comprise the
    scope of the constraint
        • r = |Si | is the arity of the constraint
       • Two types: hard or soft
Hard constraints




• A hard constraint Cih is a relation Ri that enumerates all the valid
  joint assignments of all variables in the scope of the constraint.

                        Ri ⊆ Di1 × . . . × Dir

                            Ri    xj   xk
                                  0     1
                                  1     0
Soft constraints



• A soft constraint Cis is a function Fi that maps every possible joint
  assignment of all variables in the scope to a real value.
                        Fi : Di1 × . . . × Dir → ℜ

                            Fi   xj   xk
                            2    0    0
                            0    0    1
                            0    1    0
                            1    1    1
Binary Constraint Networks


                                                                  x1
• Binary constraint networks are those where:




                                                                        F1,3
                                                             ,2
    • Each constraint (soft or hard) is defined




                                                         F1
      over two variables.
• Every constraint network can be mapped to
  a binary constraint network                           x2                     x3




                                                                   4
                                                                  F1,
    • requires the addition of variables and
      constraints




                                                 F2,4
    • may add complexity to the model
• They can be represented by a constraint
  graph                                          x4
Different objectives, different problems



• Constraint Satisfaction Problem (CSP)

    • Objective: find an assignment for all the variables in the network
      that satisfies all constraints.


• Constraint Optimization Problem (COP)

    • Objective: find an assignment for all the variables in the network
      that satisfies all constraints and optimizes a global function.

    • Global function = aggregation (typically sum) of local functions.
      F(x) = ∑i Fi (xi )
Distributed Constraint Reasoning




                                           A1
When operating in a
decentralized context:
  • a set of agents control
    variables
                                      A2
  • agents interact to find a                    A4
    solution to the constraint
    network

                                 A3
Distributed Constraint Reasoning




Two types of decentralized problems:
  • distributed CSP (DCSP)
  • distributed COP (DCOP)



                     Here, we focus on DCOPs.
Distributed Constraint Optimization Problem (DCOP)




A DCOP consists of a constraint network N = X, D,C and a set of
agents A = {A1 , . . . , Ak } where each agent:
  • controls a subset of the variables Xi ⊆ X
  • is only aware of constraints that involve variable it controls
  • communicates only with its neighbours
Distributed Constraint Optimization Problem (DCOP)




 • Agents are assumed to be fully cooperative
     • Goal: find the assignment that optimizes the global function, not
       their local local utilities.
 • Solving a COP is NP-Hard and DCOP is as hard as COP.
Motivation




Why distribute?
  • Privacy
  • Robustness
  • Scalability
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Real World Applications




Many standard benchmark problems in computer science can be
modeled using the DCOP framework:
  • graph coloring
As can many real world applications:
  • human-agent organizations (e.g. meeting scheduling)
  • sensor networks and robotics (e.g. target tracking)
Outline

Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems
  Graph coloring
  Meeting Scheduling
  Target Tracking

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Graph coloring




• Popular benchmark
• Simple formulation
• Complexity controlled with few parameters:
    • Number of available colors
    • Number of nodes
    • Density (#nodes/#constraints)
• Many versions of the problem:
    • CSP, MaxCSP, COP
Graph coloring - CSP


• Nodes can take k colors
• Any two adjacent nodes should have different colors
    • If it happens this is a conflict

                                   Yes!                 No!
Graph coloring - Max-CSP



• Minimize the number of conflicts

                                    0    -1
Graph coloring - COP



• Different weights to violated constraints
• Preferences for different colors

                                     0        -2
       -1

-3          -2

        -1
Graph coloring - DCOP


 • Each node:
      • controlled by one agent
 • Each agent:
      • Preferences for different colors
      • Communicates with its direct neighbours in the graph


            -1
A1                         A3            • A1 and A2 exchange
     -3           -2                       preferences and conflicts
                                         • A3 and A4 do not
A2                         A4              communicate
             -1
Outline

Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems
  Graph coloring
  Meeting Scheduling
  Target Tracking

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Meeting Scheduling




Motivation:
  • Privacy
  • Robustness
  • Scalability
Meeting Scheduling


In large organizations many people, possibly working in different
    departments, are involved in a number of work meetings.
Meeting Scheduling


People might have various private preferences on meeting start times




      Better after 12:00am
Meeting Scheduling

Two meetings that share a participant cannot overlap


                               Window: 15:00-18:00
                                   Duration: 2h




                                Window: 15:00-17:00
                                    Duration: 1h
DCOP formalization for the meeting scheduling problem



   • A set of agents representing participants
   • A set of variables representing meeting starting times according
     to a participant.
   • Hard Constraints:
        • Starting meeting times across different agents are equal
        • Meetings for the same agent are non-overlapping.
   • Soft Constraints:
        • Represent agent preferences on meeting starting times.


 Objective: find a valid schedule for the meeting while maximizing the
 sum of individuals’ preferences.
Outline

Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems
  Graph coloring
  Meeting Scheduling
  Target Tracking

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Target Tracking




Motivation:
  • Privacy
  • Robustness
  • Scalability
Target Tracking

A set of sensors tracking a set of targets in order to provide an
             accurate estimate of their positions.



                                           T4




                      T1
                                      T3




                                 T2




     Crucial for surveillance and monitoring applications.
Target Tracking


Sensors can have different sensing modalities that impact on the
      accuracy of the estimation of the targets’ positions.



                                                 MODES
                                                         T4




                  MODES                 MODES
                           T1                      T3




                                MODES       T2
Target Tracking


Collaboration among sensors is crucial to improve system
                     performance



                                      T4




                   T1
                                 T3




                            T2
DCOP formalization for the target tracking problem



• Agents represent sensors
• Variables encode the different sensing modalities of each sensor
• Constraints
     • relate to a specific target
     • represent how sensor modalities impacts on the tracking
       performance
• Objective:
     • Maximize coverage of the environment
     • Provide accurate estimations of potentially dangerous targets
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Complete Algorithms




U
D   Always find an optimal solution


D   Exhibit an exponentially increasing coordination overhead
    Very limited scalability on general problems.
Complete Algorithms



  • Completely decentralised
       • Search-based.
           • Synchronous: SyncBB, AND/OR search
           • Asynchronous: ADOPT, NCBB and AFB.
       • Dynamic programming.


  • Partially decentralised
       • OptAPO

Next, we focus on completely decentralised algorithms
Decentralised Complete Algorithms




Search-based                     Dynamic programming
  • Uses distributed search        • Uses distributed inference
  • Exchange individual values     • Exchange constraints
  • Small messages but             • Few messages but
    . . . exponentially many         . . . exponentially large
Representative: ADOPT [Modi      Representative: DPOP [Petcu
et al., 2005]                    and Faltings, 2005]
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs
  Search Based: ADOPT
  Dynamic Programming DPOP

Approximated Algorithms for DCOPs

Conclusions
ADOPT


ADOPT (Asynchronous Distributed OPTimization) [Modi et al., 2005]:

  • Distributed backtrack search using a best-first strategy

  • Best value based on local information:

       • Lower/upper bound estimates of each possible value of its
         variable

       • Backtrack thresholds used to speed up the search of previously
         explored solutions.

       • Termination conditions that check if the bound interval is less than
         a given valid error bound (0 if optimal)
ADOPT by example


             4 variables (4 agents): x1 , x2 , x3 , x4 with D = {0, 1}

                             4 identical cost functions
             x1               Fi, j   xi   xj
                              2       0    0
                      F1,3
        ,2
       F1




                              0       0    1
                              0       1    0
       x2               x3
                              1       1    1
                  4
             F1,
F2,4




                              Goal: find a variable assignment with minimal cost

x4                           Solution: x1 = 1, x2 = 0, x3 = 0 and x4 = 1
                             giving total cost 1.
DFS arrangement




• Before executing ADOPT, agents must be arranged in a depth
  first search (DFS) tree.
• DFS trees have been frequently used in optimization because
  they have two interesting properties:
    • Agents in different branches of the tree do not share any
      constraints;
    • Every constraint network admits a DFS tree.
ADOPT by example


                                                                                        A1       (root)
                 x1




                                                                                    →



                                                                                             ←
                                  Fi, j   xi   xj




                                                                                 ent




                                                                                             par
                                                                              par




                                                                                                ent ,
                                  2       0    0
                       F1,3
            ,2




                                                                                ,




                                                                                                 chil
                                                                          hil d
       F1




                                  0       0    1




                                                                                                      d→
                                                                        ←c
                                  0       1    0
       x2                 x3      1       1    1                        A2                          A3
                  4
                 F1,




                                                               →
                                                                arent
F2,4




                                          →




                                                        il d , p
                                                    ← ch
x4                               DFS arrangement
                                                    A4
Cost functions




The local cost function for an agent Ai (δ (xi )) is the sum of the values
      of constraints involving only higher neighbors in the DFS.
ADOPT by example


                                            A1     δ (x1 ) = 0




δ (x1 , x2 ) = F1,2 (x1 , x2 )   A2                   A3   δ (x1 , x3 ) = F1,3 (x1 , x3 )




                           A4

  δ (x1 , x2 , x4 ) = F1,4 (x1 , x4 ) + F2,4 (x2 , x4 )
Initialization


 Each agent initially chooses a random value for their variables and
initialize the lower and upper bounds to zero and infinity respectively.


           x1 = 0, LB = 0,UB = ∞       A1




    x2 = 0, LB = 0,UB = ∞       A2            A3    x3 = 0, LB = 0,UB = ∞




x4 = 0, LB = 0,UB = ∞      A4
ADOPT by example


  Value messages are sent by an agent to all its neighbors that are
                      lower in the DFS tree


             x1 = 0    A1
                ← =0



                            x 1−−
                   −

                             −
                 −−



                               =→
                x1




                                 0            A1 sends three value
                                              message to A2 , A3 and
    x2 = 0      A2             A3    x3 = 0
                       ←−0



                                              A4 informing them that its
                        −−
                       x1 =
         ←−0
          −−




                                              current value is 0.
         x2 =




x4 = 0   A4
ADOPT by example


    Current Context: a partial variable assignment maintained by each
  agent that records the assignment of all higher neighbours in the DFS.


                      A1

                                      • Updated by each VALUE
                                        message
c2 : {x1 = 0}    A2          A3
                                      • If current context is not
                      c3 : {x1 = 0}     compatible with some child
                                        context, the latter is re-initialized
            A4                          (also the child bound)

   c4 : {x1 = 0, x2 = 0}
ADOPT by example


                   Each agent Ai sends a cost message to its parent A p


                      A1
             , c2 ]

                           [0, 0


                                          Each cost message reports:
           [0, ∞



                                 ,
                              c 3]


                                            • The minimum lower bound (LB)
                                            • The maximum upper bound (UB)
           A2                  A3
                                            • The context (ci )
    4] c
[0, 0,




                                                        [LB,UP, ci ]

 A4
Lower bound computation




Each agent evaluates for each possible value of its variable:
  • its local cost function with respect to the current context
  • adding all the compatible lower bound messages received from
    children.



              Analogous computation for upper bounds
ADOPT by example


           Consider the lower bound in the cost message sent by A4 :

                 A1               • Recall that A4 local cost function is:
                                    δ (x1 , x2 , x4 ) = F1,4 (x1 , x4 ) + F2,4 (x2 , x4 )
                                  • Restricted to the current context
                                    c4 = {(x1 = 0, x2 = 0)}:
                                    λ (0, 0, x4 ) = F1,4 (0, x4 ) + F2,4 (0, x4 ).
           A2           A3        • For x4 = 0:
                                    λ (0, 0, 0) = F1,4 (0, 0) + F2,4 (0, 0) = 2 + 2 = 4.
    4] c




                                  • For x4 = 1:
[0, 0,




                                    λ (0, 0, 1) = F1,4 (0, 1) + F2,4 (0, 1) = 0 + 0 = 0.

 A4                            Then the minimum lower bound across variable
                               values is LB = 0.
ADOPT by example


       Each agent asynchronously chooses the value of its variable that
                        minimizes its lower bound.

                                      A2 computes for each possible value of its
                            A1
                                      variable its local function restricted to the
                   , c2 ]



                                      current context c2 = {(x1 = 0)}
                 [0, 2




                                      (λ (0, x2 ) = F1,2 (0, x2 )) and adding lower
                                      bound message from A4 (lb).
x2 = 0 → 1       A2              A3     • For x2 = 0: LB(x2 = 0) = λ (0, x2 =
                                          0) + lb(x2 = 0) = 2 + 0 = 2.
             1
        x2 =




                                        • For x2 = 1: LB(x2 = 1) = λ (0, x2 =
                                          1) + 0 = 0 + 0 = 0.
         A4
                                      A2 changes its value to x2 = 1 with LB = 0.
Backtrack thresholds



               The search strategy is based on lower bounds

Problem
  • Values abandoned before proven to be
    suboptimal
  • Lower/upper bounds only stored for the
    current context

Solution
  • Backtrack thresholds: used to speed up
    the search of previously explored
    solutions.
ADOPT by example


     x1 = 0 → 1 → 0

          A1

                          A1 changes its value and the context with
                          x1 = 0 is visited again.
                            • Reconstructing from scratch is inefficient
     A2        A3
                            • Remembering solutions is expensive



A4
Backtrack thresholds




Solution: Backtrack thresholds
  • Lower bound previously determined by children
  • Polynomial space
  • Control backtracking to efficiently search
  • Key point: do not change value until LB(currentvalue)> threshold
A child agent will not change its variable value so long as cost is less
         than the backtrack threshold given to it by its parent.


      LB(x1 = 0) = 1               A1



                                        t (x 1 =
                            1
                           2
                            0) =



                                            0) =
                    t (x1 =




                                              1
                                               2
             1                                                  1
LB(x2 = 0) > 2 ?     A2                     A3     LB(x3 = 0) > 2 ?



               A4
Rebalance incorrect threshold




How to correctly subdivide threshold among children?

  • Parent distributes the accumulated bound among children
       • Arbitrarily/Using some heuristics

  • Correct subdivision as feedback is received from children
      • LB < t(CONT EXT )
      • t(CONT EXT ) = ∑Ci t(CONT EXT ) + δ
Backtrack Threshold Computation


                        A1

                             (2) 2 t
                    1
  =1
             = 0) =



                                 1 (x 1 =
(1) LB

              1




                                             • When A1 receives a new lower bound
                x



                                    0) = 0
         (2) t (




                                               from A2 rebalances thresholds
                                             • A1 resends threshold messages to A2
         A2                        A3          and A3



  A4
ADOPT extensions




• BnB-ADOPT [Yeoh et al., 2008] reduces computation time by
  using depth-first search with branch and bound strategy

• [Ali et al., 2005] suggest the use of preprocessing techniques for
  guiding ADOPT search and show that this can result in a
  consistent increase in performance.
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs
  Search Based: ADOPT
  Dynamic Programming DPOP

Approximated Algorithms for DCOPs

Conclusions
DPOP




DPOP (Dynamic Programming Optimization Protocol) [Petcu and
Faltings, 2005]:
  • Based on the dynamic programming paradigm.
  • Special case of Bucket Tree Elimination Algorithm (BTE)
    [Dechter, 2003].
DPOP by example


                    x1                                                                            x1
                                     Fi, j   xi   xj




                                                                                      ← P2 , child →
                           F1,3
               ,2
           F1




                                                                 →




                                                                                                                    ← PP3
                                     2       0    0




                                                                     4
                                                             il d , PP
 ,4




                                     0       0    1
                    F2,3
F1




                                                                                                                      , pseud
          x2                  x3     0       1    0




                                                            ud och
                                     1       1    1




                                                                                                                         ochil d
                                                                                                  x2




                                                       ← pse
     ,4
F2




                                                                               →

                                                                                                       ←




                                                                                                                            →
                                             =>




                                                                                                       P3,
                                                                                  4
                                                                              ,P



                                                                                                           chi
                                                                             ld
                                                                         chi




                                                                                                               ld
x4
                                    DFS arrangement




                                                                         ←




                                                                                                               →
          Objective: find assignment                                      x4                                   x3
                  with maximal value
DPOP phases




Given a DFS tree structure, DPOP runs in two phases:
  • Util propagation: agents exchange util messages up the tree.
       • Aim: aggregate all info so that root agent can choose optimal
         value
  • Value propagation: agents exchange value messages down the
    tree.
       • Aim: propagate info so that all agents can make their choice given
         choices of ancestors
Sepi : set of agents preceding Ai in the pseudo-tree order that are
           connected with Ai or with a descendant of Ai .


                                                              x1       Sep1 = 0
                                                                              /




                                                  ← P2 , child →
                             →




                                                                                ← PP3
                                 4
                         il d , PP




                                                                                  , pseud
                        ud och




                                                                                     ochil d
                                                              x2                               Sep2 = {x1 }
                   ← pse



                                           →

                                                                   ←




                                                                                        →
                                                                   P3,
                                              4
                                          ,P



                                                                       chi
                                         ld
                                     chi




                                                                           ld
                                     ←




                                                                           →
     Sep4 = {x1 , x2 }               x4                                   x3            Sep3 = {x1 , x2 }
Util message



The Util message Ui→ j that agent Ai sends to its parent A j can be
computed as:

                   Ui→ j (Sepi ) = max            Uk→i ⊗                  Fi,p
                                   xi    Ak ∈Ci            A p ∈Pi ∪PPi


Size exponential        All incoming messages         Shared constraints with
     in Sepi                  from children           parents/pseudoparents
The ⊗ operator is a join operator that sums up functions with different
but overlapping scores consistently.
Join operator



F2,4 x2 x4                                      max{x4 } (F1,4 ⊗ F2,4 )
                       F1,4 ⊗ F2,4
2   0   0
                         x1 x2 x4                           x1 x2 x4
0   0   1
                   4    0   0   0                          0    0   0
0   1   0                                      max(4,0)
                   0    0   0   1                          0    0   1
1   1   1                            Project
                   2    0   1   0                          0    1   0
             Add                               max(2,1)
                   1    0   1   1    out x4                0    1   1
F1,4 x1 x4
                   2    1   0   0                          1    0   0
2   0   0                                      max(2,2)
                   2    1   0   1                          1    0   1
0   0   1
                   0    1   1   0                          1    1   0
0   1   0                                      max(0,2)
                   2    1   1   1                          1    1   1
1   1   1
Complexity exponential to the largest Sepi .
  Largest Sepi = induced width of the DFS tree ordering used.


                                   A1      (root)
                                                     U2→1         x1   Sep2




                                    U1→2
                                    ←−
                                     −−
                                                         0        10
           Sep4                                          1        5
                                   A2      max(U3→2 ⊗U4→2 ⊗ F1,2 )
                                            x2
U4→2   x1 x2                                         U3→2 x1 x2                Sep3


                                           U← −
                       −− 2


 4     0       0                                              4        0   0
                         →
                       − 4→




                                            3 −
                                             −
                                             →
                        U




                                              2
 2     0       1                                              2        0   1
 2     1       0                                              2        1   0
 2     1       1                                              2        1   1
                     A4                             A3
           1                                         max(F1,3 ⊗ F2,3 )
           2   max(F1,4 ⊗ F2,4 )                         x3
                x4
Value message


Keeping fixed the value of parent/pseudoparents, finds the value that
maximizes the computed cost function in the util phase:

         ∗
        xi = arg max      ∑        U j→i (xi , x∗ ) +
                                                p           ∑          Fi, j (xi , x∗ )
                                                                                    j
                   xi
                         A j ∈Ci                        A j ∈Pi ∪PPi

where x∗ = A j ∈Pi ∪PPi {x∗ } is the set of optimal values for Ai ’s parent
       p                  j
and pseudoparents received from Ai ’s parent.
Propagates this value through children down the tree:

                               ∗                                    ∗
                Vi→ j = {xi = xi } ∪                         {xs = xs }
                                           xs ∈Sepi ∩Sep j
∗
                                                     x1 = max U1→2 (x1 )
                                         A1                 x1




                                              V1→2
                                              −→
                                               −
                                                      ∗              ∗                ∗                 ∗
                                                     x2 = max(U3→2 (x1 , x2 ) ⊗U4→2 (x1 , x2 ) ⊗ F1,2 (x1 , x2 ))
                                         A2                 x2



                                                       V 2→→
                         −4
                          −




                                                        −
                        ←2→




                                                          −
                        V




                                                            3




                   A4                                             A3
 ∗              ∗                 ∗
x4 = max(F1,4 (x1 , x4 ) ⊗ F2,4 (x2 , x4 ))              ∗              ∗                 ∗
                                                        x3 = max(F1,3 (x1 , x3 ) ⊗ F2,3 (x2 , x3 ))
       x4                                                        x3
DPOP extensions




• MB-DPOP [Petcu and Faltings, 2007] trades-off message size
  against the number of messages.
• A-DPOP trades-off message size against solution quality [Petcu
  and Faltings, 2005(2)].
Conclusions




• Constraint processing
    • exploit problem structure to solve hard problems efficiently
• DCOP framework
    • applies constraint processing to solve decision making problems
      in Multi-Agent Systems
    • increasingly being applied within real world problems.
References I



•   [Modi et al., 2005] P. J. Modi, W. Shen, M. Tambe, and M.Yokoo. ADOPT: Asynchronous
    distributed constraint optimization with quality guarantees. Artificial Intelligence Jour- nal,
    (161):149-180, 2005.
•   [Yeoh et al., 2008] W. Yeoh, A. Felner, and S. Koenig. BnB-ADOPT: An asynchronous
    branch-and-bound DCOP algorithm. In Proceedings of the Seventh International Joint
    Conference on Autonomous Agents and Multiagent Systems, pages 591Ð598, 2008.
•   [Ali et al., 2005] S. M. Ali, S. Koenig, and M. Tambe. Preprocessing techniques for
    accelerating the DCOP algorithm ADOPT. In Proceedings of the Fourth International Joint
    Conference on Autonomous Agents and Multiagent Systems, pages 1041Ð1048, 2005.
•   [Petcu and Faltings, 2005] A. Petcu and B. Faltings. DPOP: A scalable method for
    multiagent constraint opti- mization. In Proceedings of the Nineteenth International Joint
    Conference on Arti- ficial Intelligence, pages 266-271, 2005.
•   [Dechter, 2003] R. Dechter. Constraint Processing. Morgan Kaufmann, 2003.
References II



•   [Petcu and Faltings, 2005(2)] A. Petcu and B. Faltings. A-DPOP: Approximations in
    distributed optimization. In Principles and Practice of Constraint Programming, pages
    802-806, 2005.
•   [Petcu and Faltings, 2007] A. Petcu and B. Faltings. MB-DPOP: A new memory-bounded
    algorithm for distributed optimization. In Proceedings of the Twentieth International Joint
    Confer- ence on Artificial Intelligence, pages 1452-1457, 2007.
•   [S. Fitzpatrick and L. Meetrens, 2003] S. Fitzpatrick and L. Meetrens. Distributed Sensor
    Networks: A multiagent perspective, chapter Distributed coordination through anarchic
    optimization, pages 257- 293. Kluwer Academic, 2003.
•   [R. T. Maheswaran et al., 2004] R. T. Maheswaran, J. P. Pearce, and M. Tambe.
    Distributed algorithms for DCOP: A graphical game-based approach. In Proceedings of
    the Seventeenth International Conference on Parallel and Distributed Computing
    Systems, pages 432-439, 2004.
Outline


Introduction

Distributed Constraint Reasoning

Applications and Exemplar Problems

Complete algorithms for DCOPs

Approximated Algorithms for DCOPs

Conclusions
Approximate Algorithms: outline

• No guarantees
   – DSA-1, MGM-1 (exchange individual assignments)
   – Max-Sum (exchange functions)
• Off-Line guarantees
   – K-optimality and extensions
• On-Line Guarantees
   – Bounded max-sum
Why Approximate Algorithms
• Motivations
  – Often optimality in practical applications is not achievable
  – Fast good enough solutions are all we can have
• Example – Graph coloring
  – Medium size problem (about 20 nodes, three colors per
    node)
  – Number of states to visit for optimal solution in the worst
    case 3^20 = 3 billions of states
• Key problem
  – Provides guarantees on solution quality
Exemplar Application: Surveillance
• Event Detection
   – Vehicles passing on a road
• Energy Constraints
   – Sense/Sleep modes
   – Recharge when sleeping
• Coordination
   – Activity can be detected
     by single sensor                         duty cycle


   – Roads have different                                            time


     traffic loads                                         Good Schedule

• Aim [Rogers et al. 10]                                   Bad Schedule
   – Focus on road with more
     traffic load       Heavy traffic road   small road
Surveillance demo
Guarantees on solution quality
• Key Concept: bound the optimal solution
   – Assume a maximization problem
   –     optimal solution,  a solution
   –
   –     percentage of optimality
       • [0,1]
       • The higher the better
   –          approximation ratio
       • >= 1
       • The lower the better
   –           is the bound
Types of Guarantees
            Instance-specific        Accuracy: high alpha
                                     Generality: less use of
           Bounded Max-              instance specific knowledge
               Sum
              DaCSA
Accuracy
                                      Instance-generic
            No guarantees
                                       K-optimality
            MGM-1,
                                       T-optimality
             DSA-1,
            Max-Sum                    Region Opt.

                        Generality
Centralized Local Greedy approaches
• Greedy local search
  – Start from random solution
  – Do local changes if global solution improves
  – Local: change the value of a subset of variables, usually one
                 -1        -1          -1
                                                       -4
                            -1
                                         0
0    -2
                           -1          -1
                                                       -2



                                                      0
Centralized Local Greedy approaches
• Problems
  – Local minima
  – Standard solutions: RandomWalk, Simulated Annealing

                -1                 -1
                                                  -2

 -1   -1

                -1 -1            -1 -1         -1 -1
Distributed Local Greedy approaches
• Local knowledge
• Parallel execution:
   – A greedy local move might be harmful/useless
   – Need coordination
                    -1        -1            -1
                                                          -4
                              -1


  0                                              0   -2
      -2        0                  0   -2
                         -2

                                                          -4
Distributed Stochastic Algorithm
• Greedy local search with activation probability to
  mitigate issues with parallel executions
• DSA-1: change value of one variable at time
• Initialize agents with a random assignment and
  communicate values to neighbors
• Each agent:
   – Generates a random number and execute only if rnd less
     than activation probability
   – When executing changes value maximizing local gain
   – Communicate possible variable change to neighbors
DSA-1: Execution Example

rnd > ¼ ?   rnd > ¼ ?        rnd > ¼ ?   rnd > ¼ ?
            -1      -1            -1
                                              P = 1/4
                        -1


0   -2
DSA-1: discussion
• Extremely “cheap” (computation/communication)
• Good performance in various domains
  – e.g. target tracking [Fitzpatrick Meertens 03, Zhang et al. 03],
  – Shows an anytime property (not guaranteed)
  – Benchmarking technique for coordination
• Problems
  – Activation probablity must be tuned [Zhang et al. 03]
  – No general rule, hard to characterise results across domains
Maximum Gain Message (MGM-1)
• Coordinate to decide who is going to move
   – Compute and exchange possible gains
   – Agent with maximum (positive) gain executes
• Analysis [Maheswaran et al. 04]
   –   Empirically, similar to DSA
   –   More communication (but still linear)
   –   No Threshold to set
   –   Guaranteed to be monotonic (Anytime behavior)
MGM-1: Example

                 -1       -1




0   -2                          -1 -1
         -1 -1        0    -2
G = -2
                                  G=0
          G=0             G=2
Local greedy approaches
• Exchange local values for variables
   – Similar to search based methods (e.g. ADOPT)
• Consider only local information when maximizing
   – Values of neighbors
• Anytime behaviors
• Could result in very bad solutions
Max-sum
Agents iteratively computes local functions that depend
  only on the variable they control


                       X1     X2
      Choose arg max



                       X4     X3    Shared constraint




                                          All incoming
                                          messages except x2
                                         All incoming
                                         messages
Factor Graph and GDL
• Factor Graph
   – [Kschischang, Frey, Loeliger 01]
   – Computational framework to represent factored computation
   – Bipartite graph, Variable - Factor


        H ( X1, X 2 , X 3 ) = H ( X1) + H ( X 2 | X1) + H ( X 3 | X1 )

                                                       H ( X 2 | X1)
   x1                       x2       H ( X1)      x1                     x2


                            x3                           H ( X 3 | X1)
                                                                         x3
Max-Sum on acyclic graphs
• Max-sum Optimal on acyclic
  graphs
   – Different branches are                             H ( X 2 | X1)
     independent
                                         H ( X1)   x1                   x2
   – Each agent can build a correct
     estimation of its contribution to the
     global problem (z functions)
• Message equations very similar
  to Util messages in DPOP                                              x3
   – GDL generalizes DPOP [Vinyals
     et al. 2010a]                                      H ( X 3 | X1)

                              sum up info from other nodes

                                  local maximization step
(Loopy) Max-sum Performance
• Good performance on loopy networks [Farinelli et al. 08]
   – When it converges very good results
      • Interesting results when only one cycle [Weiss 00]
   – We could remove cycle but pay an exponential price (see
     DPOP)
   – Java Library for max-sum http://code.google.com/p/jmaxsum/
Max-Sum for low power devices
• Low overhead
   – Msgs number/size
• Asynchronous computation
   – Agents take decisions whenever new messages arrive
• Robust to message loss
Max-sum on hardware
Max-Sum for UAVs
Task Assignment for UAVs [Delle Fave et al 12]


                     Video Streaming



                                                 Max-Sum


                          Interest points
Task utility
                     Task completion
Priority



                Urgency
           First assigned UAVs reaches
           task
            Last assigned UAVs leaves
            task (consider battery life)
                             24
Factor Graph Representation

   PDA2               UAV2
                             X2        PDA1
                                   T1
            U 2 T2
                                        U1
UAV1
                     U3
                                  T3
       X1            PDA3
UAVs Demo
Quality guarantees for approx.
           techniques
• Key area of research
• Address trade-off between guarantees and
  computational effort
• Particularly important for many real world applications
   – Critical (e.g. Search and rescue)
   – Constrained resource (e.g. Embedded devices)
   – Dynamic settings
Instance-generic guarantees
            Instance-specific

           Bounded Max-              Characterise solution quality without
               Sum                   running the algorithm

              DaCSA
Accuracy
                                                  Instance-generic
            No guarantees
                                                   K-optimality
            MGM-1,
                                                   T-optimality
             DSA-1,
            Max-Sum                                Region Opt.

                        Generality
K-Optimality framework
• Given a characterization of solution gives bound on
  solution quality [Pearce and Tambe 07]
• Characterization of solution: k-optimal
• K-optimal solution:
   – Corresponding value of the objective function can not be
     improved by changing the assignment of k or less
     variables.
K-Optimal solutions

                                                 1
                 1
         1               1       1           1            1


                                     1
             2-optimal ? Yes             3-optimal ?     No
                                                     2
                     2
             0               0                   1            0
                                     0

                                         2
Bounds for K-Optimality
 For any DCOP with non-negative rewards [Pearce and Tambe 07]

     Number of agents                      Maximum arity of constraints




     K-optimal solution


Binary Network (m=2):
K-Optimality Discussion
• Need algorithms for computing k-optimal solutions
   – DSA-1, MGM-1 k=1; DSA-2, MGM-2 k=2 [Maheswaran et al. 04]
   – DALO for generic k (and t-optimality) [Kiekintveld et al. 10]
• The higher k the more complex the computation
  (exponential)

Percentage of Optimal:
• The higher k the better
• The higher the number of
agents the worst
Trade-off between generality and solution
                quality
• K-optimality based on worst case analysis
• assuming more knowledge gives much better bounds
• Knowledge on structure [Pearce and Tambe 07]
Trade-off between generality and
          solution quality
• Knowledge on reward [Bowring et al. 08]
• Beta: ratio of least minimum reward to the maximum
Off-Line Guarantees: Region
                    Optimality
• k-optimality: use size as a criterion for optimality
• t-optimality: use distance to a central agent in the
  constraint graph
• Region Optimality: define regions based on general
  criteria (e.g. S-size bounded distance) [Vinyals et al 11]
• Ack: Meritxell Vinyals
 3-size regions             1-distance regions            C regions

                            x0             x0
                       x1        x2   x1        x2
                            x3             x3
                            x0             x0
                       x1        x2   x1        x2
                            x3             x3
x0   x1   x2      x3                                 x0    x1   x2    x3
Size-Bounded Distance
• Region optimality can explore new
  regions: s-size bounded distance
• One region per agent, largest t-   3-size bounded distance
  distance group whose size is less
  than s                               x0               x0
                                    x1    x2        x1     x2
• S-Size-bounded distance
                                                 x3              x3
   – C-DALO extension of DALO for general
     regions                                     t=1             t=0
   – Can provides better bounds and              x0              x0
     keep under control size and            x1         x2   x1         x2
     number of regions                           x3              x3
                                                 t=0             t=1
Max-Sum and Region Optimality
• Can use region optimality to provide bounds for Max-
  sum [Vinyals et al 10b]
• Upon convergence Max-Sum is optimal on SLT regions of
  the graph [Weiss 00]
• Single Loops and Trees (SLT): all groups of agents whose
  vertex induced subgraph contains at most one cycle
              x0
                                      x0              x0
                                 x1        x2   x1          x2
       x1             x2              x3              x3
                                      x0               x0
              x3                 x1        x2    x1          x2
                                      x3               x3
Bounds for Max-Sum
• Complete: same as
  3-size optimality
• bipartite



• 2D grids
Variable Disjoint Cycles
Very high quality guarantees if smallest cycle is large
Instance-specific guarantees

            Instance-specific

           Bounded Max-              Characterise solution quality after/while
               Sum                   running the algorithm

              DaCSA
Accuracy
                                                   Instance-generic
            No guarantees
                                                   K-optimality
            MGM-1,
                                                   T-optimality
             DSA-1,
            Max-Sum                                Region Opt.

                        Generality
Bounded Max-Sum
Aim: Remove cycles from Factor Graph avoiding
exponential computation/communication (e.g. no junction tree)
Key Idea: solve a relaxed problem instance [Rogers et al.11]

X1      F2      X3                         X1        F2        X3

                     Build Spanning tree



F1      X2    F3                           F1        X2        F3
        Compute Bound
                                                Run Max-Sum

                                     X1         X2        X3
                                       Optimal solution on tree
Factor Graph Annotation

•   Compute a weight for
    each edge
    –   maximum possible impact X1           F2           X3
        of the variable on the         w21          w23
        function


                                     w11          w22          w33

                                       w12         w32

                               F1            X2           F3
Factor Graph Modification
• Build a Maximum
  Spanning Tree
   – Keep higher weights
                           X1             F2            X3
• Cut remaining
                                   w21            w23
  dependencies
   – Compute
• Modify functions              w11            w22           w33

• Compute bound                    w12            w32

                           F1             X2            F3

                                  W = w22 + w23
Results: Random Binary Network
                           Optimal
Bound is significant       Approx.
   – Approx. ratio is      Lower Bound
     typically 1.23 (81 %) Upper Bound




                                   Comparison with k-optimal
                                     with knowledge on
                                     reward structure
                                   Much more accurate less
                                     general
Discussion
• Discussion with other data-dependent techniques
   – BnB-ADOPT [Yeoh et al 09]
      • Fix an error bound and execute until the error bound is met
      • Worst case computation remains exponential
   – ADPOP [Petcu and Faltings 05b]
      • Can fix message size (and thus computation) or error bound and
        leave the other parameter free
• Divide and coordinate [Vinyals et al 10]
   – Divide problems among agents and negotiate agreement
     by exchanging utility
   – Provides anytime quality guarantees
Summary
• Approximation techniques crucial for practical applications:
  surveillance, rescue, etc.
• DSA, MGM, Max-Sum heuristic approaches
   – Low coordination overhead, acceptable performance
   – No guarantees (convergence, solution quality)
• Instance generic guarantees:
   – K-optimality framework
   – Loose bounds for large scale systems
• Instance specific guarantees
   – Bounded max-sum, ADPOP, BnB-ADOPT
   – Performance depend on specific instance
References I
DOCPs for MRS
•   [Delle Fave et al 12] A methodology for deploying the max-sum algorithm and a case study on
    unmanned aerial vehicles. In, IAAI 2012
•   [Taylor et al. 11] Distributed On-line Multi-Agent Optimization Under Uncertainty: Balancing
    Exploration and Exploitation, Advances in Complex Systems
MGM
•   [Maheswaran et al. 04] Distributed Algorithms for DCOP: A Graphical Game-Based Approach,
    PDCS-2004
DSA
•   [Fitzpatrick and Meertens 03] Distributed Coordination through Anarchic Optimization,
    Distributed Sensor Networks: a multiagent perspective.
•   [Zhang et al. 03] A Comparative Study of Distributed Constraint algorithms, Distributed
    Sensor Networks: a multiagent perspective.
Max-Sum
•   [Stranders at al 09] Decentralised Coordination of Mobile Sensors Using the Max-Sum
    Algorithm, AAAI 09
•   [Rogers et al. 10] Self-organising Sensors for Wide Area Surveillance Using the Max-sum
    Algorithm, LNCS 6090 Self-Organizing Architectures
•   [Farinelli et al. 08] Decentralised coordination of low-power embedded devices using the
    max-sum algorithm, AAMAS 08
References II
Instance-based Approximation
•   [Yeoh et al. 09] Trading off solution quality for faster computation in DCOP search algorithms,
    IJCAI 09
•   [Petcu and Faltings 05b] A-DPOP: Approximations in Distributed Optimization, CP 2005
•   [Rogers et al. 11] Bounded approximate decentralised coordination via the max-sum
    algorithm, Artificial Intelligence 2011.
Instance-generic Approximation
•   [Vinyals et al 10b] Worst-case bounds on the quality of max-product fixed-points, NIPS 10
•   [Vinyals et al 11] Quality guarantees for region optimal algorithms, AAMAS 11
•   [Pearce and Tambe 07] Quality Guarantees on k-Optimal Solutions for Distributed Constraint
    Optimization Problems, IJCAI 07
•   [Bowring et al. 08] On K-Optimal Distributed Constraint Optimization Algorithms: New
    Bounds and Algorithms, AAMAS 08
•   [Weiss 00] Correctness of local probability propagation in graphical models with loops, Neural
    Computation
•   [Kiekintveld et al. 10] Asynchronous Algorithms for Approximate Distributed Constraint
    Optimization with Quality Bounds, AAMAS 10

Weitere ähnliche Inhalte

Was ist angesagt? (7)

slides_nuclear_norm_regularization_david_mateos
slides_nuclear_norm_regularization_david_mateosslides_nuclear_norm_regularization_david_mateos
slides_nuclear_norm_regularization_david_mateos
 
Math
MathMath
Math
 
Image processing
Image processingImage processing
Image processing
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Ant colony search and heuristic techniques for optimal dispatch of energy sou...
Ant colony search and heuristic techniques for optimal dispatch of energy sou...Ant colony search and heuristic techniques for optimal dispatch of energy sou...
Ant colony search and heuristic techniques for optimal dispatch of energy sou...
 
Multiobjective presentation
Multiobjective presentationMultiobjective presentation
Multiobjective presentation
 
Multiobjective optimization and trade offs using pareto optimality
Multiobjective optimization and trade offs using pareto optimalityMultiobjective optimization and trade offs using pareto optimality
Multiobjective optimization and trade offs using pareto optimality
 

Ähnlich wie T12 Distributed search and constraint handling

Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Sigma web solutions pvt. ltd.
 
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
Jinwon Lee
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
Jinwon Lee
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
Daniel Cahall
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Sung Kim
 
Dynamic programming class 16
Dynamic programming class 16Dynamic programming class 16
Dynamic programming class 16
Kumar
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_Souam
MDO_Lab
 

Ähnlich wie T12 Distributed search and constraint handling (20)

Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Markov Random Field (MRF)
Markov Random Field (MRF)Markov Random Field (MRF)
Markov Random Field (MRF)
 
CNN for modeling sentence
CNN for modeling sentenceCNN for modeling sentence
CNN for modeling sentence
 
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
Fault tolerance in wireless sensor networks by Constrained Delaunay Triangula...
 
WT in IP.ppt
WT in IP.pptWT in IP.ppt
WT in IP.ppt
 
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblages
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
 
Chapter 4 better.pptx
Chapter 4 better.pptxChapter 4 better.pptx
Chapter 4 better.pptx
 
Dynamic programming class 16
Dynamic programming class 16Dynamic programming class 16
Dynamic programming class 16
 
PF_MAO_2010_Souam
PF_MAO_2010_SouamPF_MAO_2010_Souam
PF_MAO_2010_Souam
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 

Mehr von EASSS 2012 (9)

T7 Embodied conversational agents and affective computing
T7 Embodied conversational agents and affective computingT7 Embodied conversational agents and affective computing
T7 Embodied conversational agents and affective computing
 
T14 Argumentation for agent societies
T14	Argumentation for agent societiesT14	Argumentation for agent societies
T14 Argumentation for agent societies
 
T4 Introduction to the modelling and verification of, and reasoning about mul...
T4 Introduction to the modelling and verification of, and reasoning about mul...T4 Introduction to the modelling and verification of, and reasoning about mul...
T4 Introduction to the modelling and verification of, and reasoning about mul...
 
T13 Agent coordination in planning and scheduling
T13	Agent coordination in planning and schedulingT13	Agent coordination in planning and scheduling
T13 Agent coordination in planning and scheduling
 
T3 Agent oriented programming languages
T3 Agent oriented programming languagesT3 Agent oriented programming languages
T3 Agent oriented programming languages
 
T0. Multiagent Systems and Electronic Institutions
T0. Multiagent Systems and Electronic InstitutionsT0. Multiagent Systems and Electronic Institutions
T0. Multiagent Systems and Electronic Institutions
 
T2. Organization and Environment oriented programming
T2. Organization and Environment oriented programmingT2. Organization and Environment oriented programming
T2. Organization and Environment oriented programming
 
T11. Normative multi-agent systems
T11. Normative multi-agent systemsT11. Normative multi-agent systems
T11. Normative multi-agent systems
 
T9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systemsT9. Trust and reputation in multi-agent systems
T9. Trust and reputation in multi-agent systems
 

Kürzlich hochgeladen

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Kürzlich hochgeladen (20)

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 

T12 Distributed search and constraint handling

  • 1. Distributed Constraint Handling and Optimization Alessandro Farinelli1 Alex Rogers2 Meritxell Vinyals1 1 Computer Science Department University of Verona, Italy 2 Agents, Interaction and Complexity Group School of Electronics and Computer Science University of Southampton,UK Tutorial EASSS 2012 Valencia https://sites.google.com/site/ easss2012optimization/
  • 2. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 3. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 4. Constraints • Pervade our everyday lives • Are usually perceived as elements that limit solutions to the problems we face
  • 5. Constraints From a computational point of view, they: • Reduce the space of possible solutions • Encode knowledge about the problem at hand • Are key components for efficiently solving hard problems
  • 6. Constraint Processing Many different disciplines deal with hard computational problems that can be made tractable by carefully considering the constraints that define the structure of the problem. Planning Operational Automated Reasoning Computer Scheduling Research Decision Theory Vision
  • 7. Constraint Processing in Multi-Agent Systems Focus on how constraint processing can be used to address optimization problems in Multi-Agent Systems (MAS) where: A set of agents must come to some agreement, typically via some form of negotiation, about which action each agent should take in order to jointly obtain the best solution for the whole system. M2 M1 A1 A2 A2 A3
  • 8. Distributed Constraint Optimization Problems (DCOPs) We will consider Distributed Constraint Optimization Problems (DCOP) where: Each agent negotiates locally with just a subset of other agents (usually called neighbors) that are those that can directly influence his/her behavior. M2 M1 A2 A1 A3
  • 9. Distributed Constraint Optimization Problems (DCOPs) After reading this chapter, you will understand: • The mathematical formulation of a DCOP • The main exact solution techniques for DCOPs • Key differences, benefits and limitations • The main approximate solution techniques for DCOPs • Key differences, benefits and limitations • The quality guarantees these approach provide: • Types of quality guarantees • Frameworks and techniques
  • 10. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 11. Constraint Networks A constraint network N is formally defined as a tuple X, D,C where: • X = {x1 , . . . , xn } is a set of discrete variables; • D = {D1 , . . . , Dn } is a set of variable domains, which enumerate all possible values of the corresponding variables; and • C = {C1 , . . . ,Cm } is a set of constraints; where a constraint Ci is defined on a subset of variables Si ⊆ X which comprise the scope of the constraint • r = |Si | is the arity of the constraint • Two types: hard or soft
  • 12. Hard constraints • A hard constraint Cih is a relation Ri that enumerates all the valid joint assignments of all variables in the scope of the constraint. Ri ⊆ Di1 × . . . × Dir Ri xj xk 0 1 1 0
  • 13. Soft constraints • A soft constraint Cis is a function Fi that maps every possible joint assignment of all variables in the scope to a real value. Fi : Di1 × . . . × Dir → ℜ Fi xj xk 2 0 0 0 0 1 0 1 0 1 1 1
  • 14. Binary Constraint Networks x1 • Binary constraint networks are those where: F1,3 ,2 • Each constraint (soft or hard) is defined F1 over two variables. • Every constraint network can be mapped to a binary constraint network x2 x3 4 F1, • requires the addition of variables and constraints F2,4 • may add complexity to the model • They can be represented by a constraint graph x4
  • 15. Different objectives, different problems • Constraint Satisfaction Problem (CSP) • Objective: find an assignment for all the variables in the network that satisfies all constraints. • Constraint Optimization Problem (COP) • Objective: find an assignment for all the variables in the network that satisfies all constraints and optimizes a global function. • Global function = aggregation (typically sum) of local functions. F(x) = ∑i Fi (xi )
  • 16. Distributed Constraint Reasoning A1 When operating in a decentralized context: • a set of agents control variables A2 • agents interact to find a A4 solution to the constraint network A3
  • 17. Distributed Constraint Reasoning Two types of decentralized problems: • distributed CSP (DCSP) • distributed COP (DCOP) Here, we focus on DCOPs.
  • 18. Distributed Constraint Optimization Problem (DCOP) A DCOP consists of a constraint network N = X, D,C and a set of agents A = {A1 , . . . , Ak } where each agent: • controls a subset of the variables Xi ⊆ X • is only aware of constraints that involve variable it controls • communicates only with its neighbours
  • 19. Distributed Constraint Optimization Problem (DCOP) • Agents are assumed to be fully cooperative • Goal: find the assignment that optimizes the global function, not their local local utilities. • Solving a COP is NP-Hard and DCOP is as hard as COP.
  • 20. Motivation Why distribute? • Privacy • Robustness • Scalability
  • 21. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 22. Real World Applications Many standard benchmark problems in computer science can be modeled using the DCOP framework: • graph coloring As can many real world applications: • human-agent organizations (e.g. meeting scheduling) • sensor networks and robotics (e.g. target tracking)
  • 23. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Graph coloring Meeting Scheduling Target Tracking Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 24. Graph coloring • Popular benchmark • Simple formulation • Complexity controlled with few parameters: • Number of available colors • Number of nodes • Density (#nodes/#constraints) • Many versions of the problem: • CSP, MaxCSP, COP
  • 25. Graph coloring - CSP • Nodes can take k colors • Any two adjacent nodes should have different colors • If it happens this is a conflict Yes! No!
  • 26. Graph coloring - Max-CSP • Minimize the number of conflicts 0 -1
  • 27. Graph coloring - COP • Different weights to violated constraints • Preferences for different colors 0 -2 -1 -3 -2 -1
  • 28. Graph coloring - DCOP • Each node: • controlled by one agent • Each agent: • Preferences for different colors • Communicates with its direct neighbours in the graph -1 A1 A3 • A1 and A2 exchange -3 -2 preferences and conflicts • A3 and A4 do not A2 A4 communicate -1
  • 29. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Graph coloring Meeting Scheduling Target Tracking Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 30. Meeting Scheduling Motivation: • Privacy • Robustness • Scalability
  • 31. Meeting Scheduling In large organizations many people, possibly working in different departments, are involved in a number of work meetings.
  • 32. Meeting Scheduling People might have various private preferences on meeting start times Better after 12:00am
  • 33. Meeting Scheduling Two meetings that share a participant cannot overlap Window: 15:00-18:00 Duration: 2h Window: 15:00-17:00 Duration: 1h
  • 34. DCOP formalization for the meeting scheduling problem • A set of agents representing participants • A set of variables representing meeting starting times according to a participant. • Hard Constraints: • Starting meeting times across different agents are equal • Meetings for the same agent are non-overlapping. • Soft Constraints: • Represent agent preferences on meeting starting times. Objective: find a valid schedule for the meeting while maximizing the sum of individuals’ preferences.
  • 35. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Graph coloring Meeting Scheduling Target Tracking Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 36. Target Tracking Motivation: • Privacy • Robustness • Scalability
  • 37. Target Tracking A set of sensors tracking a set of targets in order to provide an accurate estimate of their positions. T4 T1 T3 T2 Crucial for surveillance and monitoring applications.
  • 38. Target Tracking Sensors can have different sensing modalities that impact on the accuracy of the estimation of the targets’ positions. MODES T4 MODES MODES T1 T3 MODES T2
  • 39. Target Tracking Collaboration among sensors is crucial to improve system performance T4 T1 T3 T2
  • 40. DCOP formalization for the target tracking problem • Agents represent sensors • Variables encode the different sensing modalities of each sensor • Constraints • relate to a specific target • represent how sensor modalities impacts on the tracking performance • Objective: • Maximize coverage of the environment • Provide accurate estimations of potentially dangerous targets
  • 41. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 42. Complete Algorithms U D Always find an optimal solution D Exhibit an exponentially increasing coordination overhead Very limited scalability on general problems.
  • 43. Complete Algorithms • Completely decentralised • Search-based. • Synchronous: SyncBB, AND/OR search • Asynchronous: ADOPT, NCBB and AFB. • Dynamic programming. • Partially decentralised • OptAPO Next, we focus on completely decentralised algorithms
  • 44. Decentralised Complete Algorithms Search-based Dynamic programming • Uses distributed search • Uses distributed inference • Exchange individual values • Exchange constraints • Small messages but • Few messages but . . . exponentially many . . . exponentially large Representative: ADOPT [Modi Representative: DPOP [Petcu et al., 2005] and Faltings, 2005]
  • 45. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Search Based: ADOPT Dynamic Programming DPOP Approximated Algorithms for DCOPs Conclusions
  • 46. ADOPT ADOPT (Asynchronous Distributed OPTimization) [Modi et al., 2005]: • Distributed backtrack search using a best-first strategy • Best value based on local information: • Lower/upper bound estimates of each possible value of its variable • Backtrack thresholds used to speed up the search of previously explored solutions. • Termination conditions that check if the bound interval is less than a given valid error bound (0 if optimal)
  • 47. ADOPT by example 4 variables (4 agents): x1 , x2 , x3 , x4 with D = {0, 1} 4 identical cost functions x1 Fi, j xi xj 2 0 0 F1,3 ,2 F1 0 0 1 0 1 0 x2 x3 1 1 1 4 F1, F2,4 Goal: find a variable assignment with minimal cost x4 Solution: x1 = 1, x2 = 0, x3 = 0 and x4 = 1 giving total cost 1.
  • 48. DFS arrangement • Before executing ADOPT, agents must be arranged in a depth first search (DFS) tree. • DFS trees have been frequently used in optimization because they have two interesting properties: • Agents in different branches of the tree do not share any constraints; • Every constraint network admits a DFS tree.
  • 49. ADOPT by example A1 (root) x1 → ← Fi, j xi xj ent par par ent , 2 0 0 F1,3 ,2 , chil hil d F1 0 0 1 d→ ←c 0 1 0 x2 x3 1 1 1 A2 A3 4 F1, → arent F2,4 → il d , p ← ch x4 DFS arrangement A4
  • 50. Cost functions The local cost function for an agent Ai (δ (xi )) is the sum of the values of constraints involving only higher neighbors in the DFS.
  • 51. ADOPT by example A1 δ (x1 ) = 0 δ (x1 , x2 ) = F1,2 (x1 , x2 ) A2 A3 δ (x1 , x3 ) = F1,3 (x1 , x3 ) A4 δ (x1 , x2 , x4 ) = F1,4 (x1 , x4 ) + F2,4 (x2 , x4 )
  • 52. Initialization Each agent initially chooses a random value for their variables and initialize the lower and upper bounds to zero and infinity respectively. x1 = 0, LB = 0,UB = ∞ A1 x2 = 0, LB = 0,UB = ∞ A2 A3 x3 = 0, LB = 0,UB = ∞ x4 = 0, LB = 0,UB = ∞ A4
  • 53. ADOPT by example Value messages are sent by an agent to all its neighbors that are lower in the DFS tree x1 = 0 A1 ← =0 x 1−− − − −− =→ x1 0 A1 sends three value message to A2 , A3 and x2 = 0 A2 A3 x3 = 0 ←−0 A4 informing them that its −− x1 = ←−0 −− current value is 0. x2 = x4 = 0 A4
  • 54. ADOPT by example Current Context: a partial variable assignment maintained by each agent that records the assignment of all higher neighbours in the DFS. A1 • Updated by each VALUE message c2 : {x1 = 0} A2 A3 • If current context is not c3 : {x1 = 0} compatible with some child context, the latter is re-initialized A4 (also the child bound) c4 : {x1 = 0, x2 = 0}
  • 55. ADOPT by example Each agent Ai sends a cost message to its parent A p A1 , c2 ] [0, 0 Each cost message reports: [0, ∞ , c 3] • The minimum lower bound (LB) • The maximum upper bound (UB) A2 A3 • The context (ci ) 4] c [0, 0, [LB,UP, ci ] A4
  • 56. Lower bound computation Each agent evaluates for each possible value of its variable: • its local cost function with respect to the current context • adding all the compatible lower bound messages received from children. Analogous computation for upper bounds
  • 57. ADOPT by example Consider the lower bound in the cost message sent by A4 : A1 • Recall that A4 local cost function is: δ (x1 , x2 , x4 ) = F1,4 (x1 , x4 ) + F2,4 (x2 , x4 ) • Restricted to the current context c4 = {(x1 = 0, x2 = 0)}: λ (0, 0, x4 ) = F1,4 (0, x4 ) + F2,4 (0, x4 ). A2 A3 • For x4 = 0: λ (0, 0, 0) = F1,4 (0, 0) + F2,4 (0, 0) = 2 + 2 = 4. 4] c • For x4 = 1: [0, 0, λ (0, 0, 1) = F1,4 (0, 1) + F2,4 (0, 1) = 0 + 0 = 0. A4 Then the minimum lower bound across variable values is LB = 0.
  • 58. ADOPT by example Each agent asynchronously chooses the value of its variable that minimizes its lower bound. A2 computes for each possible value of its A1 variable its local function restricted to the , c2 ] current context c2 = {(x1 = 0)} [0, 2 (λ (0, x2 ) = F1,2 (0, x2 )) and adding lower bound message from A4 (lb). x2 = 0 → 1 A2 A3 • For x2 = 0: LB(x2 = 0) = λ (0, x2 = 0) + lb(x2 = 0) = 2 + 0 = 2. 1 x2 = • For x2 = 1: LB(x2 = 1) = λ (0, x2 = 1) + 0 = 0 + 0 = 0. A4 A2 changes its value to x2 = 1 with LB = 0.
  • 59. Backtrack thresholds The search strategy is based on lower bounds Problem • Values abandoned before proven to be suboptimal • Lower/upper bounds only stored for the current context Solution • Backtrack thresholds: used to speed up the search of previously explored solutions.
  • 60. ADOPT by example x1 = 0 → 1 → 0 A1 A1 changes its value and the context with x1 = 0 is visited again. • Reconstructing from scratch is inefficient A2 A3 • Remembering solutions is expensive A4
  • 61. Backtrack thresholds Solution: Backtrack thresholds • Lower bound previously determined by children • Polynomial space • Control backtracking to efficiently search • Key point: do not change value until LB(currentvalue)> threshold
  • 62. A child agent will not change its variable value so long as cost is less than the backtrack threshold given to it by its parent. LB(x1 = 0) = 1 A1 t (x 1 = 1 2 0) = 0) = t (x1 = 1 2 1 1 LB(x2 = 0) > 2 ? A2 A3 LB(x3 = 0) > 2 ? A4
  • 63. Rebalance incorrect threshold How to correctly subdivide threshold among children? • Parent distributes the accumulated bound among children • Arbitrarily/Using some heuristics • Correct subdivision as feedback is received from children • LB < t(CONT EXT ) • t(CONT EXT ) = ∑Ci t(CONT EXT ) + δ
  • 64. Backtrack Threshold Computation A1 (2) 2 t 1 =1 = 0) = 1 (x 1 = (1) LB 1 • When A1 receives a new lower bound x 0) = 0 (2) t ( from A2 rebalances thresholds • A1 resends threshold messages to A2 A2 A3 and A3 A4
  • 65. ADOPT extensions • BnB-ADOPT [Yeoh et al., 2008] reduces computation time by using depth-first search with branch and bound strategy • [Ali et al., 2005] suggest the use of preprocessing techniques for guiding ADOPT search and show that this can result in a consistent increase in performance.
  • 66. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Search Based: ADOPT Dynamic Programming DPOP Approximated Algorithms for DCOPs Conclusions
  • 67. DPOP DPOP (Dynamic Programming Optimization Protocol) [Petcu and Faltings, 2005]: • Based on the dynamic programming paradigm. • Special case of Bucket Tree Elimination Algorithm (BTE) [Dechter, 2003].
  • 68. DPOP by example x1 x1 Fi, j xi xj ← P2 , child → F1,3 ,2 F1 → ← PP3 2 0 0 4 il d , PP ,4 0 0 1 F2,3 F1 , pseud x2 x3 0 1 0 ud och 1 1 1 ochil d x2 ← pse ,4 F2 → ← → => P3, 4 ,P chi ld chi ld x4 DFS arrangement ← → Objective: find assignment x4 x3 with maximal value
  • 69. DPOP phases Given a DFS tree structure, DPOP runs in two phases: • Util propagation: agents exchange util messages up the tree. • Aim: aggregate all info so that root agent can choose optimal value • Value propagation: agents exchange value messages down the tree. • Aim: propagate info so that all agents can make their choice given choices of ancestors
  • 70. Sepi : set of agents preceding Ai in the pseudo-tree order that are connected with Ai or with a descendant of Ai . x1 Sep1 = 0 / ← P2 , child → → ← PP3 4 il d , PP , pseud ud och ochil d x2 Sep2 = {x1 } ← pse → ← → P3, 4 ,P chi ld chi ld ← → Sep4 = {x1 , x2 } x4 x3 Sep3 = {x1 , x2 }
  • 71. Util message The Util message Ui→ j that agent Ai sends to its parent A j can be computed as: Ui→ j (Sepi ) = max Uk→i ⊗ Fi,p xi Ak ∈Ci A p ∈Pi ∪PPi Size exponential All incoming messages Shared constraints with in Sepi from children parents/pseudoparents The ⊗ operator is a join operator that sums up functions with different but overlapping scores consistently.
  • 72. Join operator F2,4 x2 x4 max{x4 } (F1,4 ⊗ F2,4 ) F1,4 ⊗ F2,4 2 0 0 x1 x2 x4 x1 x2 x4 0 0 1 4 0 0 0 0 0 0 0 1 0 max(4,0) 0 0 0 1 0 0 1 1 1 1 Project 2 0 1 0 0 1 0 Add max(2,1) 1 0 1 1 out x4 0 1 1 F1,4 x1 x4 2 1 0 0 1 0 0 2 0 0 max(2,2) 2 1 0 1 1 0 1 0 0 1 0 1 1 0 1 1 0 0 1 0 max(0,2) 2 1 1 1 1 1 1 1 1 1
  • 73. Complexity exponential to the largest Sepi . Largest Sepi = induced width of the DFS tree ordering used. A1 (root) U2→1 x1 Sep2 U1→2 ←− −− 0 10 Sep4 1 5 A2 max(U3→2 ⊗U4→2 ⊗ F1,2 ) x2 U4→2 x1 x2 U3→2 x1 x2 Sep3 U← − −− 2 4 0 0 4 0 0 → − 4→ 3 − − → U 2 2 0 1 2 0 1 2 1 0 2 1 0 2 1 1 2 1 1 A4 A3 1 max(F1,3 ⊗ F2,3 ) 2 max(F1,4 ⊗ F2,4 ) x3 x4
  • 74. Value message Keeping fixed the value of parent/pseudoparents, finds the value that maximizes the computed cost function in the util phase: ∗ xi = arg max ∑ U j→i (xi , x∗ ) + p ∑ Fi, j (xi , x∗ ) j xi A j ∈Ci A j ∈Pi ∪PPi where x∗ = A j ∈Pi ∪PPi {x∗ } is the set of optimal values for Ai ’s parent p j and pseudoparents received from Ai ’s parent. Propagates this value through children down the tree: ∗ ∗ Vi→ j = {xi = xi } ∪ {xs = xs } xs ∈Sepi ∩Sep j
  • 75. x1 = max U1→2 (x1 ) A1 x1 V1→2 −→ − ∗ ∗ ∗ ∗ x2 = max(U3→2 (x1 , x2 ) ⊗U4→2 (x1 , x2 ) ⊗ F1,2 (x1 , x2 )) A2 x2 V 2→→ −4 − − ←2→ − V 3 A4 A3 ∗ ∗ ∗ x4 = max(F1,4 (x1 , x4 ) ⊗ F2,4 (x2 , x4 )) ∗ ∗ ∗ x3 = max(F1,3 (x1 , x3 ) ⊗ F2,3 (x2 , x3 )) x4 x3
  • 76. DPOP extensions • MB-DPOP [Petcu and Faltings, 2007] trades-off message size against the number of messages. • A-DPOP trades-off message size against solution quality [Petcu and Faltings, 2005(2)].
  • 77. Conclusions • Constraint processing • exploit problem structure to solve hard problems efficiently • DCOP framework • applies constraint processing to solve decision making problems in Multi-Agent Systems • increasingly being applied within real world problems.
  • 78. References I • [Modi et al., 2005] P. J. Modi, W. Shen, M. Tambe, and M.Yokoo. ADOPT: Asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence Jour- nal, (161):149-180, 2005. • [Yeoh et al., 2008] W. Yeoh, A. Felner, and S. Koenig. BnB-ADOPT: An asynchronous branch-and-bound DCOP algorithm. In Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems, pages 591Ð598, 2008. • [Ali et al., 2005] S. M. Ali, S. Koenig, and M. Tambe. Preprocessing techniques for accelerating the DCOP algorithm ADOPT. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pages 1041Ð1048, 2005. • [Petcu and Faltings, 2005] A. Petcu and B. Faltings. DPOP: A scalable method for multiagent constraint opti- mization. In Proceedings of the Nineteenth International Joint Conference on Arti- ficial Intelligence, pages 266-271, 2005. • [Dechter, 2003] R. Dechter. Constraint Processing. Morgan Kaufmann, 2003.
  • 79. References II • [Petcu and Faltings, 2005(2)] A. Petcu and B. Faltings. A-DPOP: Approximations in distributed optimization. In Principles and Practice of Constraint Programming, pages 802-806, 2005. • [Petcu and Faltings, 2007] A. Petcu and B. Faltings. MB-DPOP: A new memory-bounded algorithm for distributed optimization. In Proceedings of the Twentieth International Joint Confer- ence on Artificial Intelligence, pages 1452-1457, 2007. • [S. Fitzpatrick and L. Meetrens, 2003] S. Fitzpatrick and L. Meetrens. Distributed Sensor Networks: A multiagent perspective, chapter Distributed coordination through anarchic optimization, pages 257- 293. Kluwer Academic, 2003. • [R. T. Maheswaran et al., 2004] R. T. Maheswaran, J. P. Pearce, and M. Tambe. Distributed algorithms for DCOP: A graphical game-based approach. In Proceedings of the Seventeenth International Conference on Parallel and Distributed Computing Systems, pages 432-439, 2004.
  • 80. Outline Introduction Distributed Constraint Reasoning Applications and Exemplar Problems Complete algorithms for DCOPs Approximated Algorithms for DCOPs Conclusions
  • 81. Approximate Algorithms: outline • No guarantees – DSA-1, MGM-1 (exchange individual assignments) – Max-Sum (exchange functions) • Off-Line guarantees – K-optimality and extensions • On-Line Guarantees – Bounded max-sum
  • 82. Why Approximate Algorithms • Motivations – Often optimality in practical applications is not achievable – Fast good enough solutions are all we can have • Example – Graph coloring – Medium size problem (about 20 nodes, three colors per node) – Number of states to visit for optimal solution in the worst case 3^20 = 3 billions of states • Key problem – Provides guarantees on solution quality
  • 83. Exemplar Application: Surveillance • Event Detection – Vehicles passing on a road • Energy Constraints – Sense/Sleep modes – Recharge when sleeping • Coordination – Activity can be detected by single sensor duty cycle – Roads have different time traffic loads Good Schedule • Aim [Rogers et al. 10] Bad Schedule – Focus on road with more traffic load Heavy traffic road small road
  • 85. Guarantees on solution quality • Key Concept: bound the optimal solution – Assume a maximization problem – optimal solution, a solution – – percentage of optimality • [0,1] • The higher the better – approximation ratio • >= 1 • The lower the better – is the bound
  • 86. Types of Guarantees Instance-specific Accuracy: high alpha Generality: less use of Bounded Max- instance specific knowledge Sum DaCSA Accuracy Instance-generic No guarantees K-optimality MGM-1, T-optimality DSA-1, Max-Sum Region Opt. Generality
  • 87. Centralized Local Greedy approaches • Greedy local search – Start from random solution – Do local changes if global solution improves – Local: change the value of a subset of variables, usually one -1 -1 -1 -4 -1 0 0 -2 -1 -1 -2 0
  • 88. Centralized Local Greedy approaches • Problems – Local minima – Standard solutions: RandomWalk, Simulated Annealing -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1
  • 89. Distributed Local Greedy approaches • Local knowledge • Parallel execution: – A greedy local move might be harmful/useless – Need coordination -1 -1 -1 -4 -1 0 0 -2 -2 0 0 -2 -2 -4
  • 90. Distributed Stochastic Algorithm • Greedy local search with activation probability to mitigate issues with parallel executions • DSA-1: change value of one variable at time • Initialize agents with a random assignment and communicate values to neighbors • Each agent: – Generates a random number and execute only if rnd less than activation probability – When executing changes value maximizing local gain – Communicate possible variable change to neighbors
  • 91. DSA-1: Execution Example rnd > ¼ ? rnd > ¼ ? rnd > ¼ ? rnd > ¼ ? -1 -1 -1 P = 1/4 -1 0 -2
  • 92. DSA-1: discussion • Extremely “cheap” (computation/communication) • Good performance in various domains – e.g. target tracking [Fitzpatrick Meertens 03, Zhang et al. 03], – Shows an anytime property (not guaranteed) – Benchmarking technique for coordination • Problems – Activation probablity must be tuned [Zhang et al. 03] – No general rule, hard to characterise results across domains
  • 93. Maximum Gain Message (MGM-1) • Coordinate to decide who is going to move – Compute and exchange possible gains – Agent with maximum (positive) gain executes • Analysis [Maheswaran et al. 04] – Empirically, similar to DSA – More communication (but still linear) – No Threshold to set – Guaranteed to be monotonic (Anytime behavior)
  • 94. MGM-1: Example -1 -1 0 -2 -1 -1 -1 -1 0 -2 G = -2 G=0 G=0 G=2
  • 95. Local greedy approaches • Exchange local values for variables – Similar to search based methods (e.g. ADOPT) • Consider only local information when maximizing – Values of neighbors • Anytime behaviors • Could result in very bad solutions
  • 96. Max-sum Agents iteratively computes local functions that depend only on the variable they control X1 X2 Choose arg max X4 X3 Shared constraint All incoming messages except x2 All incoming messages
  • 97. Factor Graph and GDL • Factor Graph – [Kschischang, Frey, Loeliger 01] – Computational framework to represent factored computation – Bipartite graph, Variable - Factor H ( X1, X 2 , X 3 ) = H ( X1) + H ( X 2 | X1) + H ( X 3 | X1 ) H ( X 2 | X1) x1 x2 H ( X1) x1 x2 x3 H ( X 3 | X1) x3
  • 98. Max-Sum on acyclic graphs • Max-sum Optimal on acyclic graphs – Different branches are H ( X 2 | X1) independent H ( X1) x1 x2 – Each agent can build a correct estimation of its contribution to the global problem (z functions) • Message equations very similar to Util messages in DPOP x3 – GDL generalizes DPOP [Vinyals et al. 2010a] H ( X 3 | X1) sum up info from other nodes local maximization step
  • 99. (Loopy) Max-sum Performance • Good performance on loopy networks [Farinelli et al. 08] – When it converges very good results • Interesting results when only one cycle [Weiss 00] – We could remove cycle but pay an exponential price (see DPOP) – Java Library for max-sum http://code.google.com/p/jmaxsum/
  • 100. Max-Sum for low power devices • Low overhead – Msgs number/size • Asynchronous computation – Agents take decisions whenever new messages arrive • Robust to message loss
  • 102. Max-Sum for UAVs Task Assignment for UAVs [Delle Fave et al 12] Video Streaming Max-Sum Interest points
  • 103. Task utility Task completion Priority Urgency First assigned UAVs reaches task Last assigned UAVs leaves task (consider battery life) 24
  • 104. Factor Graph Representation PDA2 UAV2 X2 PDA1 T1 U 2 T2 U1 UAV1 U3 T3 X1 PDA3
  • 106. Quality guarantees for approx. techniques • Key area of research • Address trade-off between guarantees and computational effort • Particularly important for many real world applications – Critical (e.g. Search and rescue) – Constrained resource (e.g. Embedded devices) – Dynamic settings
  • 107. Instance-generic guarantees Instance-specific Bounded Max- Characterise solution quality without Sum running the algorithm DaCSA Accuracy Instance-generic No guarantees K-optimality MGM-1, T-optimality DSA-1, Max-Sum Region Opt. Generality
  • 108. K-Optimality framework • Given a characterization of solution gives bound on solution quality [Pearce and Tambe 07] • Characterization of solution: k-optimal • K-optimal solution: – Corresponding value of the objective function can not be improved by changing the assignment of k or less variables.
  • 109. K-Optimal solutions 1 1 1 1 1 1 1 1 2-optimal ? Yes 3-optimal ? No 2 2 0 0 1 0 0 2
  • 110. Bounds for K-Optimality For any DCOP with non-negative rewards [Pearce and Tambe 07] Number of agents Maximum arity of constraints K-optimal solution Binary Network (m=2):
  • 111. K-Optimality Discussion • Need algorithms for computing k-optimal solutions – DSA-1, MGM-1 k=1; DSA-2, MGM-2 k=2 [Maheswaran et al. 04] – DALO for generic k (and t-optimality) [Kiekintveld et al. 10] • The higher k the more complex the computation (exponential) Percentage of Optimal: • The higher k the better • The higher the number of agents the worst
  • 112. Trade-off between generality and solution quality • K-optimality based on worst case analysis • assuming more knowledge gives much better bounds • Knowledge on structure [Pearce and Tambe 07]
  • 113. Trade-off between generality and solution quality • Knowledge on reward [Bowring et al. 08] • Beta: ratio of least minimum reward to the maximum
  • 114. Off-Line Guarantees: Region Optimality • k-optimality: use size as a criterion for optimality • t-optimality: use distance to a central agent in the constraint graph • Region Optimality: define regions based on general criteria (e.g. S-size bounded distance) [Vinyals et al 11] • Ack: Meritxell Vinyals 3-size regions 1-distance regions C regions x0 x0 x1 x2 x1 x2 x3 x3 x0 x0 x1 x2 x1 x2 x3 x3 x0 x1 x2 x3 x0 x1 x2 x3
  • 115. Size-Bounded Distance • Region optimality can explore new regions: s-size bounded distance • One region per agent, largest t- 3-size bounded distance distance group whose size is less than s x0 x0 x1 x2 x1 x2 • S-Size-bounded distance x3 x3 – C-DALO extension of DALO for general regions t=1 t=0 – Can provides better bounds and x0 x0 keep under control size and x1 x2 x1 x2 number of regions x3 x3 t=0 t=1
  • 116. Max-Sum and Region Optimality • Can use region optimality to provide bounds for Max- sum [Vinyals et al 10b] • Upon convergence Max-Sum is optimal on SLT regions of the graph [Weiss 00] • Single Loops and Trees (SLT): all groups of agents whose vertex induced subgraph contains at most one cycle x0 x0 x0 x1 x2 x1 x2 x1 x2 x3 x3 x0 x0 x3 x1 x2 x1 x2 x3 x3
  • 117. Bounds for Max-Sum • Complete: same as 3-size optimality • bipartite • 2D grids
  • 118. Variable Disjoint Cycles Very high quality guarantees if smallest cycle is large
  • 119. Instance-specific guarantees Instance-specific Bounded Max- Characterise solution quality after/while Sum running the algorithm DaCSA Accuracy Instance-generic No guarantees K-optimality MGM-1, T-optimality DSA-1, Max-Sum Region Opt. Generality
  • 120. Bounded Max-Sum Aim: Remove cycles from Factor Graph avoiding exponential computation/communication (e.g. no junction tree) Key Idea: solve a relaxed problem instance [Rogers et al.11] X1 F2 X3 X1 F2 X3 Build Spanning tree F1 X2 F3 F1 X2 F3 Compute Bound Run Max-Sum X1 X2 X3 Optimal solution on tree
  • 121. Factor Graph Annotation • Compute a weight for each edge – maximum possible impact X1 F2 X3 of the variable on the w21 w23 function w11 w22 w33 w12 w32 F1 X2 F3
  • 122. Factor Graph Modification • Build a Maximum Spanning Tree – Keep higher weights X1 F2 X3 • Cut remaining w21 w23 dependencies – Compute • Modify functions w11 w22 w33 • Compute bound w12 w32 F1 X2 F3 W = w22 + w23
  • 123. Results: Random Binary Network Optimal Bound is significant Approx. – Approx. ratio is Lower Bound typically 1.23 (81 %) Upper Bound Comparison with k-optimal with knowledge on reward structure Much more accurate less general
  • 124. Discussion • Discussion with other data-dependent techniques – BnB-ADOPT [Yeoh et al 09] • Fix an error bound and execute until the error bound is met • Worst case computation remains exponential – ADPOP [Petcu and Faltings 05b] • Can fix message size (and thus computation) or error bound and leave the other parameter free • Divide and coordinate [Vinyals et al 10] – Divide problems among agents and negotiate agreement by exchanging utility – Provides anytime quality guarantees
  • 125. Summary • Approximation techniques crucial for practical applications: surveillance, rescue, etc. • DSA, MGM, Max-Sum heuristic approaches – Low coordination overhead, acceptable performance – No guarantees (convergence, solution quality) • Instance generic guarantees: – K-optimality framework – Loose bounds for large scale systems • Instance specific guarantees – Bounded max-sum, ADPOP, BnB-ADOPT – Performance depend on specific instance
  • 126. References I DOCPs for MRS • [Delle Fave et al 12] A methodology for deploying the max-sum algorithm and a case study on unmanned aerial vehicles. In, IAAI 2012 • [Taylor et al. 11] Distributed On-line Multi-Agent Optimization Under Uncertainty: Balancing Exploration and Exploitation, Advances in Complex Systems MGM • [Maheswaran et al. 04] Distributed Algorithms for DCOP: A Graphical Game-Based Approach, PDCS-2004 DSA • [Fitzpatrick and Meertens 03] Distributed Coordination through Anarchic Optimization, Distributed Sensor Networks: a multiagent perspective. • [Zhang et al. 03] A Comparative Study of Distributed Constraint algorithms, Distributed Sensor Networks: a multiagent perspective. Max-Sum • [Stranders at al 09] Decentralised Coordination of Mobile Sensors Using the Max-Sum Algorithm, AAAI 09 • [Rogers et al. 10] Self-organising Sensors for Wide Area Surveillance Using the Max-sum Algorithm, LNCS 6090 Self-Organizing Architectures • [Farinelli et al. 08] Decentralised coordination of low-power embedded devices using the max-sum algorithm, AAMAS 08
  • 127. References II Instance-based Approximation • [Yeoh et al. 09] Trading off solution quality for faster computation in DCOP search algorithms, IJCAI 09 • [Petcu and Faltings 05b] A-DPOP: Approximations in Distributed Optimization, CP 2005 • [Rogers et al. 11] Bounded approximate decentralised coordination via the max-sum algorithm, Artificial Intelligence 2011. Instance-generic Approximation • [Vinyals et al 10b] Worst-case bounds on the quality of max-product fixed-points, NIPS 10 • [Vinyals et al 11] Quality guarantees for region optimal algorithms, AAMAS 11 • [Pearce and Tambe 07] Quality Guarantees on k-Optimal Solutions for Distributed Constraint Optimization Problems, IJCAI 07 • [Bowring et al. 08] On K-Optimal Distributed Constraint Optimization Algorithms: New Bounds and Algorithms, AAMAS 08 • [Weiss 00] Correctness of local probability propagation in graphical models with loops, Neural Computation • [Kiekintveld et al. 10] Asynchronous Algorithms for Approximate Distributed Constraint Optimization with Quality Bounds, AAMAS 10