SlideShare ist ein Scribd-Unternehmen logo
1 von 61
Downloaden Sie, um offline zu lesen
Control Techniques for Complex Systems
      Department of Electrical & Computer Engineering
                    University of Florida



                      Sean P. Meyn

                 Coordinated Science Laboratory
  and the Department of Electrical and Computer Engineering
        University of Illinois at Urbana-Champaign, USA


                       April 21, 2011




                                                              1 / 26
Outline

                                      Control Techniques      Markov Chains
                                             FOR                   and
                                      Complex Networks      Stochastic Stability
                                                                         P n (x, · ) − π   f   →0




                                                                                                    sup Ex [SτC (f )] < ∞
                                                                                                    C
                                                             π(f ) < ∞
1   Control Techniques                                               ∆V (x) ≤ −f (x) + bIC (x)



                                          Sean Meyn        S. P. Meyn and R. L. Tweedie



2   Complex Networks

3   Architectures for Adaptation & Learning

4   Next Steps




                                                                                                                            2 / 26
Control Techniques




                       System model
                      d
                         α = µ σ −Cα + . . .
                      dt
                       d
                         q = 1 µ I −1 (C − . . .
                             2
                      dt
                       d
                         θ=q
                      dt


                                      ???

Control Techniques?
                                                   3 / 26
Control Techniques


Typical steps to control design


  Obtain simple model that captures                    System model
essential structure                                   d
                                                      dt
                                                         α = µ σ −Cα + . . .
                                                       d
   – An equilibrium model if the goal is regulation   dt
                                                         q = 1 µ I −1 (C − . . .
                                                             2
                                                       d
                                                         θ=q
                                                      dt


                                                                      ???




                                                                                   4 / 26
Control Techniques


Typical steps to control design


  Obtain simple model that captures                                          System model
essential structure                                                         d
                                                                            dt
                                                                               α = µ σ −Cα + . . .
                                                                             d
   – An equilibrium model if the goal is regulation                         dt
                                                                               q = 1 µ I −1 (C − . . .
                                                                                   2
                                                                             d
                                                                               θ=q
                                                                            dt


                                                                                            ???

  Obtain feedback design,        using dynamic programming, LQG, loop shaping, ...
                                    Design for performance and reliability
  Test via simulations and experiments, and refine design




                                                                                                         4 / 26
Control Techniques


Typical steps to control design


  Obtain simple model that captures                                          System model
essential structure                                                         d
                                                                            dt
                                                                               α = µ σ −Cα + . . .
                                                                             d
   – An equilibrium model if the goal is regulation                         dt
                                                                               q = 1 µ I −1 (C − . . .
                                                                                   2
                                                                             d
                                                                               θ=q
                                                                            dt


                                                                                            ???

  Obtain feedback design,        using dynamic programming, LQG, loop shaping, ...
                                    Design for performance and reliability
  Test via simulations and experiments, and refine design

                      If these steps fail, we may have to re-engineer the
                      system (e.g., introduce new sensors), and start over.



                                                                                                         4 / 26
Control Techniques


Typical steps to control design


  Obtain simple model that captures                                          System model
essential structure                                                         d
                                                                            dt
                                                                               α = µ σ −Cα + . . .
                                                                             d
   – An equilibrium model if the goal is regulation                         dt
                                                                               q = 1 µ I −1 (C − . . .
                                                                                   2
                                                                             d
                                                                               θ=q
                                                                            dt


                                                                                            ???

  Obtain feedback design,        using dynamic programming, LQG, loop shaping, ...
                                    Design for performance and reliability
  Test via simulations and experiments, and refine design

                      If these steps fail, we may have to re-engineer the
                      system (e.g., introduce new sensors), and start over.
                                          This point of view is unique to control


                                                                                                         4 / 26
Control Techniques


Typical steps to scheduling

                                                     Inventory model: Controlled work-release, controlled routing,
                                                                          uncertain demand

A simplified model of a semiconductor
manufacturing facility
Similar demand-driven models can be used
                                                                                                               demand 1
to model allocation of locational reserves
in a power grid
                                                    demand 2




                                                                                                                     5 / 26
Control Techniques


Typical steps to scheduling

                                                     Inventory model: Controlled work-release, controlled routing,
                                                                          uncertain demand

A simplified model of a semiconductor
manufacturing facility
Similar demand-driven models can be used
                                                                                                               demand 1
to model allocation of locational reserves
in a power grid
                                                    demand 2




      Obtain simple model –
                     Frequently based on simple statistics to obtain a Markov model
      Obtain feedback design based on heuristics, or dynamic programming
      Performance evaluation via computation
                                                               (e.g., Neuts’ matrix-geometric methods)



                                                                                                                     5 / 26
Control Techniques


Typical steps to scheduling
                                                     Inventory model: Controlled work-release, controlled routing,
                                                                          uncertain demand
A simplified model of a semiconductor
manufacturing facility.
Similar demand-driven models can be used                                                                       demand 1

to model allocation of locational reserves
in a power grid                                     demand 2




Difficulty : A Markov model is not simple enough!
      Obtain simple model –
             Frequently based on exponential statistics to obtain a Markov model
      Obtain feedback design based on heuristics, or dynamic programming
      Performance evaluation via computation (e.g., Neut’s matrix-geometric methods)


With the 16 buffers truncated to 0 ≤ x ≤ 10,


                                                                                                                          6 / 26
Control Techniques


Typical steps to scheduling
                                                     Inventory model: Controlled work-release, controlled routing,
                                                                          uncertain demand
A simplified model of a semiconductor
manufacturing facility.
Similar demand-driven models can be used                                                                       demand 1

to model allocation of locational reserves
in a power grid                                     demand 2




Difficulty : A Markov model is not simple enough!
      Obtain simple model –
             Frequently based on exponential statistics to obtain a Markov model
      Obtain feedback design based on heuristics, or dynamic programming
      Performance evaluation via computation (e.g., Neut’s matrix-geometric methods)


With the 16 buffers truncated to 0 ≤ x ≤ 10,
    policy synthesis reduces to a linear program of dimension 1116 !

                                                                                                                          6 / 26
Control Techniques


Control-theoretic approach to scheduling                                                  d
                                                                                          dt q   = Bu + α


  Inventory model: Controlled work-release, controlled routing,
                       uncertain demand
                                                                       q: Queue length evolves on R16 .
                                                                                                   +

                                                                       u: Scheduling/routing decisions —
                                                            demand 1
                                                                          Convex relaxation

 demand 2
                                                                       α: Mean exogenous arrivals of work
                                                                       B: Captures network topology




                                                                                                            7 / 26
Control Techniques


Control-theoretic approach to scheduling                                                  d
                                                                                          dt q   = Bu + α


  Inventory model: Controlled work-release, controlled routing,
                       uncertain demand
                                                                       q: Queue length evolves on R16 .
                                                                                                   +

                                                                       u: Scheduling/routing decisions —
                                                            demand 1
                                                                          Convex relaxation

 demand 2
                                                                       α: Mean exogenous arrivals of work
                                                                       B: Captures network topology



Control-theoretic approach to scheduling:
Dimension reduced from a linear program of dimension 1116 ...
                                    to an HJB equation of dimension 16



                                                                                                            7 / 26
Control Techniques


Control-theoretic approach to scheduling                                                  d
                                                                                          dt q   = Bu + α


  Inventory model: Controlled work-release, controlled routing,
                       uncertain demand
                                                                       q: Queue length evolves on R16 .
                                                                                                   +

                                                                       u: Scheduling/routing decisions —
                                                            demand 1
                                                                          Convex relaxation

 demand 2
                                                                       α: Mean exogenous arrivals of work
                                                                       B: Captures network topology



Control-theoretic approach to scheduling:
Dimension reduced from a linear program of dimension 1116 ...
                                    to an HJB equation of dimension 16
                                                                               Does this solve the problem?


                                                                                                            7 / 26
Complex Networks




                   Uncongested
                   Congested
                   Highly Congested



Complex Networks

                                      8 / 26
Complex Networks




                       Uncongested
                       Congested
                       Highly Congested



Complex Networks
              First, a review of some control theory...
                                                    8 / 26
Complex Networks


Dynamic Programming Equations
Deterministic model   x = f (x, u)
                      ˙




                                                9 / 26
Complex Networks


Dynamic Programming Equations
Deterministic model   x = f (x, u)
                      ˙



Controlled generator
             d
Du h (x) =   dt h(x(t))    t=0
                          x(0)=x
                          u(0)=u




                                                 9 / 26
Complex Networks


Dynamic Programming Equations
Deterministic model   x = f (x, u)
                      ˙



Controlled generator
             d
Du h (x) =   dt h(x(t))    t=0     = f (x, u) ·   h (x)
                          x(0)=x
                          u(0)=u




                                                          9 / 26
Complex Networks


Dynamic Programming Equations
Deterministic model   x = f (x, u)
                      ˙



Controlled generator
             d
Du h (x) =   dt h(x(t))    t=0      = f (x, u) ·      h (x)
                          x(0)=x
                          u(0)=u




Minimal total cost:
                                       ∞
                J ∗ (x) = inf              c(x(t), u(t)) dt ,   x(0) = x
                            U      0
HJB Equation:
                           min c(x, u) + Du J ∗ (x) = 0
                            u



                                                                           9 / 26
Complex Networks


Dynamic Programming Equations
Diffusion model   dX = f (X, U )dt + σ(X)dN

Controlled generator
                      d
         Du h (x) =      E[h(X(t))]     t=0
                      dt               x(0)=x
                                       u(0)=u

                                                          2
                  = f (x, u) ·    h (x) + 1 trace σ(x)T
                                          2                   h (x)σ(x)




                                                                          10 / 26
Complex Networks


Dynamic Programming Equations
Diffusion model   dX = f (X, U )dt + σ(X)dN

Controlled generator
                      d
         Du h (x) =      E[h(X(t))]     t=0
                      dt               x(0)=x
                                       u(0)=u

                                                                    2
                  = f (x, u) ·    h (x) + 1 trace σ(x)T
                                          2                             h (x)σ(x)

Minimal average cost:
                                                 T
                                   1
                    η ∗ = inf lim                    c(X(t), U (t)) dt
                          U   T →∞ T         0




                                                                                    10 / 26
Complex Networks


Dynamic Programming Equations
Diffusion model   dX = f (X, U )dt + σ(X)dN

Controlled generator
                      d
         Du h (x) =      E[h(X(t))]        t=0
                      dt                  x(0)=x
                                          u(0)=u

                                                                       2
                  = f (x, u) ·       h (x) + 1 trace σ(x)T
                                             2                             h (x)σ(x)

Minimal average cost:
                                                    T
                                      1
                    η ∗ = inf lim                       c(X(t), U (t)) dt
                             U   T →∞ T         0
ACOE (Average Cost Optimality Equation):

                       min c(x, u) + Du h∗ (x) = η ∗
                         u

                                                           h∗ is the relative value function
                                                                                        10 / 26
Complex Networks


Dynamic Programming Equations
MDP model   X(t + 1) − X(t) = f (X(t), U (t), N (t + 1))


Controlled generator
                  Du h (x) = E[h(X(1)) − h(X(0))]
                            = E[h(x + f (x, u, N ))] − h(x)




                                                              11 / 26
Complex Networks


Dynamic Programming Equations
MDP model   X(t + 1) − X(t) = f (X(t), U (t), N (t + 1))


Controlled generator
                  Du h (x) = E[h(X(1)) − h(X(0))]
                                 = E[h(x + f (x, u, N ))] − h(x)

Minimal average cost:
                                                  T −1
                       ∗         1
                     η = inf lim                         c(X(t), U (t))
                          U T →∞ T
                                                   0

ACOE (Average Cost Optimality Equation):

                       min c(x, u) + Du h∗ (x) = η ∗
                           u

                                                           h∗ is the relative value function
                                                                                        11 / 26
Complex Networks


Approximate Dynamic Programming
ODE model from the MDP model,     X(t + 1) − X(t) = f (X(t), U (t), N (t + 1))




Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u]




                                                                                 12 / 26
Complex Networks


Approximate Dynamic Programming
ODE model from the MDP model,     X(t + 1) − X(t) = f (X(t), U (t), N (t + 1))




Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u]

Fluid Model:     x(t) = f (x(t), u(t))
                 ˙




                                                                                 12 / 26
Complex Networks


Approximate Dynamic Programming
ODE model from the MDP model,     X(t + 1) − X(t) = f (X(t), U (t), N (t + 1))




Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u]

Fluid Model:      x(t) = f (x(t), u(t))
                  ˙
First-order Taylor series approximation:

                  Du h (x) = E[h(x + f (x, u, N ))] − h(x)
                           ≈ f (x, u) ·     h (x)




                                                                                 12 / 26
Complex Networks


Approximate Dynamic Programming
ODE model from the MDP model,     X(t + 1) − X(t) = f (X(t), U (t), N (t + 1))




Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u]

Fluid Model:      x(t) = f (x(t), u(t))
                  ˙
First-order Taylor series approximation:

                  Du h (x) = E[h(x + f (x, u, N ))] − h(x)
                           ≈ f (x, u) ·     h (x)

                                     A second-order Taylor series expansion
                                     leads to a Diffusion Model.


                                                                                 12 / 26
Complex Networks


ADP for Stochastic Networks
Conclusions as of April 21, 2011



   Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1)

                       d
        Fluid Model:      q(t) = Bu(t) + α       Cost c(x, u) = |x|
                       dt
                                                 Relative value function h∗
                                                 Total cost value function J ∗




                                                                            13 / 26
Complex Networks


ADP for Stochastic Networks
Conclusions as of April 21, 2011



   Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1)

                                   d
            Fluid Model:              q(t) = Bu(t) + α                        Cost c(x, u) = |x|
                                   dt
                                                                              Relative value function h∗
                                                                              Total cost value function J ∗

  Inventory model: Controlled work-release, controlled routing,
                       uncertain demand
                                                                       q: Queue length evolves on R16 .
                                                                                                   +

                                                                       u: Scheduling/routing decisions —
                                                            demand 1
                                                                          Convex relaxation
                                                                       α: Mean exogenous arrivals of work
 demand 2

                                                                       B: Captures network topology

                                                                                                           13 / 26
Complex Networks


ADP for Stochastic Networks
Conclusions as of April 21, 2011



   Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1)

                             d
        Fluid Model:            q(t) = Bu(t) + α             Cost c(x, u) = |x|
                             dt
                                                             Relative value function h∗
                                                             Total cost value function J ∗

Key conclusions – analytical
    Stability of q implies stochastic stability of Q              Dai, Dai & M. 1995

      h∗ (x)   ≈   J ∗ (x)   for large |x|    M. 1996–2011

      In many cases, the translation of the optimal policy for q is
      approximately optimal, with logarithmic regret M. 2005 & 2009


                                                                                        14 / 26
Complex Networks


ADP for Stochastic Networks
Conclusions as of April 21, 2011



   Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1)

                        d
        Fluid Model:       q(t) = Bu(t) + α               Cost c(x, u) = |x|
                        dt
                                                          Relative value function h∗
                                                          Total cost value function J ∗

Key conclusions – engineering
    Stability of q implies stochastic stability of Q
      Simple decentralized policies based on q            Tassiulas, 1995 –

      Workload relaxation for model reduction
                     M. 2003 –, following “heavy traffic” theory: Laws, Kelly, Harrison, Dai, ...

      Intuition regarding structure of good policies
                                                                                          15 / 26
Complex Networks


ADP for Stochastic Networks
Workload Relaxations
                                                                                               R STO   R∗
     Inventory model: Controlled work-release, controlled routing,             50

                          uncertain demand
                                                                          w2




                                                               demand 1

                                                                                0


    demand 2


                                                                               -20

                                                                                     -20   0           50
                                                                                                            w1


Workload process: W evolves on R2
Relaxation: Only lower bounds on rates are preserved
Effective cost: c(w) is the minimum of c(x), over all x consistent w.
               ¯




                                                                                                             16 / 26
Complex Networks


ADP for Stochastic Networks
Workload Relaxations
                                                                                               R STO   R∗
     Inventory model: Controlled work-release, controlled routing,             50

                          uncertain demand
                                                                          w2




                                                               demand 1

                                                                                0


    demand 2


                                                                               -20

                                                                                     -20   0           50
                                                                                                            w1


Workload process: W evolves on R2
Relaxation: Only lower bounds on rates are preserved
Effective cost: c(w) is the minimum of c(x), over all x consistent w.
               ¯
Optimal policy for fluid relaxation: Non-idling on region R∗
Optimal policy for stochastic relaxation: Introduce hedging

                                                                                                             16 / 26
Complex Networks


ADP for Stochastic Networks
Policy translation
                                                                                               R STO   R∗
     Inventory model: Controlled work-release, controlled routing,             50

                          uncertain demand
                                                                          w2




                                                               demand 1

                                                                                0


    demand 2


                                                                               -20

                                                                                     -20   0           50
                                                                                                            w1


 Complete Policy Synthesis
1. Optimal control of relaxation
2. Translation to physical system:
   2a. Achieve the approximation c(Q(t)) ≈ c(W (t))
                                            ¯
   2b. Address boundary constraints ignored in fluid approximations


                                                                                                             17 / 26
Complex Networks


ADP for Stochastic Networks
Policy translation
                                                                                                          R STO   R∗
     Inventory model: Controlled work-release, controlled routing,             50

                          uncertain demand
                                                                          w2




                                                               demand 1

                                                                                0


    demand 2


                                                                               -20

                                                                                     -20       0                  50
                                                                                                                       w1


 Complete Policy Synthesis
1. Optimal control of relaxation
2. Translation to physical system:
   2a. Achieve the approximation c(Q(t)) ≈ c(W (t))
                                            ¯
   2b. Address boundary constraints ignored in fluid approximations
                                                                                           achieved using safety stocks.

                                                                                                                        17 / 26
Architectures for Adaptation & Learning




                                                                                                                                                                    Singular Perturbations
                                                Mean-Field Games                                                                                                      Workload Relaxations
                                                1
(individual state)
                      (ensemble state)


                                                                                                                                                                                             q1                               q5

                                                                                                                                                                                             q2                               q6
                                                                                                                                          Agent 5                                                                                                            q 13 q 15
                                                0                                                                                         barely controllable                                q3                               q7




                                                                                                                                                                                                     Station 1




                                                                                                                                                                                                                                       Station 2
                                                                                                                                                                                                                                                                                 d1
                                                                                                                                                                                                                              q8
                                                                                                                        Agent 4
                                                                                                                                                                              q 16   q 14                        q4                                q9       q 12
                                                -1                                                                                           4                      d2
                                                     0         1         2     3        4       5          6       7    8    9         10 x 10
                                                                                                                                                                                                                                                              Station 5

                                                                                                                                                                                                                 q 11     µ 10a                    q 10
                                                                                                                                                                                                                          µ 10b
                                                                                                                                                                                                  Station 4                          Station 3


                                          Fluid model                                                                                                                                                                                                          R STO        R∗
                                                                                                                                                                                                                        50
                                                                                                                                                                                                                  w2


                     12.6
                                             Di usion model
                                         Average
                                         Cost
                     12.4
                                                                   Standard VIA                                                   1

                                                                   Initialized with quadratic                                                            Optimal policy               0.06
                     12.2

                                                                   Initialized with optimal uid value function
                       12                                                                                                                                                             0.05                               0


                     11.8                                                                                                                                                             0.04

                     11.6                                                                                                         0
                                                                                                                                                                                      0.03                              -20

                     11.4
                                                                                                                                                                                                                               -20                      0                   50
                                                                                                                                                                                      0.02
                     11.2                                                                                                                                                                                                                                              w1
                                                                                                                                                                                      0.01
                       11
                                           50            100       150        200       250         300   Iteration n
                                                                                                                                 −1
                                                                                                                                  −1                0                     1




                                                                                                      Adaptation & Learning
                                                                                                                                                                                                                                                                                      18 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: Q-learning




ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗
                         u
      h∗ : Relative value function
      η ∗ : Minimal average cost




                                                        19 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: Q-learning




ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗
                         u
      h∗ : Relative value function
      η ∗ : Minimal average cost
      “Q-function”: Q∗ (x, u) = c(x, u) + Du h∗ (x)
                                           Watkins 1989 ... “Machine Intelligence Lab”@ece.ufl.edu




                                                                                             19 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: Q-learning




ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗
                         u
      h∗ : Relative value function
      η ∗ : Minimal average cost
      “Q-function”: Q∗ (x, u) = c(x, u) + Du h∗ (x)
                                           Watkins 1989 ... “Machine Intelligence Lab”@ece.ufl.edu


Q-Learning: Given parameterized family {Qθ : θ ∈ Rd }.
Qθ is an approximation of the Q-function, or Hamiltonian                       Mehta & M. 2009




                                                                                             19 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: Q-learning




ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗
                         u
      h∗ : Relative value function
      η ∗ : Minimal average cost
      “Q-function”: Q∗ (x, u) = c(x, u) + Du h∗ (x)
                                           Watkins 1989 ... “Machine Intelligence Lab”@ece.ufl.edu


Q-Learning: Given parameterized family {Qθ : θ ∈ Rd }.
Qθ is an approximation of the Q-function, or Hamiltonian                       Mehta & M. 2009

Compute    θ∗   based on observations — without using a system model.



                                                                                             19 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: TD-learning


Value functions: For a given policy U (t) = φ(X(t)),
                                                     T
                                       1
                           η = lim                       c(X(t), U (t)) dt
                                  T →∞ T         0
Poisson’s equation: h is again called a relative value function,

                               c(x, u) + Du h (x)                      =η
                                                              u=φ(x)




                                                                             20 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: TD-learning


Value functions: For a given policy U (t) = φ(X(t)),
                                                     T
                                       1
                           η = lim                       c(X(t), U (t)) dt
                                  T →∞ T         0
Poisson’s equation: h is again called a relative value function,

                               c(x, u) + Du h (x)                      =η
                                                              u=φ(x)


TD-Learning: Given parameterized family {hθ : θ ∈ Rd }.

        min{ h − hθ : θ ∈ Rd }                       Sutton 1988, Tsitsiklis & Van Roy, 1997




                                                                                               20 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: TD-learning


Value functions: For a given policy U (t) = φ(X(t)),
                                                     T
                                       1
                           η = lim                       c(X(t), U (t)) dt
                                  T →∞ T         0
Poisson’s equation: h is again called a relative value function,

                               c(x, u) + Du h (x)                      =η
                                                              u=φ(x)


TD-Learning: Given parameterized family {hθ : θ ∈ Rd }.

        min{ h − hθ : θ ∈ Rd }                       Sutton 1988, Tsitsiklis & Van Roy, 1997

Compute θ∗ based on observations — without using a system model.

                                                                                               20 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: How do we choose a basis?




                                                            21 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: How do we choose a basis?


Basis selection: hθ (x) =             θi ψi (x)
ψ1 : Linearize
ψ2 : Fluid model with relaxation
ψ3 : Diffusion model with relaxation
ψ4 : Mean-field game




                                                            21 / 26
Architectures for Adaptation & Learning


Reinforcement Learning
Approximating a value function: How do we choose a basis?


Basis selection: hθ (x) =                         θi ψi (x)
ψ1 : Linearize
ψ2 : Fluid model with relaxation
ψ3 : Diffusion model with relaxation
ψ4 : Mean-field game

Examples: Decentralized control, nonlinear control, processor speed-scaling
     1
                                                 1
                                                                  Optimal policy       0.06            Approximate relative value function   h
                                                                                              15                               ∗
                                                                                                       Fluid value function J
                                                                                       0.05
                                                                                                                                 ∗
                                                                                                       Relative value function h
                                                                                       0.04
                                                                                              10
     0
                                                 0
                                                                                       0.03


                                                                                       0.02
                                                                                               5
                           Agent 4                                                     0.01


     -1                                          −1
                                             4    −1          0                    1           0
          0         5                10   x 10                                                     0                                             5



              Mean-Field Game                          Linearization                                             Fluid Model

                                                                                                                                                     21 / 26
Next Steps


                     Nodal Power Prices in NZ: $/MWh
              100
March 25:


               50




                0
                              4am            9am        2pm   7pm
                                                                    Otahuhu
            20,000
                                                                    Stratford
March 26:


            10,000




                0                                                   http://www.electricityinfo.co.nz/
                              4am            9am        2pm   7pm



                                           Next Steps
                                                                                                        22 / 26
Next Steps


Complex Systems
Mainly energy




                               23 / 26
Next Steps


Complex Systems
Mainly energy



Entropic Grid: Advances in systems theory...
  Complex systems: Model reduction specialized to tomorrow’s grid
                                  Short term operations and long-term planning
   Resource allocation: Controlling supply, storage, and demand
                                     Resource allocation with shared constraints.
   Statistics and learning: For planning and forecasting
                                                  Both rare and common events




                                                                             23 / 26
Next Steps


Complex Systems
Mainly energy



Entropic Grid: Advances in systems theory...
  Complex systems: Model reduction specialized to tomorrow’s grid
                                  Short term operations and long-term planning
   Resource allocation: Controlling supply, storage, and demand
                                     Resource allocation with shared constraints.
   Statistics and learning: For planning and forecasting
                                                  Both rare and common events
  Economics for an Entropic Grid: Incorporate dynamics and uncertainty
  in a strategic setting.
  How to create policies to protect participants on both sides of the
  market, while creating incentives for R&D on renewable energy?


                                                                             23 / 26
Next Steps


Complex Systems
Mainly energy




How to create policies to protect participants on both sides of the market,
while creating incentives for R&D on renewable energy?
Our community must consider long-term planning and policy, along with
traditional systems operations




                                                                        24 / 26
Next Steps


Complex Systems
Mainly energy




How to create policies to protect participants on both sides of the market,
while creating incentives for R&D on renewable energy?
Our community must consider long-term planning and policy, along with
traditional systems operations

Planning and Policy, includes Markets & Competition




                                                                        24 / 26
Next Steps


Complex Systems
Mainly energy




How to create policies to protect participants on both sides of the market,
while creating incentives for R&D on renewable energy?
Our community must consider long-term planning and policy, along with
traditional systems operations

Planning and Policy, includes Markets & Competition
    Evolution?




                                                                        24 / 26
Next Steps


Complex Systems
Mainly energy




How to create policies to protect participants on both sides of the market,
while creating incentives for R&D on renewable energy?
Our community must consider long-term planning and policy, along with
traditional systems operations

Planning and Policy, includes Markets & Competition
    Evolution? Too slow!




                                                                        24 / 26
Next Steps


Complex Systems
Mainly energy




How to create policies to protect participants on both sides of the market,
while creating incentives for R&D on renewable energy?
Our community must consider long-term planning and policy, along with
traditional systems operations

Planning and Policy, includes Markets & Competition
    Evolution? Too slow!
    What we need is Intelligent Design




                                                                        24 / 26
Next Steps


Conclusions
The control community has created many techniques for understanding
complex systems, and a valuable philosophy for thinking about control
design




                                                                        25 / 26
Next Steps


Conclusions
The control community has created many techniques for understanding
complex systems, and a valuable philosophy for thinking about control
design
In particular, stylized models can have great value:
   Insight in formulation of control policies
  Analysis of closed loop behavior, such as stability via ODE methods
  Architectures for learning algorithms
  Building bridges between OR, CS, and control disciplines
  The ideas surveyed here arose from partnerships with researchers in
  mathematics, economics, computer science, and operations research.




                                                                        25 / 26
Next Steps


Conclusions
The control community has created many techniques for understanding
complex systems, and a valuable philosophy for thinking about control
design
In particular, stylized models can have great value:
   Insight in formulation of control policies
  Analysis of closed loop behavior, such as stability via ODE methods
  Architectures for learning algorithms
  Building bridges between OR, CS, and control disciplines
  The ideas surveyed here arose from partnerships with researchers in
  mathematics, economics, computer science, and operations research.

Besides the many technical open questions, my hope is to extend the
application of these ideas to long-range planning, especially in applications
to sustainable energy.
                                                                         25 / 26
Next Steps


References

   S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press,
   Cambridge, 2007.
   S. P. Meyn and R. L. Tweedie. Markov chains and stochastic stability. Second edition,
   Cambridge University Press – Cambridge Mathematical Library, 2009.
   S. Meyn. Stability and asymptotic optimality of generalized MaxWeight policies. SIAM J.
   Control Optim., 47(6):3259–3294, 2009.
   V. S. Borkar and S. P. Meyn. The ODE method for convergence of stochastic
   approximation and reinforcement learning. SIAM J. Control Optim., 38(2):447–469, 2000.
   S. P. Meyn. Sequencing and routing in multiclass queueing networks. Part II: Workload
   relaxations. SIAM J. Control Optim., 42(1):178–217, 2003.
   P. G. Mehta and S. P. Meyn. Q-learning and Pontryagin’s minimum principle. In Proc. of
   the 48th IEEE Conf. on Dec. and Control, pp. 3598–3605, Dec. 2009.
   W. Chen, D. Huang, A. A. Kulkarni, J. Unnikrishnan, Q. Zhu, P. Mehta, S. Meyn, and
   A. Wierman. Approximate dynamic programming using fluid and diffusion approximations
   with applications to power management. In Proc. of the 48th IEEE Conf. on Dec. and
   Control, pp. 3575–3580, Dec. 2009.


                                                                                           26 / 26

Weitere ähnliche Inhalte

Was ist angesagt?

State equations for physical systems
State equations for physical systemsState equations for physical systems
State equations for physical systems
Sarah Krystelle
 
systems of linear equations & matrices
systems of linear equations & matricessystems of linear equations & matrices
systems of linear equations & matrices
Student
 

Was ist angesagt? (20)

Time domain specifications of second order system
Time domain specifications of second order systemTime domain specifications of second order system
Time domain specifications of second order system
 
SPSF04 - Euler and Runge-Kutta Methods
SPSF04 - Euler and Runge-Kutta MethodsSPSF04 - Euler and Runge-Kutta Methods
SPSF04 - Euler and Runge-Kutta Methods
 
Tcsc
TcscTcsc
Tcsc
 
Diagonalization of matrix
Diagonalization of matrixDiagonalization of matrix
Diagonalization of matrix
 
Control engineering module 5 18me71
Control engineering  module 5 18me71Control engineering  module 5 18me71
Control engineering module 5 18me71
 
Conversion of transfer function to canonical state variable models
Conversion of transfer function to canonical state variable modelsConversion of transfer function to canonical state variable models
Conversion of transfer function to canonical state variable models
 
Kane’s Method for Robotic Arm Dynamics: a Novel Approach
Kane’s Method for Robotic Arm Dynamics: a Novel ApproachKane’s Method for Robotic Arm Dynamics: a Novel Approach
Kane’s Method for Robotic Arm Dynamics: a Novel Approach
 
Time response first order
Time response first orderTime response first order
Time response first order
 
Matrix presentation By DHEERAJ KATARIA
Matrix presentation By DHEERAJ KATARIAMatrix presentation By DHEERAJ KATARIA
Matrix presentation By DHEERAJ KATARIA
 
System dynamics ch 1
System dynamics ch 1System dynamics ch 1
System dynamics ch 1
 
State equations for physical systems
State equations for physical systemsState equations for physical systems
State equations for physical systems
 
Statcom
StatcomStatcom
Statcom
 
GATE Engineering Maths : System of Linear Equations
GATE Engineering Maths : System of Linear EquationsGATE Engineering Maths : System of Linear Equations
GATE Engineering Maths : System of Linear Equations
 
systems of linear equations & matrices
systems of linear equations & matricessystems of linear equations & matrices
systems of linear equations & matrices
 
Two port network
Two port networkTwo port network
Two port network
 
Modern Control - Lec07 - State Space Modeling of LTI Systems
Modern Control - Lec07 - State Space Modeling of LTI SystemsModern Control - Lec07 - State Space Modeling of LTI Systems
Modern Control - Lec07 - State Space Modeling of LTI Systems
 
P1111223206
P1111223206P1111223206
P1111223206
 
First order response
First order responseFirst order response
First order response
 
state space representation,State Space Model Controllability and Observabilit...
state space representation,State Space Model Controllability and Observabilit...state space representation,State Space Model Controllability and Observabilit...
state space representation,State Space Model Controllability and Observabilit...
 
Coordinate transformation
Coordinate transformationCoordinate transformation
Coordinate transformation
 

Andere mochten auch

2012 Tutorial: Markets for Differentiated Electric Power Products
2012 Tutorial:  Markets for Differentiated Electric Power Products2012 Tutorial:  Markets for Differentiated Electric Power Products
2012 Tutorial: Markets for Differentiated Electric Power Products
Sean Meyn
 
Distributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power GridDistributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power Grid
Sean Meyn
 
control technique's
control technique'scontrol technique's
control technique's
chetan birla
 

Andere mochten auch (8)

2012 Tutorial: Markets for Differentiated Electric Power Products
2012 Tutorial:  Markets for Differentiated Electric Power Products2012 Tutorial:  Markets for Differentiated Electric Power Products
2012 Tutorial: Markets for Differentiated Electric Power Products
 
Distributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power GridDistributed Randomized Control for Ancillary Service to the Power Grid
Distributed Randomized Control for Ancillary Service to the Power Grid
 
State estimation and Mean-Field Control with application to demand dispatch
State estimation and Mean-Field Control with application to demand dispatchState estimation and Mean-Field Control with application to demand dispatch
State estimation and Mean-Field Control with application to demand dispatch
 
Otto Challenge report
Otto Challenge reportOtto Challenge report
Otto Challenge report
 
Kaggle Otto Group
Kaggle Otto GroupKaggle Otto Group
Kaggle Otto Group
 
control technique's
control technique'scontrol technique's
control technique's
 
Smart city hackathon
Smart city hackathonSmart city hackathon
Smart city hackathon
 
control techniques
control techniquescontrol techniques
control techniques
 

Ähnlich wie Control Techniques for Complex Systems

fauvel_igarss.pdf
fauvel_igarss.pdffauvel_igarss.pdf
fauvel_igarss.pdf
grssieee
 
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
The Statistical and Applied Mathematical Sciences Institute
 
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Chyi-Tsong Chen
 
Container and algorithms(C++)
Container and algorithms(C++)Container and algorithms(C++)
Container and algorithms(C++)
JerryHe
 
Parametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damperParametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damper
MidoOoz
 
Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...
Université de Liège (ULg)
 
Molecular models, threads and you
Molecular models, threads and youMolecular models, threads and you
Molecular models, threads and you
Jiahao Chen
 

Ähnlich wie Control Techniques for Complex Systems (20)

Ydstie
YdstieYdstie
Ydstie
 
fauvel_igarss.pdf
fauvel_igarss.pdffauvel_igarss.pdf
fauvel_igarss.pdf
 
Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03
 
Higham
HighamHigham
Higham
 
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
Intelligent Process Control Using Neural Fuzzy Techniques ~陳奇中教授演講投影片
 
Container and algorithms(C++)
Container and algorithms(C++)Container and algorithms(C++)
Container and algorithms(C++)
 
Parametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damperParametric time domain system identification of a mass spring-damper
Parametric time domain system identification of a mass spring-damper
 
Andres hernandez ai_machine_learning_london_nov2017
Andres hernandez ai_machine_learning_london_nov2017Andres hernandez ai_machine_learning_london_nov2017
Andres hernandez ai_machine_learning_london_nov2017
 
Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...Computing near-optimal policies from trajectories by solving a sequence of st...
Computing near-optimal policies from trajectories by solving a sequence of st...
 
Direct policy search
Direct policy searchDirect policy search
Direct policy search
 
Robust Repositioning in Large-scale Networks
Robust Repositioning in Large-scale NetworksRobust Repositioning in Large-scale Networks
Robust Repositioning in Large-scale Networks
 
Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009
 
Molecular models, threads and you
Molecular models, threads and youMolecular models, threads and you
Molecular models, threads and you
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Index Determination in DAEs using the Library indexdet and the ADOL-C Package...
Index Determination in DAEs using the Library indexdet and the ADOL-C Package...Index Determination in DAEs using the Library indexdet and the ADOL-C Package...
Index Determination in DAEs using the Library indexdet and the ADOL-C Package...
 
Note 11
Note 11Note 11
Note 11
 
Java cơ bản java co ban
Java cơ bản java co ban Java cơ bản java co ban
Java cơ bản java co ban
 

Mehr von Sean Meyn

Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Sean Meyn
 
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
Sean Meyn
 
The Value of Volatile Resources... Caltech, May 6 2010
The Value of Volatile Resources... Caltech, May 6 2010The Value of Volatile Resources... Caltech, May 6 2010
The Value of Volatile Resources... Caltech, May 6 2010
Sean Meyn
 

Mehr von Sean Meyn (20)

Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
 
DeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdfDeepLearn2022 1. Goals & AlgorithmDesign.pdf
DeepLearn2022 1. Goals & AlgorithmDesign.pdf
 
DeepLearn2022 3. TD and Q Learning
DeepLearn2022 3. TD and Q LearningDeepLearn2022 3. TD and Q Learning
DeepLearn2022 3. TD and Q Learning
 
DeepLearn2022 2. Variance Matters
DeepLearn2022  2. Variance MattersDeepLearn2022  2. Variance Matters
DeepLearn2022 2. Variance Matters
 
Smart Grid Tutorial - January 2019
Smart Grid Tutorial - January 2019Smart Grid Tutorial - January 2019
Smart Grid Tutorial - January 2019
 
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
State Space Collapse in Resource Allocation for Demand Dispatch - May 2019
 
Irrational Agents and the Power Grid
Irrational Agents and the Power GridIrrational Agents and the Power Grid
Irrational Agents and the Power Grid
 
Zap Q-Learning - ISMP 2018
Zap Q-Learning - ISMP 2018Zap Q-Learning - ISMP 2018
Zap Q-Learning - ISMP 2018
 
Introducing Zap Q-Learning
Introducing Zap Q-Learning   Introducing Zap Q-Learning
Introducing Zap Q-Learning
 
Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast AlgorithmsReinforcement Learning: Hidden Theory and New Super-Fast Algorithms
Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
 
Demand-Side Flexibility for Reliable Ancillary Services
Demand-Side Flexibility for Reliable Ancillary ServicesDemand-Side Flexibility for Reliable Ancillary Services
Demand-Side Flexibility for Reliable Ancillary Services
 
Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...
Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...
Spectral Decomposition of Demand-Side Flexibility for Reliable Ancillary Serv...
 
Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...
Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...
Demand-Side Flexibility for Reliable Ancillary Services in a Smart Grid: Elim...
 
Why Do We Ignore Risk in Power Economics?
Why Do We Ignore Risk in Power Economics?Why Do We Ignore Risk in Power Economics?
Why Do We Ignore Risk in Power Economics?
 
Ancillary service to the grid from deferrable loads: the case for intelligent...
Ancillary service to the grid from deferrable loads: the case for intelligent...Ancillary service to the grid from deferrable loads: the case for intelligent...
Ancillary service to the grid from deferrable loads: the case for intelligent...
 
Tutorial for Energy Systems Week - Cambridge 2010
Tutorial for Energy Systems Week - Cambridge 2010Tutorial for Energy Systems Week - Cambridge 2010
Tutorial for Energy Systems Week - Cambridge 2010
 
Panel Lecture for Energy Systems Week
Panel Lecture for Energy Systems WeekPanel Lecture for Energy Systems Week
Panel Lecture for Energy Systems Week
 
The Value of Volatile Resources... Caltech, May 6 2010
The Value of Volatile Resources... Caltech, May 6 2010The Value of Volatile Resources... Caltech, May 6 2010
The Value of Volatile Resources... Caltech, May 6 2010
 
Approximate dynamic programming using fluid and diffusion approximations with...
Approximate dynamic programming using fluid and diffusion approximations with...Approximate dynamic programming using fluid and diffusion approximations with...
Approximate dynamic programming using fluid and diffusion approximations with...
 
Anomaly Detection Using Projective Markov Models
Anomaly Detection Using Projective Markov ModelsAnomaly Detection Using Projective Markov Models
Anomaly Detection Using Projective Markov Models
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Control Techniques for Complex Systems

  • 1. Control Techniques for Complex Systems Department of Electrical & Computer Engineering University of Florida Sean P. Meyn Coordinated Science Laboratory and the Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign, USA April 21, 2011 1 / 26
  • 2. Outline Control Techniques Markov Chains FOR and Complex Networks Stochastic Stability P n (x, · ) − π f →0 sup Ex [SτC (f )] < ∞ C π(f ) < ∞ 1 Control Techniques ∆V (x) ≤ −f (x) + bIC (x) Sean Meyn S. P. Meyn and R. L. Tweedie 2 Complex Networks 3 Architectures for Adaptation & Learning 4 Next Steps 2 / 26
  • 3. Control Techniques System model d α = µ σ −Cα + . . . dt d q = 1 µ I −1 (C − . . . 2 dt d θ=q dt ??? Control Techniques? 3 / 26
  • 4. Control Techniques Typical steps to control design Obtain simple model that captures System model essential structure d dt α = µ σ −Cα + . . . d – An equilibrium model if the goal is regulation dt q = 1 µ I −1 (C − . . . 2 d θ=q dt ??? 4 / 26
  • 5. Control Techniques Typical steps to control design Obtain simple model that captures System model essential structure d dt α = µ σ −Cα + . . . d – An equilibrium model if the goal is regulation dt q = 1 µ I −1 (C − . . . 2 d θ=q dt ??? Obtain feedback design, using dynamic programming, LQG, loop shaping, ... Design for performance and reliability Test via simulations and experiments, and refine design 4 / 26
  • 6. Control Techniques Typical steps to control design Obtain simple model that captures System model essential structure d dt α = µ σ −Cα + . . . d – An equilibrium model if the goal is regulation dt q = 1 µ I −1 (C − . . . 2 d θ=q dt ??? Obtain feedback design, using dynamic programming, LQG, loop shaping, ... Design for performance and reliability Test via simulations and experiments, and refine design If these steps fail, we may have to re-engineer the system (e.g., introduce new sensors), and start over. 4 / 26
  • 7. Control Techniques Typical steps to control design Obtain simple model that captures System model essential structure d dt α = µ σ −Cα + . . . d – An equilibrium model if the goal is regulation dt q = 1 µ I −1 (C − . . . 2 d θ=q dt ??? Obtain feedback design, using dynamic programming, LQG, loop shaping, ... Design for performance and reliability Test via simulations and experiments, and refine design If these steps fail, we may have to re-engineer the system (e.g., introduce new sensors), and start over. This point of view is unique to control 4 / 26
  • 8. Control Techniques Typical steps to scheduling Inventory model: Controlled work-release, controlled routing, uncertain demand A simplified model of a semiconductor manufacturing facility Similar demand-driven models can be used demand 1 to model allocation of locational reserves in a power grid demand 2 5 / 26
  • 9. Control Techniques Typical steps to scheduling Inventory model: Controlled work-release, controlled routing, uncertain demand A simplified model of a semiconductor manufacturing facility Similar demand-driven models can be used demand 1 to model allocation of locational reserves in a power grid demand 2 Obtain simple model – Frequently based on simple statistics to obtain a Markov model Obtain feedback design based on heuristics, or dynamic programming Performance evaluation via computation (e.g., Neuts’ matrix-geometric methods) 5 / 26
  • 10. Control Techniques Typical steps to scheduling Inventory model: Controlled work-release, controlled routing, uncertain demand A simplified model of a semiconductor manufacturing facility. Similar demand-driven models can be used demand 1 to model allocation of locational reserves in a power grid demand 2 Difficulty : A Markov model is not simple enough! Obtain simple model – Frequently based on exponential statistics to obtain a Markov model Obtain feedback design based on heuristics, or dynamic programming Performance evaluation via computation (e.g., Neut’s matrix-geometric methods) With the 16 buffers truncated to 0 ≤ x ≤ 10, 6 / 26
  • 11. Control Techniques Typical steps to scheduling Inventory model: Controlled work-release, controlled routing, uncertain demand A simplified model of a semiconductor manufacturing facility. Similar demand-driven models can be used demand 1 to model allocation of locational reserves in a power grid demand 2 Difficulty : A Markov model is not simple enough! Obtain simple model – Frequently based on exponential statistics to obtain a Markov model Obtain feedback design based on heuristics, or dynamic programming Performance evaluation via computation (e.g., Neut’s matrix-geometric methods) With the 16 buffers truncated to 0 ≤ x ≤ 10, policy synthesis reduces to a linear program of dimension 1116 ! 6 / 26
  • 12. Control Techniques Control-theoretic approach to scheduling d dt q = Bu + α Inventory model: Controlled work-release, controlled routing, uncertain demand q: Queue length evolves on R16 . + u: Scheduling/routing decisions — demand 1 Convex relaxation demand 2 α: Mean exogenous arrivals of work B: Captures network topology 7 / 26
  • 13. Control Techniques Control-theoretic approach to scheduling d dt q = Bu + α Inventory model: Controlled work-release, controlled routing, uncertain demand q: Queue length evolves on R16 . + u: Scheduling/routing decisions — demand 1 Convex relaxation demand 2 α: Mean exogenous arrivals of work B: Captures network topology Control-theoretic approach to scheduling: Dimension reduced from a linear program of dimension 1116 ... to an HJB equation of dimension 16 7 / 26
  • 14. Control Techniques Control-theoretic approach to scheduling d dt q = Bu + α Inventory model: Controlled work-release, controlled routing, uncertain demand q: Queue length evolves on R16 . + u: Scheduling/routing decisions — demand 1 Convex relaxation demand 2 α: Mean exogenous arrivals of work B: Captures network topology Control-theoretic approach to scheduling: Dimension reduced from a linear program of dimension 1116 ... to an HJB equation of dimension 16 Does this solve the problem? 7 / 26
  • 15. Complex Networks Uncongested Congested Highly Congested Complex Networks 8 / 26
  • 16. Complex Networks Uncongested Congested Highly Congested Complex Networks First, a review of some control theory... 8 / 26
  • 17. Complex Networks Dynamic Programming Equations Deterministic model x = f (x, u) ˙ 9 / 26
  • 18. Complex Networks Dynamic Programming Equations Deterministic model x = f (x, u) ˙ Controlled generator d Du h (x) = dt h(x(t)) t=0 x(0)=x u(0)=u 9 / 26
  • 19. Complex Networks Dynamic Programming Equations Deterministic model x = f (x, u) ˙ Controlled generator d Du h (x) = dt h(x(t)) t=0 = f (x, u) · h (x) x(0)=x u(0)=u 9 / 26
  • 20. Complex Networks Dynamic Programming Equations Deterministic model x = f (x, u) ˙ Controlled generator d Du h (x) = dt h(x(t)) t=0 = f (x, u) · h (x) x(0)=x u(0)=u Minimal total cost: ∞ J ∗ (x) = inf c(x(t), u(t)) dt , x(0) = x U 0 HJB Equation: min c(x, u) + Du J ∗ (x) = 0 u 9 / 26
  • 21. Complex Networks Dynamic Programming Equations Diffusion model dX = f (X, U )dt + σ(X)dN Controlled generator d Du h (x) = E[h(X(t))] t=0 dt x(0)=x u(0)=u 2 = f (x, u) · h (x) + 1 trace σ(x)T 2 h (x)σ(x) 10 / 26
  • 22. Complex Networks Dynamic Programming Equations Diffusion model dX = f (X, U )dt + σ(X)dN Controlled generator d Du h (x) = E[h(X(t))] t=0 dt x(0)=x u(0)=u 2 = f (x, u) · h (x) + 1 trace σ(x)T 2 h (x)σ(x) Minimal average cost: T 1 η ∗ = inf lim c(X(t), U (t)) dt U T →∞ T 0 10 / 26
  • 23. Complex Networks Dynamic Programming Equations Diffusion model dX = f (X, U )dt + σ(X)dN Controlled generator d Du h (x) = E[h(X(t))] t=0 dt x(0)=x u(0)=u 2 = f (x, u) · h (x) + 1 trace σ(x)T 2 h (x)σ(x) Minimal average cost: T 1 η ∗ = inf lim c(X(t), U (t)) dt U T →∞ T 0 ACOE (Average Cost Optimality Equation): min c(x, u) + Du h∗ (x) = η ∗ u h∗ is the relative value function 10 / 26
  • 24. Complex Networks Dynamic Programming Equations MDP model X(t + 1) − X(t) = f (X(t), U (t), N (t + 1)) Controlled generator Du h (x) = E[h(X(1)) − h(X(0))] = E[h(x + f (x, u, N ))] − h(x) 11 / 26
  • 25. Complex Networks Dynamic Programming Equations MDP model X(t + 1) − X(t) = f (X(t), U (t), N (t + 1)) Controlled generator Du h (x) = E[h(X(1)) − h(X(0))] = E[h(x + f (x, u, N ))] − h(x) Minimal average cost: T −1 ∗ 1 η = inf lim c(X(t), U (t)) U T →∞ T 0 ACOE (Average Cost Optimality Equation): min c(x, u) + Du h∗ (x) = η ∗ u h∗ is the relative value function 11 / 26
  • 26. Complex Networks Approximate Dynamic Programming ODE model from the MDP model, X(t + 1) − X(t) = f (X(t), U (t), N (t + 1)) Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u] 12 / 26
  • 27. Complex Networks Approximate Dynamic Programming ODE model from the MDP model, X(t + 1) − X(t) = f (X(t), U (t), N (t + 1)) Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u] Fluid Model: x(t) = f (x(t), u(t)) ˙ 12 / 26
  • 28. Complex Networks Approximate Dynamic Programming ODE model from the MDP model, X(t + 1) − X(t) = f (X(t), U (t), N (t + 1)) Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u] Fluid Model: x(t) = f (x(t), u(t)) ˙ First-order Taylor series approximation: Du h (x) = E[h(x + f (x, u, N ))] − h(x) ≈ f (x, u) · h (x) 12 / 26
  • 29. Complex Networks Approximate Dynamic Programming ODE model from the MDP model, X(t + 1) − X(t) = f (X(t), U (t), N (t + 1)) Mean drift: f (x, u) = E[X(t + 1) − X(t) | X(t) = x, U (t) = u] Fluid Model: x(t) = f (x(t), u(t)) ˙ First-order Taylor series approximation: Du h (x) = E[h(x + f (x, u, N ))] − h(x) ≈ f (x, u) · h (x) A second-order Taylor series expansion leads to a Diffusion Model. 12 / 26
  • 30. Complex Networks ADP for Stochastic Networks Conclusions as of April 21, 2011 Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1) d Fluid Model: q(t) = Bu(t) + α Cost c(x, u) = |x| dt Relative value function h∗ Total cost value function J ∗ 13 / 26
  • 31. Complex Networks ADP for Stochastic Networks Conclusions as of April 21, 2011 Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1) d Fluid Model: q(t) = Bu(t) + α Cost c(x, u) = |x| dt Relative value function h∗ Total cost value function J ∗ Inventory model: Controlled work-release, controlled routing, uncertain demand q: Queue length evolves on R16 . + u: Scheduling/routing decisions — demand 1 Convex relaxation α: Mean exogenous arrivals of work demand 2 B: Captures network topology 13 / 26
  • 32. Complex Networks ADP for Stochastic Networks Conclusions as of April 21, 2011 Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1) d Fluid Model: q(t) = Bu(t) + α Cost c(x, u) = |x| dt Relative value function h∗ Total cost value function J ∗ Key conclusions – analytical Stability of q implies stochastic stability of Q Dai, Dai & M. 1995 h∗ (x) ≈ J ∗ (x) for large |x| M. 1996–2011 In many cases, the translation of the optimal policy for q is approximately optimal, with logarithmic regret M. 2005 & 2009 14 / 26
  • 33. Complex Networks ADP for Stochastic Networks Conclusions as of April 21, 2011 Stochastic Model: Q(t + 1) − Q(t) = B(t + 1)U (t) + A(t + 1) d Fluid Model: q(t) = Bu(t) + α Cost c(x, u) = |x| dt Relative value function h∗ Total cost value function J ∗ Key conclusions – engineering Stability of q implies stochastic stability of Q Simple decentralized policies based on q Tassiulas, 1995 – Workload relaxation for model reduction M. 2003 –, following “heavy traffic” theory: Laws, Kelly, Harrison, Dai, ... Intuition regarding structure of good policies 15 / 26
  • 34. Complex Networks ADP for Stochastic Networks Workload Relaxations R STO R∗ Inventory model: Controlled work-release, controlled routing, 50 uncertain demand w2 demand 1 0 demand 2 -20 -20 0 50 w1 Workload process: W evolves on R2 Relaxation: Only lower bounds on rates are preserved Effective cost: c(w) is the minimum of c(x), over all x consistent w. ¯ 16 / 26
  • 35. Complex Networks ADP for Stochastic Networks Workload Relaxations R STO R∗ Inventory model: Controlled work-release, controlled routing, 50 uncertain demand w2 demand 1 0 demand 2 -20 -20 0 50 w1 Workload process: W evolves on R2 Relaxation: Only lower bounds on rates are preserved Effective cost: c(w) is the minimum of c(x), over all x consistent w. ¯ Optimal policy for fluid relaxation: Non-idling on region R∗ Optimal policy for stochastic relaxation: Introduce hedging 16 / 26
  • 36. Complex Networks ADP for Stochastic Networks Policy translation R STO R∗ Inventory model: Controlled work-release, controlled routing, 50 uncertain demand w2 demand 1 0 demand 2 -20 -20 0 50 w1 Complete Policy Synthesis 1. Optimal control of relaxation 2. Translation to physical system: 2a. Achieve the approximation c(Q(t)) ≈ c(W (t)) ¯ 2b. Address boundary constraints ignored in fluid approximations 17 / 26
  • 37. Complex Networks ADP for Stochastic Networks Policy translation R STO R∗ Inventory model: Controlled work-release, controlled routing, 50 uncertain demand w2 demand 1 0 demand 2 -20 -20 0 50 w1 Complete Policy Synthesis 1. Optimal control of relaxation 2. Translation to physical system: 2a. Achieve the approximation c(Q(t)) ≈ c(W (t)) ¯ 2b. Address boundary constraints ignored in fluid approximations achieved using safety stocks. 17 / 26
  • 38. Architectures for Adaptation & Learning Singular Perturbations Mean-Field Games Workload Relaxations 1 (individual state) (ensemble state) q1 q5 q2 q6 Agent 5 q 13 q 15 0 barely controllable q3 q7 Station 1 Station 2 d1 q8 Agent 4 q 16 q 14 q4 q9 q 12 -1 4 d2 0 1 2 3 4 5 6 7 8 9 10 x 10 Station 5 q 11 µ 10a q 10 µ 10b Station 4 Station 3 Fluid model R STO R∗ 50 w2 12.6 Di usion model Average Cost 12.4 Standard VIA 1 Initialized with quadratic Optimal policy 0.06 12.2 Initialized with optimal uid value function 12 0.05 0 11.8 0.04 11.6 0 0.03 -20 11.4 -20 0 50 0.02 11.2 w1 0.01 11 50 100 150 200 250 300 Iteration n −1 −1 0 1 Adaptation & Learning 18 / 26
  • 39. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: Q-learning ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗ u h∗ : Relative value function η ∗ : Minimal average cost 19 / 26
  • 40. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: Q-learning ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗ u h∗ : Relative value function η ∗ : Minimal average cost “Q-function”: Q∗ (x, u) = c(x, u) + Du h∗ (x) Watkins 1989 ... “Machine Intelligence Lab”@ece.ufl.edu 19 / 26
  • 41. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: Q-learning ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗ u h∗ : Relative value function η ∗ : Minimal average cost “Q-function”: Q∗ (x, u) = c(x, u) + Du h∗ (x) Watkins 1989 ... “Machine Intelligence Lab”@ece.ufl.edu Q-Learning: Given parameterized family {Qθ : θ ∈ Rd }. Qθ is an approximation of the Q-function, or Hamiltonian Mehta & M. 2009 19 / 26
  • 42. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: Q-learning ACOE Equation: min c(x, u) + Du h∗ (x) = η ∗ u h∗ : Relative value function η ∗ : Minimal average cost “Q-function”: Q∗ (x, u) = c(x, u) + Du h∗ (x) Watkins 1989 ... “Machine Intelligence Lab”@ece.ufl.edu Q-Learning: Given parameterized family {Qθ : θ ∈ Rd }. Qθ is an approximation of the Q-function, or Hamiltonian Mehta & M. 2009 Compute θ∗ based on observations — without using a system model. 19 / 26
  • 43. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: TD-learning Value functions: For a given policy U (t) = φ(X(t)), T 1 η = lim c(X(t), U (t)) dt T →∞ T 0 Poisson’s equation: h is again called a relative value function, c(x, u) + Du h (x) =η u=φ(x) 20 / 26
  • 44. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: TD-learning Value functions: For a given policy U (t) = φ(X(t)), T 1 η = lim c(X(t), U (t)) dt T →∞ T 0 Poisson’s equation: h is again called a relative value function, c(x, u) + Du h (x) =η u=φ(x) TD-Learning: Given parameterized family {hθ : θ ∈ Rd }. min{ h − hθ : θ ∈ Rd } Sutton 1988, Tsitsiklis & Van Roy, 1997 20 / 26
  • 45. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: TD-learning Value functions: For a given policy U (t) = φ(X(t)), T 1 η = lim c(X(t), U (t)) dt T →∞ T 0 Poisson’s equation: h is again called a relative value function, c(x, u) + Du h (x) =η u=φ(x) TD-Learning: Given parameterized family {hθ : θ ∈ Rd }. min{ h − hθ : θ ∈ Rd } Sutton 1988, Tsitsiklis & Van Roy, 1997 Compute θ∗ based on observations — without using a system model. 20 / 26
  • 46. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: How do we choose a basis? 21 / 26
  • 47. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: How do we choose a basis? Basis selection: hθ (x) = θi ψi (x) ψ1 : Linearize ψ2 : Fluid model with relaxation ψ3 : Diffusion model with relaxation ψ4 : Mean-field game 21 / 26
  • 48. Architectures for Adaptation & Learning Reinforcement Learning Approximating a value function: How do we choose a basis? Basis selection: hθ (x) = θi ψi (x) ψ1 : Linearize ψ2 : Fluid model with relaxation ψ3 : Diffusion model with relaxation ψ4 : Mean-field game Examples: Decentralized control, nonlinear control, processor speed-scaling 1 1 Optimal policy 0.06 Approximate relative value function h 15 ∗ Fluid value function J 0.05 ∗ Relative value function h 0.04 10 0 0 0.03 0.02 5 Agent 4 0.01 -1 −1 4 −1 0 1 0 0 5 10 x 10 0 5 Mean-Field Game Linearization Fluid Model 21 / 26
  • 49. Next Steps Nodal Power Prices in NZ: $/MWh 100 March 25: 50 0 4am 9am 2pm 7pm Otahuhu 20,000 Stratford March 26: 10,000 0 http://www.electricityinfo.co.nz/ 4am 9am 2pm 7pm Next Steps 22 / 26
  • 51. Next Steps Complex Systems Mainly energy Entropic Grid: Advances in systems theory... Complex systems: Model reduction specialized to tomorrow’s grid Short term operations and long-term planning Resource allocation: Controlling supply, storage, and demand Resource allocation with shared constraints. Statistics and learning: For planning and forecasting Both rare and common events 23 / 26
  • 52. Next Steps Complex Systems Mainly energy Entropic Grid: Advances in systems theory... Complex systems: Model reduction specialized to tomorrow’s grid Short term operations and long-term planning Resource allocation: Controlling supply, storage, and demand Resource allocation with shared constraints. Statistics and learning: For planning and forecasting Both rare and common events Economics for an Entropic Grid: Incorporate dynamics and uncertainty in a strategic setting. How to create policies to protect participants on both sides of the market, while creating incentives for R&D on renewable energy? 23 / 26
  • 53. Next Steps Complex Systems Mainly energy How to create policies to protect participants on both sides of the market, while creating incentives for R&D on renewable energy? Our community must consider long-term planning and policy, along with traditional systems operations 24 / 26
  • 54. Next Steps Complex Systems Mainly energy How to create policies to protect participants on both sides of the market, while creating incentives for R&D on renewable energy? Our community must consider long-term planning and policy, along with traditional systems operations Planning and Policy, includes Markets & Competition 24 / 26
  • 55. Next Steps Complex Systems Mainly energy How to create policies to protect participants on both sides of the market, while creating incentives for R&D on renewable energy? Our community must consider long-term planning and policy, along with traditional systems operations Planning and Policy, includes Markets & Competition Evolution? 24 / 26
  • 56. Next Steps Complex Systems Mainly energy How to create policies to protect participants on both sides of the market, while creating incentives for R&D on renewable energy? Our community must consider long-term planning and policy, along with traditional systems operations Planning and Policy, includes Markets & Competition Evolution? Too slow! 24 / 26
  • 57. Next Steps Complex Systems Mainly energy How to create policies to protect participants on both sides of the market, while creating incentives for R&D on renewable energy? Our community must consider long-term planning and policy, along with traditional systems operations Planning and Policy, includes Markets & Competition Evolution? Too slow! What we need is Intelligent Design 24 / 26
  • 58. Next Steps Conclusions The control community has created many techniques for understanding complex systems, and a valuable philosophy for thinking about control design 25 / 26
  • 59. Next Steps Conclusions The control community has created many techniques for understanding complex systems, and a valuable philosophy for thinking about control design In particular, stylized models can have great value: Insight in formulation of control policies Analysis of closed loop behavior, such as stability via ODE methods Architectures for learning algorithms Building bridges between OR, CS, and control disciplines The ideas surveyed here arose from partnerships with researchers in mathematics, economics, computer science, and operations research. 25 / 26
  • 60. Next Steps Conclusions The control community has created many techniques for understanding complex systems, and a valuable philosophy for thinking about control design In particular, stylized models can have great value: Insight in formulation of control policies Analysis of closed loop behavior, such as stability via ODE methods Architectures for learning algorithms Building bridges between OR, CS, and control disciplines The ideas surveyed here arose from partnerships with researchers in mathematics, economics, computer science, and operations research. Besides the many technical open questions, my hope is to extend the application of these ideas to long-range planning, especially in applications to sustainable energy. 25 / 26
  • 61. Next Steps References S. P. Meyn. Control Techniques for Complex Networks. Cambridge University Press, Cambridge, 2007. S. P. Meyn and R. L. Tweedie. Markov chains and stochastic stability. Second edition, Cambridge University Press – Cambridge Mathematical Library, 2009. S. Meyn. Stability and asymptotic optimality of generalized MaxWeight policies. SIAM J. Control Optim., 47(6):3259–3294, 2009. V. S. Borkar and S. P. Meyn. The ODE method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim., 38(2):447–469, 2000. S. P. Meyn. Sequencing and routing in multiclass queueing networks. Part II: Workload relaxations. SIAM J. Control Optim., 42(1):178–217, 2003. P. G. Mehta and S. P. Meyn. Q-learning and Pontryagin’s minimum principle. In Proc. of the 48th IEEE Conf. on Dec. and Control, pp. 3598–3605, Dec. 2009. W. Chen, D. Huang, A. A. Kulkarni, J. Unnikrishnan, Q. Zhu, P. Mehta, S. Meyn, and A. Wierman. Approximate dynamic programming using fluid and diffusion approximations with applications to power management. In Proc. of the 48th IEEE Conf. on Dec. and Control, pp. 3575–3580, Dec. 2009. 26 / 26