SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Solving Linear Optimization
          Problems with MOSEK.
                                  Bo Jensen ∗
                                MOSEK ApS,
                  Fruebjergvej 3, Box 16, 2100 Copenhagen,
                                  Denmark.
                       Email: bo.jensen@mosek.com

                 INFORMS Annual Meeting Seattle Nov. 7, 2007
∗                                                        http://www.mosek.com
    Erling D. Andersen
Introduction




               2 / 26
Topics

Introduction
Topics
                       s   The problem:
The linear optimizer                           (P ) min cT x
The simplex                                         st  Ax = b,
optimizers

Computational
                                                        x ≥ 0.
results

Conclusions            s   The linear optimizers.
                           x   Interior-point optimizer (Not main focus in this talk).
                           x   Simplex optimizer.
                       s   What is the recent improvements?
                       s   What is the (relative) performance?




                                                                                    3 / 26
The linear optimizer

Introduction
Topics
                       The general flow :
The linear optimizer
                       s   Presolve.
The simplex
optimizers             s   Form the reduced primal or dual.
Computational          s   Scale (optimizer specific).
results
                       s   Optimize (interior-point or simplex).
Conclusions
                       s   Basis identification (interior-point only).
                       s   Undo scaling and dualizing.
                       s   Postsolve.




                                                                        4 / 26
The simplex optimizers




                         5 / 26
What makes a good simplex optimizer ?

Introduction
                      s   Exploit sparsity (i.e. LU and FTRAN and BTRAN
The simplex
optimizers                routines).
What makes a good
simplex optimizer ?
                      s   Exploit problem dependent structure.
MOSEK
simplex-overview
                      s   Choose right path (i.e. good pricing strategy).
Exploiting sparsity   s   Long steps (i.e. avoid degeneracy).
aggressively
Primal (dual)         s   Numerical stability (i.e. reliable and consistent results).
Degeneracy
Dual bound flipping
                      s   Fast hotstarts (i.e. MIP and other hotstart applications).
idea used more
aggressively
                      s   Other tricks.
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                        6 / 26
MOSEK simplex-overview

Introduction
                      s   Primal and dual simplex optimizer.
The simplex
optimizers
What makes a good
                          x   Efficient cold start and warm start.
simplex optimizer ?       x   Crashes an initial basis.
MOSEK
simplex-overview          x   Multiple pricing options:
Exploiting sparsity
aggressively                  s   Full (Dantzig).
Primal (dual)
Degeneracy                    s   Partial.
Dual bound flipping
idea used more                s   Approximate/exact steepest edge.
aggressively
Numerical stability
                              s   Hybrid.
Network optimizer
                          x   Degeneration handling.
Computational
results
                      s   Revised simplex algorithm + many enhancements.
Conclusions
                      s   Many enhancements still possible!.




                                                                           7 / 26
Exploiting sparsity aggressively

Introduction
                      s   Simplex algs. require solution of the linear equation
The simplex
optimizers                systems
What makes a good
simplex optimizer ?
                                           Bf = A:j and B T g = ei .
MOSEK
simplex-overview          in each iteration.
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                  8 / 26
Exploiting sparsity aggressively

Introduction
                      s   Simplex algs. require solution of the linear equation
The simplex
optimizers                systems
What makes a good
simplex optimizer ?
                                           Bf = A:j and B T g = ei .
MOSEK
simplex-overview          in each iteration.
Exploiting sparsity
aggressively          s   Assume a sparse LU factorization of the basis
Primal (dual)
Degeneracy
Dual bound flipping                                 B = LU.
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                  8 / 26
Exploiting sparsity aggressively

Introduction
                      s   Simplex algs. require solution of the linear equation
The simplex
optimizers                systems
What makes a good
simplex optimizer ?
                                           Bf = A:j and B T g = ei .
MOSEK
simplex-overview          in each iteration.
Exploiting sparsity
aggressively          s   Assume a sparse LU factorization of the basis
Primal (dual)
Degeneracy
Dual bound flipping                                 B = LU.
idea used more
aggressively
Numerical stability   s   f can be computed as follow. Solve
Network optimizer

Computational
results
                                                    ¯
                                                   Lf = A:j
Conclusions
                          and then
                                                        ¯
                                                   Uf = f.



                                                                                  8 / 26
Exploiting sparsity aggressively

Introduction
                      s   Simplex algs. require solution of the linear equation
The simplex
optimizers                systems
What makes a good
simplex optimizer ?
                                           Bf = A:j and B T g = ei .
MOSEK
simplex-overview          in each iteration.
Exploiting sparsity
aggressively          s   Assume a sparse LU factorization of the basis
Primal (dual)
Degeneracy
Dual bound flipping                                 B = LU.
idea used more
aggressively
Numerical stability   s   f can be computed as follow. Solve
Network optimizer

Computational
results
                                                    ¯
                                                   Lf = A:j
Conclusions
                          and then
                                                        ¯
                                                   Uf = f.
                      s   Simple implementation requires O(nz(L) + nz(U )) flops.


                                                                                  8 / 26
Exploiting sparsity aggressively (continued)

Introduction
                      s   Consider the simple example:
The simplex
optimizers
                                                       ¯
                                                         
What makes a good                         1            f1      0
simplex optimizer ?
MOSEK                                    0 1          ¯
                                                    f2  =  x 
simplex-overview
                                          x 0 1        ¯
                                                       f3      0
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                     9 / 26
Exploiting sparsity aggressively (continued)

Introduction
                      s   Consider the simple example:
The simplex
optimizers
                                                       ¯
                                                         
What makes a good                         1            f1      0
simplex optimizer ?
MOSEK                                    0 1          ¯
                                                    f2  =  x 
simplex-overview
                                          x 0 1        ¯
                                                       f3      0
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
                      s   Clearly sparsity in the RHS can be exploited! (done
idea used more
aggressively
                          extensively in MOSEK).
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                9 / 26
Exploiting sparsity aggressively (continued)

Introduction
                      s   Consider the simple example:
The simplex
optimizers
                                                       ¯
                                                         
What makes a good                         1            f1      0
simplex optimizer ?
MOSEK                                    0 1          ¯
                                                    f2  =  x 
simplex-overview
                                          x 0 1        ¯
                                                       f3      0
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
                      s   Clearly sparsity in the RHS can be exploited! (done
idea used more
aggressively
                          extensively in MOSEK).
Numerical stability   s   Gilbert and Peierls [GIL:88] demonstrate how to solve the
Network optimizer
                          triangular system in O(minimal number of flops).
Computational
results

Conclusions




                                                                                9 / 26
Exploiting sparsity aggressively (continued)

Introduction
                      s   Consider the simple example:
The simplex
optimizers
                                                       ¯
                                                         
What makes a good                         1            f1      0
simplex optimizer ?
MOSEK                                    0 1          ¯
                                                    f2  =  x 
simplex-overview
                                          x 0 1        ¯
                                                       f3      0
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
                      s   Clearly sparsity in the RHS can be exploited! (done
idea used more
aggressively
                          extensively in MOSEK).
Numerical stability   s   Gilbert and Peierls [GIL:88] demonstrate how to solve the
Network optimizer
                          triangular system in O(minimal number of flops).
Computational
results               s   Aim: Solves with L and U and updates to the LU should
Conclusions               run in O(minimal number of flops) and not in O(m) for
                          instance.




                                                                                9 / 26
Exploiting sparsity aggressively (continued)

Introduction
                      s   Consider the simple example:
The simplex
optimizers
                                                       ¯
                                                         
What makes a good                         1            f1      0
simplex optimizer ?
MOSEK                                    0 1          ¯
                                                    f2  =  x 
simplex-overview
                                          x 0 1        ¯
                                                       f3      0
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
                      s   Clearly sparsity in the RHS can be exploited! (done
idea used more
aggressively
                          extensively in MOSEK).
Numerical stability   s   Gilbert and Peierls [GIL:88] demonstrate how to solve the
Network optimizer
                          triangular system in O(minimal number of flops).
Computational
results               s   Aim: Solves with L and U and updates to the LU should
Conclusions               run in O(minimal number of flops) and not in O(m) for
                          instance.
                      s   Drawback: Both L and U must be stored row and column
                          wise because solves with LT and U T are required too.


                                                                                9 / 26
Primal (dual) Degeneracy

Introduction
                      The simplex optimizer may take very small or zero step sizes,
The simplex
optimizers            why ?
What makes a good
simplex optimizer ?
MOSEK
simplex-overview
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                               10 / 26
Primal (dual) Degeneracy

Introduction
                      The simplex optimizer may take very small or zero step sizes,
The simplex
optimizers            why ?
What makes a good
simplex optimizer ?   s   Primal step size δp :
MOSEK
simplex-overview          lB ≤ xB − δp B −1 aq ≤ uB
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                               10 / 26
Primal (dual) Degeneracy

Introduction
                      The simplex optimizer may take very small or zero step sizes,
The simplex
optimizers            why ?
What makes a good
simplex optimizer ?   s   Primal step size δp :
MOSEK
simplex-overview          lB ≤ xB − δp B −1 aq ≤ uB
Exploiting sparsity
aggressively          s   Basic variables on a bound may imply a zero primal step.
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                               10 / 26
Primal (dual) Degeneracy

Introduction
                      The simplex optimizer may take very small or zero step sizes,
The simplex
optimizers            why ?
What makes a good
simplex optimizer ?   s   Primal step size δp :
MOSEK
simplex-overview          lB ≤ xB − δp B −1 aq ≤ uB
Exploiting sparsity
aggressively          s   Basic variables on a bound may imply a zero primal step.
Primal (dual)
Degeneracy
                      s   Dual step size δd :
Dual bound flipping        cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL
idea used more
aggressively
                          cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                               10 / 26
Primal (dual) Degeneracy

Introduction
                      The simplex optimizer may take very small or zero step sizes,
The simplex
optimizers            why ?
What makes a good
simplex optimizer ?   s   Primal step size δp :
MOSEK
simplex-overview          lB ≤ xB − δp B −1 aq ≤ uB
Exploiting sparsity
aggressively          s   Basic variables on a bound may imply a zero primal step.
Primal (dual)
Degeneracy
                      s   Dual step size δd :
Dual bound flipping        cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL
idea used more
aggressively
                          cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU
Numerical stability
Network optimizer     s   Non basic variables with zero reduced cost may imply a
Computational
results
                          zero dual step.
Conclusions           Degeneration posses both a theoretical and a practical
                      problem for the simplex optimizer !




                                                                               10 / 26
Primal (dual) Degeneracy

Introduction
                      The simplex optimizer may take very small or zero step sizes,
The simplex
optimizers            why ?
What makes a good
simplex optimizer ?   s   Primal step size δp :
MOSEK
simplex-overview          lB ≤ xB − δp B −1 aq ≤ uB
Exploiting sparsity
aggressively          s   Basic variables on a bound may imply a zero primal step.
Primal (dual)
Degeneracy
                      s   Dual step size δd :
Dual bound flipping        cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL
idea used more
aggressively
                          cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU
Numerical stability
Network optimizer     s   Non basic variables with zero reduced cost may imply a
Computational
results
                          zero dual step.
Conclusions           Degeneration posses both a theoretical and a practical
                      problem for the simplex optimizer !

                      What is our options ?


                                                                               10 / 26
Primal (dual) Degeneracy

Introduction
                      The simplex optimizer may take very small or zero step sizes,
The simplex
optimizers            why ?
What makes a good
simplex optimizer ?   s   Primal step size δp :
MOSEK
simplex-overview          lB ≤ xB − δp B −1 aq ≤ uB
Exploiting sparsity
aggressively          s   Basic variables on a bound may imply a zero primal step.
Primal (dual)
Degeneracy
                      s   Dual step size δd :
Dual bound flipping        cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL
idea used more
aggressively
                          cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU
Numerical stability
Network optimizer     s   Non basic variables with zero reduced cost may imply a
Computational
results
                          zero dual step.
Conclusions           Degeneration posses both a theoretical and a practical
                      problem for the simplex optimizer !

                      What is our options ?

                      One approach is to perturb lj and uj (cj ).
                                                                               10 / 26
Primal (dual) Degeneracy (continued)

Introduction
                      MOSEK 5 has been improved on degenerated problems:
The simplex
optimizers
What makes a good
                      s   Better and more aggressive perturbation scheme.
simplex optimizer ?
MOSEK
simplex-overview
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                            11 / 26
Primal (dual) Degeneracy (continued)

Introduction
                      MOSEK 5 has been improved on degenerated problems:
The simplex
optimizers
What makes a good
                      s   Better and more aggressive perturbation scheme.
simplex optimizer ?   s   Sparsity issues important (very tricky).
MOSEK
simplex-overview
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                            11 / 26
Primal (dual) Degeneracy (continued)

Introduction
                      MOSEK 5 has been improved on degenerated problems:
The simplex
optimizers
What makes a good
                      s   Better and more aggressive perturbation scheme.
simplex optimizer ?   s   Sparsity issues important (very tricky).
MOSEK
simplex-overview      s   Clean up perturbations with dual (primal) simplex.
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                               11 / 26
Primal (dual) Degeneracy (continued)

Introduction
                      MOSEK 5 has been improved on degenerated problems:
The simplex
optimizers
What makes a good
                      s   Better and more aggressive perturbation scheme.
simplex optimizer ?   s   Sparsity issues important (very tricky).
MOSEK
simplex-overview      s   Clean up perturbations with dual (primal) simplex.
Exploiting sparsity
aggressively          s   Many examples where ”tailed” solves are substantial
Primal (dual)
Degeneracy
                          reduced.
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                11 / 26
Primal (dual) Degeneracy (continued)

Introduction
                      MOSEK 5 has been improved on degenerated problems:
The simplex
optimizers
What makes a good
                      s   Better and more aggressive perturbation scheme.
simplex optimizer ?   s   Sparsity issues important (very tricky).
MOSEK
simplex-overview      s   Clean up perturbations with dual (primal) simplex.
Exploiting sparsity
aggressively          s   Many examples where ”tailed” solves are substantial
Primal (dual)
Degeneracy
                          reduced.
Dual bound flipping    s   Still room for improvement.
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                11 / 26
Dual bound flipping idea used more aggressively

Introduction
The simplex
                       Dual step size δd :
optimizers             cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0          ∀j ∈ NL
What makes a good
simplex optimizer ?
MOSEK
                       cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0          ∀j ∈ NU
simplex-overview
Exploiting sparsity
                       A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be
aggressively           binding in dual min-ratio if profitable.
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                            12 / 26
Dual bound flipping idea used more aggressively

Introduction
The simplex
                       Dual step size δd :
optimizers             cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0              ∀j ∈ NL
What makes a good
simplex optimizer ?
MOSEK
                       cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0              ∀j ∈ NU
simplex-overview
Exploiting sparsity
                       A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be
aggressively           binding in dual min-ratio if profitable.
Primal (dual)
Degeneracy
Dual bound flipping    s   This involves flipping nonbasic variables to opposite
idea used more
aggressively              bound to remain dual feasible and cost one extra solve.
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                12 / 26
Dual bound flipping idea used more aggressively

Introduction
The simplex
                       Dual step size δd :
optimizers             cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0              ∀j ∈ NL
What makes a good
simplex optimizer ?
MOSEK
                       cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0              ∀j ∈ NU
simplex-overview
Exploiting sparsity
                       A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be
aggressively           binding in dual min-ratio if profitable.
Primal (dual)
Degeneracy
Dual bound flipping    s   This involves flipping nonbasic variables to opposite
idea used more
aggressively              bound to remain dual feasible and cost one extra solve.
Numerical stability
Network optimizer
                      s   Longer dual steplengths.
Computational
results

Conclusions




                                                                                12 / 26
Dual bound flipping idea used more aggressively

Introduction
The simplex
                       Dual step size δd :
optimizers             cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0              ∀j ∈ NL
What makes a good
simplex optimizer ?
MOSEK
                       cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0              ∀j ∈ NU
simplex-overview
Exploiting sparsity
                       A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be
aggressively           binding in dual min-ratio if profitable.
Primal (dual)
Degeneracy
Dual bound flipping    s   This involves flipping nonbasic variables to opposite
idea used more
aggressively              bound to remain dual feasible and cost one extra solve.
Numerical stability
Network optimizer
                      s   Longer dual steplengths.
Computational         s   Reduces degeneracy.
results

Conclusions




                                                                                12 / 26
Dual bound flipping idea used more aggressively

Introduction
The simplex
                       Dual step size δd :
optimizers             cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0              ∀j ∈ NL
What makes a good
simplex optimizer ?
MOSEK
                       cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0              ∀j ∈ NU
simplex-overview
Exploiting sparsity
                       A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be
aggressively           binding in dual min-ratio if profitable.
Primal (dual)
Degeneracy
Dual bound flipping    s   This involves flipping nonbasic variables to opposite
idea used more
aggressively              bound to remain dual feasible and cost one extra solve.
Numerical stability
Network optimizer
                      s   Longer dual steplengths.
Computational         s   Reduces degeneracy.
results
                      s   Less iterations.
Conclusions




                                                                                12 / 26
Dual bound flipping idea used more aggressively

Introduction
The simplex
                       Dual step size δd :
optimizers             cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0              ∀j ∈ NL
What makes a good
simplex optimizer ?
MOSEK
                       cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0              ∀j ∈ NU
simplex-overview
Exploiting sparsity
                       A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be
aggressively           binding in dual min-ratio if profitable.
Primal (dual)
Degeneracy
Dual bound flipping    s   This involves flipping nonbasic variables to opposite
idea used more
aggressively              bound to remain dual feasible and cost one extra solve.
Numerical stability
Network optimizer
                      s   Longer dual steplengths.
Computational         s   Reduces degeneracy.
results
                      s   Less iterations.
Conclusions
                      s   More flexibility in pivot choice (i.e. potentially more
                          stable).




                                                                                12 / 26
Dual bound flipping idea used more aggressively

Introduction
The simplex
                       Dual step size δd :
optimizers             cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0              ∀j ∈ NL
What makes a good
simplex optimizer ?
MOSEK
                       cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0              ∀j ∈ NU
simplex-overview
Exploiting sparsity
                       A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be
aggressively           binding in dual min-ratio if profitable.
Primal (dual)
Degeneracy
Dual bound flipping    s   This involves flipping nonbasic variables to opposite
idea used more
aggressively              bound to remain dual feasible and cost one extra solve.
Numerical stability
Network optimizer
                      s   Longer dual steplengths.
Computational         s   Reduces degeneracy.
results
                      s   Less iterations.
Conclusions
                      s   More flexibility in pivot choice (i.e. potentially more
                          stable).
                      s   Improves sparsity of the basis when degenerated! (i.e. if
                          xB i becomes feasible no basis exchange is needed).

                                                                                 12 / 26
Dual bound flipping idea used more aggressively
                      (continued)

Introduction
                      Bound flipping examples:
The simplex
optimizers
What makes a good                                       Iter               Time
simplex optimizer ?
MOSEK                  Problem    Rows      Cols     NB         WB      NB      WB
simplex-overview
Exploiting sparsity    osa-60    10280    232966    6938       5111    58.12   8.84
aggressively
Primal (dual)
                       world     34506     32734   54566       32606   218.81 50.03
Degeneracy
                       pds-40    66844    212859   34274       26599   96.51 18.48
Dual bound flipping
idea used more         ken-18    105127   154699   151203      51452   258.18 13.92
aggressively
Numerical stability    client    27216     20567   80555       63660   208.40 84.09
Network optimizer

Computational         WB = MOSEK 5 Dual simplex with bound flips
results

Conclusions
                      NB = MOSEK 5 Dual simplex with no bound flips




                                                                               13 / 26
Numerical stability

Introduction
                      s   Improving numerical stability.
The simplex
optimizers
What makes a good
                          x   Moved LU update before updating solution.
simplex optimizer ?
MOSEK
simplex-overview
Exploiting sparsity
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                          14 / 26
Numerical stability

Introduction
                      s   Improving numerical stability.
The simplex
optimizers
What makes a good
                          x   Moved LU update before updating solution.
simplex optimizer ?
MOSEK                         s   Saves one solve with L in ei T B −1 [GOL:77].
simplex-overview
Exploiting sparsity
                              s   More stable approach.
aggressively
Primal (dual)
Degeneracy
Dual bound flipping
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                  14 / 26
Numerical stability

Introduction
                      s   Improving numerical stability.
The simplex
optimizers
What makes a good
                          x   Moved LU update before updating solution.
simplex optimizer ?
MOSEK                         s   Saves one solve with L in ei T B −1 [GOL:77].
simplex-overview
Exploiting sparsity
                              s   More stable approach.
aggressively
Primal (dual)             x   Better handling of singularities (sing. variables are
Degeneracy
Dual bound flipping
                              temporary fixed).
idea used more
aggressively
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                      14 / 26
Numerical stability

Introduction
                      s   Improving numerical stability.
The simplex
optimizers
What makes a good
                          x   Moved LU update before updating solution.
simplex optimizer ?
MOSEK                         s   Saves one solve with L in ei T B −1 [GOL:77].
simplex-overview
Exploiting sparsity
                              s   More stable approach.
aggressively
Primal (dual)             x Better handling of singularities (sing. variables are
Degeneracy
Dual bound flipping
                            temporary fixed).
idea used more
aggressively
                          x Switch to safe mode if deemed unstable.
Numerical stability
Network optimizer

Computational
results

Conclusions




                                                                                    14 / 26
Network optimizer

Introduction
                      MOSEK 5 features a network simplex optimizer.
The simplex
optimizers
What makes a good
                      s   Solves pure network flow problems (i.e. LP’s with two
simplex optimizer ?       non-zeros in each column either 1 or -1).
MOSEK
simplex-overview      s   Can extract embedded network structure in a model (i.e.
Exploiting sparsity
aggressively              network with side constraints).
Primal (dual)
Degeneracy
                      s   Using standard interface, only one parameter has to be
Dual bound flipping        set.
idea used more
aggressively          s   Huge problems can be solved in limited time, for instance
Numerical stability
Network optimizer
                          a problem with 8 million variables can be solved in less
Computational             than 200 seconds.
results

Conclusions




                                                                               15 / 26
Computational results




                        16 / 26
Test setup

Introduction
                     s   577 problems (mixed size).
The simplex
optimizers           s   A Dual Core server with 4GB RAM running Windows
Computational            2003 (Intel CPU).
results
Test setup           s   A Quad Core server with 8GB RAM running Windows
Network Vs.
Standard simplex
                         2003 (Intel CPU).
Primal Simplex       s   See [HM:07] for a benchmark comparing Mosek with
Dual Simplex
Numerical difficult
                         other solvers.
problems-primal
simplex
Numerical difficult
                     All results presented in one table is obtained using one of the
problems-dual
simplex
                     two computers only.
Conclusions




                                                                                 17 / 26
Network Vs. Standard simplex

Introduction
The simplex
optimizers
                                            small                  medium
Computational
                                   netw     psim    dsim   netw     psim    dsim
results               Num.          30        30     30      43      43      43
Test setup
Network Vs.
                      Firsts        30        0       1      43       0       0
Standard simplex      Total time   13.7     114.8   27.8   589.9   10676.6 3015.2
Primal Simplex
Dual Simplex
                      G. avg.      0.39      2.42   0.70    6.30    91.74  19.70
Numerical difficult
problems-primal
simplex
Numerical difficult                             large
problems-dual
simplex
                                    netw       psim     dsim
Conclusions
                      Num.            2          2        2
                      Firsts          2          0        0
                      Total time    366.3     2905.8   968.9
                      G. avg.      182.98    1115.71   468.76


                     Table 1: Performance of the network flow, primal simplex and
                     dual simplex optimizer on pure network problems.
                                                                                    18 / 26
Primal Simplex

Introduction
The simplex
optimizers
                                       small            medium           large
Computational
                                       5       4       5       4       5         4
results                Num.          399     399     148     148      30        30
Test setup
Network Vs.
                       Firsts        329     245      91      62      22        11
Standard simplex       Total time   100.4   101.7   2425.3 8962.3   29905.2   39333.2
Primal Simplex
Dual Simplex
                       G. avg.       0.06    0.07    7.49    9.24   591.39    746.01
Numerical difficult
problems-primal
simplex
Numerical difficult
                     Table 2: Performance of the version 4 and version 5 primal
problems-dual
simplex
                     simplex optimizer
Conclusions




                                                                                  19 / 26
Dual Simplex

Introduction
The simplex
optimizers
                                       small          medium            large
Computational
                                      5      4       5      4          5        4
results                Num.         412     412     150    150        21       21
Test setup
Network Vs.
                       Firsts       198     286     133     22        18        5
Standard simplex       Total time   84.8   106.4   1852.9 7611.3   23678.9   38994.3
Primal Simplex
Dual Simplex
                       G. avg.      0.10   0.08     4.65   8.70     544.44   1065.24
Numerical difficult
problems-primal
simplex
Numerical difficult
                     Table 3: Performance of the version 4 and version 5 dual sim-
problems-dual
simplex
                     plex optimizer
Conclusions




                                                                                  20 / 26
Numerical difficult problems-primal simplex

Introduction
The simplex
optimizers
                                         small          medium          large
Computational
                                        5       4       5     4       5        4
results                   Num.          9       9      19     19      2        2
Test setup
Network Vs.
                          Firsts        5       5      13     6       2        0
Standard simplex          Total time   2.7     2.8   235.9 319.6    1297.7   1503.3
Primal Simplex
Dual Simplex
                          G. avg.      0.19   0.18    7.19   9.54   413.26   464.04
Numerical difficult        Fails         0       0       0     3       0        3
problems-primal
simplex
Numerical difficult
problems-dual        Table 4: Performance of the version 4 and 5 of the primal sim-
simplex

Conclusions
                     plex optimizer on numerical difficult problems.




                                                                                      21 / 26
Numerical difficult problems-dual simplex

Introduction
The simplex
optimizers
                                        small          medium           large
Computational
                                       5       4       5     4       5         4
results                  Num.          11     11      19     19      4         4
Test setup
Network Vs.
                         Firsts        7       6      13     6       4         0
Standard simplex         Total time   3.9     6.6   3198.3 345.9   4736.3   12820.5
Primal Simplex
Dual Simplex
                         G. avg.      0.24   0.31    8.44   9.67   802.24   2525.35
Numerical difficult       Fails         0       0       0     1       0         1
problems-primal
simplex
Numerical difficult
problems-dual        Table 5: Performance of the version 4 and 5 dual simplex opti-
simplex

Conclusions
                     mizer on numerical difficult problems.




                                                                                  22 / 26
Conclusions




              23 / 26
Conclusions

Introduction
                s   Simplex:
The simplex
optimizers
                    x   MOSEK 5 substantial faster than MOSEK 4.
Computational
results             x   MOSEK 5 more stable than MOSEK 4.
Conclusions         x   Dual simplex faster than primal.
Conclusions
A number open
issues exists
References




                                                                   24 / 26
A number open issues exists

Introduction
                s   Simplex:
The simplex
optimizers
                    x Degeneracy (non-perturbation method might be
Computational
results               needed in extreme cases).
Conclusions         x Improve primal pricing.
Conclusions
A number open
                    x Better crashing on special problems.
issues exists
                    x Choose more sparse path.
References




                                                                     25 / 26
References

Introduction
The simplex
                [HM:07] H.Mittelmann http://plato.la.asu.edu/bench.html
optimizers

Computational   [GIL:88] J. R. Gilbert and T. Peierls, ”Sparse partial pivoting in time
results              proportional to arithmetic operations”, SIAM J. Sci. Statist.
Conclusions          Comput., 9, 1988, pp. 862–874.
Conclusions
A number open
issues exists   [GOL:77] D. Goldfarb, ”On the Bartels-Golub decomposition for
References
                    linear programming bases,” Mathematical. Programming, 13,
                    1977, pp 272-279

                [KOS:02] E. Kostina, ”The Long Step Rule in the Bounded-Variable
                    Dual Simplex Method: Numerical Experiments”,
                    Mathematical Methods of Operations Research, 55 2002, I. 3.

                [MAR:03] Maros I, ”A Generalized Dual Phase-2 Simplex
                   Algorithm”, European Journal of Operational Research, 149,
                   2003, pp. 1–16



                                                                                  26 / 26

Weitere ähnliche Inhalte

Was ist angesagt?

Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...
Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...
Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...Provectus
 
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transforms
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transformsDISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transforms
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transformsNITHIN KALLE PALLY
 
Signal transmission and filtering section 3.1
Signal transmission and filtering section 3.1Signal transmission and filtering section 3.1
Signal transmission and filtering section 3.1nahrain university
 
Linear prediction
Linear predictionLinear prediction
Linear predictionUma Rajaram
 
Linear prediction
Linear predictionLinear prediction
Linear predictionRayeesa
 

Was ist angesagt? (9)

Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...
Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...
Data Summer Conf 2018, “How to accelerate your neural net inference with Tens...
 
Nabaa
NabaaNabaa
Nabaa
 
Image transforms
Image transformsImage transforms
Image transforms
 
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transforms
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transformsDISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transforms
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transforms
 
Unit ii
Unit iiUnit ii
Unit ii
 
Modulation techniques matlab_code
Modulation techniques matlab_codeModulation techniques matlab_code
Modulation techniques matlab_code
 
Signal transmission and filtering section 3.1
Signal transmission and filtering section 3.1Signal transmission and filtering section 3.1
Signal transmission and filtering section 3.1
 
Linear prediction
Linear predictionLinear prediction
Linear prediction
 
Linear prediction
Linear predictionLinear prediction
Linear prediction
 

Ähnlich wie 2007 : Solving Linear Problems with MOSEK (Seattle 2007)

2009 : Solving linear optimization problems with MOSEK
2009 : Solving linear optimization problems with MOSEK2009 : Solving linear optimization problems with MOSEK
2009 : Solving linear optimization problems with MOSEKjensenbo
 
Global optimization
Global optimizationGlobal optimization
Global optimizationbpenalver
 
2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...
2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...
2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...jensenbo
 
2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...
2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...
2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...jensenbo
 
Rethinking Attention with Performers
Rethinking Attention with PerformersRethinking Attention with Performers
Rethinking Attention with PerformersJoonhyung Lee
 
Reading_0413_var_Transformers.pptx
Reading_0413_var_Transformers.pptxReading_0413_var_Transformers.pptx
Reading_0413_var_Transformers.pptxcongtran88
 
ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...
ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...
ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...Deepak Malani
 
Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...
Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...
Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...Mostafa ElAraby
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdfAnaNeacsu5
 
Extreme learning machine:Theory and applications
Extreme learning machine:Theory and applicationsExtreme learning machine:Theory and applications
Extreme learning machine:Theory and applicationsJames Chou
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 

Ähnlich wie 2007 : Solving Linear Problems with MOSEK (Seattle 2007) (20)

2009 : Solving linear optimization problems with MOSEK
2009 : Solving linear optimization problems with MOSEK2009 : Solving linear optimization problems with MOSEK
2009 : Solving linear optimization problems with MOSEK
 
LP.ppt
LP.pptLP.ppt
LP.ppt
 
Global optimization
Global optimizationGlobal optimization
Global optimization
 
2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...
2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...
2007 : Exploiting Problem Structure in the MOSEK simplex optimizers (seattle ...
 
2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...
2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...
2008 : A Case Study: How to Speed Up the Simplex Algorithms on Problems Minim...
 
Fx570 ms 991ms_e
Fx570 ms 991ms_eFx570 ms 991ms_e
Fx570 ms 991ms_e
 
Rethinking Attention with Performers
Rethinking Attention with PerformersRethinking Attention with Performers
Rethinking Attention with Performers
 
Reading_0413_var_Transformers.pptx
Reading_0413_var_Transformers.pptxReading_0413_var_Transformers.pptx
Reading_0413_var_Transformers.pptx
 
ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...
ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...
ILP Based Approach for Input Vector Controlled (IVC) Toggle Maximization in C...
 
Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...
Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...
Identifying Critical Neurons in ANN Architectures using Mixed Integer Program...
 
lec10.pdf
lec10.pdflec10.pdf
lec10.pdf
 
Recent progress in CPLEX 12.6.2
Recent progress in CPLEX 12.6.2Recent progress in CPLEX 12.6.2
Recent progress in CPLEX 12.6.2
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdf
 
Distributed ADMM
Distributed ADMMDistributed ADMM
Distributed ADMM
 
Extreme learning machine:Theory and applications
Extreme learning machine:Theory and applicationsExtreme learning machine:Theory and applications
Extreme learning machine:Theory and applications
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
 
Ba26343346
Ba26343346Ba26343346
Ba26343346
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
 
Dynamic pgmming
Dynamic pgmmingDynamic pgmming
Dynamic pgmming
 
Optimization techniques
Optimization techniques Optimization techniques
Optimization techniques
 

2007 : Solving Linear Problems with MOSEK (Seattle 2007)

  • 1. Solving Linear Optimization Problems with MOSEK. Bo Jensen ∗ MOSEK ApS, Fruebjergvej 3, Box 16, 2100 Copenhagen, Denmark. Email: bo.jensen@mosek.com INFORMS Annual Meeting Seattle Nov. 7, 2007 ∗ http://www.mosek.com Erling D. Andersen
  • 2. Introduction 2 / 26
  • 3. Topics Introduction Topics s The problem: The linear optimizer (P ) min cT x The simplex st Ax = b, optimizers Computational x ≥ 0. results Conclusions s The linear optimizers. x Interior-point optimizer (Not main focus in this talk). x Simplex optimizer. s What is the recent improvements? s What is the (relative) performance? 3 / 26
  • 4. The linear optimizer Introduction Topics The general flow : The linear optimizer s Presolve. The simplex optimizers s Form the reduced primal or dual. Computational s Scale (optimizer specific). results s Optimize (interior-point or simplex). Conclusions s Basis identification (interior-point only). s Undo scaling and dualizing. s Postsolve. 4 / 26
  • 6. What makes a good simplex optimizer ? Introduction s Exploit sparsity (i.e. LU and FTRAN and BTRAN The simplex optimizers routines). What makes a good simplex optimizer ? s Exploit problem dependent structure. MOSEK simplex-overview s Choose right path (i.e. good pricing strategy). Exploiting sparsity s Long steps (i.e. avoid degeneracy). aggressively Primal (dual) s Numerical stability (i.e. reliable and consistent results). Degeneracy Dual bound flipping s Fast hotstarts (i.e. MIP and other hotstart applications). idea used more aggressively s Other tricks. Numerical stability Network optimizer Computational results Conclusions 6 / 26
  • 7. MOSEK simplex-overview Introduction s Primal and dual simplex optimizer. The simplex optimizers What makes a good x Efficient cold start and warm start. simplex optimizer ? x Crashes an initial basis. MOSEK simplex-overview x Multiple pricing options: Exploiting sparsity aggressively s Full (Dantzig). Primal (dual) Degeneracy s Partial. Dual bound flipping idea used more s Approximate/exact steepest edge. aggressively Numerical stability s Hybrid. Network optimizer x Degeneration handling. Computational results s Revised simplex algorithm + many enhancements. Conclusions s Many enhancements still possible!. 7 / 26
  • 8. Exploiting sparsity aggressively Introduction s Simplex algs. require solution of the linear equation The simplex optimizers systems What makes a good simplex optimizer ? Bf = A:j and B T g = ei . MOSEK simplex-overview in each iteration. Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 8 / 26
  • 9. Exploiting sparsity aggressively Introduction s Simplex algs. require solution of the linear equation The simplex optimizers systems What makes a good simplex optimizer ? Bf = A:j and B T g = ei . MOSEK simplex-overview in each iteration. Exploiting sparsity aggressively s Assume a sparse LU factorization of the basis Primal (dual) Degeneracy Dual bound flipping B = LU. idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 8 / 26
  • 10. Exploiting sparsity aggressively Introduction s Simplex algs. require solution of the linear equation The simplex optimizers systems What makes a good simplex optimizer ? Bf = A:j and B T g = ei . MOSEK simplex-overview in each iteration. Exploiting sparsity aggressively s Assume a sparse LU factorization of the basis Primal (dual) Degeneracy Dual bound flipping B = LU. idea used more aggressively Numerical stability s f can be computed as follow. Solve Network optimizer Computational results ¯ Lf = A:j Conclusions and then ¯ Uf = f. 8 / 26
  • 11. Exploiting sparsity aggressively Introduction s Simplex algs. require solution of the linear equation The simplex optimizers systems What makes a good simplex optimizer ? Bf = A:j and B T g = ei . MOSEK simplex-overview in each iteration. Exploiting sparsity aggressively s Assume a sparse LU factorization of the basis Primal (dual) Degeneracy Dual bound flipping B = LU. idea used more aggressively Numerical stability s f can be computed as follow. Solve Network optimizer Computational results ¯ Lf = A:j Conclusions and then ¯ Uf = f. s Simple implementation requires O(nz(L) + nz(U )) flops. 8 / 26
  • 12. Exploiting sparsity aggressively (continued) Introduction s Consider the simple example: The simplex optimizers ¯      What makes a good 1 f1 0 simplex optimizer ? MOSEK  0 1 ¯   f2  =  x  simplex-overview x 0 1 ¯ f3 0 Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 9 / 26
  • 13. Exploiting sparsity aggressively (continued) Introduction s Consider the simple example: The simplex optimizers ¯      What makes a good 1 f1 0 simplex optimizer ? MOSEK  0 1 ¯   f2  =  x  simplex-overview x 0 1 ¯ f3 0 Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping s Clearly sparsity in the RHS can be exploited! (done idea used more aggressively extensively in MOSEK). Numerical stability Network optimizer Computational results Conclusions 9 / 26
  • 14. Exploiting sparsity aggressively (continued) Introduction s Consider the simple example: The simplex optimizers ¯      What makes a good 1 f1 0 simplex optimizer ? MOSEK  0 1 ¯   f2  =  x  simplex-overview x 0 1 ¯ f3 0 Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping s Clearly sparsity in the RHS can be exploited! (done idea used more aggressively extensively in MOSEK). Numerical stability s Gilbert and Peierls [GIL:88] demonstrate how to solve the Network optimizer triangular system in O(minimal number of flops). Computational results Conclusions 9 / 26
  • 15. Exploiting sparsity aggressively (continued) Introduction s Consider the simple example: The simplex optimizers ¯      What makes a good 1 f1 0 simplex optimizer ? MOSEK  0 1 ¯   f2  =  x  simplex-overview x 0 1 ¯ f3 0 Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping s Clearly sparsity in the RHS can be exploited! (done idea used more aggressively extensively in MOSEK). Numerical stability s Gilbert and Peierls [GIL:88] demonstrate how to solve the Network optimizer triangular system in O(minimal number of flops). Computational results s Aim: Solves with L and U and updates to the LU should Conclusions run in O(minimal number of flops) and not in O(m) for instance. 9 / 26
  • 16. Exploiting sparsity aggressively (continued) Introduction s Consider the simple example: The simplex optimizers ¯      What makes a good 1 f1 0 simplex optimizer ? MOSEK  0 1 ¯   f2  =  x  simplex-overview x 0 1 ¯ f3 0 Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping s Clearly sparsity in the RHS can be exploited! (done idea used more aggressively extensively in MOSEK). Numerical stability s Gilbert and Peierls [GIL:88] demonstrate how to solve the Network optimizer triangular system in O(minimal number of flops). Computational results s Aim: Solves with L and U and updates to the LU should Conclusions run in O(minimal number of flops) and not in O(m) for instance. s Drawback: Both L and U must be stored row and column wise because solves with LT and U T are required too. 9 / 26
  • 17. Primal (dual) Degeneracy Introduction The simplex optimizer may take very small or zero step sizes, The simplex optimizers why ? What makes a good simplex optimizer ? MOSEK simplex-overview Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 10 / 26
  • 18. Primal (dual) Degeneracy Introduction The simplex optimizer may take very small or zero step sizes, The simplex optimizers why ? What makes a good simplex optimizer ? s Primal step size δp : MOSEK simplex-overview lB ≤ xB − δp B −1 aq ≤ uB Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 10 / 26
  • 19. Primal (dual) Degeneracy Introduction The simplex optimizer may take very small or zero step sizes, The simplex optimizers why ? What makes a good simplex optimizer ? s Primal step size δp : MOSEK simplex-overview lB ≤ xB − δp B −1 aq ≤ uB Exploiting sparsity aggressively s Basic variables on a bound may imply a zero primal step. Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 10 / 26
  • 20. Primal (dual) Degeneracy Introduction The simplex optimizer may take very small or zero step sizes, The simplex optimizers why ? What makes a good simplex optimizer ? s Primal step size δp : MOSEK simplex-overview lB ≤ xB − δp B −1 aq ≤ uB Exploiting sparsity aggressively s Basic variables on a bound may imply a zero primal step. Primal (dual) Degeneracy s Dual step size δd : Dual bound flipping cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL idea used more aggressively cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU Numerical stability Network optimizer Computational results Conclusions 10 / 26
  • 21. Primal (dual) Degeneracy Introduction The simplex optimizer may take very small or zero step sizes, The simplex optimizers why ? What makes a good simplex optimizer ? s Primal step size δp : MOSEK simplex-overview lB ≤ xB − δp B −1 aq ≤ uB Exploiting sparsity aggressively s Basic variables on a bound may imply a zero primal step. Primal (dual) Degeneracy s Dual step size δd : Dual bound flipping cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL idea used more aggressively cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU Numerical stability Network optimizer s Non basic variables with zero reduced cost may imply a Computational results zero dual step. Conclusions Degeneration posses both a theoretical and a practical problem for the simplex optimizer ! 10 / 26
  • 22. Primal (dual) Degeneracy Introduction The simplex optimizer may take very small or zero step sizes, The simplex optimizers why ? What makes a good simplex optimizer ? s Primal step size δp : MOSEK simplex-overview lB ≤ xB − δp B −1 aq ≤ uB Exploiting sparsity aggressively s Basic variables on a bound may imply a zero primal step. Primal (dual) Degeneracy s Dual step size δd : Dual bound flipping cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL idea used more aggressively cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU Numerical stability Network optimizer s Non basic variables with zero reduced cost may imply a Computational results zero dual step. Conclusions Degeneration posses both a theoretical and a practical problem for the simplex optimizer ! What is our options ? 10 / 26
  • 23. Primal (dual) Degeneracy Introduction The simplex optimizer may take very small or zero step sizes, The simplex optimizers why ? What makes a good simplex optimizer ? s Primal step size δp : MOSEK simplex-overview lB ≤ xB − δp B −1 aq ≤ uB Exploiting sparsity aggressively s Basic variables on a bound may imply a zero primal step. Primal (dual) Degeneracy s Dual step size δd : Dual bound flipping cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL idea used more aggressively cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU Numerical stability Network optimizer s Non basic variables with zero reduced cost may imply a Computational results zero dual step. Conclusions Degeneration posses both a theoretical and a practical problem for the simplex optimizer ! What is our options ? One approach is to perturb lj and uj (cj ). 10 / 26
  • 24. Primal (dual) Degeneracy (continued) Introduction MOSEK 5 has been improved on degenerated problems: The simplex optimizers What makes a good s Better and more aggressive perturbation scheme. simplex optimizer ? MOSEK simplex-overview Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 11 / 26
  • 25. Primal (dual) Degeneracy (continued) Introduction MOSEK 5 has been improved on degenerated problems: The simplex optimizers What makes a good s Better and more aggressive perturbation scheme. simplex optimizer ? s Sparsity issues important (very tricky). MOSEK simplex-overview Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 11 / 26
  • 26. Primal (dual) Degeneracy (continued) Introduction MOSEK 5 has been improved on degenerated problems: The simplex optimizers What makes a good s Better and more aggressive perturbation scheme. simplex optimizer ? s Sparsity issues important (very tricky). MOSEK simplex-overview s Clean up perturbations with dual (primal) simplex. Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 11 / 26
  • 27. Primal (dual) Degeneracy (continued) Introduction MOSEK 5 has been improved on degenerated problems: The simplex optimizers What makes a good s Better and more aggressive perturbation scheme. simplex optimizer ? s Sparsity issues important (very tricky). MOSEK simplex-overview s Clean up perturbations with dual (primal) simplex. Exploiting sparsity aggressively s Many examples where ”tailed” solves are substantial Primal (dual) Degeneracy reduced. Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 11 / 26
  • 28. Primal (dual) Degeneracy (continued) Introduction MOSEK 5 has been improved on degenerated problems: The simplex optimizers What makes a good s Better and more aggressive perturbation scheme. simplex optimizer ? s Sparsity issues important (very tricky). MOSEK simplex-overview s Clean up perturbations with dual (primal) simplex. Exploiting sparsity aggressively s Many examples where ”tailed” solves are substantial Primal (dual) Degeneracy reduced. Dual bound flipping s Still room for improvement. idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 11 / 26
  • 29. Dual bound flipping idea used more aggressively Introduction The simplex Dual step size δd : optimizers cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL What makes a good simplex optimizer ? MOSEK cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU simplex-overview Exploiting sparsity A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be aggressively binding in dual min-ratio if profitable. Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 12 / 26
  • 30. Dual bound flipping idea used more aggressively Introduction The simplex Dual step size δd : optimizers cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL What makes a good simplex optimizer ? MOSEK cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU simplex-overview Exploiting sparsity A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be aggressively binding in dual min-ratio if profitable. Primal (dual) Degeneracy Dual bound flipping s This involves flipping nonbasic variables to opposite idea used more aggressively bound to remain dual feasible and cost one extra solve. Numerical stability Network optimizer Computational results Conclusions 12 / 26
  • 31. Dual bound flipping idea used more aggressively Introduction The simplex Dual step size δd : optimizers cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL What makes a good simplex optimizer ? MOSEK cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU simplex-overview Exploiting sparsity A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be aggressively binding in dual min-ratio if profitable. Primal (dual) Degeneracy Dual bound flipping s This involves flipping nonbasic variables to opposite idea used more aggressively bound to remain dual feasible and cost one extra solve. Numerical stability Network optimizer s Longer dual steplengths. Computational results Conclusions 12 / 26
  • 32. Dual bound flipping idea used more aggressively Introduction The simplex Dual step size δd : optimizers cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL What makes a good simplex optimizer ? MOSEK cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU simplex-overview Exploiting sparsity A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be aggressively binding in dual min-ratio if profitable. Primal (dual) Degeneracy Dual bound flipping s This involves flipping nonbasic variables to opposite idea used more aggressively bound to remain dual feasible and cost one extra solve. Numerical stability Network optimizer s Longer dual steplengths. Computational s Reduces degeneracy. results Conclusions 12 / 26
  • 33. Dual bound flipping idea used more aggressively Introduction The simplex Dual step size δd : optimizers cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL What makes a good simplex optimizer ? MOSEK cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU simplex-overview Exploiting sparsity A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be aggressively binding in dual min-ratio if profitable. Primal (dual) Degeneracy Dual bound flipping s This involves flipping nonbasic variables to opposite idea used more aggressively bound to remain dual feasible and cost one extra solve. Numerical stability Network optimizer s Longer dual steplengths. Computational s Reduces degeneracy. results s Less iterations. Conclusions 12 / 26
  • 34. Dual bound flipping idea used more aggressively Introduction The simplex Dual step size δd : optimizers cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL What makes a good simplex optimizer ? MOSEK cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU simplex-overview Exploiting sparsity A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be aggressively binding in dual min-ratio if profitable. Primal (dual) Degeneracy Dual bound flipping s This involves flipping nonbasic variables to opposite idea used more aggressively bound to remain dual feasible and cost one extra solve. Numerical stability Network optimizer s Longer dual steplengths. Computational s Reduces degeneracy. results s Less iterations. Conclusions s More flexibility in pivot choice (i.e. potentially more stable). 12 / 26
  • 35. Dual bound flipping idea used more aggressively Introduction The simplex Dual step size δd : optimizers cj − y T Aj − (+)δd (ei B −1 N )j ≥ 0 ∀j ∈ NL What makes a good simplex optimizer ? MOSEK cj − y T Aj − (+)δd (ei B −1 N )j ≤ 0 ∀j ∈ NU simplex-overview Exploiting sparsity A ranged variable i.e. −∞ < lj < xj < uj < ∞ may not be aggressively binding in dual min-ratio if profitable. Primal (dual) Degeneracy Dual bound flipping s This involves flipping nonbasic variables to opposite idea used more aggressively bound to remain dual feasible and cost one extra solve. Numerical stability Network optimizer s Longer dual steplengths. Computational s Reduces degeneracy. results s Less iterations. Conclusions s More flexibility in pivot choice (i.e. potentially more stable). s Improves sparsity of the basis when degenerated! (i.e. if xB i becomes feasible no basis exchange is needed). 12 / 26
  • 36. Dual bound flipping idea used more aggressively (continued) Introduction Bound flipping examples: The simplex optimizers What makes a good Iter Time simplex optimizer ? MOSEK Problem Rows Cols NB WB NB WB simplex-overview Exploiting sparsity osa-60 10280 232966 6938 5111 58.12 8.84 aggressively Primal (dual) world 34506 32734 54566 32606 218.81 50.03 Degeneracy pds-40 66844 212859 34274 26599 96.51 18.48 Dual bound flipping idea used more ken-18 105127 154699 151203 51452 258.18 13.92 aggressively Numerical stability client 27216 20567 80555 63660 208.40 84.09 Network optimizer Computational WB = MOSEK 5 Dual simplex with bound flips results Conclusions NB = MOSEK 5 Dual simplex with no bound flips 13 / 26
  • 37. Numerical stability Introduction s Improving numerical stability. The simplex optimizers What makes a good x Moved LU update before updating solution. simplex optimizer ? MOSEK simplex-overview Exploiting sparsity aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 14 / 26
  • 38. Numerical stability Introduction s Improving numerical stability. The simplex optimizers What makes a good x Moved LU update before updating solution. simplex optimizer ? MOSEK s Saves one solve with L in ei T B −1 [GOL:77]. simplex-overview Exploiting sparsity s More stable approach. aggressively Primal (dual) Degeneracy Dual bound flipping idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 14 / 26
  • 39. Numerical stability Introduction s Improving numerical stability. The simplex optimizers What makes a good x Moved LU update before updating solution. simplex optimizer ? MOSEK s Saves one solve with L in ei T B −1 [GOL:77]. simplex-overview Exploiting sparsity s More stable approach. aggressively Primal (dual) x Better handling of singularities (sing. variables are Degeneracy Dual bound flipping temporary fixed). idea used more aggressively Numerical stability Network optimizer Computational results Conclusions 14 / 26
  • 40. Numerical stability Introduction s Improving numerical stability. The simplex optimizers What makes a good x Moved LU update before updating solution. simplex optimizer ? MOSEK s Saves one solve with L in ei T B −1 [GOL:77]. simplex-overview Exploiting sparsity s More stable approach. aggressively Primal (dual) x Better handling of singularities (sing. variables are Degeneracy Dual bound flipping temporary fixed). idea used more aggressively x Switch to safe mode if deemed unstable. Numerical stability Network optimizer Computational results Conclusions 14 / 26
  • 41. Network optimizer Introduction MOSEK 5 features a network simplex optimizer. The simplex optimizers What makes a good s Solves pure network flow problems (i.e. LP’s with two simplex optimizer ? non-zeros in each column either 1 or -1). MOSEK simplex-overview s Can extract embedded network structure in a model (i.e. Exploiting sparsity aggressively network with side constraints). Primal (dual) Degeneracy s Using standard interface, only one parameter has to be Dual bound flipping set. idea used more aggressively s Huge problems can be solved in limited time, for instance Numerical stability Network optimizer a problem with 8 million variables can be solved in less Computational than 200 seconds. results Conclusions 15 / 26
  • 43. Test setup Introduction s 577 problems (mixed size). The simplex optimizers s A Dual Core server with 4GB RAM running Windows Computational 2003 (Intel CPU). results Test setup s A Quad Core server with 8GB RAM running Windows Network Vs. Standard simplex 2003 (Intel CPU). Primal Simplex s See [HM:07] for a benchmark comparing Mosek with Dual Simplex Numerical difficult other solvers. problems-primal simplex Numerical difficult All results presented in one table is obtained using one of the problems-dual simplex two computers only. Conclusions 17 / 26
  • 44. Network Vs. Standard simplex Introduction The simplex optimizers small medium Computational netw psim dsim netw psim dsim results Num. 30 30 30 43 43 43 Test setup Network Vs. Firsts 30 0 1 43 0 0 Standard simplex Total time 13.7 114.8 27.8 589.9 10676.6 3015.2 Primal Simplex Dual Simplex G. avg. 0.39 2.42 0.70 6.30 91.74 19.70 Numerical difficult problems-primal simplex Numerical difficult large problems-dual simplex netw psim dsim Conclusions Num. 2 2 2 Firsts 2 0 0 Total time 366.3 2905.8 968.9 G. avg. 182.98 1115.71 468.76 Table 1: Performance of the network flow, primal simplex and dual simplex optimizer on pure network problems. 18 / 26
  • 45. Primal Simplex Introduction The simplex optimizers small medium large Computational 5 4 5 4 5 4 results Num. 399 399 148 148 30 30 Test setup Network Vs. Firsts 329 245 91 62 22 11 Standard simplex Total time 100.4 101.7 2425.3 8962.3 29905.2 39333.2 Primal Simplex Dual Simplex G. avg. 0.06 0.07 7.49 9.24 591.39 746.01 Numerical difficult problems-primal simplex Numerical difficult Table 2: Performance of the version 4 and version 5 primal problems-dual simplex simplex optimizer Conclusions 19 / 26
  • 46. Dual Simplex Introduction The simplex optimizers small medium large Computational 5 4 5 4 5 4 results Num. 412 412 150 150 21 21 Test setup Network Vs. Firsts 198 286 133 22 18 5 Standard simplex Total time 84.8 106.4 1852.9 7611.3 23678.9 38994.3 Primal Simplex Dual Simplex G. avg. 0.10 0.08 4.65 8.70 544.44 1065.24 Numerical difficult problems-primal simplex Numerical difficult Table 3: Performance of the version 4 and version 5 dual sim- problems-dual simplex plex optimizer Conclusions 20 / 26
  • 47. Numerical difficult problems-primal simplex Introduction The simplex optimizers small medium large Computational 5 4 5 4 5 4 results Num. 9 9 19 19 2 2 Test setup Network Vs. Firsts 5 5 13 6 2 0 Standard simplex Total time 2.7 2.8 235.9 319.6 1297.7 1503.3 Primal Simplex Dual Simplex G. avg. 0.19 0.18 7.19 9.54 413.26 464.04 Numerical difficult Fails 0 0 0 3 0 3 problems-primal simplex Numerical difficult problems-dual Table 4: Performance of the version 4 and 5 of the primal sim- simplex Conclusions plex optimizer on numerical difficult problems. 21 / 26
  • 48. Numerical difficult problems-dual simplex Introduction The simplex optimizers small medium large Computational 5 4 5 4 5 4 results Num. 11 11 19 19 4 4 Test setup Network Vs. Firsts 7 6 13 6 4 0 Standard simplex Total time 3.9 6.6 3198.3 345.9 4736.3 12820.5 Primal Simplex Dual Simplex G. avg. 0.24 0.31 8.44 9.67 802.24 2525.35 Numerical difficult Fails 0 0 0 1 0 1 problems-primal simplex Numerical difficult problems-dual Table 5: Performance of the version 4 and 5 dual simplex opti- simplex Conclusions mizer on numerical difficult problems. 22 / 26
  • 49. Conclusions 23 / 26
  • 50. Conclusions Introduction s Simplex: The simplex optimizers x MOSEK 5 substantial faster than MOSEK 4. Computational results x MOSEK 5 more stable than MOSEK 4. Conclusions x Dual simplex faster than primal. Conclusions A number open issues exists References 24 / 26
  • 51. A number open issues exists Introduction s Simplex: The simplex optimizers x Degeneracy (non-perturbation method might be Computational results needed in extreme cases). Conclusions x Improve primal pricing. Conclusions A number open x Better crashing on special problems. issues exists x Choose more sparse path. References 25 / 26
  • 52. References Introduction The simplex [HM:07] H.Mittelmann http://plato.la.asu.edu/bench.html optimizers Computational [GIL:88] J. R. Gilbert and T. Peierls, ”Sparse partial pivoting in time results proportional to arithmetic operations”, SIAM J. Sci. Statist. Conclusions Comput., 9, 1988, pp. 862–874. Conclusions A number open issues exists [GOL:77] D. Goldfarb, ”On the Bartels-Golub decomposition for References linear programming bases,” Mathematical. Programming, 13, 1977, pp 272-279 [KOS:02] E. Kostina, ”The Long Step Rule in the Bounded-Variable Dual Simplex Method: Numerical Experiments”, Mathematical Methods of Operations Research, 55 2002, I. 3. [MAR:03] Maros I, ”A Generalized Dual Phase-2 Simplex Algorithm”, European Journal of Operational Research, 149, 2003, pp. 1–16 26 / 26