Frechet Derivatives of Matrix Functions and Applications

Frechet Derivatives of Matrix Functions and
Applications
Samuel Relton
samuel.relton@maths.man.ac.uk @sdrelton
samrelton.com blog.samrelton.com
Joint work with Nicholas J. Higham
higham@maths.man.ac.uk @nhigham
www.maths.man.ac.uk/~higham nickhigham.wordpress.com
University of Manchester, UK
September 4, 2014
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 23

Outline
Matrix Functions, their Derivatives, and the Condition Number
Elementwise Sensitivity
Physics: Nuclear Activation Sensitivity Problem
Dierential Equations: Predicting Algebraic Error in the FEM

Matrix Functions
We are interested in functions f : Cnn7! Cnn e.g.
Matrix Exponential eA =
1X
k=0
Ak
k!
Matrix Cosine cos(A) =
1X
k=0
(1)kA2k
(2k)!

Matrix Functions
We are interested in functions f : Cnn7! Cnn e.g.
Matrix Exponential eA =
1X
k=0
Ak
k!
Matrix Cosine cos(A) =
1X
k=0
(1)kA2k
(2k)!
De

ne f (A) by Taylor series when f is analytic
If A = XDX1 then f (A) = Xf (D)X1
Dierential equations: du
dt = Au(t), u = etAu(0)
Use cos(A) and sin(A) for second order ODEs

Frechet Derivatives
Let f : Cnn7! Cnn be a matrix function.
De

nition (Frechet derivative)
The Frechet derivative of f at A is the unique linear function
Lf (A, ) : Cnn7! Cnn such that for all E
f (A + E) f (A) Lf (A, E) = o(kEk).

nition (Frechet derivative)
The Frechet derivative of f at A is the unique linear function
Lf (A, ) : Cnn7! Cnn such that for all E
f (A + E) f (A) Lf (A, E) = o(kEk).
Applications include manifold optimization, Markov models,
bladder cancer, image processing, and network analysis
Higher order derivatives recently analyzed (Higham R., 2014)

Sensitivity of Matrix Functions
f
f
SA
f (SA)
SX
f (SX )

Sensitivity of Matrix Functions
f
f
SA
f (SA)
SX
f (SX )
The function f is well conditioned at A and
ill conditioned at X

The Norm-wise Condition Number
The two condition numbers for a matrix function are:
condabs(f , A) = max
kEk=1
kLf (A, E)k,
condrel(f , A) = max
kEk=1
kLf (A, E)k
kAk
kf (A)k
.

Elementwise Sensitivity
If we change just one element Aij , how is f (A) aected?
Let Eij =

ij

, then the dierence between f (A) and f (A + Eij ) is
kf (A) f (A + Eij )k kLf (A, Eij )k.
kLf (A, Eij )k gives the sensitivity in (i , j) component
Sometimes we want the t most sensitive elements for t = 5: 20

A simple algorithm
To compute the most sensitive t entries of A:
1 for i = 1: n
2 for j = 1: n
3 if Aij6= 0
4 Compute and store kLf (A, Eij )k
5 end if
6 end for
7 end for
8 Take the largest t values of kLf (A, Eij )k
Cost: Up to O(n5)
ops since computing Lf (A, E) costs O(n3)
ops

A simple algorithm
To compute the most sensitive t entries of A:
1 for i = 1: n
2 for j = 1: n
3 if Aij6= 0
4 Compute and store kLf (A, Eij )k
5 end if
6 end for
7 end for
8 Take the largest t values of kLf (A, Eij )k
Cost: Up to O(n5)
ops since computing Lf (A, E) costs O(n3)
ops
Trivially parallel but still very expensive when A is large
Speed this up using block norm estimation (work in progress)

The Nuclear Activation Sensitivity Problem
Chemical reactions: u0(t) = Au(t)
u(t) = eAtu(0) tells us the
concentration of each element at time t
qT u(t) is the dosage at time t
Aij represents the reaction between
elements i and j (so ignore Aij = 0)
Aij is subject to measurement error
What happens to qT u(t) when it
changes?
Implications for safety in radiation exposure models etc.

Nuclear Activation Solution - 1
If Aij is perturbed, this introduces a relative error in qT u(t) of
jqT (etA+Eij etA)u(0)j
jqT etAu(0)j

jqT Lex (tA, Eij )u(0)j
jqT etAu(0)j

Nuclear Activation Solution - 1
If Aij is perturbed, this introduces a relative error in qT u(t) of
jqT (etA+Eij etA)u(0)j
jqT etAu(0)j

jqT Lex (tA, Eij )u(0)j
jqT etAu(0)j
We note that:
The denominator is the same for all perturbations
This requires computing a derivative in all directions Aij6= 0
Can we improve upon this?

Nuclear activation solution - 2
Using vec(AXB) = (BT
A)vec(X) we see the sensitivity in direction Eij is
jqT Lex (tA, Eij )u(0)j = j(u(0)T
qT )Kex (tA) vec(Eij )j.
Therefore the sensitivity in ALL n2 directions is
j[(u(0)T
qT )Kex (tA)]T j = jvec(Lex (tA, unvec(u(0)
q)T )T j.
Only 1 derivative needed for all sensitivities
Found 2 bugs in existing commercial software!
Extend for time dependent coecients A = A(t)

Predicting Algebraic Error in an ODE
Let's solve the model ODE
u00 = f (x), x 2 (0, 1), u(0) = u(1) = 0
with the

nite element method using piecewise linear basis functions i .
Exact solution u(x) = e5(x0.5)2
e5=4 determines f (x)
Generate a grid of n = 19 equally spaced points xi
Generate system Ax = b where Aij =
R 1
0 ij and bi = f (xi ).
A = diag(1, 2,1) in this case
Solve with CG iteration

Algebraic and discretization errors
Let Vh be our

nite element space (dimension 19)
Let uh 2 Vh be the best solution possible from Vh
Let uk
est be our numerical solution corresponding to k iterations of CG

nite element space (dimension 19)
Let uh 2 Vh be the best solution possible from Vh
Let uk
est be our numerical solution corresponding to k iterations of CG
The discretization error is u uh
The algebraic error is uh uk
est
The total error is u uk
est = alg. err. + disc. err.
Sometimes alg err dominates the total err, how do we detect this?

Discretization error
−3 Discretization Error
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
3.5
3
2.5
2
1.5
1
0.5
0
−0.5
−1
−1.5
x 10
u uh

Algebraic Error - 8 CG iterations
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.015
0.01
0.005
0
−0.005
−0.01
−0.015
Algebraic Error k = 8

Alg. Err.
Total Err.
Nodes 9{11 highlighted

Algebraic Error - 9 CG iterations
−3 Algebraic Error k = 9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
5
4
3
2
1
0
−1
−2
−3
−4
−5
x 10

Alg. Err.
Total Err.
Nodes 9{11 highlighted

Elementwise sensitivity analysis
Taking f (A) = A1 we can calculate the sensitivity of each element
Lf (A, E) = A1EA1 so easily computed
Ignore Aij = 0 since the two basis elements don't overlap
Results plotted on the following heat map

Elementwise sensitivity analysis
Most sensitive elements of A when computing A−1 in 1−norm

2 4 6 8 10 12 14 16 18
2
4
6
8
10
12
14
16
18
0.6
0.5
0.4
0.3
0.2
0.1
0
Row/Cols 9{11 in the middle

Frechet Derivatives of Matrix Functions and Applications

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Andere mochten auch

Andere mochten auch (14)

Ähnlich wie Frechet Derivatives of Matrix Functions and Applications

Ähnlich wie Frechet Derivatives of Matrix Functions and Applications (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Frechet Derivatives of Matrix Functions and Applications