This document summarizes Samuel Relton's presentation on Fréchet derivatives of matrix functions and their applications. Some key points:
1) It discusses how to define and compute Fréchet derivatives of matrix functions, which describe how small perturbations to a matrix affect the output of the function.
2) Applications include estimating sensitivity in nuclear activation models, predicting algebraic error in finite element methods, and analyzing condition numbers.
3) It presents algorithms for efficiently computing the most sensitive elements of a matrix to changes in the output of a function, with applications to finite elements.
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Frechet Derivatives of Matrix Functions and Applications
1. Frechet Derivatives of Matrix Functions and
Applications
Samuel Relton
samuel.relton@maths.man.ac.uk @sdrelton
samrelton.com blog.samrelton.com
Joint work with Nicholas J. Higham
higham@maths.man.ac.uk @nhigham
www.maths.man.ac.uk/~higham nickhigham.wordpress.com
University of Manchester, UK
September 4, 2014
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 23
2. Outline
Matrix Functions, their Derivatives, and the Condition Number
Elementwise Sensitivity
Physics: Nuclear Activation Sensitivity Problem
Dierential Equations: Predicting Algebraic Error in the FEM
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 2 / 23
3. Matrix Functions
We are interested in functions f : Cnn7! Cnn e.g.
Matrix Exponential eA =
1X
k=0
Ak
k!
Matrix Cosine cos(A) =
1X
k=0
(1)kA2k
(2k)!
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23
4. Matrix Functions
We are interested in functions f : Cnn7! Cnn e.g.
Matrix Exponential eA =
1X
k=0
Ak
k!
Matrix Cosine cos(A) =
1X
k=0
(1)kA2k
(2k)!
De
5. ne f (A) by Taylor series when f is analytic
If A = XDX1 then f (A) = Xf (D)X1
Dierential equations: du
dt = Au(t), u = etAu(0)
Use cos(A) and sin(A) for second order ODEs
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 3 / 23
7. nition (Frechet derivative)
The Frechet derivative of f at A is the unique linear function
Lf (A, ) : Cnn7! Cnn such that for all E
f (A + E) f (A) Lf (A, E) = o(kEk).
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23
9. nition (Frechet derivative)
The Frechet derivative of f at A is the unique linear function
Lf (A, ) : Cnn7! Cnn such that for all E
f (A + E) f (A) Lf (A, E) = o(kEk).
Applications include manifold optimization, Markov models,
bladder cancer, image processing, and network analysis
Higher order derivatives recently analyzed (Higham R., 2014)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 4 / 23
10. Sensitivity of Matrix Functions
f
f
SA
f (SA)
SX
f (SX )
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23
11. Sensitivity of Matrix Functions
f
f
SA
f (SA)
SX
f (SX )
The function f is well conditioned at A and
ill conditioned at X
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 5 / 23
12. The Norm-wise Condition Number
The two condition numbers for a matrix function are:
condabs(f , A) = max
kEk=1
kLf (A, E)k,
condrel(f , A) = max
kEk=1
kLf (A, E)k
kAk
kf (A)k
.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 6 / 23
13. Elementwise Sensitivity
If we change just one element Aij , how is f (A) aected?
Let Eij =
ij
, then the dierence between f (A) and f (A + Eij ) is
kf (A) f (A + Eij )k kLf (A, Eij )k.
kLf (A, Eij )k gives the sensitivity in (i , j) component
Sometimes we want the t most sensitive elements for t = 5: 20
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 7 / 23
14. A simple algorithm
To compute the most sensitive t entries of A:
1 for i = 1: n
2 for j = 1: n
3 if Aij6= 0
4 Compute and store kLf (A, Eij )k
5 end if
6 end for
7 end for
8 Take the largest t values of kLf (A, Eij )k
Cost: Up to O(n5)
ops since computing Lf (A, E) costs O(n3)
ops
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23
15. A simple algorithm
To compute the most sensitive t entries of A:
1 for i = 1: n
2 for j = 1: n
3 if Aij6= 0
4 Compute and store kLf (A, Eij )k
5 end if
6 end for
7 end for
8 Take the largest t values of kLf (A, Eij )k
Cost: Up to O(n5)
ops since computing Lf (A, E) costs O(n3)
ops
Trivially parallel but still very expensive when A is large
Speed this up using block norm estimation (work in progress)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 8 / 23
16. The Nuclear Activation Sensitivity Problem
Chemical reactions: u0(t) = Au(t)
u(t) = eAtu(0) tells us the
concentration of each element at time t
qT u(t) is the dosage at time t
Aij represents the reaction between
elements i and j (so ignore Aij = 0)
Aij is subject to measurement error
What happens to qT u(t) when it
changes?
Implications for safety in radiation exposure models etc.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 9 / 23
17. Nuclear Activation Solution - 1
If Aij is perturbed, this introduces a relative error in qT u(t) of
jqT (etA+Eij etA)u(0)j
jqT etAu(0)j
jqT Lex (tA, Eij )u(0)j
jqT etAu(0)j
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23
18. Nuclear Activation Solution - 1
If Aij is perturbed, this introduces a relative error in qT u(t) of
jqT (etA+Eij etA)u(0)j
jqT etAu(0)j
jqT Lex (tA, Eij )u(0)j
jqT etAu(0)j
We note that:
The denominator is the same for all perturbations
This requires computing a derivative in all directions Aij6= 0
Can we improve upon this?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 10 / 23
19. Nuclear activation solution - 2
Using vec(AXB) = (BT
A)vec(X) we see the sensitivity in direction Eij is
jqT Lex (tA, Eij )u(0)j = j(u(0)T
qT )Kex (tA) vec(Eij )j.
Therefore the sensitivity in ALL n2 directions is
j[(u(0)T
qT )Kex (tA)]T j = jvec(Lex (tA, unvec(u(0)
q)T )T j.
Only 1 derivative needed for all sensitivities
Found 2 bugs in existing commercial software!
Extend for time dependent coecients A = A(t)
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 11 / 23
20. Predicting Algebraic Error in an ODE
Let's solve the model ODE
u00 = f (x), x 2 (0, 1), u(0) = u(1) = 0
with the
21. nite element method using piecewise linear basis functions i .
Exact solution u(x) = e5(x0.5)2
e5=4 determines f (x)
Generate a grid of n = 19 equally spaced points xi
Generate system Ax = b where Aij =
R 1
0 ij and bi = f (xi ).
A = diag(1, 2,1) in this case
Solve with CG iteration
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 12 / 23
23. nite element space (dimension 19)
Let uh 2 Vh be the best solution possible from Vh
Let uk
est be our numerical solution corresponding to k iterations of CG
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23
25. nite element space (dimension 19)
Let uh 2 Vh be the best solution possible from Vh
Let uk
est be our numerical solution corresponding to k iterations of CG
The discretization error is u uh
The algebraic error is uh uk
est
The total error is u uk
est = alg. err. + disc. err.
Sometimes alg err dominates the total err, how do we detect this?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 13 / 23
26. Discretization error
−3 Discretization Error
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
3.5
3
2.5
2
1.5
1
0.5
0
−0.5
−1
−1.5
x 10
u uh
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 14 / 23
27. Algebraic Error - 8 CG iterations
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.015
0.01
0.005
0
−0.005
−0.01
−0.015
Algebraic Error k = 8
Alg. Err.
Total Err.
Nodes 9{11 highlighted
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 15 / 23
28. Algebraic Error - 9 CG iterations
−3 Algebraic Error k = 9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
5
4
3
2
1
0
−1
−2
−3
−4
−5
x 10
Alg. Err.
Total Err.
Nodes 9{11 highlighted
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 16 / 23
29. Elementwise sensitivity analysis
Taking f (A) = A1 we can calculate the sensitivity of each element
Lf (A, E) = A1EA1 so easily computed
Ignore Aij = 0 since the two basis elements don't overlap
Results plotted on the following heat map
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 17 / 23
30. Elementwise sensitivity analysis
Most sensitive elements of A when computing A−1 in 1−norm
2 4 6 8 10 12 14 16 18
2
4
6
8
10
12
14
16
18
0.6
0.5
0.4
0.3
0.2
0.1
0
Row/Cols 9{11 in the middle
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 18 / 23
31. 2D Peak Problem
0.03
0.025
0.02
0.015
0.01
0.005
0
0
0.2
0.4
0.6
0.8
1 0
0.2
0.4
0.6
0.8
1
−0.005
Peak problem
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 19 / 23
32. Algebraic Error Estimation
2
1
0
−1
0
0.5
1 0
0.5
1
−2
−4
x 10
1.5
1
0.5
0
−0.5
−1
0
0.5
1 0
0.5
1
−1.5
−7
x 10
Left: True algebraic error using 7 CG iterations.
Right: Error in estimated algebraic error using 1st Frechet derivative.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 20 / 23
33. Higher Order Derivatives to Estimate Alg. Err.
−6
10
−8
10
−10
10
−12
10
−14
10
−16
0 50 100 150 200
10
Componentwise error using kth order derivatives, k = 1, 3, 5.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 21 / 23
34. Possible extensions
Can this be used to modify the discretization mesh to obtain better
accuracy? (See Papez, Liesen, and Strakos 2014)
Currently too expensive: can we estimate the sensitivities?
Can this be extended to f (A) = eA (exponential integrators)?
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 22 / 23
35. Conclusions
Explained elementwise sensitivity of matrix functions
New applications in nuclear physics and FEM analysis
Former is basically solved, latter needs to be cheaper
Future work:
Estimate sensitivities more eciently (block norm estimation)
Further comparison of nuclear physics solution to commercial
alternative
Further analysis of ODE problem
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 23 / 23
37. ned recursively:
L(k)
f (A+Ek+1, E1, ... , Ek ) L(k)
f (A, E1, ... , Ek ) =
L(k+1)
f (A, E1, ... , Ek , Ek+1) + o(kEk+1k)
Also have a simple method to compute them. For example:
f
0
BB@
2
A E1 E2 0
0 A 0 E2
0 0 A E1
0 0 A
664
3
1
775
CCA
=
2
f (A) Lf (A, E1) Lf (A, E2) L(2)
664
f (A, E1, E2)
0 f (A) 0 Lf (A, E2)
0 0 f (A) Lf (A, E1)
0 0 0 f (A)
3
775
More info in Higham Relton, SIMAX 35(4), 2014.
Sam Relton (UoM) Derivatives of matrix functions September 4, 2014 1 / 1