SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Computing f-Divergences and Distances of
High-Dimensional Probability Density Functions
Alexander Litvinenko (RWTH Aachen, Germany),
joint work with
Youssef Marzouk, Hermann G. Matthies, Marco Scavino, Alessio Spantini
SIAM IS 2022
Plan
1. Motivating examples, working diagram
2. Basics: pdf, cdf, FFT
3. Theoretical background
4. Computation of moments and divergences
5. Tensor formats
6. Algorithms
7. Numerics
1 / 30
Motivation and settings:
Assume we use Karhunen-Loeve and (or) Polynomial Chaos
Expansion methods to solve
1. a PDE with uncertain coefficients,
2. a parameter inference problem,
3. a data assimilation task,
4. an optimal design of experiment task.
As a result, we obtain a functional representation (KLE, PCE) of
our uncertain QoI.
How to compute KLD and other divergences? (Classical result
is given only for pdfs, which are not known).
How to compute entropy, KLD and other divergences if pdfs are
not available or not exist?
2 / 30
Motivation: stochastic PDEs
−∇ · (κ(x,ω)∇u(x,ω)) = f(x,ω), x ∈ G ⊂ Rd
, ω ∈ Ω
Write first Karhunen-Loeve Expansion and then for
uncorrelated random variables the Polynomial Chaos
Expansion
u(x,ω) =
K
X
i=1
p
λiφi(x)ξi(ω) =
K
X
i=1
p
λiφi(x)
X
α∈J
ξ
(α)
i Hα(θ(ω))
(1)
where ξ
(α)
i is a tensor. Note that α = (α1,α2,...,αM,...) is a
multi-index.
X
α∈J
ξ
(α)
i Hα(θ(ω)) :=
p1
X
α1=1
...
pM
X
αM=1
ξ
(α1,...,αM)
i
M
Y
j=1
hαj
(θj) (2)
The same decomposition for κ(x,ω).
3 / 30
Final discretized stochastic PDE
Au = f, where
A:=
Ps
l=1 Ãl ⊗
NM
µ=1 ∆lµ

, Ãl ∈ RN×N, ∆lµ ∈ Rnµ×nµ,
u:=
Pr
j=1 uj ⊗
NM
µ=1 ujµ

, uj ∈ RN, ujµ ∈ Rnµ,
f:=
PR
k=1 f̃k ⊗
NM
µ=1 gkµ, f̃k ∈ RN and gkµ ∈ Rnµ.
Examples of stochastic Galerkin matrices:
4 / 30
How to compute KLE, divergences if pdf is not available?
Two ways to compute f-divergence, KLD, entropy.
5 / 30
Connection of pcf and pdf
The probability characteristic function (pcf ) ϕξ could be
defined:
ϕξ(t) := E(exp(ihξ|ti)) :=
Z
Rd
pξ(y)exp(ihy|ti) dy =: F [d]
(pξ)(t),
where t = (t1,t2,...,td) ∈ Rd is the dual variable to y ∈ Rd,
hy|ti =
Pd
j=1 yjtj is the canonical inner product on Rd, and F [d]
is the probabilist’s d-dimensional Fourier transform.
pξ(y) =
1
(2π)d
Z
Rd
exp(−iht|yi)ϕξ(t)dt = F [−d]
(ϕξ)(y). (3)
6 / 30
Discrete low-rank representation of pcf
We try to find an approximation
ϕξ(t) ≈ e
ϕξ(t) =
R
X
`=1
d
O
ν=1
ϕ`,ν(tν), (4)
where the ϕ`,ν(tν) are one-dimensional functions.
Then we can get
pξ(y) ≈ e
pξ(y) = F [−d]
(ϕξ)y =
R
X
`=1
d
O
ν=1
F −1
1 (ϕ`,ν)(yν),
where F −1
1 is the one-dimensional inverse Fourier transform.
7 / 30
Representation of a 3D tensor in the CP tensor format
A full tensor w ∈ Rn1×n2×n3 (on the left) is represented as a sum
of tensor products. The lines on the right denote vectors
wi,k ∈ Rnk , i = 1,...,r, k = 1,2,3.
8 / 30
CP tensor format linear algebra
α · w =
r
X
j=1
α
d
O
ν=1
wj,ν =
r
X
j=1
d
O
ν=1
ανwj,ν
where αν := d
√
|α| for all ν  1. and α1 := sign(α) d
√
|α|.
The sum of two tensors costs only O(1):
w = u + v =








ru
X
j=1
d
O
ν=1
uj,ν








+









rv
X
k=1
d
O
µ=1
vk,µ









=
ru+rv
X
j=1
d
O
ν=1
wj,ν
where wj,ν := uj,ν for j ≤ ru and wj,ν := vj,ν for ru  j ≤ ru + rv.
The sum w generally has rank ru + rv.
9 / 30
CP properties: Hadamard product
u
v =








ru
X
j=1
d
O
ν=1
uj,ν















rv
X
k=1
d
O
ν=1
vk,ν






 =
ru
X
j=1
rv
X
k=1
d
O
ν=1
(uj,ν
vk,ν)
The new rank can increase till rurv, and the computational cost
is O(ru rvn d).
The scalar product can be computed as follows:
hu|viT = h
ru
X
j=1
d
O
ν=1
uj,ν|
rv
X
k=1
d
O
ν=1
vk,νiT =
ru
X
j=1
rv
X
k=1
d
Y
ν=1
huj,ν|vk,νi
Cost is O(ru rv n d).
Rank truncation via the ALS-method or Gauss-Newton-method.
The scalar product above helps to compute the Frobenius norm
kuk2 :=
p
hu|viT .
10 / 30
Tensor Train data format
w := (wi1...id
) =
r0
X
j0=1
...
rd
X
jd=1
w
(1)
j0j1
[i1]···w
(ν)
jν−1jν
[iν]···w
(d)
jd−1jd
[id]
Alternatively, each TT-core W(ν)
may be seen as a vector of
rν−1 × rν matrices W
(ν)
iν
of length Mν, i.e.
W(ν)
= (W
(ν)
iν
) : 1 ≤ iν ≤ Mν. Then
w := (wi1...id
) =
d
Y
ν=1
W
(ν)
iν
.
Schema of the TT decomposition. The waggons denote the TT
cores and each wheel denotes the index iν. Each waggon is
connected with neighbours by indices jν−1 and jν. 11 / 30
Computation of QoIs
Z
p(x) dx ≈ S(P) :=
V
N
hP|iT .
E(f(ξ)) =
Z
Rd
f(x)pξ(x) dx ≈ S(F
P) =
V
N
hF|PiT
Differential entropy, requiring the point-wise logarithm of P:
h(p) := E(−log(p))p ≈ E(−log(P))P = −
V
N
hlog(P)|Pi,
Then the f-divergence of p from q and its discrete
approximation is defined as
Df(pkq) := E f
p
q
!!
q
≈ E

f(P
Q
−1
)

Q
=
V
N
hf(P
Q
−1
)|Qi.
12 / 30
List of some typical divergences and distances
Divergence D•(pkq)
KLD
Z
(log(p(x)/q(x)))p(x) dx
Hellinger dist.
1
2
Z q
p(x) −
q
q(x)
2
dx
Bregman div.
Z
[(φ(p(x)) − φ(q(x))) − (p(x) − q(x))φ0
(q(x))] dx
Bhattacharyya −log
Z q
(p(x)q(x)) dx
!
List of some typical divergences and distances.
13 / 30
Discrete approximations for divergences
Divergence Approx. D•(pkq)
KLD
V
N
(hlog(P)|Pi − hlog(Q)|Pi)
(DH)2:
V
2N
hP
1/2
− Q
1/2
|P
1/2
− Q
1/2
i
Dφ: S ((φ(P) − φ(Q)) − (P − Q)
φ0
(Q))
DBh: −log

V
N
hP
1/2
|Q
1/2
i

14 / 30
Computing pointwise inverse w

Weitere ähnliche Inhalte

Ähnlich wie Computing f-Divergences and Distances of\\ High-Dimensional Probability Density Functions

SLC 2015 talk improved version
SLC 2015 talk improved versionSLC 2015 talk improved version
SLC 2015 talk improved versionZheng Mengdi
 
Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics Alexander Litvinenko
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionPedro222284
 
TwoLevelMedium
TwoLevelMediumTwoLevelMedium
TwoLevelMediumJohn Paul
 
Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...
Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...
Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...Shizuoka Inst. Science and Tech.
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化Akira Tanimoto
 
Response Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationResponse Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationAlexander Litvinenko
 
Estimating structured vector autoregressive models
Estimating structured vector autoregressive modelsEstimating structured vector autoregressive models
Estimating structured vector autoregressive modelsAkira Tanimoto
 
A Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scaleA Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scaleOctavianPostavaru
 
Analysis of coupled inset dielectric guide structure
Analysis of coupled inset dielectric guide structureAnalysis of coupled inset dielectric guide structure
Analysis of coupled inset dielectric guide structureYong Heui Cho
 
Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02
Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02
Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02Luke Underwood
 

Ähnlich wie Computing f-Divergences and Distances of\\ High-Dimensional Probability Density Functions (20)

SLC 2015 talk improved version
SLC 2015 talk improved versionSLC 2015 talk improved version
SLC 2015 talk improved version
 
PCA on graph/network
PCA on graph/networkPCA on graph/network
PCA on graph/network
 
Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability Distribution
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
 
TwoLevelMedium
TwoLevelMediumTwoLevelMedium
TwoLevelMedium
 
kactl.pdf
kactl.pdfkactl.pdf
kactl.pdf
 
residue
residueresidue
residue
 
Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...
Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...
Talk at SciCADE2013 about "Accelerated Multiple Precision ODE solver base on ...
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化
 
02_AJMS_297_21.pdf
02_AJMS_297_21.pdf02_AJMS_297_21.pdf
02_AJMS_297_21.pdf
 
lec07_DFT.pdf
lec07_DFT.pdflec07_DFT.pdf
lec07_DFT.pdf
 
Response Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationResponse Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty Quantification
 
Estimating structured vector autoregressive models
Estimating structured vector autoregressive modelsEstimating structured vector autoregressive models
Estimating structured vector autoregressive models
 
A Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scaleA Fibonacci-like universe expansion on time-scale
A Fibonacci-like universe expansion on time-scale
 
Analysis of coupled inset dielectric guide structure
Analysis of coupled inset dielectric guide structureAnalysis of coupled inset dielectric guide structure
Analysis of coupled inset dielectric guide structure
 
Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02
Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02
Electromagnetic Scattering from Objects with Thin Coatings.2016.05.04.02
 
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems
Adaptive dynamic programming algorithm for uncertain nonlinear switched systemsAdaptive dynamic programming algorithm for uncertain nonlinear switched systems
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems
 
DCT
DCTDCT
DCT
 
DCT
DCTDCT
DCT
 

Mehr von Alexander Litvinenko

litvinenko_Intrusion_Bari_2023.pdf
litvinenko_Intrusion_Bari_2023.pdflitvinenko_Intrusion_Bari_2023.pdf
litvinenko_Intrusion_Bari_2023.pdfAlexander Litvinenko
 
Density Driven Groundwater Flow with Uncertain Porosity and Permeability
Density Driven Groundwater Flow with Uncertain Porosity and PermeabilityDensity Driven Groundwater Flow with Uncertain Porosity and Permeability
Density Driven Groundwater Flow with Uncertain Porosity and PermeabilityAlexander Litvinenko
 
Uncertain_Henry_problem-poster.pdf
Uncertain_Henry_problem-poster.pdfUncertain_Henry_problem-poster.pdf
Uncertain_Henry_problem-poster.pdfAlexander Litvinenko
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfAlexander Litvinenko
 
Identification of unknown parameters and prediction of missing values. Compar...
Identification of unknown parameters and prediction of missing values. Compar...Identification of unknown parameters and prediction of missing values. Compar...
Identification of unknown parameters and prediction of missing values. Compar...Alexander Litvinenko
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Alexander Litvinenko
 
Identification of unknown parameters and prediction with hierarchical matrice...
Identification of unknown parameters and prediction with hierarchical matrice...Identification of unknown parameters and prediction with hierarchical matrice...
Identification of unknown parameters and prediction with hierarchical matrice...Alexander Litvinenko
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Alexander Litvinenko
 
Application of parallel hierarchical matrices for parameter inference and pre...
Application of parallel hierarchical matrices for parameter inference and pre...Application of parallel hierarchical matrices for parameter inference and pre...
Application of parallel hierarchical matrices for parameter inference and pre...Alexander Litvinenko
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Alexander Litvinenko
 
Propagation of Uncertainties in Density Driven Groundwater Flow
Propagation of Uncertainties in Density Driven Groundwater FlowPropagation of Uncertainties in Density Driven Groundwater Flow
Propagation of Uncertainties in Density Driven Groundwater FlowAlexander Litvinenko
 
Simulation of propagation of uncertainties in density-driven groundwater flow
Simulation of propagation of uncertainties in density-driven groundwater flowSimulation of propagation of uncertainties in density-driven groundwater flow
Simulation of propagation of uncertainties in density-driven groundwater flowAlexander Litvinenko
 
Approximation of large covariance matrices in statistics
Approximation of large covariance matrices in statisticsApproximation of large covariance matrices in statistics
Approximation of large covariance matrices in statisticsAlexander Litvinenko
 
Semi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster EnsembleSemi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster EnsembleAlexander Litvinenko
 
Talk Alexander Litvinenko on SIAM GS Conference in Houston
Talk Alexander Litvinenko on SIAM GS Conference in HoustonTalk Alexander Litvinenko on SIAM GS Conference in Houston
Talk Alexander Litvinenko on SIAM GS Conference in HoustonAlexander Litvinenko
 
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...Alexander Litvinenko
 
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...Alexander Litvinenko
 

Mehr von Alexander Litvinenko (20)

litvinenko_Intrusion_Bari_2023.pdf
litvinenko_Intrusion_Bari_2023.pdflitvinenko_Intrusion_Bari_2023.pdf
litvinenko_Intrusion_Bari_2023.pdf
 
Density Driven Groundwater Flow with Uncertain Porosity and Permeability
Density Driven Groundwater Flow with Uncertain Porosity and PermeabilityDensity Driven Groundwater Flow with Uncertain Porosity and Permeability
Density Driven Groundwater Flow with Uncertain Porosity and Permeability
 
litvinenko_Gamm2023.pdf
litvinenko_Gamm2023.pdflitvinenko_Gamm2023.pdf
litvinenko_Gamm2023.pdf
 
Litvinenko_Poster_Henry_22May.pdf
Litvinenko_Poster_Henry_22May.pdfLitvinenko_Poster_Henry_22May.pdf
Litvinenko_Poster_Henry_22May.pdf
 
Uncertain_Henry_problem-poster.pdf
Uncertain_Henry_problem-poster.pdfUncertain_Henry_problem-poster.pdf
Uncertain_Henry_problem-poster.pdf
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdf
 
Identification of unknown parameters and prediction of missing values. Compar...
Identification of unknown parameters and prediction of missing values. Compar...Identification of unknown parameters and prediction of missing values. Compar...
Identification of unknown parameters and prediction of missing values. Compar...
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...
 
Identification of unknown parameters and prediction with hierarchical matrice...
Identification of unknown parameters and prediction with hierarchical matrice...Identification of unknown parameters and prediction with hierarchical matrice...
Identification of unknown parameters and prediction with hierarchical matrice...
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...
 
Application of parallel hierarchical matrices for parameter inference and pre...
Application of parallel hierarchical matrices for parameter inference and pre...Application of parallel hierarchical matrices for parameter inference and pre...
Application of parallel hierarchical matrices for parameter inference and pre...
 
Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...Computation of electromagnetic fields scattered from dielectric objects of un...
Computation of electromagnetic fields scattered from dielectric objects of un...
 
Propagation of Uncertainties in Density Driven Groundwater Flow
Propagation of Uncertainties in Density Driven Groundwater FlowPropagation of Uncertainties in Density Driven Groundwater Flow
Propagation of Uncertainties in Density Driven Groundwater Flow
 
Simulation of propagation of uncertainties in density-driven groundwater flow
Simulation of propagation of uncertainties in density-driven groundwater flowSimulation of propagation of uncertainties in density-driven groundwater flow
Simulation of propagation of uncertainties in density-driven groundwater flow
 
Approximation of large covariance matrices in statistics
Approximation of large covariance matrices in statisticsApproximation of large covariance matrices in statistics
Approximation of large covariance matrices in statistics
 
Semi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster EnsembleSemi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster Ensemble
 
Talk Alexander Litvinenko on SIAM GS Conference in Houston
Talk Alexander Litvinenko on SIAM GS Conference in HoustonTalk Alexander Litvinenko on SIAM GS Conference in Houston
Talk Alexander Litvinenko on SIAM GS Conference in Houston
 
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...
 
Talk litvinenko gamm19
Talk litvinenko gamm19Talk litvinenko gamm19
Talk litvinenko gamm19
 
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...
 

Kürzlich hochgeladen

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Kürzlich hochgeladen (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

Computing f-Divergences and Distances of\\ High-Dimensional Probability Density Functions

  • 1. Computing f-Divergences and Distances of High-Dimensional Probability Density Functions Alexander Litvinenko (RWTH Aachen, Germany), joint work with Youssef Marzouk, Hermann G. Matthies, Marco Scavino, Alessio Spantini SIAM IS 2022
  • 2. Plan 1. Motivating examples, working diagram 2. Basics: pdf, cdf, FFT 3. Theoretical background 4. Computation of moments and divergences 5. Tensor formats 6. Algorithms 7. Numerics 1 / 30
  • 3. Motivation and settings: Assume we use Karhunen-Loeve and (or) Polynomial Chaos Expansion methods to solve 1. a PDE with uncertain coefficients, 2. a parameter inference problem, 3. a data assimilation task, 4. an optimal design of experiment task. As a result, we obtain a functional representation (KLE, PCE) of our uncertain QoI. How to compute KLD and other divergences? (Classical result is given only for pdfs, which are not known). How to compute entropy, KLD and other divergences if pdfs are not available or not exist? 2 / 30
  • 4. Motivation: stochastic PDEs −∇ · (κ(x,ω)∇u(x,ω)) = f(x,ω), x ∈ G ⊂ Rd , ω ∈ Ω Write first Karhunen-Loeve Expansion and then for uncorrelated random variables the Polynomial Chaos Expansion u(x,ω) = K X i=1 p λiφi(x)ξi(ω) = K X i=1 p λiφi(x) X α∈J ξ (α) i Hα(θ(ω)) (1) where ξ (α) i is a tensor. Note that α = (α1,α2,...,αM,...) is a multi-index. X α∈J ξ (α) i Hα(θ(ω)) := p1 X α1=1 ... pM X αM=1 ξ (α1,...,αM) i M Y j=1 hαj (θj) (2) The same decomposition for κ(x,ω). 3 / 30
  • 5. Final discretized stochastic PDE Au = f, where A:= Ps l=1 Ãl ⊗ NM µ=1 ∆lµ , Ãl ∈ RN×N, ∆lµ ∈ Rnµ×nµ, u:= Pr j=1 uj ⊗ NM µ=1 ujµ , uj ∈ RN, ujµ ∈ Rnµ, f:= PR k=1 f̃k ⊗ NM µ=1 gkµ, f̃k ∈ RN and gkµ ∈ Rnµ. Examples of stochastic Galerkin matrices: 4 / 30
  • 6. How to compute KLE, divergences if pdf is not available? Two ways to compute f-divergence, KLD, entropy. 5 / 30
  • 7. Connection of pcf and pdf The probability characteristic function (pcf ) ϕξ could be defined: ϕξ(t) := E(exp(ihξ|ti)) := Z Rd pξ(y)exp(ihy|ti) dy =: F [d] (pξ)(t), where t = (t1,t2,...,td) ∈ Rd is the dual variable to y ∈ Rd, hy|ti = Pd j=1 yjtj is the canonical inner product on Rd, and F [d] is the probabilist’s d-dimensional Fourier transform. pξ(y) = 1 (2π)d Z Rd exp(−iht|yi)ϕξ(t)dt = F [−d] (ϕξ)(y). (3) 6 / 30
  • 8. Discrete low-rank representation of pcf We try to find an approximation ϕξ(t) ≈ e ϕξ(t) = R X `=1 d O ν=1 ϕ`,ν(tν), (4) where the ϕ`,ν(tν) are one-dimensional functions. Then we can get pξ(y) ≈ e pξ(y) = F [−d] (ϕξ)y = R X `=1 d O ν=1 F −1 1 (ϕ`,ν)(yν), where F −1 1 is the one-dimensional inverse Fourier transform. 7 / 30
  • 9. Representation of a 3D tensor in the CP tensor format A full tensor w ∈ Rn1×n2×n3 (on the left) is represented as a sum of tensor products. The lines on the right denote vectors wi,k ∈ Rnk , i = 1,...,r, k = 1,2,3. 8 / 30
  • 10. CP tensor format linear algebra α · w = r X j=1 α d O ν=1 wj,ν = r X j=1 d O ν=1 ανwj,ν where αν := d √ |α| for all ν 1. and α1 := sign(α) d √ |α|. The sum of two tensors costs only O(1): w = u + v =         ru X j=1 d O ν=1 uj,ν         +          rv X k=1 d O µ=1 vk,µ          = ru+rv X j=1 d O ν=1 wj,ν where wj,ν := uj,ν for j ≤ ru and wj,ν := vj,ν for ru j ≤ ru + rv. The sum w generally has rank ru + rv. 9 / 30
  • 14. vk,ν) The new rank can increase till rurv, and the computational cost is O(ru rvn d). The scalar product can be computed as follows: hu|viT = h ru X j=1 d O ν=1 uj,ν| rv X k=1 d O ν=1 vk,νiT = ru X j=1 rv X k=1 d Y ν=1 huj,ν|vk,νi Cost is O(ru rv n d). Rank truncation via the ALS-method or Gauss-Newton-method. The scalar product above helps to compute the Frobenius norm kuk2 := p hu|viT . 10 / 30
  • 15. Tensor Train data format w := (wi1...id ) = r0 X j0=1 ... rd X jd=1 w (1) j0j1 [i1]···w (ν) jν−1jν [iν]···w (d) jd−1jd [id] Alternatively, each TT-core W(ν) may be seen as a vector of rν−1 × rν matrices W (ν) iν of length Mν, i.e. W(ν) = (W (ν) iν ) : 1 ≤ iν ≤ Mν. Then w := (wi1...id ) = d Y ν=1 W (ν) iν . Schema of the TT decomposition. The waggons denote the TT cores and each wheel denotes the index iν. Each waggon is connected with neighbours by indices jν−1 and jν. 11 / 30
  • 16. Computation of QoIs Z p(x) dx ≈ S(P) := V N hP|iT . E(f(ξ)) = Z Rd f(x)pξ(x) dx ≈ S(F
  • 17. P) = V N hF|PiT Differential entropy, requiring the point-wise logarithm of P: h(p) := E(−log(p))p ≈ E(−log(P))P = − V N hlog(P)|Pi, Then the f-divergence of p from q and its discrete approximation is defined as Df(pkq) := E f p q !! q ≈ E f(P
  • 18. Q
  • 20. Q
  • 22. List of some typical divergences and distances Divergence D•(pkq) KLD Z (log(p(x)/q(x)))p(x) dx Hellinger dist. 1 2 Z q p(x) − q q(x) 2 dx Bregman div. Z [(φ(p(x)) − φ(q(x))) − (p(x) − q(x))φ0 (q(x))] dx Bhattacharyya −log Z q (p(x)q(x)) dx ! List of some typical divergences and distances. 13 / 30
  • 23. Discrete approximations for divergences Divergence Approx. D•(pkq) KLD V N (hlog(P)|Pi − hlog(Q)|Pi) (DH)2: V 2N hP
  • 27. 1/2 i Dφ: S ((φ(P) − φ(Q)) − (P − Q)
  • 32. −1. Let F(x) := w − x
  • 33. −1. Applying Newton’s method to F(x) for approximating the inverse of a given tensor w, one obtains the following iteration function Ψ
  • 34. −1 with the i.c. v0 = α · w to bring v0 close to va = : Ψ
  • 36. (2 ·  − w
  • 37. v). The iteration converges if the initial iterate v0 satisfies k − w
  • 38. v0k∞ 1. A possible candidate for the starting value is v0 = αw with α (1/kwk∞)2. For such a v0, the convergence initial condition k − αw
  • 39. 2k∞ 1 is always satisfied. 15 / 30
  • 41. 2 − w = 0. The Newton iteration Ψ √(v) = 1 2 · (v + v
  • 42. −1
  • 43. w). (5) with i.c. v0 = (w + )/2. An alternative is Newton-Schulz iteration, which computes v+ ∗ = √ w = w
  • 44. 1/2 and v− ∗ = ( √ w)
  • 46. −1/2. We set V 0 = [y0,z0] = [α · w,] ∈ T 2, and A(y,z) = 3 ·  − z
  • 50. Series of numerical examples 1. KLD is computed with the analytical formula and the amen cross algorithm from TT-toolbox 2. Hellinger distances is computed with well-known analytical formulas and the amen cross algorithm. 3. (pdf is not known analytically), the d-variate elliptically contoured α-stable distributions are chosen and accessed via their pcfs , 4. KLD and Hellinger distances for different value of d, n and the parameter α. 17 / 30
  • 51. Example 1: KLD for two Gaussian distributions N1 := N (µ1,C1) and N2 := N (µ2,C2), where C1 := σ2 1 I, C2 := σ2 2 I, µ1 = (1.1...,1.1) and µ2 = (1.4,...,1.4) ∈ Rd, d = {16,32,64}, and σ1 = 1.5, σ2 = 22.1. The well known analytical formula is 2DKL(N1kN2) = tr(C−1 2 C1) + (µ2 − µ1)T C−1 2 (µ2 − µ1) − d + log |C2| |C1| 18 / 30
  • 52. Comparison of KLDs computed via two methods DKL computed via TT tensors (AMEn algorithm) and the analytical formula for various values of d. TT tolerance = 10−6, the stopping difference between consecutive iterations. d 16 32 64 n 2048 2048 2048 DKL (exact) 35.08 70.16 140.32 e DKL 35.08 70.16 140.32 erra 4.0e-7 2.43e-5 1.4e-5 errr 1.1e-8 3.46e-8 8.1e-8 comp. time, sec. 1.0 5.0 18.7 19 / 30
  • 53. Example 2 — Hellinger distance (for Gaussian distributions) DH(N1,N2)2 = 1 − K1 2 (N1,N2), where K1 2 (N1,N2) = det(C1) 1 4 det(C2) 1 4 det C1+C2 2 1 2 · · exp − 1 8 (µ1 − µ2) C1 + C2 2 −1 (µ1 − µ2) ! 20 / 30
  • 54. Hellinger distance DH is computed via TT tensors (AMEn) and the analytical formula. TT tolerance = 10−6. d 16 32 64 n 2048 2048 2048 DH (exact) 0.99999 0.99999 0.99999 e DH 0.99992 0.99999 0.99999 erra 3.5e-5 7.1e-5 1.4e-4 errr 2.5e-5 5.0e-5 1.0e-4 comp. time, sec. 1.7 7.5 30.5 The AMEn algorithm is able to compute the Hellinger distance DH between two multiv. Gaussian distribs for d = {16,32,64}, and n = 2048. The exact and approximate values are identical, and the error is small. 21 / 30
  • 55. Example 3: α-stable distribution The pcf of a d-variate elliptically contoured α-stable distribution is given by ϕξ(t) = exp iht|µi − ht|Cti α 2 . AMEn tol.= 10−9. 22 / 30
  • 56. Example 3: KLD between two α-stable distributions with α1 = 2.0, α2 = 1.9 (µ1 = µ2 = 0, C1 = C2 = I). d 16 16 16 16 16 16 16 32 32 32 n 8 16 32 64 128 256 512 64 128 256 DKL(2.0,1.9) 0.016 0.06 0.06 0.062 0.06 0.06 0.06 0.09 0.14 0.12 time, sec. 0.8 3 8.9 14 22 61 207 46 100 258 maxTT rank 40 57 79 79 59 79 77 80 78 79 mem., MB 1.8 7 34 54 73 158 538 160 313 626 AMEn tol.= 10−9. 23 / 30
  • 57. Full storage vs low-rank d = 32 and n = 256 mean that the amount of data in full storage mode would be N = nd = 26532 ≈ 1.16E77 ≈ 1E78 bytes. In TT-low-rank approximation it is ca. 626MB, and fits on a laptop. Assuming 1GHz notebook, the KLD computation in full mode would require ca. 1.2E68sec, or more than 3E60 years, and even with a perfect speed-up on a parallel super-computer with say 1,000,000 processors, this would require still more than 3E54 years; compare this with the estimated age of the universe of ca. 1.4E10 years. 24 / 30
  • 58. Example 4: DKL(α1,α2) between two α-stable distributions for (α1,α2) and fixed d = 8 and n = 64. (α1,α2) (2.0,0.5) (2.0,1.0) (2.0,1.5) (2.0,1.9) (1.5,1.4) (1.0,0.4) DKL(α1,α2) 2.27 0.66 0.3 0.03 0.031 0.6 comp. time, sec. 8.4 7.8 7.5 8.5 11 8.7 max. TT rank 78 74 76 76 80 79 memory, MB 28.5 28.5 27.1 28.5 35 29.5 µ1 = µ2 = 0, C1 = C2 = I. AMEn tol.= 10−12. 25 / 30
  • 59. Example 3: Hellinger distance DH(α1,α2) for the d-variate elliptically contoured α-stable distribution for α = 1.5 and α = 0.9 for different d and n. µ1 = µ2 = 0, C1 = C2 = I. d 16 16 16 16 16 16 32 32 32 32 n 8 16 32 64 128 256 16 32 64 128 DH(1.5,0.9) 0.218 0.223 0.223 0.223 0.219 0.223 0.180 0.176 0.175 0.176 comp. time, sec. 2.8 3.7 7.5 19 53 156 11 21 62 117 max. TT rank 79 76 76 76 79 76 75 71 75 74 memory, MB 7.7 17 34 71 145 283 34 66 144 285 AMEn tolerance is 10−9. 26 / 30
  • 60. Example 6: DH vs. TT (AMEn) tolerances TT(AMEn) tolerance 10−7 10−8 10−9 10−10 10−14 DH(1.5,0.9) 0.1645 0.1817 0.176 0.1761 0.1802 comp. time, sec. 43 86 103 118 241 max. TT rank 64 75 75 78 77 memory, MB 126 255 270 307 322 Computation of DH(α1,α2) between two α-stable distributions (α = 1.5 and α = 0.9) for different AMEn tolerances. n = 128, d = 32, µ1 = µ2 = 0, C1 = C2 = I. 27 / 30
  • 61. Conclusion We demonstrated 1. numerical computation of characterising statistics of high-dimensional pdfs , as well as their divergences and distances. 2. pdfs and pcfs , and some functions (log(), √ P,...) of them can be approximated and represented in a low-rank tensor data format. 3. Utilisation of low-rank tensor techniques helps to reduce the complexity and storage from exponential O(nd) to linear, e.g. O dnr2 for the TT format. 28 / 30
  • 62. Literature 1. A. Litvinenko, Y. Marzouk, H.G. Matthies, M. Scavino, A. Spantini, Computing f-Divergences and Distances of High-Dimensional Probability Density Functions – Low-Rank Tensor Approximations, arXiv: 2111.07164, 2021, https://arxiv.org/abs/2111.07164 2. M. Espig, W. Hackbusch, A. Litvinenko, H.G. Matthies, E Zander, Iterative algorithms for the post-processing of high-dimensional data, JCP 410, 109396, 2020, https://doi.org/10.1016/j.jcp.2020.109396 3. S. Dolgov, A. Litvinenko, D. Liu, KRIGING IN TENSOR TRAIN DATA FORMAT, Conf. Proc., 3rd Int. Conf. on Uncertainty Quantification in CSE, https://files.eccomasproceedia. org/papers/e-books/uncecomp_2019.pdf, pp 309-329, 2019 29 / 30
  • 63. Literature 4. A. Litvinenko, D. Keyes, V. Khoromskaia, B.N. Khoromskij, H.G. Matthies, Tucker tensor analysis of Matérn functions in spatial statistics, Computational Methods in Applied Mathematics 19 (1), 101-122, 2019, https://www.degruyter.com/document/doi/10.1515/ cmam-2018-0022/pdf 5. A. Litvinenko, R. Kriemann, M.G. Genton, Y. Sun, D.E. Keyes, HLIBCov: Parallel hierarchical matrix approximation of large covariance matrices and likelihoods with applications in parameter identification, MethodsX 7, 100600, 2020, https://doi.org/10.1016/j.mex.2019.07.001 6. A. Litvinenko, Y. Sun, M.G. Genton, D.E. Keyes, Likelihood approximation with hierarchical matrices for large spatial datasets, Computational Statistics Data Analysis 137, pp 115-132, 2019, https://doi.org/10.1016/j.csda.2019.02.002 30 / 30