SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Mat-2.108 Independent Research Project in Applied Mathematics


     Perfusion Deconvolution
        via EM Algorithm
                      27th January 2004




             Helsinki University of Technology
     Department of Engineering Physics and Mathematics
               Systems Analysis Laboratory

               Helsinki Brain Research Center
               Functional Brain Imaging Unit

                       Tero Tuominen
                           51687J
Contents
   List of abbreviations and symbols                                                                  ii

1 Introduction                                                                                         1

2 Perfusion Model and Problem Description                                                              4
  2.1 Discretization: 0th order approximation . . . . . . . . . . . . . . . .                          5
  2.2 SVD Solution to Deconvolution . . . . . . . . . . . . . . . . . . . . .                          6
  2.3 Discretization: 1st order approximation . . . . . . . . . . . . . . . .                          6

3 EM Algorithm                                                                                         8
  3.1 Overview of EM Algorithm . . . . . . . . . . . . . .        .   .   .   .   .   .   .   .   .    8
  3.2 EM Algorithm applied to Perfusion Deconvolution             .   .   .   .   .   .   .   .   .    9
      3.2.1 Lange’s Method in PET image reconstruction            .   .   .   .   .   .   .   .   .    9
      3.2.2 Vonken’s Method . . . . . . . . . . . . . . . .       .   .   .   .   .   .   .   .   .   12

4 Improved application of EM                                                                          14

5 This Work                                                                                           18
  5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                    18
  5.2 Detailed description and the parameters used . . . . . . . . . . . . .                          19

6 Results                                                                                             19

7 Conclusions                                                                                         23

   References                                                                                         26




                                           i
List of abbreviations and symbols
MRI         Magnetic Resonance Imaging
fMRI        Functional Magnetic Resonance Imaging
PWI         Perfusion Weighted Imaging
EM          Expectation Maximum
MLE         Maximum Likelihood Estimate
MTT         Mean Transit Time
CBV         Cerebral Blood Volume
CBF         Cerebral Blood Flow
SNR         Signal-to-Noise Ratio
TR          Time-to-Repeat
TE          Time-to-Echo
EPI         Echo-Planar Imaging
AIF  a(t)   Arterial Input Function
TCC c(t)    Tissue Concentration Curve
     r(t)   Residue Function
     Ψ(t)   Impulse Response; Ψ(t) = CBF · r(t)
     a      vector or matrix, a ∈ n×m , n, m > 1
     a      scalar, a ∈
     A      random variable
     a      realization
     A      random vector or matrix
     a      realization of random vector or matrix




                                  ii
1    Introduction
Since its introduction in 1988 perfusion weighted fMRI has gained widespread
interest in the field of medical imaging. It offers an easy and - most importantly - a
non-invasive method for monitoring brain perfusion and even its minor changes
in vivo. General principles of perfusion weighted imaging (PWI) were introduced
by Villinger et al. in 1988 [1] and further developed by Rosen et al. in 1989 [2].
By injecting a bolus of intravascular paramagnetic contrast agent and observing
its first passage concentration-time curves in the brain they were able to gain a
valuable insight to functioning of the living organ.
    The theory of kinetics of intravascular tracers was developed by Meier and
Zierler in 1954 [3]. To gain all the knowledge methodologically possible one must
recover so called impulse response function for each volume of interest. This func-
tion characterises the local perfusion properties. According to the work of Meier
and Zierler, however, in order to recover this function one must solve an integral
equation of the form
                                         t
                            c(t) =           a(τ )Ψ(t − τ ) dτ,
                                     0

This is a typical equation of class of equatiations known as Fredholm’s integral
equations. The integral also represent so called convolution; thus solving this
kind of equation is widely known as deconvolution.
    Deconvolution belongs to a class of inversion problems. That is, the theory of
Meier and Zierler (equation above) describes the change in the input function a
as it experiences the changes resulting from the properties of the vasculature and
local perfusion (charactirezied by impulse response Ψ). The result is a new func-
tion c. The inverse of this problem emerges when one measures input function a
and its counterpart c and asks from what kind of mechanism do these changes
originate from, i.e. what is the impulse response Ψ.
    Several methods have been proposed to solve the inversion problem. Tradi-
tional methods such as Fourier and Laplace techniques fail in this case due to the
significant amount of noise that is present in the measurements. The noisy data
and the form of the problem as a typically hard-to-solve Fredholm’s equation
make an additional requirement for the method used to solve the problem: the
solution has to be recovered so that the effect of noise is either cancelled out or
in some other way ignored because an exact solution computed directly from the
noisy data is heavily biased and physiologically meaningless. This fact highlights
the significance of the physical model which the solution method is based on.
    The current standard method is based on an algebraic decomposition method
known as Singular Value Decomposition (SVD). It requires the discretization of
the equation and then reqularises the ill-conditioned system of equations by cut-
ting off the smallest singular values. The method was introduced to the field by

                                               1
Østergaard et al. [4].
     An alternative methodology for inversion is based on probabilistic formula-
tion of the model for the problem and then solving it in term of maximum likeli-
hood. Such a method was first introduced by Vonken et al. in 1999 [5]. It is based
on the Expectation-Maximum (EM) algorithm developed by Dempster et al. in
1977 [6]. The EM algorithm was introduced to the field of medical imaging in-
dependently by Shepp and Vardi in 1982 [7] and Lange and Carson in 1984 [8]
and further developed by Vardi, Shepp and Kaufman [9]. Vonken’s work relies
heavily on that of Lange’s.
     There are four goals for this work. First, there is no comprehensive descrip-
tion of the EM-based perfusion deconvolution; Vonken’s paper is very dense and
brief in what comes to the theory. In some parts it is even inaccurate and falsely
justified. So here we try to offer a comprehensive and thorough desription of the
EM algorithm and its application. We shall take an excessive care to formulate
our presentation in a mathematically fluent form.
     Secondly, Vonken tries to base his version of the algorithm on the physical
model but fails to some extent. He simplifies on the expense of the physical model
by borrowing one result directly from Lange. The problem is that the result is de-
rived assuming Poisson distribution for random variates which in reality follow
normal distribution. In this work we correct this assumpition and also the other
inaccuarte parts of Vonken’s work and see wether the results are affected.
     Third, we try to repeat Vonken’s results and for this purpose a computer pro-
gram had to created. We also implement the proposed changes and try to com-
pare their effects. These programs are to be created in such a manner that they can
later serve as research tools at the Helsinki Brain Research Center. The HBRC cur-
rently lacks such tools. The comparison of the methods is carried out by Monte
Carlo simulations. Since the main interest in this report, however, is in the the-
oretical aspects of the EM application we do not concentrate too much on the
simulations and thus they are not meant to fully cover the subject.
     The fourth and the last goal for this report is to fulfill to requirements of course
Mat-2.108 Independent Research Project in Applied Mathematics at Helsinki Univer-
sity of Technology in Systems Analysis Laboratory.
     This report is organized as follows. First in chapter 2 the perfusion model and
the problem description are represented. Also the SVD solution method and dis-
cretization is dealt with. Then the chapter 3 describes the general EM algorithm. It
is followed by in introductory example of the use of EM in typical problem, that
is, the EM complete-data embedding derived and used by Lange [8] and later
adopted by Vonken [5] is revisited. The aim is to offer a simple example and lay
grounds for the later developements and representation of Vonken’s work. Such
derivation is not present even in the original Lange’s article. The next chapter 4
is entirely devoted to the derivation of the corrected probabilistic model and the

                                           2
EM algorithm based on it. Since the simplifications used by Vonken are omitted
the derivation is tedious.
   The later chapter include the description of the simulation and their results.
The last chapter gives the conlusions.




                                       3
2    Perfusion Model and Problem Description
Villinger and Rosen introduced the general principles of MR perfusion imaging
in 1988 and 1989 ([1],[2]). Using paramagnetic intravascular contrast agent he
was able to detect measurable change in time series of MR signal S(t). Assum-
ing linear relatioship between concentration of a contrast agent c(t) and change
in transverse relaxation rate ∆R2 the concentration as a function of time can be
characterized as
                                                 1    S(t)
                            c(t) ∝ ∆R2 = −         ln      ,                        (1)
                                               TE      S0
where S0 is the baseline intensity of the signal.
    For intravascular tracer, i.e. tracers that remain strictly inside the vasculature,
theoretical framework for mathematical analysis was developed by Meier and
Zierler in 1954 [3]. According to their work the concentration of a contrast agent
in vasculature as a function of time can be represented as
                                           t
                            c(t) = F           a(τ )r(t − τ ) dτ,                   (2)
                                       0

where a(t) is the concentration in large artery (also called Arterial Input Func-
tion, AIF) feeding the volyme of interest (VOI). c(t) on the left hand side of the
equation 2 typically refers to concentration further in tissue and is thus also called
Tissue Concentration Curve or TCC. r(t) is so called residue function which is the
fraction of tracer remaining in the system at time t. Formally it is defined as
                                                        t
                                r(t) = 1 −                  h(s) ds,                (3)
                                                    0

where h(t) is the distribution of transit times, i.e. the time a plasma particle takes
to travel through the capillary vasculature detectable by dynamic susceptibility
contranst MRI (DSC-MRI). That is, h(t) is a probability density function. Hence
r(t) has the following properties: r(0) = 1 and r(∞) = 0. In practice it is also
possible that the TCC is delayed be some time td due to the non-zero distance
from where the AIF was measured to where the TCC is measured. In theory, this
shifts r(t) to right. Hence, more general form of the residue function is

                                       0          t < td
                            rd (t) =                                                (4)
                                       r(t − td ) t ≥ td

From now on we will use more general rd (t) without explicit statement and de-
note it simply as r(t).
   In perfusion weighted imaging the TCC c(t) and AIF a(t) are measured. The
goal is in finding the solution to integral equation 2, i.e finding out the impulse

                                                4
response Ψ(t) = F · r(t). This impulse response characterizes the prorerties of the
underlying vasculature to the extent that is methodologically possible.
   In practical PWI the main interest, however, are the parameters MTT and CBF,
whos interdependency is characterized by the Central Volume Theorem [3]

                                   CBV = M T T · CBF                                                   (5)

MTT is so called Mean Transit Time, i.e. the expectancy of h(t) and CBF is Cerebral
Blood Flow, that is, F in equation 2. The CBV is simply the area under the c(t)
curve. In this work we concentrate on recovering only the CBF. Anyway, for this
purpose the whole impulse response has to be recovered.


2.1    Discretization: 0th order approximation
The measurements of a(t) and c(t) are made in discrete time intervals {t0 , t1 , t2 , . . . , tn }
where time between each measurement is ∆t = T R. This represents natural dis-
cretization for the problem 2. Traditionally eq. 2 is discretized directly with an
assumption that both the a(t) and the c(t) are constants over the time interval
∆t [4].
    This zeroth order (step function) approximation of the convolution integral 2
leads to following linear formulation for the problem

                                       tj                                    j
                  c(tj ) = cj =             a(τ )Ψ(tj − τ )dτ ≈ ∆t                ai Ψj−i              (6)
                                   0                                        i=0

where a(ti ) = ai and Ψ(tj ) = Ψj .
  By defining matrix a0◦ ∈ n×n as

                                                              ···
                                                                       
                                                a0       0          0
                                       
                                               a1       a0   ···   0   
                                                                        
                            a0◦   = ∆t          .
                                                 .            ...   .
                                                                    .
                                                                                                      (7)
                                                 .                  .
                                                                       
                                                                       
                                                an an−1 · · · a0
                                                                                       n×1             n×1
and discrete versions of Ψ(t) and c(t) as column vectors Ψ ∈                                 and c ∈
it is possible to rewrite approximated eq. 6 briefly as

                                             c = a0◦ · Ψ                                               (8)

    In practice, however, T R is of magnitude of seconds and a(t) varies between
magnitude of 10 to 30 within a few seconds. This naturally gives rise to a dis-
cretization errors.


                                                     5
2.2   SVD Solution to Deconvolution
Traditionally in perfusion fMRI the equation 8 is solved via Singular Value De-
composition (SVD) [4]. This regularises typically ill-conditined system of linear
equations 8. In general SVD of matrix a ∈ m×n is

                                      a=U·D·V                                    (9)

where U ∈ m×m and V ∈ n×n are orthogonal so that U · U = V · V = I. I is
an identity matrix. D is a diagonal matrix with same dimensionality as a and its
elements are so called singular values {σ i }n , i.e. D = diag{σ i }.
                                             i=1
   SVD’s regularizing properties come up simply in inverting the decomposed
matrix a. From 9 it is easy to see that

                              a−1 = V · diag{1/σ i } · U                        (10)

Now, if singular value are very small, i.e. σ i << 1 the inversion becomes instable
as the elements in the diagonal grow. Hence a pseudo-inversion is performed in
case of small singular values, that is, large elements 1/σ i corresponding to small
singular values σ i are simply set to zero. In practise this requires a threshold un-
der which singular values are ingored. In case of perfusion inversion this thresh-
old has been shown to be 0.2×the largest singular value [4].
    SVD solution (pseudo-inverse) is not suitable for approximation represented
in next subsection because trapezoidal approximation weightes separate elements
of a differently.


2.3   Discretization: 1st order approximation
The first order (trapezoidal) approximation for the convolution integral 2 is adopted
from Jacquez [10]. The measurements of a(t) and c(t) are made in discrete time
intervals {t0 , t1 , t2 , . . . , tn }. Now 2 at time tj is approximated as

                                  ∆t j
                           cj ≈         (aj−i Ψi + aj−i+1 Ψi−1 )                (11)
                                  2 i=1

Assuming a0 = 0 and defining a1◦ as
                                                                     
                                      a1    0          ···       0
                                      a2   2a1         ···       0
                                                                     
                                                                     
                             ∆t      a3   2a2   2a1             0
                                                                      
                     a1◦   =                                                    (12)
                                                                     
                                                                     
                             2        .
                                       .               ..        .
                                                                 .
                                                                      
                                
                                      .                    .    .    
                                                                      
                                      an 2an−1 · · ·            2a1

                                            6
we can write 11 briefly in vector notation as

                                   c = a1◦ · Ψ                                (13)

This does not help in case of SVD solution but might be of assistance where direct
discrete convolution is needed. EM is one of these.




                                        7
3     EM Algorithm
McLachlan encapsulates the essence of EM algorithm as [11]
      The Expectation-Maximization (EM) algorithm is a broadly applicable
      approach to the iterative computation of maximum likelihood (ML)
      estimates, useful in a variety of incomplete-data problems [. . . ] On
      each iteration of the EM algorithm, there are two steps – called the
      expectation step or the E-Step and the maximization step or the M-step.
      [. . . ] The notion of ’incomplete-data’ includes the conventional sense
      of missing data, but it also applies to situations where the complete
      data represents what would be available from some hypothetical ex-
      periment. [. . . ] even when a problem does not at first appear to be an
      incomplete-data one, computation of MLE is often greatly facilitated
      by artificially formulating it be as such.
The first general treatment of the EM algorithm was published by Dempster et
al. in 1977 [6]. Since then it has been applied in numerous different fields. In per-
fusion fMRI it was first used by Vonken et al. in 1999 [5]. Vonken’s work relies
heavily on that of Lange’s in 1984 [8]. Lange, however, applied EM to PET image
reconstruction.
     In this chapter first a brief overview of the EM algorithm is offered. It culmi-
nates to statemenst of both the E- and M-steps in eqs. 18 and 19. This is followed
by introductory overview of Lange’s method [8] which is meant to offer a compre-
hensive example of the use of EM in typical problem. Next Vonken’s method [5]
is introduced. Excessive care has been taken to formulate the made assumption
in mathematically fluent form.

3.1   Overview of EM Algorithm
Here we offer a brief recap of the EM theory imitating McLachlan’s book [11].
    Let Y be the random vector corresponding to the observed data y, that is,
y is Y’s realization. Y has probability density function (pdf) g(y; Ψ) where Ψ
is the vector containing the unknown parameters to be estimated. Respectively
complete-data random vector will be denoted by X and respectively its realiza-
tion as x. X has the pdf f (x; Ψ).
    The complete-data log likelihood function that could be formed for Ψ if x
were fully observable is
                                ln L(Ψ) = ln f (x; Ψ)                       (14)
   Define h as many-to-one mapping from complete-data sample space X to
incomplete-data sample space Y
                                    h:X →Y                                       (15)

                                         8
Now we do not observe complete-data x in X but instead incomplete-data
y = h(x) in Y. Thus,
                             g(y; Ψ) =            f (x; Ψ) dx,                 (16)
                                          X (y)

where X (y) is the subset of the complete-data sample space X determined by the
equation y = h(x).
   The eq. 16 in discrete form is
                              g(y; Ψ) =              f (x; Ψ)                  (17)
                                          x:h(x)=y

    Problem here is to solve incomplete-data (observable-data) log likelihood max-
imization. The main idea of EM is to solve it in terms of the complete-data rep-
resentation L(Ψ) = f (x; Ψ). As it is unobservable it is replaced by its conditional
expectation given y and current fit for Ψ which at iteration n is denoted by Ψ(n) .
In other words, the entire likelihood function is replaced by its conditional expec-
tation, not merely complete-data variates.
    To crystallize the heuristic EM approach to concrete steps we have the follow-
ing:
    First choose an initial value/guess Ψ(0) for the iteration to begin with.
    Next carry out the the E-step i.e. calculate the conditional expectation of the
log likelihood function given the current parameter estimate Ψ(n) and the obser-
vations y
                        Q(Ψ; Ψ(n) ) = EΨ(n) [ ln L(Ψ) | y, Ψ(n) ]               (18)
      Finally the M-step: maximize Q(Ψ; Ψ(n) ) with respect to the parameters Ψ
                            Ψ(n+1) = arg max Q(Ψ; Ψ(n) )                       (19)
                                            Ψ

   Now, if there are terms independent of Ψ in eq. 19 they do not contribute to
new Ψ(n+1) because they drop out in derivation (i.e. maximization) with respect
to Ψ. In some cases this eases the derivation.

3.2     EM Algorithm applied to Perfusion Deconvolution
3.2.1 Lange’s Method in PET image reconstruction
Here we review Lange’s derivation of his version of the physically based EM
algorithm. It is meant to serve as an introductory example and to clarify the use
of EM in practise.
    The idea in PET is to recover the values of the emission intensity Ψj when one
sees only the sum of the emission over a finite time interval. Let the number of
emissions from pixel j during projection i be the random variate Xij
                                Xij ∼ P oisson(cij Ψj )                        (20)

                                            9
where cij ’s are assumed to be known constants. Next define the observable quan-
tity, i.e. their sum, be the number of emission recorded for projection i as the
random variate Yi
                                   Yi =   Xij                               (21)
                                                        j

Hence
                               Yij ∼ P oisson(                    cij Ψj )                   (22)
                                                              j

From 20 it follows that
                                                        (cij Ψj )xij −cij Ψj
                          P [Xij = xij ] =                          e                        (23)
                                                            xij !

and so
                            f (x; Ψ) =                      P [Xij = xij ]                   (24)
                                                   i    j

Thus with 14 we have

                 ln L(Ψ) =                   { xij ln(cij Ψj ) − cij Ψj − ln xij ! }         (25)
                               i        j


and eq. 18 yields based on the linearity of the expectation

         Q(Ψ; Ψ(n) ) = EΨ(n) [ ln L(Ψ) | y, Ψ(n) ]
                     =        { E[ Xij | y, Ψ(n) ] ln(cij Ψj ) − cij Ψj } + R                (26)
                           i       j


R does not depend on Ψ. It includes the term E[ ln Xij ! | y, Ψ(n) ] which would
be difficult to calculate.
   Conditional expectation can be derived as follows
                                                  yi
                                       (n)
                  E[ Xij | y, Ψ              ]=         k · P [ Xij = k | y, Ψ(n) ]          (27)
                                                  k=0

where
                                               P [Xij = k, Yi = yi ]
            P [ Xij = k | y, Ψ(n) ] =
                                                    P [Yi = yi ]
                                               P [Xij = k, pj Xip = yi − k]
                                             =
                                                          P [Yi = yi ]
                                                                   (n)
                                                       yi (cij Ψj )k ( pj cip Ψ(n) )yi −k
                                                                                  p
                                             =                             (n) yi
                                                                                             (28)
                                                       k          ( p cip Ψp )

                                                       10
because Ψ(n) is a parameter vector and Xij is independent of other Yj s except of
the Yi to which itself contributes. Substituting this to eq. 27 and using
                                   n
                                               n k n−k
                                                 a b   = (a + b)n                                   (29)
                                k=0            k

and
                              yi        yi − 1
                                 k = yi        , yi ≥ k > 1                                         (30)
                              k         k−1
we finally get the conditional expectation for Xij and denote it by Nij
                                                                                    (n)
                                                         (n)            yi cij Ψj
                        Nij = E[ Xij | y, Ψ                        ]=              (n)
                                                                                                    (31)
                                                                            p cip Ψp


Now, if the initial guess Ψ(0) is positive then Nij s are all positive. Hence E-step is
completed and yields

                  Q(Ψ; Ψ(n) ) =                     { Nij ln(cij Ψj ) − cij Ψj } + R                (32)
                                       i        j


    Now M-step is performed by derivating eq. 32 with respect to Ψ and equating
its derivatives to zero. Derivation yields

                          ∂                                          Nij
                             Q(Ψ; Ψ(n) ) =                               −          cij             (33)
                         ∂Ψj                                   i     Ψj        i

                                                                                            (n+1)
and setting it to zero and solving for Ψj yields the new estimate Ψj
                                                               (n)
                      (n+1)                iNij    Ψj                              yi cij
                     Ψj        =                 =                                    (n)
                                                                                                    (34)
                                           i cij    i cij               i      p cip Ψp

This solution truly maximizes Q. It can be seen as follows. Q’s second derivative
is
                          ∂2                          Nij
                                Q(Ψ; Ψ(n) ) = −         2
                                                                             (35)
                        ∂Ψi ∂Ψj                    j Ψj

when i = j and zero otherwise. Thus the quadratic form Ψ H(Ψ)Ψ, where H
denotes the Hessian matrix of Q, is strictly negative for all Ψj ≥ 0. That is, the
eq. 34 represents the point of concave function where its gradient is equal to zero
vector.




                                                       11
3.2.2 Vonken’s Method
Here we review the application of the EM algorithm to perfusion weighted fMRI
published by Vonken in 1999 [5]. First the article is briefly referred and then some
of its flaws are pointed out. The notation is changed to correspond this document
but no changes beyond this have been made. In the next section we try to offer
more exact and thorough treatment of the subject and correct the contradictions
in Vonken’s work.
    Vonken starts by defining the convolution operator a as a square matrix whose
elements are defined as
                                          Ai−j if i − j ≥ 0
                              aij =                                                                   (36)
                                          0    otherwise
where Ai−j denotes AIF at time ti − tj , i.e. A(ti − tj ). Thus the operator a corre-
sponds to the zeroth order approximation for the convolution integral, i.e. eq 8 in
page 5.
    The next two steps are responsible for cleverly formulating the complete-data
embedding. For this purpose Vonken assumes two distributions, one for the com-
plete and one for the observed data. The first one has the pdf f (X; Ψ) and it is
assumed to follow the normal distribution. The observed data is also assumed to
follow the normal distribution. Its pdf is g(C; Ψ). These normality assumption are
satisfactorily justified; especially the normality of C is treated thoroughly. First
Vonken defines the elements of complete-data matrix as
                                          xij = aij Ψj                                                (37)
and then naturally the linkage to the observed (incomplete-) data as
                                  ci =        xik =       aik Ψk                                      (38)
                                          k           k

The notation for current estimate of ci s based on the current estimate Ψ(n) is
                                                          (n)
                                      ci =
                                      ˜            aik Ψk                                             (39)
                                               k

     Next Vonken moves onwards to define the complete-data log likelihood func-
tion based on the assumption that the complete-data xij are distributed normally,
                        2                    2
i.e. Xij ∼ N (aij Ψj , σij )· The variances σij are later taken to be equal and after all
in the M-step they cancel out. He says:
                                                                                   (n)            (n)
      " . . . using Eq. 38 and the expectancy E[Xij |c, Ψ(n) ] = ci ·aij Ψj /            j   aij Ψj
      = ci /˜i ≡ Nij . This gives
               c
                                                                             (n)
      E[ln f (X; Ψ)|c, Ψ(n) ] =           ln P [Xij ] = −               (aij Ψj − Nij )2 /2σij
                                                                                            2

                                  i   j                         i   j


                                                12
with P [Xij ] the probability of Xij and σij the standard deviation in the
      complete-data representation."

From this Vonken proceeds to the M-step. He takes the derivative of the condi-
tional expectation above and equates this to zero. This yield a set of equations
                                            (n)
                                    aij (aij Ψj − Nij ) = 0                        (40)
                                i

                              (n+1)
i.e. an equation for each Ψj        . To finish, Vonken says: "A program has been
implemented that numerically solves Eq. 40 using a Newton-Raphson scheme."
     The above summarization is not meant to be a complete description of the
Vonken’s article; rather it tries to describe the essential points of his derivation in
order to illustrate the the facts that are to be changed here. Here are the points
that seem to need changes.
     First, Vonken’s notation could be more exact. He does not make notational
difference between random variates and their realizations. This might be a con-
sequence of the Lange’s work being the reference point throughout his work.
     Secondly, more explicit expression of the assumpition used migth clarify the
derivation. Especially, even though Vonken let’s the reader to believe that the en-
tire derivation is faithfully based on the normality assumptions there is one point
where this is not the case. Namely when Vonken takes the conditional expecta-
tion E[Xij |c, Ψ(n) ] he does not mention its origins. In fact it is taken directly from
Lange [8]. The result, however, is derived based on the assumption of Poisson
                                    (n)
distribution Xij ∼ P oisson(aij Ψj ). This may serve as a satisfactory approxima-
tion but is clearly incorrect and ungrounded here. Vonken’s obvious goal is to try
to ground his work on the physical model like Lange but here he deviated from
this without any explanation.
     Finally, the calculation of the log likelihood of the complete-data is guestion-
albe. In EM theory the conditional expectation is taken from the entire log likeli-
hood function ln L(Ψ) as stated in eq. 18. If the log likelihood function is linear
in x in terms containing the parameter Ψj the result looks just like the xij s had
simply been replaced by their conditional expectations. For an example see eq. 25
in page 10. Here, however, the normality assumption leads to non-linear term
       (n)
(aij Ψj − xij )2 whose conditional expectation with notation E[Xij |c, Ψ(n) ] ≡ Nij
              (n)
is not (aij Ψj − Nij )2 as derived by Vonken. This might be the explanation for the
fast and sometimes instable convergence of the algorithm.




                                            13
4    Improved application of EM
Vonken’s reasoning in complete-data embedding is adopted and the convolution
operator a is defined as

                                        Ai−j if i − j ≥ 0
                            aij =                                                (41)
                                        0    otherwise

where Ai−j denotes AIF at time ti − tj , i.e. A(ti − tj ). Thus a represents a zeroth
order approximation for the convolution integral as eq. 8 on page 5 shows.
    The distribution of the measured values of time-series of AIF and TCC is as-
sumed the be normal as Vonken argued. This is also intuitively appealing as the
values at issue are measurement values of a physical quantity after almost linear
transformation.
    Now A refers to both the convolution matrix which is treated as a random
variate (matrix) and also to the random vector of AIF values. After measurement
A is realized as a; first as the AIF and then after transfomation 41 also as the
convolution operator a. These two differ only at the notational level: aj refers to
the element of AIF whereas aij is an element of the operator 41.
    Based on the previous reasoning the AIF values Ai are assumed to be normally
distributed around its mean which will be notated here with parameter µi , i.e.
E[Aij ] = µij . Later when the actual measurements are made and the developed
algorithm will be used to recover the residual this parameter will be replaced by
the measured aij , i.e Aij ’s realization. The variance associated with the parameter
                2
naturally is σAIF . Explicitly

                                Aij ∼ N (µij , σ 2 )
                                                 AIF                             (42)

From this the distribution of the complete-data elements Xij can be easily de-
rived. They are defined as Xij = Aij Ψj and thus

                            Xij ∼ N (µij Ψj , (Ψj σ AIF )2 )                     (43)

Thus the complete-data pdf is of the familiar exponential form and is from now
on denoted by fX (x; Ψ).
   Now as the observed-data are defined as

                                        Ci =        Xik                          (44)
                                                k

we have
                         Ci ∼ N (       µik Ψk ,        (Ψk σAIF )2 )            (45)
                                    k               k

From now on the pdf of random observed data vector C is denoted by gC (c; Ψ).

                                               14
From eq. 43 one can easily formulate the complete-data log likelihood function
which is needed in the E-step
                                                    √                   (µij Ψj − xij )2
             ln L(Ψ) =                       { − ln( 2π (Ψj σAIF )2 ) −                  }                                  (46)
                                 i       j                               2(Ψj σAIF )2

Writing the binomial open and denoting by R the terms independent of Ψ the
conditional expectation of the log likelihood can be written as

      Q(Ψ; Ψ(n) ) = EΨ(n) [ ln L(Ψ) | c, Ψ(n) ]
                                    √                                                    µij
                  =            − ln( 2π (Ψj σAIF )2 ) +                                    2
                                                                                               E[ Xij | c, Ψ(n) ]
                            i        j                                                 Ψj σAIF
                                     1
                           −                 E[ Xij | c, Ψ(n) ] + R
                                                 2
                                                                                                                            (47)
                                2(Ψj σAIF )2
From eq. 47 it is clear that two different conditional expectations are needed:

                E[ Xij | c, Ψ(n) ] =                      xij fX|C,Ψ(n) (xij |ci , Ψ(n) ) dxij                              (48)

                E[ Xij | c, Ψ(n) ] =
                    2
                                                          x2 fX|C,Ψ(n) (xij |ci , Ψ(n) ) dxij
                                                           ij                                                               (49)

where fX|C,Ψ(n) (xij |ci , Ψ(n) ) refers to the current conditional pdf of Xij given c and
Ψ(n) . This can be found using basic property familiar from the probability theory

                                             (n)                                (n)
                                                                                        fX|Ψ(n) (xij |Ψ(n) )
            fX|C,Ψ(n) (xij |ci , Ψ                 ) = gC|X,Ψ(n) (ci |xij , Ψ         )                                     (50)
                                                                                         gC|Ψ(n) (ci |Ψ(n) )

where gC|X,Ψ(n) (ci |xij , Ψ(n) ) refers respectively to the conditional pdf of Ci given
xij and current Ψ(n) . fX (xij ) and gC (ci ) are merely the pdfs of Xij and Ci .
    The functions in eq. 50 expressed explicitly are:
                                                                                                  (n)
                                                               1                       (µij Ψj − xij )2
            fX|Ψ(n) (xij |Ψ(n) ) = √                                      exp(−                   (n)
                                                                                                                    )       (51)
                                                               (n)
                                                     2π (Ψj σAIF )2                        2(Ψj σAIF )2

and
                                                                                                        (n)
                           (n)                             1                           (    k   µik Ψk − ci )2
          gC|Ψ(n) (ci |Ψ         )= √                                    exp(−                      (n)
                                                                                                                        )   (52)
                                                             (n)                        2                   2
                                              2π         k (Ψk σAIF )
                                                                     2                          k (Ψk σAIF )

and
                                                                                                          (n)
                                                          1                            (    kj   µik Ψk + xij − ci )2
 gC|X,Ψ(n) (ci |xij , Ψ(n) ) = √                               (n)
                                                                         exp(−                                (n)
                                                                                                                             )
                                                                                            2                     2
                                                                                                    kj (Ψk σAIF )
                                             2π                      2
                                                       kj (Ψk σAIF )
                                                                                                                            (53)

                                                                   15
The notation kj means that the sum is taken over all k exept j, in other
words kj zk = k zk − zj .
   From equations 50 through 53 it is obvious that the results will get messy.
Therefore we define the following short-hand notations
                                                  (n)
                                             µij Ψj         = γij
                                           (n)
                                     (Ψj σAIF )2 = αj
                                           (n)
                                     (Ψk σAIF )2 =                       α
                                 k
                                           (n)
                                     (Ψk σAIF )2 =                       βj
                             kj
                                                 (n)
                                           (µik Ψk ) =                   γi
                                     k
                                                 (n)
                                           (µik Ψk ) =                    δij
                                     kj

   One must not confuse Ψ and Ψ(n) because the maximization in the M-step re-
                                                             (n)
quires derivation of Q with respect to each Ψj and iterates Ψj are treated as con-
stant parameters. Hence 52 has no dependency on xij it will be denoted merely
by gC (ci ) in the future.
   The conditional expectations yield with defined notation
                                 ci αj + γ βj − αj δij        ( γi − ci )2
          E[ Xij | c, Ψ(n) ] =       √                  exp(−              )                              (54)
                                       2πgC (ci ) α−3/2         2 α
and
                                               1
         E[ Xij | c, Ψ(n) ] = √
             2
                                                                        (ci αj )2 + (γij        βj )2 +
                                         2πgC (ci )         α5/2
                                     +2ci αj (γij               βj − αj          δij ) +
                                     +αj         βj (           βj − 2γij         δij ) +
                                       2                              2             (      γi − ci )2
                                     +αj (       βj +                δij ) exp(−                      )   (55)
                                                                                           2 α
Substituting these to Q (eq. 47) have have completed the E-step.
   Now, Q is of the form
                            Q(Ψ; Ψ(n) ) =                            Kij (Ψj )                            (56)
                                                        i       j

thus the derivation with respect to each Ψj yields
                            ∂Q(Ψ; Ψ(n) )                            ∂Kij (Ψj )
                                         =                                                                (57)
                               ∂Ψj                          i         ∂Ψj

                                                  16
where the derivative of Kij (Ψj ) can be written as

                         ∂Kij (Ψj )
                                    = Λij Ψ−3 − Ωij Ψ−2 − Ψ−1
                                           j         j     j                                              (58)
                           ∂Ψj

where we have again defined the following short-hand notations

                                        1
             Λij = √         2
                                                                (ci αj )2 + (γij               βj )2 +
                          2πσAIF gC (ci )              α5/2
                        +2ci αj (γij            βj − αj             δij ) +
                        +αj         βj (        βj − 2γij            δij ) +
                          2                        2                     (       γi − ci )2
                        +αj (       βj +          δij ) exp(−                               )             (59)
                                                                                 2 α
and
                      µij (ci αj + γ βj − αj δij )       ( γi − ci )2
              Ωij =      √       2
                                                   exp(−              )                                   (60)
                            2πσAIF gC (ci ) α−3/2          2 α
Hence after the summation over i and multiplication by Ψ3 in eq. 56 we have the
                                                        j
equation for the root of the derivative eq. 57

                           Ψ2
                            j        1 + Ψj             Ωij −                Λij = 0                      (61)
                                i                  i                 i

This second-degree equation can easily be solve for Ψj . Choosing the positive
root we have

                (n+1)               −       i   Ωij +       (   i   Ωij )2 + 4         i   1    i   Λij
               Ψj       = Ψj =                                                                            (62)
                                                                2        i   1




                                                       17
5     This Work
5.1   Overview
The two main goals of this report are to descripe the EM-based deconvolution
method published by Vonken [5] and try to improve it and then to evaluate the
made changes by simulations. For this purpose both the Vonken’s methdod and
the new method were implemented on the MATLAB platform. Both methods
were also implemented using both the 0th and then the 1st order approximations
for the convolution integral. Therefore in total four different methods were to be
evaluated.
    As stated in the introduction, however, this report concentrates mainly on
the theoretical aspects and the full evaluation is not included. Instead only the
reproductibility of CBF was studied.
    The methods are evaluated using Monte Carlo simulation. For this purpose
the true values of AIF, TCC and impulse response Ψ have to be known. This
was achieved by creating a numerical integrator which computes the "true" TCC
based on given AIF and impulse response using eq. 2. This avoids the effect of
discretization errors arising from discretized eq. 8, for example. This method also
enables us easily to change all the parameters affecting the implulse response;
most importantly the delay is not binded to multiples of TR.
    After the true functions are know the gaussian noise is added to TCC using
eq. 1. This noisy TCC is then used when performing the deconvolution by the
methods to be tested. Numerical values used in this work were S0 = 300 and
k = 1. Signal-to-Noise ratio was set to clinically interesting values of SN R = 35.
    Vonken reported difficulties in deciding the optimal number of iterations needed.
In his clinical experiment he used four iterations. This number is adoted here,
also. This number is without any further investigation used for both the zeroth
and the first order approximations.
    However, the convergence properties of the algorithm change dramatically
when the proposed changes are implemented. Empirically (try-and-error) the fol-
lowing iteration numbers were found: the zeroth order approximation was iter-
ated 100 times whereas in the case of the first order approximation the maximum
number of iterations was set to 400.
    Another problematic area not described by Vonken was the tendency of the
recovered impulse response to "upraise its tail". In other word the convergence
produced nearly always an impulse response whose last and sometime even
the second-to-last elements were clearly incorrectly large. This, however, did not
seem to affect the previous elements. The same was observed in the case of the
new EM version. This may result in erroneously determined CBF if the tail rises
higher than the true maximum of the impulse response. To overcome this diffi-


                                       18
culty in CBF estimation the last four elements of the estimated impulse response
were simply put to zero.
    The new algorith was found to suffer minor numerical instabilities. The val-
ues of eq. 52 are typicall very small and in case of initial guess that differs greatly
from the measured data the values of eq. 52 become too small for the available ac-
curacy. Therefore a good initial guess is needed. To guarantee equal treatment of
all methods a common initial guess was set to constant function of value .02. The
insufficienf numerical accuracy, however, was in some cases so severe that some-
times (very rarely, in present simulations the occurence frequency was 5 times
out of 13 · 512 = 6656 simulations) the algorithm could not procede and in such
cases was set to produce a NaN (Not a Number) result. The mean and standard
deviations of the estimates were calculated ignoring these NaN values.


5.2   Detailed description and the parameters used
There were two different sets of simulations: one with zero delay (td = 0) and
other with 2.7 seconds delay,i.e. td = 2.7 in eq. 4. Both were carried out in sim-
ilar manner. The CBF was varied between 0.01 and .13 [arbitary units] with .01
intervals. At each flow level 512 different noisy TCCs were generated and each
of them was deconvolved with every method. The average CBF estimate and its
standard deviation was then calculated. The original residue function (see eq. 2
pp. 4) was generated from h(t) of the form

                         Γ(α + β)
                h(t) =            (t1 − t0 )1−α−β (t − t0 )α−1 (t1 − t)β−1
                         Γ(α)Γ(β)

with empirically seems to be reasonable model for h(t) [12]. The numerical values
were set to t0 = 0, t1 = 8, α = 2, 3 and β = 3, 8 corresponding physiologically
typical to M T T ≈ 3s. The AIF was modeled as a gamma-variate function of the
form                                                (t−t0 )
                            AIF (t) = a(t − t0 )b e− c
where now a = 2, b = 4 and c = 1, 1. All the time T R was kept at 1,5s. For
comparison also SVD solution was calculated.
   The simulation were very time consuming. Each of the two set described
above took nearly two days to complete on 2,4GHz AMD platform.


6     Results
The simulation results are depicted in figures 1 and 2 on pages 21 and 22. The first
one depicts a normal case whereas in the latter one TCC is delayed by 2.7 seconds.


                                            19
The similar results for the standard SVD deconvolution method are show in fig-
ure 3 on page 23. There are four pictures corresponding to each four different
versions of the EM based deconvolution: first two (upper row) depict the perfor-
mance of the new EM algorithm using both the zeroth order and the first order
approximation for the convolution integral. The lower row respectively depicts
the performance of the Vonken’s EM algorithm in both the zeroth and the first
order cases.
    The two most eye-catching features are the enormous standard deviation of
the traditional Vonken’s EM based CBF estimate and the tendency of the Vonken’s
original algorithm to yield dramatically overestimated CBF estimates in low CBF
values. Standard deviations of such magnitude were not reported in Vonken’s
original paper. Neither was the obviously incorrect convergence in the low CBF
values. Since the last elements of the impulse responses recovered here were set
to zero this huge variation in CBF estimates has to originate from the physically
meaningfull part of the impulse responses.
    The principal differences between the results obtained by different methods
are as follows. In case where no delay is present the original Vonken’s algorithm
seem to provide equal results as the new zeroth order version developed here.
Despite the major difference in the standard deviation the means of the results
seem equal. The simultaneous appearance of the huge change in the standard
deviation and smaller change in the mean CBF value may indicate the existence
of few major out-liers.
    The effect of the first order approximation for the convolution integral results
in more loyal estimate of the CBF. In both cases - traditional and new EM decon-
volution - the estimated CBF seems follow the true value well. The new version,
however, is prone to overestimation. The original version of the EM deconvolu-
tion equipped with the more accurete approximation, however, yields very good
results. The constant overestimation of the new algorithm may be a result of poor
selection of the number of maximum iterations.
    The presence of 2,7 seconds delay in general deteriorates the performance of
both methods. The standard deviations are not affected but the CBF estimates are
lower throughout the range than before. The new algorithm with higher order
approximation (upper right corner in figure 2 in pp. 22), however, gives extremely
good results withmodest standard deviation. However, the biased estimation in
no-delay-situation and the behaviour of the original algorithm with the higher
order approximation suggest that here a bias is compensated by another bias.




                                        20
new EM−d0 CBF                                                               new EM−d1 CBF
                            0.2                                                                         0.2
estimated CBF [arb.units]




                                                                            estimated CBF [arb.units]
                            0.1                                                                         0.1




                             0                                                                           0

                                  0.02 0.04 0.06 0.08 0.1       0.12                                          0.02 0.04 0.06 0.08 0.1       0.12
                                         true CBF [arb.units]                                                        true CBF [arb.units]

                                          trad. EM−d0 CBF                                                             trad. EM−d1 CBF
                            0.2                                                                         0.2
estimated CBF [arb.units]




                                                                            estimated CBF [arb.units]




                            0.1                                                                         0.1




                             0                                                                           0

                                  0.02 0.04 0.06 0.08 0.1       0.12                                          0.02 0.04 0.06 0.08 0.1       0.12
                                         true CBF [arb.units]                                                        true CBF [arb.units]

Figure 1: Simulation results in case there is no delay (td = 0). Pictures in upper
row correspond to the new version of the EM algorithm whereas the lower row
corresponds to the original Vonken’s version. The left pictures are computed with
the original zeroth order convolution integral approximation but in right ones the
linear approximation is used. The thiker lines give the mean of the deconvolved
CBF estimate vs. the true CBF. The dashed lines are the mean of the CBF ± their
standard deviation. The dotted line corresponds the perfect match.




                                                                       21
new EM−d0 CBF                                                               new EM−d1 CBF
                            0.2                                                                         0.2
estimated CBF [arb.units]




                                                                            estimated CBF [arb.units]
                            0.1                                                                         0.1




                             0                                                                           0

                                  0.02 0.04 0.06 0.08 0.1       0.12                                          0.02 0.04 0.06 0.08 0.1       0.12
                                         true CBF [arb.units]                                                        true CBF [arb.units]
                                          trad. EM−d0 CBF                                                             trad. EM−d1 CBF
                            0.2                                                                         0.2
estimated CBF [arb.units]




                                                                            estimated CBF [arb.units]




                            0.1                                                                         0.1




                             0                                                                           0

                                  0.02 0.04 0.06 0.08 0.1       0.12                                          0.02 0.04 0.06 0.08 0.1       0.12
                                         true CBF [arb.units]                                                        true CBF [arb.units]

Figure 2: Simulation results in case there is 2.7 seconds delay (td = 2.7). Pictures in
upper row correspond to the new version of the EM algorithm whereas the lower
row corresponds to the original Vonken’s version. The left pictures are computed
with the original zeroth order convolution integral approximation but in right
ones the linear approximation is used. The thiker lines give the mean of the de-
convolved CBF estimate vs. the true CBF. The dashed lines are the mean of the
CBF ± their standard deviation. The dotted line corresponds the perfect match.




                                                                       22
SVD−CBF norm. & delayd
                                        0.2
            estimated CBF [arb.units]




                                        0.1




                                         0


                                              0.02   0.04     0.06          0.08     0.1   0.12
                                                              true CBF [arb.units]


        Figure 3: With and without delay SVD. Dashed line is with delay



7    Conclusions
In this work the EM based deconvolution method developed by Vonken et al. [5]
was reviewed. Also some theoretical backgrounds were given and attention was
paid especially to discretization accuracy. Some flaws of Vonken’s article were
pointed out and corrected. This resulted in an entirely new version of the EM
deconvolution algorithm.
    The new EM based algorithm was tedious to derive. First major change with
respect to that of Vonken’s was in implementing the more natural and better
grounded normality assumption concerning the distribution of the complete-data
variates. Second, more fundamental change was done amending Vonken’s con-
ditional expectancy of the complete-data log likelihood function. This is likely to
be the source of the different convergence properties of the new algorithm.
    After implementing the first order approximation and the new version of the
algorithm there were four different versions of the algorithm to be tested. Sim-
ulations were carried out with and without delay between AIF and TCC. For
comparison, also traditional SVD deconvolution was carried out.
    The results were surprising. First of all, the strange behaviour of Vonken’s
original algorithm is in contrast of that reported in his original article. It seems


                                                                  23
to be prone to dramatically overestimate the low CBF values and in addition to
that it suffers from large standard deviation. These are likely to originate from the
wrongly derived equation 40 on page 13.
    The new version of the algorithm converges much slowlier and hence requires
more iterations to be used. Neither the optimal number of iterations nor the ini-
tial guess were subjects here. Regardless of that the results were promising. The
standard deviation was of the same magnitude as that of SVD’s. In fact, the ze-
roth order approximation yielded almost identical results as SVD did. The use
of first order approximation resulted in minor overestimation of the CBF but no-
table here is that the magnitude of the bias does not change as the CBF does.
The higher order approximation, however, results in somewhat greater standard
deviation of the estimate. The absence of the improvement due to higher order
approximation in the case of delayed TCC, however, suggest that the excellent
performance of the new algorithm with higher order approximation results from
one bias being compensated by another.
    After all, developements described in this report seem promising. They were
able to guarantee nearly certain convergence with modest spread of CBF esti-
mates. A clear improvement with respect to Vonken’s original algorithm was
recorded. The price paid was slower convergence and longer computation time.
Further research still has to be carried out. The reproductibility of the full im-
pulse response is of great importance in some applications. The effect of different
delays and especially different shapes of residual function also remain to be in-
vestigated.




                                         24
References
 [1] A. Villringer, B. Rosen, J. Belliveau, J. Ackerman, R. Lauffer, R. Buxton,
     Y. Chao, V. Wedeen, and T. Brady, “Dynamic imaging with lanthanide
     chelates in normal brain: contrast due to magnetic susceptibility effects,”
     Magnetic Resonance In Medicine, vol. 6, no. 2, pp. 164–174, 1988.

 [2] B. R. Rosen, J. W. Belliveau, and D. Chien, “Perfusion Imaging by Nuclear
     Magnetic Resonance,” Magnetic Resonance Quaterly, vol. 5, no. 4, pp. 263–281,
     1989.

 [3] P. Meier and K. L. Zierler, “On the Theory of the Indicator-Dilution Method
     for Measurement of Blood Flow and Volume,” Journal of Applied Physiology,
     vol. 6, no. 12, pp. 731–744, 1954.

 [4] L. Østergaard, R. M. Weisskoff, D. A. Chesler, C. Gyldensted, and B. R.
     Rosen, “High Resolution Measurement of Cerebral Blood Flow using In-
     travascular Tracer Bolus Passages. Part 1: Mathematical Approach and Sta-
     tistical Analysis,” Magnetic Resonance in Medicine, vol. 36, pp. 715–725, 1996.

 [5] E.-J. P. Vonken, F. J. Beekman, C. J. Bakker, and M. A. Viergever, “Maxi-
     mum Likelihood Estimation of Cerebral Blood Flow in Dynamic Suscepti-
     bility Contrast MRI,” Magnetic Resonance in Medicine, vol. 41, pp. 343–350,
     1999.

 [6] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from
     Incomplete Data via EM Algorithm,” Journal of the Royal Statistical Society.
     Series B (Methodological), vol. 39, no. 1, pp. 1–38, 1977.

 [7] L. A. Shepp and Y. Vardi, “Maximum Likelihood Reconstruction for Emis-
     sion Tomography,” IEEE Transactions on Medical Imaging, vol. 1, pp. 113–122,
     1982.

 [8] K. Lange and R. Carson, “EM Reconstruction Algorithms for Emission and
     Trasmission Tomography,” Journal of Computer Assisted Tomography, vol. 8,
     no. 2, pp. 306–316, 1984.

 [9] Y. Vardi, L. A. Shepp, and L. Kaufman, “A Statistical Model for Positron
     Emission Tomography,” Journal of the American Statistical Association, vol. 80,
     no. 389, pp. 8–20, 1985.

[10] J. A. Jacquez, Compartmental Analysis in Biology and Medicine. The University
     of Michigan Press, 2 ed., 1985.



                                        25
[11] G. J. McLachlan and T. Krishnan, The EM Algorithm and Extensions. Wiley
     Series in Probability and Statistics, Wiley, 1997.

[12] L. Østergaard, D. A. Chesler, R. M. Weisskoff, A. G. Sorensen, and B. R.
     Rosen, “Modeling Cerebral Blood Flow and Flow Heterogenity From Mag-
     netic Resonance Residue Data,” Journal of Cerebral Blood Flow and Metabolism,
     vol. 19, pp. 690–699, 1999.




                                       26

Weitere ähnliche Inhalte

Was ist angesagt?

Acoustics and vibrations mechanical measurements - structural testing part ...
Acoustics and vibrations   mechanical measurements - structural testing part ...Acoustics and vibrations   mechanical measurements - structural testing part ...
Acoustics and vibrations mechanical measurements - structural testing part ...Indian Institute of Technology, Kanpur
 
Fuzzy and entropy facial recognition [pdf]
Fuzzy and entropy facial recognition  [pdf]Fuzzy and entropy facial recognition  [pdf]
Fuzzy and entropy facial recognition [pdf]ijfls
 
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionExtended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionCSCJournals
 
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIntellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIJCSIS Research Publications
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482IJRAT
 
Paper id 24201464
Paper id 24201464Paper id 24201464
Paper id 24201464IJRAT
 
Detection of Carotid Artery from Pre-Processed Magnetic Resonance Angiogram
Detection of Carotid Artery from Pre-Processed Magnetic Resonance AngiogramDetection of Carotid Artery from Pre-Processed Magnetic Resonance Angiogram
Detection of Carotid Artery from Pre-Processed Magnetic Resonance AngiogramIDES Editor
 
Face Recognition Using Sign Only Correlation
Face Recognition Using Sign Only CorrelationFace Recognition Using Sign Only Correlation
Face Recognition Using Sign Only CorrelationIDES Editor
 
Image segmentation Based on Chan-Vese Active Contours using Finite Difference...
Image segmentation Based on Chan-Vese Active Contours using Finite Difference...Image segmentation Based on Chan-Vese Active Contours using Finite Difference...
Image segmentation Based on Chan-Vese Active Contours using Finite Difference...ijsrd.com
 
Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)Tatsuya Yokota
 
fMRI Segmentation Using Echo State Neural Network
fMRI Segmentation Using Echo State Neural NetworkfMRI Segmentation Using Echo State Neural Network
fMRI Segmentation Using Echo State Neural NetworkCSCJournals
 
A new hybrid method for the segmentation of the brain mris
A new hybrid method for the segmentation of the brain mrisA new hybrid method for the segmentation of the brain mris
A new hybrid method for the segmentation of the brain mrissipij
 
VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHM
VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHMVIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHM
VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHMijcsa
 

Was ist angesagt? (17)

Acoustics and vibrations mechanical measurements - structural testing part ...
Acoustics and vibrations   mechanical measurements - structural testing part ...Acoustics and vibrations   mechanical measurements - structural testing part ...
Acoustics and vibrations mechanical measurements - structural testing part ...
 
Fuzzy and entropy facial recognition [pdf]
Fuzzy and entropy facial recognition  [pdf]Fuzzy and entropy facial recognition  [pdf]
Fuzzy and entropy facial recognition [pdf]
 
Report
ReportReport
Report
 
1 sati
1 sati1 sati
1 sati
 
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionExtended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
 
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic AlgorithmIntellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
Intellectual Person Identification Using 3DMM, GPSO and Genetic Algorithm
 
Paper id 26201482
Paper id 26201482Paper id 26201482
Paper id 26201482
 
Paper id 24201464
Paper id 24201464Paper id 24201464
Paper id 24201464
 
Detection of Carotid Artery from Pre-Processed Magnetic Resonance Angiogram
Detection of Carotid Artery from Pre-Processed Magnetic Resonance AngiogramDetection of Carotid Artery from Pre-Processed Magnetic Resonance Angiogram
Detection of Carotid Artery from Pre-Processed Magnetic Resonance Angiogram
 
Face Recognition Using Sign Only Correlation
Face Recognition Using Sign Only CorrelationFace Recognition Using Sign Only Correlation
Face Recognition Using Sign Only Correlation
 
main
mainmain
main
 
Image segmentation Based on Chan-Vese Active Contours using Finite Difference...
Image segmentation Based on Chan-Vese Active Contours using Finite Difference...Image segmentation Based on Chan-Vese Active Contours using Finite Difference...
Image segmentation Based on Chan-Vese Active Contours using Finite Difference...
 
Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)
 
fMRI Segmentation Using Echo State Neural Network
fMRI Segmentation Using Echo State Neural NetworkfMRI Segmentation Using Echo State Neural Network
fMRI Segmentation Using Echo State Neural Network
 
A new hybrid method for the segmentation of the brain mris
A new hybrid method for the segmentation of the brain mrisA new hybrid method for the segmentation of the brain mris
A new hybrid method for the segmentation of the brain mris
 
VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHM
VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHMVIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHM
VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHM
 
Cq4201618622
Cq4201618622Cq4201618622
Cq4201618622
 

Andere mochten auch

El aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religiónEl aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religiónAna Albero
 
Information model of an electricity procurement planning system
Information model of an electricity procurement planning systemInformation model of an electricity procurement planning system
Information model of an electricity procurement planning systemT T
 
El aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religiónEl aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religiónAna Albero
 
Jogos no Design de Experiências de Aprendizagem de Programação Engajadoras
Jogos no Design de Experiências de Aprendizagem de Programação EngajadorasJogos no Design de Experiências de Aprendizagem de Programação Engajadoras
Jogos no Design de Experiências de Aprendizagem de Programação EngajadorasTancicleide Gomes
 
Campus+2+ methode+de+francais
Campus+2+ methode+de+francaisCampus+2+ methode+de+francais
Campus+2+ methode+de+francaisHong Luu
 
English as an Universal Language
English as an Universal LanguageEnglish as an Universal Language
English as an Universal LanguageMane Díaz
 

Andere mochten auch (13)

Analisi tg1
Analisi tg1Analisi tg1
Analisi tg1
 
Muovo adv
Muovo advMuovo adv
Muovo adv
 
El aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religiónEl aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religión
 
Popular culture
Popular culturePopular culture
Popular culture
 
Qg1
Qg1Qg1
Qg1
 
Information model of an electricity procurement planning system
Information model of an electricity procurement planning systemInformation model of an electricity procurement planning system
Information model of an electricity procurement planning system
 
Muovo adv
Muovo advMuovo adv
Muovo adv
 
El aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religiónEl aprendizaje cooperativo en el área de religión
El aprendizaje cooperativo en el área de religión
 
Analisi tg1
Analisi tg1Analisi tg1
Analisi tg1
 
Giao trinh c_can_ban
Giao trinh c_can_banGiao trinh c_can_ban
Giao trinh c_can_ban
 
Jogos no Design de Experiências de Aprendizagem de Programação Engajadoras
Jogos no Design de Experiências de Aprendizagem de Programação EngajadorasJogos no Design de Experiências de Aprendizagem de Programação Engajadoras
Jogos no Design de Experiências de Aprendizagem de Programação Engajadoras
 
Campus+2+ methode+de+francais
Campus+2+ methode+de+francaisCampus+2+ methode+de+francais
Campus+2+ methode+de+francais
 
English as an Universal Language
English as an Universal LanguageEnglish as an Universal Language
English as an Universal Language
 

Ähnlich wie Perfusion deconvolution via em algorithm

Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!A Jorge Garcia
 
M2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelatesM2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelatesYiteng Dang
 
An Improved Empirical Mode Decomposition Based On Particle Swarm Optimization
An Improved Empirical Mode Decomposition Based On Particle Swarm OptimizationAn Improved Empirical Mode Decomposition Based On Particle Swarm Optimization
An Improved Empirical Mode Decomposition Based On Particle Swarm OptimizationIJRES Journal
 
Compiled Report
Compiled ReportCompiled Report
Compiled ReportSam McStay
 
Integral Equation Formalism for Electromagnetic Scattering from Small Particles
Integral Equation Formalism for Electromagnetic Scattering from Small ParticlesIntegral Equation Formalism for Electromagnetic Scattering from Small Particles
Integral Equation Formalism for Electromagnetic Scattering from Small ParticlesHo Yin Tam
 
Bifurcation analysis of a semiconductor laser with two filtered optical feedb...
Bifurcation analysis of a semiconductor laser with two filtered optical feedb...Bifurcation analysis of a semiconductor laser with two filtered optical feedb...
Bifurcation analysis of a semiconductor laser with two filtered optical feedb...mpiotr
 
Mom slideshow
Mom slideshowMom slideshow
Mom slideshowashusuzie
 
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...IRJET Journal
 
A New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition PerformanceA New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition PerformanceCSCJournals
 
Molecular dynamics and namd simulation
Molecular dynamics and namd simulationMolecular dynamics and namd simulation
Molecular dynamics and namd simulationShoaibKhan488
 
morten_bakkedal_dissertation_final
morten_bakkedal_dissertation_finalmorten_bakkedal_dissertation_final
morten_bakkedal_dissertation_finalMorten Bakkedal
 
Petar Petrov MSc thesis defense
Petar Petrov MSc thesis defensePetar Petrov MSc thesis defense
Petar Petrov MSc thesis defensePetar Petrov
 
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINESAPPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINEScseij
 
Application of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip linesApplication of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip linescseij
 
Algorithms for Sparse Signal Recovery in Compressed Sensing
Algorithms for Sparse Signal Recovery in Compressed SensingAlgorithms for Sparse Signal Recovery in Compressed Sensing
Algorithms for Sparse Signal Recovery in Compressed SensingAqib Ejaz
 

Ähnlich wie Perfusion deconvolution via em algorithm (20)

12098
1209812098
12098
 
Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!Calculus Research Lab 3: Differential Equations!
Calculus Research Lab 3: Differential Equations!
 
M2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelatesM2 Internship report rare-earth nickelates
M2 Internship report rare-earth nickelates
 
An Improved Empirical Mode Decomposition Based On Particle Swarm Optimization
An Improved Empirical Mode Decomposition Based On Particle Swarm OptimizationAn Improved Empirical Mode Decomposition Based On Particle Swarm Optimization
An Improved Empirical Mode Decomposition Based On Particle Swarm Optimization
 
Compiled Report
Compiled ReportCompiled Report
Compiled Report
 
Integral Equation Formalism for Electromagnetic Scattering from Small Particles
Integral Equation Formalism for Electromagnetic Scattering from Small ParticlesIntegral Equation Formalism for Electromagnetic Scattering from Small Particles
Integral Equation Formalism for Electromagnetic Scattering from Small Particles
 
Bifurcation analysis of a semiconductor laser with two filtered optical feedb...
Bifurcation analysis of a semiconductor laser with two filtered optical feedb...Bifurcation analysis of a semiconductor laser with two filtered optical feedb...
Bifurcation analysis of a semiconductor laser with two filtered optical feedb...
 
Mom slideshow
Mom slideshowMom slideshow
Mom slideshow
 
Sheikh-Bagheri_etal
Sheikh-Bagheri_etalSheikh-Bagheri_etal
Sheikh-Bagheri_etal
 
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...
Development, Optimization, and Analysis of Cellular Automaton Algorithms to S...
 
A New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition PerformanceA New Method Based on MDA to Enhance the Face Recognition Performance
A New Method Based on MDA to Enhance the Face Recognition Performance
 
1108.1170
1108.11701108.1170
1108.1170
 
Molecular dynamics and namd simulation
Molecular dynamics and namd simulationMolecular dynamics and namd simulation
Molecular dynamics and namd simulation
 
morten_bakkedal_dissertation_final
morten_bakkedal_dissertation_finalmorten_bakkedal_dissertation_final
morten_bakkedal_dissertation_final
 
Petar Petrov MSc thesis defense
Petar Petrov MSc thesis defensePetar Petrov MSc thesis defense
Petar Petrov MSc thesis defense
 
Richard Allen Thesis
Richard Allen ThesisRichard Allen Thesis
Richard Allen Thesis
 
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINESAPPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
APPLICATION OF PARTICLE SWARM OPTIMIZATION TO MICROWAVE TAPERED MICROSTRIP LINES
 
Application of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip linesApplication of particle swarm optimization to microwave tapered microstrip lines
Application of particle swarm optimization to microwave tapered microstrip lines
 
rkdiss
rkdissrkdiss
rkdiss
 
Algorithms for Sparse Signal Recovery in Compressed Sensing
Algorithms for Sparse Signal Recovery in Compressed SensingAlgorithms for Sparse Signal Recovery in Compressed Sensing
Algorithms for Sparse Signal Recovery in Compressed Sensing
 

Perfusion deconvolution via em algorithm

  • 1. Mat-2.108 Independent Research Project in Applied Mathematics Perfusion Deconvolution via EM Algorithm 27th January 2004 Helsinki University of Technology Department of Engineering Physics and Mathematics Systems Analysis Laboratory Helsinki Brain Research Center Functional Brain Imaging Unit Tero Tuominen 51687J
  • 2. Contents List of abbreviations and symbols ii 1 Introduction 1 2 Perfusion Model and Problem Description 4 2.1 Discretization: 0th order approximation . . . . . . . . . . . . . . . . 5 2.2 SVD Solution to Deconvolution . . . . . . . . . . . . . . . . . . . . . 6 2.3 Discretization: 1st order approximation . . . . . . . . . . . . . . . . 6 3 EM Algorithm 8 3.1 Overview of EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 EM Algorithm applied to Perfusion Deconvolution . . . . . . . . . 9 3.2.1 Lange’s Method in PET image reconstruction . . . . . . . . . 9 3.2.2 Vonken’s Method . . . . . . . . . . . . . . . . . . . . . . . . . 12 4 Improved application of EM 14 5 This Work 18 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.2 Detailed description and the parameters used . . . . . . . . . . . . . 19 6 Results 19 7 Conclusions 23 References 26 i
  • 3. List of abbreviations and symbols MRI Magnetic Resonance Imaging fMRI Functional Magnetic Resonance Imaging PWI Perfusion Weighted Imaging EM Expectation Maximum MLE Maximum Likelihood Estimate MTT Mean Transit Time CBV Cerebral Blood Volume CBF Cerebral Blood Flow SNR Signal-to-Noise Ratio TR Time-to-Repeat TE Time-to-Echo EPI Echo-Planar Imaging AIF a(t) Arterial Input Function TCC c(t) Tissue Concentration Curve r(t) Residue Function Ψ(t) Impulse Response; Ψ(t) = CBF · r(t) a vector or matrix, a ∈ n×m , n, m > 1 a scalar, a ∈ A random variable a realization A random vector or matrix a realization of random vector or matrix ii
  • 4. 1 Introduction Since its introduction in 1988 perfusion weighted fMRI has gained widespread interest in the field of medical imaging. It offers an easy and - most importantly - a non-invasive method for monitoring brain perfusion and even its minor changes in vivo. General principles of perfusion weighted imaging (PWI) were introduced by Villinger et al. in 1988 [1] and further developed by Rosen et al. in 1989 [2]. By injecting a bolus of intravascular paramagnetic contrast agent and observing its first passage concentration-time curves in the brain they were able to gain a valuable insight to functioning of the living organ. The theory of kinetics of intravascular tracers was developed by Meier and Zierler in 1954 [3]. To gain all the knowledge methodologically possible one must recover so called impulse response function for each volume of interest. This func- tion characterises the local perfusion properties. According to the work of Meier and Zierler, however, in order to recover this function one must solve an integral equation of the form t c(t) = a(τ )Ψ(t − τ ) dτ, 0 This is a typical equation of class of equatiations known as Fredholm’s integral equations. The integral also represent so called convolution; thus solving this kind of equation is widely known as deconvolution. Deconvolution belongs to a class of inversion problems. That is, the theory of Meier and Zierler (equation above) describes the change in the input function a as it experiences the changes resulting from the properties of the vasculature and local perfusion (charactirezied by impulse response Ψ). The result is a new func- tion c. The inverse of this problem emerges when one measures input function a and its counterpart c and asks from what kind of mechanism do these changes originate from, i.e. what is the impulse response Ψ. Several methods have been proposed to solve the inversion problem. Tradi- tional methods such as Fourier and Laplace techniques fail in this case due to the significant amount of noise that is present in the measurements. The noisy data and the form of the problem as a typically hard-to-solve Fredholm’s equation make an additional requirement for the method used to solve the problem: the solution has to be recovered so that the effect of noise is either cancelled out or in some other way ignored because an exact solution computed directly from the noisy data is heavily biased and physiologically meaningless. This fact highlights the significance of the physical model which the solution method is based on. The current standard method is based on an algebraic decomposition method known as Singular Value Decomposition (SVD). It requires the discretization of the equation and then reqularises the ill-conditioned system of equations by cut- ting off the smallest singular values. The method was introduced to the field by 1
  • 5. Østergaard et al. [4]. An alternative methodology for inversion is based on probabilistic formula- tion of the model for the problem and then solving it in term of maximum likeli- hood. Such a method was first introduced by Vonken et al. in 1999 [5]. It is based on the Expectation-Maximum (EM) algorithm developed by Dempster et al. in 1977 [6]. The EM algorithm was introduced to the field of medical imaging in- dependently by Shepp and Vardi in 1982 [7] and Lange and Carson in 1984 [8] and further developed by Vardi, Shepp and Kaufman [9]. Vonken’s work relies heavily on that of Lange’s. There are four goals for this work. First, there is no comprehensive descrip- tion of the EM-based perfusion deconvolution; Vonken’s paper is very dense and brief in what comes to the theory. In some parts it is even inaccurate and falsely justified. So here we try to offer a comprehensive and thorough desription of the EM algorithm and its application. We shall take an excessive care to formulate our presentation in a mathematically fluent form. Secondly, Vonken tries to base his version of the algorithm on the physical model but fails to some extent. He simplifies on the expense of the physical model by borrowing one result directly from Lange. The problem is that the result is de- rived assuming Poisson distribution for random variates which in reality follow normal distribution. In this work we correct this assumpition and also the other inaccuarte parts of Vonken’s work and see wether the results are affected. Third, we try to repeat Vonken’s results and for this purpose a computer pro- gram had to created. We also implement the proposed changes and try to com- pare their effects. These programs are to be created in such a manner that they can later serve as research tools at the Helsinki Brain Research Center. The HBRC cur- rently lacks such tools. The comparison of the methods is carried out by Monte Carlo simulations. Since the main interest in this report, however, is in the the- oretical aspects of the EM application we do not concentrate too much on the simulations and thus they are not meant to fully cover the subject. The fourth and the last goal for this report is to fulfill to requirements of course Mat-2.108 Independent Research Project in Applied Mathematics at Helsinki Univer- sity of Technology in Systems Analysis Laboratory. This report is organized as follows. First in chapter 2 the perfusion model and the problem description are represented. Also the SVD solution method and dis- cretization is dealt with. Then the chapter 3 describes the general EM algorithm. It is followed by in introductory example of the use of EM in typical problem, that is, the EM complete-data embedding derived and used by Lange [8] and later adopted by Vonken [5] is revisited. The aim is to offer a simple example and lay grounds for the later developements and representation of Vonken’s work. Such derivation is not present even in the original Lange’s article. The next chapter 4 is entirely devoted to the derivation of the corrected probabilistic model and the 2
  • 6. EM algorithm based on it. Since the simplifications used by Vonken are omitted the derivation is tedious. The later chapter include the description of the simulation and their results. The last chapter gives the conlusions. 3
  • 7. 2 Perfusion Model and Problem Description Villinger and Rosen introduced the general principles of MR perfusion imaging in 1988 and 1989 ([1],[2]). Using paramagnetic intravascular contrast agent he was able to detect measurable change in time series of MR signal S(t). Assum- ing linear relatioship between concentration of a contrast agent c(t) and change in transverse relaxation rate ∆R2 the concentration as a function of time can be characterized as 1 S(t) c(t) ∝ ∆R2 = − ln , (1) TE S0 where S0 is the baseline intensity of the signal. For intravascular tracer, i.e. tracers that remain strictly inside the vasculature, theoretical framework for mathematical analysis was developed by Meier and Zierler in 1954 [3]. According to their work the concentration of a contrast agent in vasculature as a function of time can be represented as t c(t) = F a(τ )r(t − τ ) dτ, (2) 0 where a(t) is the concentration in large artery (also called Arterial Input Func- tion, AIF) feeding the volyme of interest (VOI). c(t) on the left hand side of the equation 2 typically refers to concentration further in tissue and is thus also called Tissue Concentration Curve or TCC. r(t) is so called residue function which is the fraction of tracer remaining in the system at time t. Formally it is defined as t r(t) = 1 − h(s) ds, (3) 0 where h(t) is the distribution of transit times, i.e. the time a plasma particle takes to travel through the capillary vasculature detectable by dynamic susceptibility contranst MRI (DSC-MRI). That is, h(t) is a probability density function. Hence r(t) has the following properties: r(0) = 1 and r(∞) = 0. In practice it is also possible that the TCC is delayed be some time td due to the non-zero distance from where the AIF was measured to where the TCC is measured. In theory, this shifts r(t) to right. Hence, more general form of the residue function is 0 t < td rd (t) = (4) r(t − td ) t ≥ td From now on we will use more general rd (t) without explicit statement and de- note it simply as r(t). In perfusion weighted imaging the TCC c(t) and AIF a(t) are measured. The goal is in finding the solution to integral equation 2, i.e finding out the impulse 4
  • 8. response Ψ(t) = F · r(t). This impulse response characterizes the prorerties of the underlying vasculature to the extent that is methodologically possible. In practical PWI the main interest, however, are the parameters MTT and CBF, whos interdependency is characterized by the Central Volume Theorem [3] CBV = M T T · CBF (5) MTT is so called Mean Transit Time, i.e. the expectancy of h(t) and CBF is Cerebral Blood Flow, that is, F in equation 2. The CBV is simply the area under the c(t) curve. In this work we concentrate on recovering only the CBF. Anyway, for this purpose the whole impulse response has to be recovered. 2.1 Discretization: 0th order approximation The measurements of a(t) and c(t) are made in discrete time intervals {t0 , t1 , t2 , . . . , tn } where time between each measurement is ∆t = T R. This represents natural dis- cretization for the problem 2. Traditionally eq. 2 is discretized directly with an assumption that both the a(t) and the c(t) are constants over the time interval ∆t [4]. This zeroth order (step function) approximation of the convolution integral 2 leads to following linear formulation for the problem tj j c(tj ) = cj = a(τ )Ψ(tj − τ )dτ ≈ ∆t ai Ψj−i (6) 0 i=0 where a(ti ) = ai and Ψ(tj ) = Ψj . By defining matrix a0◦ ∈ n×n as ···   a0 0 0   a1 a0 ··· 0   a0◦ = ∆t  . . ... . .  (7) . .     an an−1 · · · a0 n×1 n×1 and discrete versions of Ψ(t) and c(t) as column vectors Ψ ∈ and c ∈ it is possible to rewrite approximated eq. 6 briefly as c = a0◦ · Ψ (8) In practice, however, T R is of magnitude of seconds and a(t) varies between magnitude of 10 to 30 within a few seconds. This naturally gives rise to a dis- cretization errors. 5
  • 9. 2.2 SVD Solution to Deconvolution Traditionally in perfusion fMRI the equation 8 is solved via Singular Value De- composition (SVD) [4]. This regularises typically ill-conditined system of linear equations 8. In general SVD of matrix a ∈ m×n is a=U·D·V (9) where U ∈ m×m and V ∈ n×n are orthogonal so that U · U = V · V = I. I is an identity matrix. D is a diagonal matrix with same dimensionality as a and its elements are so called singular values {σ i }n , i.e. D = diag{σ i }. i=1 SVD’s regularizing properties come up simply in inverting the decomposed matrix a. From 9 it is easy to see that a−1 = V · diag{1/σ i } · U (10) Now, if singular value are very small, i.e. σ i << 1 the inversion becomes instable as the elements in the diagonal grow. Hence a pseudo-inversion is performed in case of small singular values, that is, large elements 1/σ i corresponding to small singular values σ i are simply set to zero. In practise this requires a threshold un- der which singular values are ingored. In case of perfusion inversion this thresh- old has been shown to be 0.2×the largest singular value [4]. SVD solution (pseudo-inverse) is not suitable for approximation represented in next subsection because trapezoidal approximation weightes separate elements of a differently. 2.3 Discretization: 1st order approximation The first order (trapezoidal) approximation for the convolution integral 2 is adopted from Jacquez [10]. The measurements of a(t) and c(t) are made in discrete time intervals {t0 , t1 , t2 , . . . , tn }. Now 2 at time tj is approximated as ∆t j cj ≈ (aj−i Ψi + aj−i+1 Ψi−1 ) (11) 2 i=1 Assuming a0 = 0 and defining a1◦ as   a1 0 ··· 0 a2 2a1 ··· 0     ∆t  a3 2a2 2a1 0  a1◦ = (12)     2  . . .. . .    . . .   an 2an−1 · · · 2a1 6
  • 10. we can write 11 briefly in vector notation as c = a1◦ · Ψ (13) This does not help in case of SVD solution but might be of assistance where direct discrete convolution is needed. EM is one of these. 7
  • 11. 3 EM Algorithm McLachlan encapsulates the essence of EM algorithm as [11] The Expectation-Maximization (EM) algorithm is a broadly applicable approach to the iterative computation of maximum likelihood (ML) estimates, useful in a variety of incomplete-data problems [. . . ] On each iteration of the EM algorithm, there are two steps – called the expectation step or the E-Step and the maximization step or the M-step. [. . . ] The notion of ’incomplete-data’ includes the conventional sense of missing data, but it also applies to situations where the complete data represents what would be available from some hypothetical ex- periment. [. . . ] even when a problem does not at first appear to be an incomplete-data one, computation of MLE is often greatly facilitated by artificially formulating it be as such. The first general treatment of the EM algorithm was published by Dempster et al. in 1977 [6]. Since then it has been applied in numerous different fields. In per- fusion fMRI it was first used by Vonken et al. in 1999 [5]. Vonken’s work relies heavily on that of Lange’s in 1984 [8]. Lange, however, applied EM to PET image reconstruction. In this chapter first a brief overview of the EM algorithm is offered. It culmi- nates to statemenst of both the E- and M-steps in eqs. 18 and 19. This is followed by introductory overview of Lange’s method [8] which is meant to offer a compre- hensive example of the use of EM in typical problem. Next Vonken’s method [5] is introduced. Excessive care has been taken to formulate the made assumption in mathematically fluent form. 3.1 Overview of EM Algorithm Here we offer a brief recap of the EM theory imitating McLachlan’s book [11]. Let Y be the random vector corresponding to the observed data y, that is, y is Y’s realization. Y has probability density function (pdf) g(y; Ψ) where Ψ is the vector containing the unknown parameters to be estimated. Respectively complete-data random vector will be denoted by X and respectively its realiza- tion as x. X has the pdf f (x; Ψ). The complete-data log likelihood function that could be formed for Ψ if x were fully observable is ln L(Ψ) = ln f (x; Ψ) (14) Define h as many-to-one mapping from complete-data sample space X to incomplete-data sample space Y h:X →Y (15) 8
  • 12. Now we do not observe complete-data x in X but instead incomplete-data y = h(x) in Y. Thus, g(y; Ψ) = f (x; Ψ) dx, (16) X (y) where X (y) is the subset of the complete-data sample space X determined by the equation y = h(x). The eq. 16 in discrete form is g(y; Ψ) = f (x; Ψ) (17) x:h(x)=y Problem here is to solve incomplete-data (observable-data) log likelihood max- imization. The main idea of EM is to solve it in terms of the complete-data rep- resentation L(Ψ) = f (x; Ψ). As it is unobservable it is replaced by its conditional expectation given y and current fit for Ψ which at iteration n is denoted by Ψ(n) . In other words, the entire likelihood function is replaced by its conditional expec- tation, not merely complete-data variates. To crystallize the heuristic EM approach to concrete steps we have the follow- ing: First choose an initial value/guess Ψ(0) for the iteration to begin with. Next carry out the the E-step i.e. calculate the conditional expectation of the log likelihood function given the current parameter estimate Ψ(n) and the obser- vations y Q(Ψ; Ψ(n) ) = EΨ(n) [ ln L(Ψ) | y, Ψ(n) ] (18) Finally the M-step: maximize Q(Ψ; Ψ(n) ) with respect to the parameters Ψ Ψ(n+1) = arg max Q(Ψ; Ψ(n) ) (19) Ψ Now, if there are terms independent of Ψ in eq. 19 they do not contribute to new Ψ(n+1) because they drop out in derivation (i.e. maximization) with respect to Ψ. In some cases this eases the derivation. 3.2 EM Algorithm applied to Perfusion Deconvolution 3.2.1 Lange’s Method in PET image reconstruction Here we review Lange’s derivation of his version of the physically based EM algorithm. It is meant to serve as an introductory example and to clarify the use of EM in practise. The idea in PET is to recover the values of the emission intensity Ψj when one sees only the sum of the emission over a finite time interval. Let the number of emissions from pixel j during projection i be the random variate Xij Xij ∼ P oisson(cij Ψj ) (20) 9
  • 13. where cij ’s are assumed to be known constants. Next define the observable quan- tity, i.e. their sum, be the number of emission recorded for projection i as the random variate Yi Yi = Xij (21) j Hence Yij ∼ P oisson( cij Ψj ) (22) j From 20 it follows that (cij Ψj )xij −cij Ψj P [Xij = xij ] = e (23) xij ! and so f (x; Ψ) = P [Xij = xij ] (24) i j Thus with 14 we have ln L(Ψ) = { xij ln(cij Ψj ) − cij Ψj − ln xij ! } (25) i j and eq. 18 yields based on the linearity of the expectation Q(Ψ; Ψ(n) ) = EΨ(n) [ ln L(Ψ) | y, Ψ(n) ] = { E[ Xij | y, Ψ(n) ] ln(cij Ψj ) − cij Ψj } + R (26) i j R does not depend on Ψ. It includes the term E[ ln Xij ! | y, Ψ(n) ] which would be difficult to calculate. Conditional expectation can be derived as follows yi (n) E[ Xij | y, Ψ ]= k · P [ Xij = k | y, Ψ(n) ] (27) k=0 where P [Xij = k, Yi = yi ] P [ Xij = k | y, Ψ(n) ] = P [Yi = yi ] P [Xij = k, pj Xip = yi − k] = P [Yi = yi ] (n) yi (cij Ψj )k ( pj cip Ψ(n) )yi −k p = (n) yi (28) k ( p cip Ψp ) 10
  • 14. because Ψ(n) is a parameter vector and Xij is independent of other Yj s except of the Yi to which itself contributes. Substituting this to eq. 27 and using n n k n−k a b = (a + b)n (29) k=0 k and yi yi − 1 k = yi , yi ≥ k > 1 (30) k k−1 we finally get the conditional expectation for Xij and denote it by Nij (n) (n) yi cij Ψj Nij = E[ Xij | y, Ψ ]= (n) (31) p cip Ψp Now, if the initial guess Ψ(0) is positive then Nij s are all positive. Hence E-step is completed and yields Q(Ψ; Ψ(n) ) = { Nij ln(cij Ψj ) − cij Ψj } + R (32) i j Now M-step is performed by derivating eq. 32 with respect to Ψ and equating its derivatives to zero. Derivation yields ∂ Nij Q(Ψ; Ψ(n) ) = − cij (33) ∂Ψj i Ψj i (n+1) and setting it to zero and solving for Ψj yields the new estimate Ψj (n) (n+1) iNij Ψj yi cij Ψj = = (n) (34) i cij i cij i p cip Ψp This solution truly maximizes Q. It can be seen as follows. Q’s second derivative is ∂2 Nij Q(Ψ; Ψ(n) ) = − 2 (35) ∂Ψi ∂Ψj j Ψj when i = j and zero otherwise. Thus the quadratic form Ψ H(Ψ)Ψ, where H denotes the Hessian matrix of Q, is strictly negative for all Ψj ≥ 0. That is, the eq. 34 represents the point of concave function where its gradient is equal to zero vector. 11
  • 15. 3.2.2 Vonken’s Method Here we review the application of the EM algorithm to perfusion weighted fMRI published by Vonken in 1999 [5]. First the article is briefly referred and then some of its flaws are pointed out. The notation is changed to correspond this document but no changes beyond this have been made. In the next section we try to offer more exact and thorough treatment of the subject and correct the contradictions in Vonken’s work. Vonken starts by defining the convolution operator a as a square matrix whose elements are defined as Ai−j if i − j ≥ 0 aij = (36) 0 otherwise where Ai−j denotes AIF at time ti − tj , i.e. A(ti − tj ). Thus the operator a corre- sponds to the zeroth order approximation for the convolution integral, i.e. eq 8 in page 5. The next two steps are responsible for cleverly formulating the complete-data embedding. For this purpose Vonken assumes two distributions, one for the com- plete and one for the observed data. The first one has the pdf f (X; Ψ) and it is assumed to follow the normal distribution. The observed data is also assumed to follow the normal distribution. Its pdf is g(C; Ψ). These normality assumption are satisfactorily justified; especially the normality of C is treated thoroughly. First Vonken defines the elements of complete-data matrix as xij = aij Ψj (37) and then naturally the linkage to the observed (incomplete-) data as ci = xik = aik Ψk (38) k k The notation for current estimate of ci s based on the current estimate Ψ(n) is (n) ci = ˜ aik Ψk (39) k Next Vonken moves onwards to define the complete-data log likelihood func- tion based on the assumption that the complete-data xij are distributed normally, 2 2 i.e. Xij ∼ N (aij Ψj , σij )· The variances σij are later taken to be equal and after all in the M-step they cancel out. He says: (n) (n) " . . . using Eq. 38 and the expectancy E[Xij |c, Ψ(n) ] = ci ·aij Ψj / j aij Ψj = ci /˜i ≡ Nij . This gives c (n) E[ln f (X; Ψ)|c, Ψ(n) ] = ln P [Xij ] = − (aij Ψj − Nij )2 /2σij 2 i j i j 12
  • 16. with P [Xij ] the probability of Xij and σij the standard deviation in the complete-data representation." From this Vonken proceeds to the M-step. He takes the derivative of the condi- tional expectation above and equates this to zero. This yield a set of equations (n) aij (aij Ψj − Nij ) = 0 (40) i (n+1) i.e. an equation for each Ψj . To finish, Vonken says: "A program has been implemented that numerically solves Eq. 40 using a Newton-Raphson scheme." The above summarization is not meant to be a complete description of the Vonken’s article; rather it tries to describe the essential points of his derivation in order to illustrate the the facts that are to be changed here. Here are the points that seem to need changes. First, Vonken’s notation could be more exact. He does not make notational difference between random variates and their realizations. This might be a con- sequence of the Lange’s work being the reference point throughout his work. Secondly, more explicit expression of the assumpition used migth clarify the derivation. Especially, even though Vonken let’s the reader to believe that the en- tire derivation is faithfully based on the normality assumptions there is one point where this is not the case. Namely when Vonken takes the conditional expecta- tion E[Xij |c, Ψ(n) ] he does not mention its origins. In fact it is taken directly from Lange [8]. The result, however, is derived based on the assumption of Poisson (n) distribution Xij ∼ P oisson(aij Ψj ). This may serve as a satisfactory approxima- tion but is clearly incorrect and ungrounded here. Vonken’s obvious goal is to try to ground his work on the physical model like Lange but here he deviated from this without any explanation. Finally, the calculation of the log likelihood of the complete-data is guestion- albe. In EM theory the conditional expectation is taken from the entire log likeli- hood function ln L(Ψ) as stated in eq. 18. If the log likelihood function is linear in x in terms containing the parameter Ψj the result looks just like the xij s had simply been replaced by their conditional expectations. For an example see eq. 25 in page 10. Here, however, the normality assumption leads to non-linear term (n) (aij Ψj − xij )2 whose conditional expectation with notation E[Xij |c, Ψ(n) ] ≡ Nij (n) is not (aij Ψj − Nij )2 as derived by Vonken. This might be the explanation for the fast and sometimes instable convergence of the algorithm. 13
  • 17. 4 Improved application of EM Vonken’s reasoning in complete-data embedding is adopted and the convolution operator a is defined as Ai−j if i − j ≥ 0 aij = (41) 0 otherwise where Ai−j denotes AIF at time ti − tj , i.e. A(ti − tj ). Thus a represents a zeroth order approximation for the convolution integral as eq. 8 on page 5 shows. The distribution of the measured values of time-series of AIF and TCC is as- sumed the be normal as Vonken argued. This is also intuitively appealing as the values at issue are measurement values of a physical quantity after almost linear transformation. Now A refers to both the convolution matrix which is treated as a random variate (matrix) and also to the random vector of AIF values. After measurement A is realized as a; first as the AIF and then after transfomation 41 also as the convolution operator a. These two differ only at the notational level: aj refers to the element of AIF whereas aij is an element of the operator 41. Based on the previous reasoning the AIF values Ai are assumed to be normally distributed around its mean which will be notated here with parameter µi , i.e. E[Aij ] = µij . Later when the actual measurements are made and the developed algorithm will be used to recover the residual this parameter will be replaced by the measured aij , i.e Aij ’s realization. The variance associated with the parameter 2 naturally is σAIF . Explicitly Aij ∼ N (µij , σ 2 ) AIF (42) From this the distribution of the complete-data elements Xij can be easily de- rived. They are defined as Xij = Aij Ψj and thus Xij ∼ N (µij Ψj , (Ψj σ AIF )2 ) (43) Thus the complete-data pdf is of the familiar exponential form and is from now on denoted by fX (x; Ψ). Now as the observed-data are defined as Ci = Xik (44) k we have Ci ∼ N ( µik Ψk , (Ψk σAIF )2 ) (45) k k From now on the pdf of random observed data vector C is denoted by gC (c; Ψ). 14
  • 18. From eq. 43 one can easily formulate the complete-data log likelihood function which is needed in the E-step √ (µij Ψj − xij )2 ln L(Ψ) = { − ln( 2π (Ψj σAIF )2 ) − } (46) i j 2(Ψj σAIF )2 Writing the binomial open and denoting by R the terms independent of Ψ the conditional expectation of the log likelihood can be written as Q(Ψ; Ψ(n) ) = EΨ(n) [ ln L(Ψ) | c, Ψ(n) ] √ µij = − ln( 2π (Ψj σAIF )2 ) + 2 E[ Xij | c, Ψ(n) ] i j Ψj σAIF 1 − E[ Xij | c, Ψ(n) ] + R 2 (47) 2(Ψj σAIF )2 From eq. 47 it is clear that two different conditional expectations are needed: E[ Xij | c, Ψ(n) ] = xij fX|C,Ψ(n) (xij |ci , Ψ(n) ) dxij (48) E[ Xij | c, Ψ(n) ] = 2 x2 fX|C,Ψ(n) (xij |ci , Ψ(n) ) dxij ij (49) where fX|C,Ψ(n) (xij |ci , Ψ(n) ) refers to the current conditional pdf of Xij given c and Ψ(n) . This can be found using basic property familiar from the probability theory (n) (n) fX|Ψ(n) (xij |Ψ(n) ) fX|C,Ψ(n) (xij |ci , Ψ ) = gC|X,Ψ(n) (ci |xij , Ψ ) (50) gC|Ψ(n) (ci |Ψ(n) ) where gC|X,Ψ(n) (ci |xij , Ψ(n) ) refers respectively to the conditional pdf of Ci given xij and current Ψ(n) . fX (xij ) and gC (ci ) are merely the pdfs of Xij and Ci . The functions in eq. 50 expressed explicitly are: (n) 1 (µij Ψj − xij )2 fX|Ψ(n) (xij |Ψ(n) ) = √ exp(− (n) ) (51) (n) 2π (Ψj σAIF )2 2(Ψj σAIF )2 and (n) (n) 1 ( k µik Ψk − ci )2 gC|Ψ(n) (ci |Ψ )= √ exp(− (n) ) (52) (n) 2 2 2π k (Ψk σAIF ) 2 k (Ψk σAIF ) and (n) 1 ( kj µik Ψk + xij − ci )2 gC|X,Ψ(n) (ci |xij , Ψ(n) ) = √ (n) exp(− (n) ) 2 2 kj (Ψk σAIF ) 2π 2 kj (Ψk σAIF ) (53) 15
  • 19. The notation kj means that the sum is taken over all k exept j, in other words kj zk = k zk − zj . From equations 50 through 53 it is obvious that the results will get messy. Therefore we define the following short-hand notations (n) µij Ψj = γij (n) (Ψj σAIF )2 = αj (n) (Ψk σAIF )2 = α k (n) (Ψk σAIF )2 = βj kj (n) (µik Ψk ) = γi k (n) (µik Ψk ) = δij kj One must not confuse Ψ and Ψ(n) because the maximization in the M-step re- (n) quires derivation of Q with respect to each Ψj and iterates Ψj are treated as con- stant parameters. Hence 52 has no dependency on xij it will be denoted merely by gC (ci ) in the future. The conditional expectations yield with defined notation ci αj + γ βj − αj δij ( γi − ci )2 E[ Xij | c, Ψ(n) ] = √ exp(− ) (54) 2πgC (ci ) α−3/2 2 α and 1 E[ Xij | c, Ψ(n) ] = √ 2 (ci αj )2 + (γij βj )2 + 2πgC (ci ) α5/2 +2ci αj (γij βj − αj δij ) + +αj βj ( βj − 2γij δij ) + 2 2 ( γi − ci )2 +αj ( βj + δij ) exp(− ) (55) 2 α Substituting these to Q (eq. 47) have have completed the E-step. Now, Q is of the form Q(Ψ; Ψ(n) ) = Kij (Ψj ) (56) i j thus the derivation with respect to each Ψj yields ∂Q(Ψ; Ψ(n) ) ∂Kij (Ψj ) = (57) ∂Ψj i ∂Ψj 16
  • 20. where the derivative of Kij (Ψj ) can be written as ∂Kij (Ψj ) = Λij Ψ−3 − Ωij Ψ−2 − Ψ−1 j j j (58) ∂Ψj where we have again defined the following short-hand notations 1 Λij = √ 2 (ci αj )2 + (γij βj )2 + 2πσAIF gC (ci ) α5/2 +2ci αj (γij βj − αj δij ) + +αj βj ( βj − 2γij δij ) + 2 2 ( γi − ci )2 +αj ( βj + δij ) exp(− ) (59) 2 α and µij (ci αj + γ βj − αj δij ) ( γi − ci )2 Ωij = √ 2 exp(− ) (60) 2πσAIF gC (ci ) α−3/2 2 α Hence after the summation over i and multiplication by Ψ3 in eq. 56 we have the j equation for the root of the derivative eq. 57 Ψ2 j 1 + Ψj Ωij − Λij = 0 (61) i i i This second-degree equation can easily be solve for Ψj . Choosing the positive root we have (n+1) − i Ωij + ( i Ωij )2 + 4 i 1 i Λij Ψj = Ψj = (62) 2 i 1 17
  • 21. 5 This Work 5.1 Overview The two main goals of this report are to descripe the EM-based deconvolution method published by Vonken [5] and try to improve it and then to evaluate the made changes by simulations. For this purpose both the Vonken’s methdod and the new method were implemented on the MATLAB platform. Both methods were also implemented using both the 0th and then the 1st order approximations for the convolution integral. Therefore in total four different methods were to be evaluated. As stated in the introduction, however, this report concentrates mainly on the theoretical aspects and the full evaluation is not included. Instead only the reproductibility of CBF was studied. The methods are evaluated using Monte Carlo simulation. For this purpose the true values of AIF, TCC and impulse response Ψ have to be known. This was achieved by creating a numerical integrator which computes the "true" TCC based on given AIF and impulse response using eq. 2. This avoids the effect of discretization errors arising from discretized eq. 8, for example. This method also enables us easily to change all the parameters affecting the implulse response; most importantly the delay is not binded to multiples of TR. After the true functions are know the gaussian noise is added to TCC using eq. 1. This noisy TCC is then used when performing the deconvolution by the methods to be tested. Numerical values used in this work were S0 = 300 and k = 1. Signal-to-Noise ratio was set to clinically interesting values of SN R = 35. Vonken reported difficulties in deciding the optimal number of iterations needed. In his clinical experiment he used four iterations. This number is adoted here, also. This number is without any further investigation used for both the zeroth and the first order approximations. However, the convergence properties of the algorithm change dramatically when the proposed changes are implemented. Empirically (try-and-error) the fol- lowing iteration numbers were found: the zeroth order approximation was iter- ated 100 times whereas in the case of the first order approximation the maximum number of iterations was set to 400. Another problematic area not described by Vonken was the tendency of the recovered impulse response to "upraise its tail". In other word the convergence produced nearly always an impulse response whose last and sometime even the second-to-last elements were clearly incorrectly large. This, however, did not seem to affect the previous elements. The same was observed in the case of the new EM version. This may result in erroneously determined CBF if the tail rises higher than the true maximum of the impulse response. To overcome this diffi- 18
  • 22. culty in CBF estimation the last four elements of the estimated impulse response were simply put to zero. The new algorith was found to suffer minor numerical instabilities. The val- ues of eq. 52 are typicall very small and in case of initial guess that differs greatly from the measured data the values of eq. 52 become too small for the available ac- curacy. Therefore a good initial guess is needed. To guarantee equal treatment of all methods a common initial guess was set to constant function of value .02. The insufficienf numerical accuracy, however, was in some cases so severe that some- times (very rarely, in present simulations the occurence frequency was 5 times out of 13 · 512 = 6656 simulations) the algorithm could not procede and in such cases was set to produce a NaN (Not a Number) result. The mean and standard deviations of the estimates were calculated ignoring these NaN values. 5.2 Detailed description and the parameters used There were two different sets of simulations: one with zero delay (td = 0) and other with 2.7 seconds delay,i.e. td = 2.7 in eq. 4. Both were carried out in sim- ilar manner. The CBF was varied between 0.01 and .13 [arbitary units] with .01 intervals. At each flow level 512 different noisy TCCs were generated and each of them was deconvolved with every method. The average CBF estimate and its standard deviation was then calculated. The original residue function (see eq. 2 pp. 4) was generated from h(t) of the form Γ(α + β) h(t) = (t1 − t0 )1−α−β (t − t0 )α−1 (t1 − t)β−1 Γ(α)Γ(β) with empirically seems to be reasonable model for h(t) [12]. The numerical values were set to t0 = 0, t1 = 8, α = 2, 3 and β = 3, 8 corresponding physiologically typical to M T T ≈ 3s. The AIF was modeled as a gamma-variate function of the form (t−t0 ) AIF (t) = a(t − t0 )b e− c where now a = 2, b = 4 and c = 1, 1. All the time T R was kept at 1,5s. For comparison also SVD solution was calculated. The simulation were very time consuming. Each of the two set described above took nearly two days to complete on 2,4GHz AMD platform. 6 Results The simulation results are depicted in figures 1 and 2 on pages 21 and 22. The first one depicts a normal case whereas in the latter one TCC is delayed by 2.7 seconds. 19
  • 23. The similar results for the standard SVD deconvolution method are show in fig- ure 3 on page 23. There are four pictures corresponding to each four different versions of the EM based deconvolution: first two (upper row) depict the perfor- mance of the new EM algorithm using both the zeroth order and the first order approximation for the convolution integral. The lower row respectively depicts the performance of the Vonken’s EM algorithm in both the zeroth and the first order cases. The two most eye-catching features are the enormous standard deviation of the traditional Vonken’s EM based CBF estimate and the tendency of the Vonken’s original algorithm to yield dramatically overestimated CBF estimates in low CBF values. Standard deviations of such magnitude were not reported in Vonken’s original paper. Neither was the obviously incorrect convergence in the low CBF values. Since the last elements of the impulse responses recovered here were set to zero this huge variation in CBF estimates has to originate from the physically meaningfull part of the impulse responses. The principal differences between the results obtained by different methods are as follows. In case where no delay is present the original Vonken’s algorithm seem to provide equal results as the new zeroth order version developed here. Despite the major difference in the standard deviation the means of the results seem equal. The simultaneous appearance of the huge change in the standard deviation and smaller change in the mean CBF value may indicate the existence of few major out-liers. The effect of the first order approximation for the convolution integral results in more loyal estimate of the CBF. In both cases - traditional and new EM decon- volution - the estimated CBF seems follow the true value well. The new version, however, is prone to overestimation. The original version of the EM deconvolu- tion equipped with the more accurete approximation, however, yields very good results. The constant overestimation of the new algorithm may be a result of poor selection of the number of maximum iterations. The presence of 2,7 seconds delay in general deteriorates the performance of both methods. The standard deviations are not affected but the CBF estimates are lower throughout the range than before. The new algorithm with higher order approximation (upper right corner in figure 2 in pp. 22), however, gives extremely good results withmodest standard deviation. However, the biased estimation in no-delay-situation and the behaviour of the original algorithm with the higher order approximation suggest that here a bias is compensated by another bias. 20
  • 24. new EM−d0 CBF new EM−d1 CBF 0.2 0.2 estimated CBF [arb.units] estimated CBF [arb.units] 0.1 0.1 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.02 0.04 0.06 0.08 0.1 0.12 true CBF [arb.units] true CBF [arb.units] trad. EM−d0 CBF trad. EM−d1 CBF 0.2 0.2 estimated CBF [arb.units] estimated CBF [arb.units] 0.1 0.1 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.02 0.04 0.06 0.08 0.1 0.12 true CBF [arb.units] true CBF [arb.units] Figure 1: Simulation results in case there is no delay (td = 0). Pictures in upper row correspond to the new version of the EM algorithm whereas the lower row corresponds to the original Vonken’s version. The left pictures are computed with the original zeroth order convolution integral approximation but in right ones the linear approximation is used. The thiker lines give the mean of the deconvolved CBF estimate vs. the true CBF. The dashed lines are the mean of the CBF ± their standard deviation. The dotted line corresponds the perfect match. 21
  • 25. new EM−d0 CBF new EM−d1 CBF 0.2 0.2 estimated CBF [arb.units] estimated CBF [arb.units] 0.1 0.1 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.02 0.04 0.06 0.08 0.1 0.12 true CBF [arb.units] true CBF [arb.units] trad. EM−d0 CBF trad. EM−d1 CBF 0.2 0.2 estimated CBF [arb.units] estimated CBF [arb.units] 0.1 0.1 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.02 0.04 0.06 0.08 0.1 0.12 true CBF [arb.units] true CBF [arb.units] Figure 2: Simulation results in case there is 2.7 seconds delay (td = 2.7). Pictures in upper row correspond to the new version of the EM algorithm whereas the lower row corresponds to the original Vonken’s version. The left pictures are computed with the original zeroth order convolution integral approximation but in right ones the linear approximation is used. The thiker lines give the mean of the de- convolved CBF estimate vs. the true CBF. The dashed lines are the mean of the CBF ± their standard deviation. The dotted line corresponds the perfect match. 22
  • 26. SVD−CBF norm. & delayd 0.2 estimated CBF [arb.units] 0.1 0 0.02 0.04 0.06 0.08 0.1 0.12 true CBF [arb.units] Figure 3: With and without delay SVD. Dashed line is with delay 7 Conclusions In this work the EM based deconvolution method developed by Vonken et al. [5] was reviewed. Also some theoretical backgrounds were given and attention was paid especially to discretization accuracy. Some flaws of Vonken’s article were pointed out and corrected. This resulted in an entirely new version of the EM deconvolution algorithm. The new EM based algorithm was tedious to derive. First major change with respect to that of Vonken’s was in implementing the more natural and better grounded normality assumption concerning the distribution of the complete-data variates. Second, more fundamental change was done amending Vonken’s con- ditional expectancy of the complete-data log likelihood function. This is likely to be the source of the different convergence properties of the new algorithm. After implementing the first order approximation and the new version of the algorithm there were four different versions of the algorithm to be tested. Sim- ulations were carried out with and without delay between AIF and TCC. For comparison, also traditional SVD deconvolution was carried out. The results were surprising. First of all, the strange behaviour of Vonken’s original algorithm is in contrast of that reported in his original article. It seems 23
  • 27. to be prone to dramatically overestimate the low CBF values and in addition to that it suffers from large standard deviation. These are likely to originate from the wrongly derived equation 40 on page 13. The new version of the algorithm converges much slowlier and hence requires more iterations to be used. Neither the optimal number of iterations nor the ini- tial guess were subjects here. Regardless of that the results were promising. The standard deviation was of the same magnitude as that of SVD’s. In fact, the ze- roth order approximation yielded almost identical results as SVD did. The use of first order approximation resulted in minor overestimation of the CBF but no- table here is that the magnitude of the bias does not change as the CBF does. The higher order approximation, however, results in somewhat greater standard deviation of the estimate. The absence of the improvement due to higher order approximation in the case of delayed TCC, however, suggest that the excellent performance of the new algorithm with higher order approximation results from one bias being compensated by another. After all, developements described in this report seem promising. They were able to guarantee nearly certain convergence with modest spread of CBF esti- mates. A clear improvement with respect to Vonken’s original algorithm was recorded. The price paid was slower convergence and longer computation time. Further research still has to be carried out. The reproductibility of the full im- pulse response is of great importance in some applications. The effect of different delays and especially different shapes of residual function also remain to be in- vestigated. 24
  • 28. References [1] A. Villringer, B. Rosen, J. Belliveau, J. Ackerman, R. Lauffer, R. Buxton, Y. Chao, V. Wedeen, and T. Brady, “Dynamic imaging with lanthanide chelates in normal brain: contrast due to magnetic susceptibility effects,” Magnetic Resonance In Medicine, vol. 6, no. 2, pp. 164–174, 1988. [2] B. R. Rosen, J. W. Belliveau, and D. Chien, “Perfusion Imaging by Nuclear Magnetic Resonance,” Magnetic Resonance Quaterly, vol. 5, no. 4, pp. 263–281, 1989. [3] P. Meier and K. L. Zierler, “On the Theory of the Indicator-Dilution Method for Measurement of Blood Flow and Volume,” Journal of Applied Physiology, vol. 6, no. 12, pp. 731–744, 1954. [4] L. Østergaard, R. M. Weisskoff, D. A. Chesler, C. Gyldensted, and B. R. Rosen, “High Resolution Measurement of Cerebral Blood Flow using In- travascular Tracer Bolus Passages. Part 1: Mathematical Approach and Sta- tistical Analysis,” Magnetic Resonance in Medicine, vol. 36, pp. 715–725, 1996. [5] E.-J. P. Vonken, F. J. Beekman, C. J. Bakker, and M. A. Viergever, “Maxi- mum Likelihood Estimation of Cerebral Blood Flow in Dynamic Suscepti- bility Contrast MRI,” Magnetic Resonance in Medicine, vol. 41, pp. 343–350, 1999. [6] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data via EM Algorithm,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 39, no. 1, pp. 1–38, 1977. [7] L. A. Shepp and Y. Vardi, “Maximum Likelihood Reconstruction for Emis- sion Tomography,” IEEE Transactions on Medical Imaging, vol. 1, pp. 113–122, 1982. [8] K. Lange and R. Carson, “EM Reconstruction Algorithms for Emission and Trasmission Tomography,” Journal of Computer Assisted Tomography, vol. 8, no. 2, pp. 306–316, 1984. [9] Y. Vardi, L. A. Shepp, and L. Kaufman, “A Statistical Model for Positron Emission Tomography,” Journal of the American Statistical Association, vol. 80, no. 389, pp. 8–20, 1985. [10] J. A. Jacquez, Compartmental Analysis in Biology and Medicine. The University of Michigan Press, 2 ed., 1985. 25
  • 29. [11] G. J. McLachlan and T. Krishnan, The EM Algorithm and Extensions. Wiley Series in Probability and Statistics, Wiley, 1997. [12] L. Østergaard, D. A. Chesler, R. M. Weisskoff, A. G. Sorensen, and B. R. Rosen, “Modeling Cerebral Blood Flow and Flow Heterogenity From Mag- netic Resonance Residue Data,” Journal of Cerebral Blood Flow and Metabolism, vol. 19, pp. 690–699, 1999. 26