SlideShare ist ein Scribd-Unternehmen logo
1 von 138
August 6-8, 2002
Topics Overview
Overview
   Mathematics Overview

     – Linear Algebra and Linear Systems
     – Probability and Hypothesis Testing
     – State Estimation

   Filtering Fundamentals

     – Linear and Non-linear Filtering
     – Multiple Model Filtering

   Tracking Basics

     – Track Maintenance
     – Data Association Techniques
     – Activity Control
Mathematics Review
Mathematics Review
   Linear Algebra and Linear Systems

     –   Definitions, Notations, Jacobians and Matrix Inversion Lemma
     –   State-Space Representation (Continuous and Discrete) and Observability

   Probability Basics

     –   Probability, Conditional Probability, Baye’s and Total Probability Theorem
     –   Random Variables, Gaussian Mixture, and Covariance Matrices

   Bayesian Hypothesis Testing

     –   Neyman-Pearson Lemma and Wald’s Theorem
     –   Chi-Square Distribution

   Estimation Basics

     –   Maximum Likelihood (ML) and Maximum A Posteriori (MAP) Estimators
     –   Least Squares (LS) and Minimum Mean Square Error (MMSE) Estimators
     –   Cramer-Rao Lower Bound, Fisher Information, Consistency and Efficiency
Vector and Matrix Basics
Definitions and Notations

                         a1 
                        a             T
           
           a = [ ai ] =  2            a = [ a1       a2  an ]
                        
                         
                        an 


           a11    a12      a1m                    a11   a21     an1 
          a                a2 m                  a              an 2 
    [ ]
A = aij =  21
           
                   a22
                            
                                          [ ]
                                      AT = a ji   =  12
                                                     
                                                            a22
                                                                    
                                                                          

                                                                       
          an1     an 2     anm                   a1m    a2 m    anm 
Basic Matrix and Vector Properties
   Symmetric and Skew Symmetric Matrix

                     A = AT          A = − AT
   Matrix Product (NxS = [NxM] [MxS]):

                           [ ]
                                         m
                     C = cij = AB = ∑ aik bkj
                                     k =1

   Transpose of Matrix Product


                     [ ]
                                                 m
            C = c ji = ( AB ) = B A = ∑ b jk aki
              T                  T   T       T

                                                 k =1

   Matrix Inverse

                             AA−1 = I
Basic Matrix and Vector Properties
       Inner Product (Vectors must have equal length)
                                            n
                        〈 a, b〉 = a b = ∑ ai bi
                                    T

                                           i =1

       Outer Product (NxM = [N] [M])


                                         [ ] [
                       abT = C = cij = ai b j         ]
       Matrix Trace
                                   n
                        Tr ( A) = ∑ aii = Tr ( AT )
                                  i =1
       Trace of Matrix Product
                            ∂                                ∂ ( AB )
Tr ( AB) = Tr ( BA)
                           ∂A
                              (Tr ( ABAT ) ) = A( B + BT )            = BT
                                                                ∂A
Matrix Inversion Lemma
   In Estimation Theory, the following complicated inverse appears:


                           (P   −1
                                     +H R H   T   −1
                                                        )   −1



   The Matrix Inversion Lemma yields an alternative expression which does
    not depend on the inverses of the matrices in the above expression:


                                     (
                        P − PH T HPH T + R                  )   −1
                                                                     HP
   An alternative form of the Matrix Inversion Lemma is:


      ( A + BCB )T −1      −1
                        = A − A B B A B+C−1
                                              (   T    −1
                                                                          )
                                                                      −1 −1
                                                                              B T A−1
The Gradient
   The Gradient operator with respect to an n-dimensional vector “x” is:
                                                     T
                               ∂      ∂ 
                         ∇x = 
                                        
                               ∂x1   ∂xn 
   Thus the gradient of a scalar function “f” is:
                                                         T
                                ∂f    ∂f 
                        ∇x f = 
                                         
                                ∂x1   ∂xn 
   The gradient of an m-dimensional vector-valued function is:

                                      T
        T  ∂       ∂                                           
     ∇x f = 
                       [ f1 ( x )                        f m ( x )] = NxM
             ∂x1   ∂xn 
The Jacobian Matrix
   The Jacobian Matrix is a matrix of derivatives describing a linear mapping from
    one set of coordinates to another. This is the transpose of the gradient of a
    vector-valued function (p. 24):
                                                    ∂x1     ∂x1 
                                                    ∂x′   
                                                            ∂xn 
                                                               ′
                    ∂x ∂ ( x1 , x2 ,..., xm )  1              
        J ( x , x ′) =  =                        =         
                      ∂x ′ ∂ ( x1 , x′ ,..., xn )
                                ′ 2           ′
                                                    ∂xm   
                                                             ∂xm 
                                                    ∂x1
                                                       ′     ∂xn 
                                                               ′
                                                                
   This is typically used as part of a Vector Taylor Expansion for approximating a
    transformation.
                               
                           ∂x        
                x = x ( xo ) +  ⋅ ( x ′ − xo ) + 
                         ′                  ′
                              ∂x ′ xo′
                                   
The Jacobian Matrix: An Example
      The conversion from Spherical to Cartesian coordinates yields:

         
         x = [ r Sin(b) Cos (e) r Cos (b) Cos (e) r Sin(e)]
                               
                               x ′ = [ r b e]

                Sin(b) Cos (e) r Cos (b) Cos (e) − r Sin(b) Sin(e) 
J ( x , x ′) = Cos (b) Cos (e) − r Sin(b) Cos (e) − r Cos (b) Sin(e)
     
                                                                     
               
                   Sin(e)                         0        r Cos (e) 
                                                                      
                                                      
                                                ∂x ∂x ′
                        J ( x , x ′) J ( x ′, x ) =   = I
                                                   ∂x ′ ∂x
Linear Systems Basics
Dirac Delta Function
   The Dirac Delta Function is defined by:

                            δ (t − τ ) = 0 ∀t ≠ τ
   This function is defined by its behavior under integration:


                                            τ ∈ [ a, b ]
                        b
                       ∫ δ (t − τ )dt = 1
                        a

   In general, the Dirac Delta Function has the following “sifting” behavior:


                     f (t ) δ (t − τ )dt = f (τ ) τ ∈ [ a, b]
                 b
             ∫a

   The discrete version of this is called the Kronnecker Delta:

                                    0 ∀i ≠ j
                             δ ij = 
                                     1 i= j
State-Space Representation (Continuous)
   A Dynamic Equation is typically expressed in the standard form (p. 27):
                                                 
                       (t ) = A(t ) x (t ) + B(t )u (t )
                      x
          
          x (t ) is the state vector of dimension “nx”
          
          u (t ) is the control input vector of dimension “ny”
          A(t ) is the system matrix of dimension “nx x nx”
          B (t ) is the input gain matrix of dimension “nx x ny”

   While the Measurement Equation is expressed in the standard form:
                                           
                            z (t ) = C (t ) x (t )
           
           z (t ) is the measurement vector of dimension “nz”
           C (t ) is the observation matrix of dimension “nz x nx”
Example State-Space System
   A typical (simple) example is the constant velocity system:

                                   ξ(t ) = 0
                                   
   This system is not yet in state-space form:

               ξ  0 1 ξ  0 0  u1 
                  
                 =   ξ  + 0 0 u 
               ξ  0 0          2 
                                               
                     (t ) = A(t ) x (t ) + B(t )u (t )
                    x
   And suppose that we only have position measurements available:

                                           ξ 
                        ξ   meas
                                   = [1 0]  
                                           ξ 
                                           
                            z (t ) = C (t ) x (t )
State-Space Representation (Discrete)
   A continuous state-space system can also be written in discrete form (p. 29):
                                              
                       xk = Fk −1 xk −1 + Gk −1 u k −1
           
           xk   is the state vector of dimension “nx” at time “k”
           
           uk   is the control input vector of dimension “ny” at time “k”
           Fk   is the transition matrix of dimension “nx x nx” at time “k”
           Gk   is the input gain matrix of dimension “nx x ny” at time “k”

   While the Measurement Equation is expressed in the discrete form:
                                        
                               z k = H k xk
          
          zk    is the measurement vector of dimension “nz” at time “k”
          Hk    is the observation matrix of dimension “nz x nx” at time “k”
Example Revisited in Discrete Time
   The constant velocity discrete time model is given by:

        ξ k  1 t k − t k −1  ξ k −1  0 0  u1k −1 
        ξ  = 0
                               ξ  + 0 0 u 
                       1   k −1  
         k                                    2 k −1 
                                          
                   xk = Fk −1 xk −1 + Gk −1 u k −1
   Since there is no time-dependence in the measurement equation, it is
    a trivial extension to the continuous example:

                                          ξ k 
                       ξ   meas
                                 = [1 0]   
                                          ξ k 
                           k


                                       
                              z k = H k xk
State Transition Matrix
      We wish to be able to convert a continuous linear system to a discrete
       time linear system. Most physical problems are easily expressible in the
       continuous form while most measurements are discrete. Consider the
       following time-invariant homogeneous linear system (pp. 180-182):
                          
             (t ) = A(t ) x (t ) where
            x                                              A(t ) = A           for t ∈ [ t k −1 , t k ]
      We have the solution:

                             
                                              {               }              
x (t ) = Fk −1 ( t , t k −1 ) x ( t k −1 ) = L−1 ( sI − A) = e A( t −tk −1 ) x ( t k −1 )
                                                          −1
                                                                                            for t ∈ [ t k −1 , t k ]

      If we add a term, making an inhomogeneous linear system, we obtain:

                              
   
   x (t ) = A(t ) x (t ) + B(t )u (t ) where B(t ) = B                                for t ∈ [ t k −1 , t k ]
Matrix Superposition Integral
   Then, the state transition matrix is applied to the additive term and
    integration is performed to obtain the generalized solution:

                                                                 
x (t ) = Fk −1 ( t , t k −1 ) x ( t k −1 ) + ∫ Fk −1 ( t ,τ ) B(τ )u (τ ) dτ   for t ∈ [ t k −1 , t k ]
                                              t

                                        t k −1


   Consider the following example:


     u(t)=(t)             2σ 2 β                x2                 1                  x1

                           s+β                                      s



                  x1  0 1   x1   0 
                   
                  x  = 0 − β   x  +  2σ 2 β u (t )
                  2          2             
Observability Criteria
   A system is categorized as observable if the state can be determined
    from a finite number of observations, assuming that the state-space
    model is correct.

   For a time-invariant linear system, the observability matrix is given by:


                             H 
                             HF 
                          Ω=          
                              
                               n x −1 
                            H F 
   Thus, the system is observable if this matrix has a rank equal to “nx” (pp.
    25,28,30).
Observability Criteria: An Example
   For the nearly constant velocity model described above, we have:


                       [1 0]  1 0
                                          
                  Ω=      1 ∆t   = 
                    [1 0]                
                                   1 ∆t 
                    
                          0 1  

   The rank of this matrix is “2” only if the delta time interval is non-zero.
    Thus, we can only estimate position and velocity both (using only
    position measurements) if these position measurements are separated
    in time.

   The actual calculation of rank is a subject for a linear algebra course and
    leads to ideas such as linear independence and singularity (p. 25)
Probability Basics
Axioms of Probability
   Suppose that “A” and “B” denote random events, then the
    following axioms hold true for probabilities:

     –    Probabilities are non-negative:

                                 P{ A} ≥ 0 ∀A
     –    The probability of a certain event is unity:
                                     P{ S } = 1

     –    Additive for mutually exclusive events:
         If P{ A ∩ B} = 0 then P{ A ∪ B} = P{ A} + P{ B}

            Mutually Exclusive
Conditional Probability
   The conditional probability of an event “A” given the event “B” is:

                                       P{ A ∩ B}
                         P{ A | B} =
                                         P{ B}

   For example, we might ask the following tracking related questions:

     – Probability of observing the current measurement given the
       previous estimate of the track state

     – Probability of observing a target detection within a certain
       surveillance region given that a true target is present


   Formulating these conditional probabilities is the foundation of
    track initiation, deletion, data association, SNR detection
    schemes…
Total Probability Theorem
   Assume that we have a set of events “Bi” which are mutually
    exclusive:


                           P{ Bi ∩ B j } = 0 ∀ i ≠ j
   And exhaustive:
                                 n

                                ∑ P{ B } = 1
                                i =1
                                        i



   Then the Total Probability Theorem states:
                       n                     n
          P{ A} = ∑ P{ A ∩ Bi } = ∑ P{ A | Bi } P{ Bi }
                      i =1                  i =1
Baye’s Theorem
          We can work the conditional probability definition in order to obtain the
           reverse conditional probability:

                                       P{ Bi ∩ A} P{ A | Bi } P{ Bi }
                        P{ Bi | A} =             =
                                          P{ A}        P{ A}
          This conditional probability “Bi” is called the Posterior Probability while the
           unconditional probability of “Bi” is called the Prior Probability.

          In the case of “Bi” being mutually exclusive and exhaustive, we have (p. 47):


                                                 P{ A | Bi } P{ Bi }
Posterior Probability      P{ Bi | A} =                                        Prior Probability


                                           ∑ P{ A | B } P{ B }
                                             n

                                                            j           j
                                            j =1                Likelihood Function
Gaussian (Normal) Random Variables
   The Gaussian Random Variable is the most well-known, well-
    investigated type because of its wide application in the real world
    and its tractable mathematics.

   A Gaussian Random Variable is one which has the following
    probability density function (PDF) :
                                                           ( x−µ ) 2
                                              1        −
               p ( x) = N ( x; µ , σ ) =
                                   2
                                                   e         2σ 2

                                           2πσ 2
   and is denoted:

                            x ~ N (µ ,σ 2 )
Gaussian (Normal) Random Variables
     The Expectation and Second Central Moment of this distribution
      are:

                                  ( x−µ ) 2
                 ∞    x       −
    E[ x ] = ∫            e        2σ 2
                                              dx = µ   Mean
             −∞
                     2π σ
                                         ∞ x2       ( x−µ ) 
                                                              2
                                                   −
    E[( x − E[ x]) ] = E[ x ] − E[ x] =  ∫
                  2        2         2
                                                  e 2σ dx  − µ 2 = σ 2
                                                          2

                                         −∞ 2π σ            
                                                            
                        Mean Square                                    Variance



     These are only with respect to scalar random variables…what about
      vector random variables?
Vector Gaussian Random Variables
   The vector generalization is straight forward:

                                                                   
                                                 ( x − µ ) T P −1 ( x − µ )
                                    1       −
           p( x ) = N ( x; µ , P) =        e                 2
                                      2π P

   The Expectation and Second Central Moment of this distribution
    are:
                         
                 E[ x ] = µ
                                      
                 E[( x − E[ x ])( x − E[ x ])T ] = P
   Notice that the Variance is now replaced with a matrix called a
    Covariance Matrix.

   If the vector “x” is a zero-mean error vector than the covariance
    matrix is called the Mean Square Error.
Baye’s Theorem: Gaussian Case
   The “noise” of a device, denoted “x”, is observed. Normal
    functionality is denoted by event “B1” while a defective device is
    denoted by event “B2”:

            B1 = N ( x;0, σ 12 )       B2 = N ( x;0, σ 2 )
                                                       2


   The conditional probability of defect is (using Baye’s Theorem):

                       P{ x | B2 } P{ B2 }                      1
P{ B2 | x} =                                        =
             P{ x | B1} P{ B1} + P{ x | B2 } P{ B2 } 1 + P{ x | B1} P{ B1 }
                                                         P{ x | B2 } P{ B2 }
   Using the two distributions, we have:
                                            1
               P{ B2 | x} =                        x2   x2
                                 σ 2 P{ B1}      − 2+ 2
                                                  2σ 1 2σ 2
                              1+             e
                                 σ 1 P{ B2 }
Baye’s Theorem: Gaussian Case
   If we assume the diffuse prior, that the probability of each event is
    equal, then we have a simplified formula:

                                            1
                 P{ B2 | x} =                 x2   x2
                                    σ2      − 2+ 2
                                             2σ 1 2σ 2
                                 1+    e
                                    σ1
   If we further assume that  2 = 4  1 and that x =  2, then we have:

                          P{ B2 | x} ≈ 0.998
   Note that the likelihood ratio largely dominates the result of this
    calculation. This quantity is crucial in inference and statistical
    decision theory and often called “evidence from the data”.
                                          P{ x | B1}
                         Λ( B1 , B2 ) =
                                          P{ x | B2 }
Gaussian Mixture
   Suppose we have “n” possible events “Aj” which are mutually exclusive
    and exhaustive. And further suppose that each event has a Gaussian
    PDF as follows (pp. 55-56):


           A j ={ x ~ N ( x j , Pj )} and      P{ A j } = p j
              ∆




   Then, the total PDF is given by the Total Probability Theorem:

                              n
                    p ( x) = ∑ p ( x | A j ) P{ Ai }
                             j =1

   This mixture can be approximated as another Gaussian once the mixed
    moments are computed.
Gaussian Mixture
   The first moment (mean) is easily derived as:

                    n                        n
                                                      [
x = E [ p ( x)] = E ∑ p ( x | A j ) P{ Ai }  = ∑ E p ( x | A j ) p j]
                     j =1                    j =1
                                   n
                           x = ∑ pj xj
                                   j =1

   The covariance matrix is more complicated, but we simply apply the
    definition:

              [                ]          [                       ]
                                    n
    P = E ( x − x )( x − x ) = ∑ E ( x − x )( x − x ) | A j p j
                           T                              T

                                   j =1


                  [                                           ]
      = ∑ E ( x − x + x j − x j )( x − x + x j − x j ) | A j p j
          n
                                                     T

         j =1
Gaussian Mixture
   Continuing the insanity:


              [                          ]
P = ∑ E ( x − x j )( x − x j ) | A j p j + ∑ ( x j − x )( x j − x ) p j
       n                                            n
                                 T                               T

       j =1                                         j =1


    = ∑ Pj p j + ∑ ( x j − x )( x j − x ) p j
       n            n
                                                T

       j =1        j =1


                          Spread of the Means

   The spread of the means term inflates the covariance of the final mixed
    random variable to account for the differences between each individual
    mean and the mixed mean.
Bayesian Hypothesis Testing
Bayesian Hypothesis Testing

                We consider two competing hypotheses about a parameter “”
                 defined as:

            Null Hypothesis            H0 : θ = θ0
        Alternate Hypothesis           H 1 : θ = θ1
                We also define standard definitions concerning the decision errors:
                                  ∆
Type I Error (False Alarm)     PeI = P{ accept H1 | H 0 true} = α
                                   ∆
      Type II Error (Miss)
                               PeII = P{ accept H 0 | H1 true} = β
Neyman-Pearson Lemma
                 The power of the hypothesis test is defined as:

                              ∆
Test Power (Detection)     π = P{ accept H1 | H1 true} = 1 − β
                 The Neyman-Pearson Lemma states that the optimal decision (most
                  powerful test) rule subject to a fixed Type I Error () is the Likelihood
                  Ratio Test (pp.72-73):

                                         P{ z | H1}  H1 ; > Λ 0
                         Λ( H1 , H 0 ) ⇒             =
                                         P{ z | H 0 }  H 0 ; < Λ 0
                    Likelihood Functions


                           P{ Λ ( H1 , H 0 ) > Λ 0 | H 0 } = PeI = α
Sequential Probability Ratio Test
      Suppose, we have a sequence of independent identically distributed (i.i.d.)
       measurements “Z={zi}” and we wish to perform a hypothesis test. We can
       formulate this in a recursive form as follows:

                              P{ H1 ∩ Z } P{ Z | H1} P0 { H1}
         PR( H1 , H 0 ) =                 =
                              P{ H 0 ∩ Z } P{ Z | H 0 } P0 { H 0 }
                          Likelihood Functions                            a priori Probabilities


                   P0 { H1} n P{ zi | H1}                      n
PRn ( H1 , H 0 ) =           ∏ P{ z | H } = PR0 ( H1 , H 0 ) ∏ Λ i ( H1 , H 0 )
                   P0 { H 0 } i =1 i   0                     i =1

                                                            n
       ln ( PRn ( H1 , H 0 ) ) = ln ( PR0 ( H1 , H 0 ) ) + ∑ ln ( Λ i ( H1 , H 0 ) )
                                                           i =1
Sequential Probability Ratio Test
   So, the recursive for of the SPRT is:

    ln( PRk ( H1 , H 0 ) ) = ln( PRk −1 ( H1 , H 0 ) ) + ln( Λ k ( H1 , H 0 ) )
   Using Wald’s Theorem, we continue to test this quantity against two
    thresholds until a decision is made:

                                           H1 ; > T2
                                    
          ln ( PRk ( H1 , H 0 ) ) ⇒ continue ; > T1 and < T2
                                           H 0 ; < T1
                                    
                        1 − β               β 
                T2 = ln        and T1 = ln 1 − α 
                         α                       
   Wald’s Theorem applies when the observations are an i.i.d. sequence.
Chi-Square Distribution
   The chi-square distribution with “n” degrees of freedom has the following
    functional form:
                                                 n−2   x
                                      1              −
                    χ n ( x) =
                      2
                                  n
                                             x    2
                                                    e  2

                                    n
                                 2 Γ 
                                  2

                                     2
   It is related to an “n” dimensional vector Gaussian distribution as follows:

                      ( x − x ) T P −1 ( x − x ) ~ χ n2
   More generally, the sum of squares of “n” independent zero-mean, unity
    variance random variables is distributed as a chi-square with “n” degrees
    of freedom (pp.58-60).
Chi-Square Distribution
   The chi-square distribution with “n” degrees of freedom has the following
    statistical moments:

                 E[ x] = n E[( x − E[ x]) 2 ] = 2n

   The sum of two independent random variables which are chi-square are
    also chi-square:


                    q1 ~ χ    2
                              n1    q2 ~ χ        2
                                                  n2

                       q1 + q2 ~ χ      2
                                        n1 + n2
Estimation Basics
Parameter Estimator
   A parameter estimator is a function of the observations (measurements)
    that yields an estimate of a time-invariant quantity (parameter). This
    estimator is typically denoted as:


                        [          ]   where Z ={ z j } j =1
                   ∆                                     ∆
                                                               k
                xk = x k , Z
                ˆ ˆ            k                    k




             Estimate   Estimator               Observations

   We also denote the error in the estimate as:
                                          ∆
                                       ~ =x−x
                                       xk   ˆk

                                       True   Estimate
Estimation Paradigms
   Non-Bayesian (Non-Random):

     – There is no prior PDF incorporated
     – The Likelihood Function PDF is formed
     – This Likelihood Function PDF is used to estimate the parameter
                                      ∆
                           Λ Z ( x) = p( Z | x )
   Bayesian (Random):

     – Start with a prior PDF of the parameter
     – Use Baye’s Theorem to find the posterior PDF
     – This posterior PDF is used to estimate the parameter
                                 p( Z | x ) p( x ) 1
                  p( x | Z ) =                    = p( Z | x ) p( x )
                                     p( Z )        c

               Posterior                            Likelihood      Prior
Estimation Methods
   Maximum Likelihood Estimator (Non-Random):


              x ML ( Z ) = arg max[ p( Z | x ) ]
              ˆ
                                  x


                      dp( Z | x )
                                             =0
                         dx           x ML
                                      ˆ

   Maximum A Posteriori Estimator (Random):


            x MAP ( Z ) = arg max[ p( Z | x ) p( x ) ]
            ˆ
                              x
Unbiased Estimators
   Non-Bayesian (Non-Random):


                    E[ xk ( Z k )] p ( Z k | x = x ) = x0
                       ˆ
                                                0



   Bayesian (Random):


                     [ ( )] (
                    E xk Z k
                      ˆ          p x∩Z k   ) = E[ x] p ( x )
   General Case:


                               [ ( )]
                            E ~k Z k = 0
                              x
Estimation Comparison Example
   Consider a single measurement of an unknown parameter “x” which is
    susceptible to additive noise “w” that is zero-mean Gaussian:

                   z = x + w w ~ N 0, σ 2    (       )
   The ML approach yields:
                                                               ( z−x ) 2
                                                 1           −
       Λ ( x ) = p ( z | x ) = N ( z; x, σ ) =
                                         2
                                                         e      2σ 2

                                                2πσ 2
                     x ML
                     ˆ      = arg max[ Λ ( x)] = z
                                  x
   Thus, the MLE is the measurement itself because there is no prior
    knowledge.
Estimation Comparison Example
   The MAP, with a Gaussian prior, approach yields:

                           p( x) = N ( x; x , σ 0 )
                                                2


                                            ( z − x)2 ( x− x )2       ( x −ξ ( z )) 2
                                        −            −            −            2
                                              2σ 2      2σ 02
                      p( z | x) p( x) e                e 2σ 1
         p( x | z ) =                =               =
                           p( z )      2πσσ 0 p( z )    2πσ 12
                            x    z              1       1   1
            ξ ( z ) = σ 12  2 + 2  and
                           σ                        = 2+ 2
                            0 σ                σ 12 σ      σ0                            Prior
                                                                                        Information
                      x MAP = arg max[ p( x | z )] = ξ ( z )
                      ˆ
                                x                                     Measurement
                                                                      Information

   Thus, the MAPE is a linear combination of the prior information and the
    observation and it is weighted based upon the variance of each.
    NOTE: The MLE and MAPE are equivalent for a diffuse prior !
Batch Estimation Paradigms
   Consider that we now have a set of observations available for
    estimating a parameter and that in general these observations are
    corrupted by measurement noise:

                       Z k = { z j = h j ( x) + w j } j =1,,k
   Least Squares (Non-Random)

                                      k               2
                   ˆ
                   x   LS
                       k                             [
                            = arg min ∑ z j − h j ( x)      ]
                                  x    j =1            
   Minimum Mean Square Error (Random):

                  xkMMSE = arg min E ( x − x ) | Z k
                  ˆ
                                     ˆ
                                     x
                                       ˆ      2
                                                     [                ]
                                [            ]            (       )
                                                 ∆   ∞
              ˆ
              x   MMSE
                  k         = E x | Z = ∫ x p x | Z k dx
                                         k
                                                     −∞
Unbiasedness of ML and MAP Estimators

   Maximum Likelihood Estimate:


           E[ xkML ] = E[ z ] = E[ x0 + w] = x0 + E[ w] = x0
              ˆ
   Maximum A Posterior Estimate:


               σ2                      
      [         ]                σ0           σ2          σ 02
                                   2
    Exˆ   k= E 2
          MAP
                           x+ 2        z = 2
                                      2 
                                                   x+ 2          E[ z ]
              σ + σ 0        σ +σ0  σ +σ0            σ +σ0
                       2                         2             2


        σ2        σ 02                   σ2        σ02
    = 2      x+ 2          E[ x + w] = 2      x+ 2        ( x + E[ w])
     σ +σ0 2
                σ +σ0    2
                                       σ +σ02
                                                 σ +σ0  2


                       σ2       σ0
                                 2
                    = 2     x+ 2     x = x = E[ x]
                     σ +σ02
                              σ +σ02
Estimation Errors
   Non-Bayesian (Non-Random):


                               {                     }   2
                                                                 {            }
       Var[ xk ( Z k )] = E[ xk ( Z k ) − E[ xk ( Z k )] ] = E[ xk ( Z k ) − x0 ]
            ˆ                ˆ               ˆ                  ˆ
                                                                                  2




   Bayesian (Random):


            ˆ                ˆ     {        }   2
                                                    [ˆ       {        }   2
       MSE[ xk ( Z k )] = E[ xk ( Z k ) − x ] = E E[ xk ( Z k ) − x | Z k ]       ]
   General Case:


       [x  ( )
      E ~k Z k
                 2
                     ]   =
                              ( ( ))
                          var xk Z k
                               ˆ      x unbiased and x non − random
                                      ˆ
                                ( ( ))
                                  ˆ k
                           MSE xk Z                       all cases
Variances of ML and MAP Estimators
   Maximum Likelihood Estimate:


                   ˆ         ( ˆ                )
              var[ xkML ] = E[ xkML − x0 ] = E[( z − x0 ) ] = σ 2
                                                    2    2



   Maximum A Posterior Estimate:



          [       ]    [(
      var xkMAP = E xkMAP − x
          ˆ         ˆ             )   2
                                          ]     σ 2σ 02
                                              = 2
                                               σ +σ0    2
                                                                     [ ]
                                                          < σ 2 = var xkML
                                                                      ˆ

   The MAPE error is less than the MLE error since the MAPE incorporates
    prior information.
Cramer-Rao Lower Bound
   The Cramer-Rao Lower Bound states that a limit on the ability to
    estimate a parameter.


                 [ ( )]           [(   ( ) )( ( ) )
            MSE xk Z k = E xk Z k − x xk Z k − x
                ˆ          ˆ          ˆ
                                                                 T
                                                                     ]≥ J   k
                                                                                −1



   Not surprisingly, this lower limit is related to the likelihood function
    which we recall as the “evidence from the data”. This limit is called the
    Fisher Information Matrix.

                        ∆
                              [   
                                        T
                                            ( (
                     J k = − E ∇ x ∇ x ln p Z k | x   ) )]   
                                                             x


   When equality holds, the estimator is called efficient. An example of
    this is the MLE estimate we have been working with.
Filtering Fundamentals
Filtering Fundamentals

   Linear Filtering

     – Linear Gaussian Assumptions, Kalman Filter, Kalman Properties
     – Direct Discrete-Time, Discretized Continuous-Time, Steady State Gains

   Non-Linear Filtering

     – Non-Linear Dynamics & Measurements, Extended Kalman Filter
     – Iterated Extended Kalman Filter

   Multiple-Model Filtering

     – Need for Multiple Models, Adaptive Filtering
     – Switching Multiple Model & Interacting Multiple Model Filter
     – Variable Structure IMM
Linear Filtering
Kalman-Bucy Problem
   A stochastic discrete-time linear dynamic system:
                                                      
                  xk = Fk −1 xk −1 + Gk −1 uk −1 + Γk −1ν k −1
           
           xk    is the state vector of dimension “nx” at time “k”
            
         Gk uk is the control input of dimension “nx” at time “k”
          Fk is the transition matrix of dimension “nx x nx” at time “k”
            
         Γkν k is the plant noise of dimension “nx” at time “k”

   The measurement equation is expressed in the discrete form:
                                       
                             z k = H k xk + wk
          
          zk    is the measurement vector of dimension “nz” at time “k”
          Hk    is the observation matrix of dimension “nz x nx” at time “k”
          
          wk    is the measurement noise of dimension “nz” at time “k”
Kalman-Bucy Problem
   The Linear Gaussian Assumptions are:
                      
                  E [ν k ] = 0             [
                                           T
                                                  ]
                                       E ν kν j = Qk δ jk
                      
                  E [ wk ] = 0              [
                                            
                                                      ]
                                       E wk wT = Rk δ jk
                                               j
   The measurement and plant noises are uncorrelated:
                                     
                                 E[ wkν k ] = 0
   The conditional mean is:

                          [            ]
                     ∆
                x j |k = E x j | Z k
                ˆ                               Z k = { zi , i ≤ k }

     ˆ
     xk | k     Filtered State Estimate         ˆ
                                                xk |k −1        Extrapolated State Estimate

   The estimation error is denoted by:
                                       ∆
                              ~ =x −x
                              x j |k   ˆ j |k
                                     j
Kalman-Bucy Problem
     The estimate covariance is defined as:

                                              [                     ]
                                         ∆
                                    Pj|k = E ~ j|k ~ jT|k | Z k
                                             x x

          Pk |k      Filtered Error Covariance           Pk |k −1       Extrapolated Error Covariance

     The predicted measurement is given by:


            [        ]      [                        ]              [       ] [              ]
      ∆
                                                                         
z k |k −1 = E z k | Z k −1 = E H k xk + wk | Z k −1 = H k E xk | Z k −1 + E wk | Z k −1 = H k xk |k −1
ˆ                                                                                             ˆ

     The measurement residual or innovation is denoted by:
                                ∆
                         ηk = zk − zk |k −1 = z k − H k xk |k −1
                                   ˆ                    ˆ
Kalman-Bucy Approach
   Recall that the MMSE is equivalent to the MAPE in the Gaussian case.

   Recall that the MAPE, with a Gaussian prior, is a linear combination of
    the measurement and the prior information.

   Recall that the prior information was, more specifically, the expectation
    of the random variable prior to receiving the measurement.

   If we consider the Kalman Filter to be a recursive process which applies
    a static Bayesian estimation (MMSE) algorithm at each step, we are
    compelled to consider the following linear combination.
                                                 
                                ′ ˆ
                     ˆ k |k = K k xk |k −1 + K k z k
                     x

                              Prior State    Observation
                             Information     Information
Kalman Filter - Unbiasedness
     We start with the proposed linear combination:
                                                        
                                       ′ ˆ
                             xk |k = K k xk |k −1 + K k z k
                             ˆ
     We wish to ensure that the estimate is unbiased, that is:

                                     E [ ~k |k ] = 0
                                         x
     Given the proposed linear combination, we determine the error to be:
                                   
               ~ = [K′ + K H − I ] x + K′ ~ + K w    
               xk | k k   k k       k   k xk |k −1 k   k

     Applying the unbiasedness constraint, we have:
                                                                               
    E[ ~k |k ] = 0 = [ K k + K k H k − I ] E[ xk ] + K k E[ ~k |k −1 ] + K k E[ wk ]
       x                 ′                             ′ x

                                   ′
                                  Kk = I − Kk H k
Kalman Filter – Kalman Gain
   So, we have the following simplified linear combination:
                                                   
                            xk |k = xk |k −1 + K k η k
                            ˆ       ˆ
   We also desire the filtered error covariance, so that it can be minimized:


                                          [
                             Pk |k = E ~k |k ~k |k
                                       x x
                                                   T
                                                             ]
          Pk |k = ( I − K k H k ) Pk |k −1 ( I − K k H k ) + K k Rk K k
                                                                     T            T




   If we minimize the trace of this expression with respect to the gain:


                 K k = Pk |k −1 H k
                                      T
                                          [H Pk   k |k −1
                                                                 T
                                                            H k + Rk     ]   −1
Kalman Filter - Recipe
   Extrapolation:

                                                   
              xk |k −1 = Fk −1 ⋅ xk −1|k −1 + Gk −1uk −1
              ˆ                  ˆ
                                                T
              Pk |k −1 = Fk −1 ⋅ Pk −1|k −1 ⋅ Fk −1 + Γk −1Qk −1ΓkT−1

   Update:


                                                  
                            xk |k = xk |k −1 + K kη k
                            ˆ       ˆ
        Pk |k = ( I − K k H k ) Pk |k −1 ( I − K k H k ) + K k Rk K k
                                                           T            T
Kalman Filter – Innovations
   The innovations are zero-mean, uncorrelated (p. 213) and have covariance:


                                 [          ]
                  S k = E η kη k = H k Pk |k −1 H k + Rk
                               T                  T


   The normalized innovation squared or statistical distance is chi-square distributed:
                                                −1
                         d k2 = η k S k η k ~ χ nz
                                            T   2


   So, we expect that the innovations should have a mean and variance of:

                         { }
                       E[ d i2 ] = nz                { }
                                                var[ d i2 ] = 2nz
   The Kalman Gain can now be written as:

        K k = Pk |k −1 H k
                             T
                                     [H P
                                       k
                                                     T
                                           k |k −1 H k + Rk   ]   −1                    T
                                                                       = Pk |k −1 H k S k
                                                                                            −1


   The state errors are correlated:

                   [
                 E~ ~
                  x x      T
                      k |k k −1|k −1       ] = [ I − K H ]F
                                                         k    k            P
                                                                       k −1 k −1|k −1
Kalman Filter – Likelihood Function
         We wish to compute the likelihood function given the dynamics model used:


 [            ] [                     ]
p z k | Z k −1 = p z k | xk |k −1 = N ( z k ; z k |k −1 , S k ) = N ( z k − z k |k −1 ;0, S k ) = N (η k ;0, S k )
                         ˆ                    ˆ                             ˆ

         Which has the explicit form:

                                                                [
                                                            1  T −1  
                                                        exp− η k S k η k      ]
                                          [
                           Λ k = p z k | Z k −1   ]   =     2             
                                                             det[ 2π S k ]
         Alternatively, we can write:

                                            1 
                                       exp − d k2 
                    [
          Λ k = p z k | Z k −1   ]   =      2  ⇒ ln Λ = − 1 d 2 − 1 ln( det[ 2π S ] )
                                              [
                                         det 2π S kr   ]
                                                       k
                                                            2
                                                               k
                                                                    2
                                                                                   k
Kalman Filter – Measurement Validation
   Suppose our Kalman filter has the following output at a given time step:

                          5 0               1 0                 10
               Pk +1|k   =          H k +1 =           ˆ k +1|k =  
                                                         x
                           0 16
                                              0 1
                                                                   15
   Suppose that we now receive 3 measurements of unknown origin:

                      4 0       1        7  2      16  3      19 
           R i
                    =            z k +1 =  , z k +1 =  , z k +1 =  
                       0 9
             k +1
                                         20          5           25
   Evaluate the consistency of these measurements for this Kalman filter model.
    This procedure is called gating and is the basis for data association.


                    ( )
                    1                
                                           ( )          
                                                            ( )
             d k2+1 z k +1 = 2 d k2+1 z k2+1 = 8 d k2+1 zk3+1 = 13
                         χ 2 ( 95% ) = 6
                           2
                                              χ 2 ( 99% ) = 9.2
                                                2
Kalman Filter – Initialization
   The true initial state is a random variable distributed as:

                          x0 = N ( x0|0 , P0|0 )
                          
                                   ˆ
   It is just as important that the initial covariance and estimate realistically
    reflect the actual accuracy. Thus, the initial estimate should satisfy:

                         ~ T P −1 ~ ≤ χ 2 ( 95% )
                         x0|0 0|0 x0|0  nx


   If the initial covariance is too small, then the Kalman gain will initially be
    small and the filter will take a longer time to converge.

   Ideally, the initial state estimate should be within one standard deviation
    (indicated by the initial covariance) of the true value. This will lead to
    optimal convergence time.
Kalman Filter – Initialization
   In general, a batch weighted least-squares curve fit can be used (Chapter 3):


             [        −1
      x0|0 = H init Rinit H init
      ˆ        T
                                   ]   −1     T      −1
                                            H init Rinit zinit          [ T      −1
                                                                 P0|0 = H init Rinit H init   ]   −1




               
                 [       T
        zinit = z0 , , z nx −1    ]                         [                 (
                                                  H init = H 0 ,  , H nx −1 Fnx − 2   ) ]
                                                                                        n x −1 T




                                               R0  0 
                                                         
                              Rinit          =     
                                               0  Rn −1 
                                                     x   
   This initialization will always be statistically consistent so long as the
    measurement errors are properly characterized.
Kalman Filter – Summary
   The Kalman Gain:

     – Proportional to the Predicted Error
     – Inversely Proportional to the Innovation Error

   The Covariance Matrix:

     – Independent of measurements
     – Indicates the error in the state estimate assuming that all of the
       assumptions/models are correct

   The Kalman Estimator:

     – Optimal MMSE state estimator (Gaussian)
     – Best Linear MMSE state estimator (Non-Gaussian)
     – The state and covariance completely summarize the past
Kalman Filter – Summary
Kalman Filter – Summary
Kalman Filter:
             Direct Discrete Time Example
   Consider the simplest example of the nearly constant velocity (CV)
    dynamics model:

                         T 2 2            ξ k 
      1 T  
     xk =      xk −1 +      ν k −1     xk =          [ ]
                                                          E ν k2 = q
          0 1           T                   ξ k 

 Discrete White
Noise Acceleration
                                 
                     z k = [1 0] xk + wk
                     
                                              [ ]
                                            E wk = r
                                               2



   The recursive estimation process is given by the Kalman equations
    derived above.

   How do we select “q”?

                               q ≅ amax
                                    2
Kalman Filter:
              Other Direct Discrete Time Models
    For nearly constant acceleration (CA) models, the Discrete Weiner
     Process Acceleration (DWPA) model is commonly used:


          1 T T 2 2        T 2 2                 ξ k 
     
    x k = 0 1
                                
                 T  xk −1 +  T ν k −1
                                                  
                                                xk = ξ k       [ ]
                                                               E ν k2 = q
          0 0   1           1                    ξk 
                                                      
                                                      
                                
                          T 4 4 T 3 3 T 2 2
                                           
         Qk = qΓk ΓkT = q T 3 3 T 2     T               q ≅ ∆amax
                                                                2


                          T 2 2 T       1 
                                           

    Notice the simple relationship between the “q-value” and the physical
     parameter that is one derivative higher than that which is estimated.
Kalman Filter:
            Discretized Continuous-Time Models
   These models are derived from continuous time representations using the
    matrix superposition integral. Ignoring the control input:
                
      x ( t ) = Ax (t ) + Dv (t )
                           ~          E [ v (t )v (τ )] = q δ ( t − τ )
                                           ~ ~             ~
                                     
                           xk = Fk −1 xk −1 + vk −1
                                              where
             F =e     AT
                                    and
                                                       T
                                                                 ~
                                              vk = ∫ e A(T −τ ) Dv (τ ) dτ
                                                       0

   Thus, the process noise covariance is found by:

        [
Qk = E v v = ∫T
            k k   ]   T

                      0   ∫
                          0
                              T
                                  Fk ( T − τ 1 ) D E [ v (τ 1 )v (τ 2 )] DT FkT ( T − τ 2 ) dτ 1dτ 2
                                                       ~ ~

                      = ∫ Fk ( T − τ 1 ) D q DT FkT ( T − τ 1 ) dτ 1
                                           ~
                              T

                              0
Kalman Filter:
              Discretized Continuous-Time Models
   Continuous White Noise Acceleration (CWNA) for CV Model:

                ~                 ~ T 3 3 T 2 2       ~
       ξ(t ) = v (t )
                            Qk = q  2                q ≅ amaxT
                                                             2

                                    T 2     T 
   Continuous Weiner Process Acceleration (CWPA) for CA Model:

                                T 5 20 T 4 8 T 3 6 
              ~
    ξ(t ) = v (t )
                             ~
                         Qk = q  T 4 8 T 3 3 T 2 2
                                                       ~
                                                        q ≅ ∆amax T
                                                              2


                                 T3 6 T2 2     T 
                                                   
   Singer [IEEE-AES, 1970] developed the Exponentially Correlated
    Acceleration (ECA) for CA model (p. 187 & pp.321-324):

                                          ~
                         ξ(t ) = −αξ(t ) + v (t )
Kalman Filter:
                    Time Consistent Extrapolation
   So, what is the difference between the Direct Discrete-Time and
    Discretized Continuous-Time models for CV or CA models. Which one
    should be used?


    [ FPF   T
                + Q DWNA   ]   T =2
                                        [ {[
                                      ≠ F FPF T + Q DWNA             ] }F
                                                                    T =1
                                                                            T
                                                                                + Q DWNA       ]   T =1

    [ FPF   T
                + Q CWNA   ]          = [ F {[ FPF   T
                                                         + Q CWNA   ] }F   T
                                                                                + Q CWNA   ]
                               T =2                                 T =1                       T =1


   Thus, for the Continuous-Time model, 2 extrapolations of 1 second yields
    the same result as 1 extrapolation of 2 seconds.

   In general, the Continuous-Time models have this time consistent
    property.

   This is because the process noise covariance is derived using the
    transition matrix, while the Direct Discrete-Time is arbitrary.
Kalman Filter:
                           Steady State Gains
   If we iterate the Kalman equations for the covariance indefinitely, the
    updated covariance (and thus the Kalman gain) will reach steady state.

   This is only true for Kalman models that have constant coefficients.

   In this case, the steady-state solution is found using the Algebraic Matrix
    Riccati Equation (pp. 211 & 350):


                [
      Pss = F Pss − Pss H ( HPss H + R ) HPss F T + Q
                               T               T        −1
                                                                     ]
   The steady state Kalman gain becomes:


                    K ss = Pss H   T
                                       [ HP H
                                          ss
                                                   T
                                                       +R   ]   −1
Kalman Filter:
                              Steady State Biases
     If a Kalman filter has reached steady-state, then it is possible to predict
      the filter’s bias resulting from un-modeled dynamics.

     Consider the CV model with an un-modeled constant acceleration (p. 13):


                                      ξ k        T 2 2           α 
    xk = F xk −1 + Γν k −1 + Gλ        xk =    G =         K ss =     
                                            ξk        T            β T 
                        Un-Modeled
                        Acceleration

     The steady-state error is found to be:
                                                    1−α 2 
                                                    β T 
        ~ = ( I − K H ) F ~ + ( I − K H ) Gλ ⇒ ~ = 
        xss               xss                  xss           λ
                   ss                ss
                                                    2α − β 
                                                           T
                                                    β
                                                            
                                                             
Kalman Filter – Summary #2
   The Kalman Gain:

     – Reaches steady-state for constant coefficient models
     – Can determine steady-state errors for un-modeled dynamics

   The Covariance Matrix:

     – Is only consistent for when model matches true
     – Has no knowledge of the residuals

   The Kalman Estimator:

     – We need modifications for more general models
     – What about non-linear dynamics?
Non-Linear Filtering
Nonlinear Estimation Problems
   Previously, all dynamics and measurement models were linear. Now, we
    consider a broader scope of estimation problems:
                       
                                                ~
                       x = f ( x , t ) + Du (t ) + v (t )
                                              
                          z (t ) = h( x , t ) + w(t )
   Nonlinear Dynamics:

     –   Ballistic Dynamics (TBM exo-atmospheric, Satellites, etc…)
     –   Drag/Thust Dynamics (TBM re-entry, TBM Boost, etc…)


   Nonlinear Measurements:

     –   Spherical Measurements
     –   Angular Measurements
     –   Doppler Measurements
EKF - Nonlinear Dynamics
        The state propagation can be done using numerical integration or a
         Taylor Series Expansion (linearization):

        However, the linearization is necessary in order to propagate the
         covariance:
                                 
xk ≅ Fk −1 xk −1 + Gk −1uk −1 + Γk −1vk −1

Fk −1 = e   [ 
          f ( x ) ( t k −t k −1 )
                                    ]   
                                        x = xk −1|k −1
                                            ˆ
                                                                              ∂f
                                                         ≅ I + (t k − t k −1 ) 
                                                                              ∂x   
                                                                                                    +
                                                                                                      (t k − t k −1 ) 2 ∂ 2 f
                                                                                                            2
                                                                                                                          
                                                                                                                        ∂x 2    
                                                                                                                                                 +
                                                                                   x = xk −1|k −1
                                                                                       ˆ                                        x = xk −1|k −1
                                                                                                                                    ˆ



                                                                       Jacobian Matrix                           Hessian Matrix
        The state and covariance propagation are precisely as before:
                                                                        
                                    ˆk |k −1 = Fk −1 ⋅ xk −1|k −1 + Gk −1uk −1
                                    x                  ˆ
                                                           T
                           Pk |k −1 = Fk −1 ⋅ Pk −1|k −1 ⋅ Fk −1 + Γk −1Qk −1ΓkT−1
EKF - Nonlinear Measurements
           We compute the linearization of the observation function:
                                                                  
                                                                    
                  h( xk ) = h( xk |k −1 ) + H k ( xk − xk |k −1 ) + H k ( xk − xk |k −1 ) 2 + 
                               ˆ                         ˆ                     ˆ
                                        ∂h                  
                                                                   ∂ 2h
Jacobian Matrix                Hk =                       Hk = 2                            Hessian Matrix
                                          ∂x x = xk|k −1
                                             
                                                 ˆ                  ∂x x = x
                                                                           
                                                                             ˆ
                                                                             k |k −1


           The residual is thus:
                         ∆
                                                                               
                    ηk = zk − z k |k −1 = z k − h( xk |k −1 ) = H k ~k |k −1 + wk
                              ˆ                    ˆ                x
           The covariance update and Kalman gain are precisely as before (381-
            386):
                                                                
                                          ˆk |k = xk |k −1 + K kη k
                                          x       ˆ
                    Pk |k = ( I − K k H k ) Pk |k −1 ( I − K k H k ) + K k Rk K k
                                                                         T                 T


                                                                    −1
                                           K k = Pk |k −1 H k S k
                                                            T
Polar Measurements
    Previously, we dealt with unrealistic observation models that assumed
     that measurements were Cartesian. Polar measurements are more
     typical. In this case, the observation function is nonlinear:

                                r   x 2 + y 2 
                                                            x k = [ xk
                                                            
                                                                                   yk ]
    
    z k = h( xk ) + wk   h( xk ) =   =  −1                           xk
                                                                             yk    T
                                   b  tan ( x y )
                                                   

    The Kalman Gain and Covariance Update only require the Jacobian of
     this observation function:

                             x                         y           
                                            0                      0
                     ∂h  x 2 + y 2                 x2 + y2
                  H =  =                                          
                     ∂x      y                      −x             
                          x2 + y2          0                      0
                                                  x2 + y2          
    This Jacobian is evaluated at the extrapolated estimate.
Ballistic Dynamics
    As a common example of nonlinear dynamics, consider the ballistic
     propagation equations specified in ECEF coordinates:

                          [x
                               y
                               x       z ] T =
                                       y  z
    [x
        2 ω y + ω 2 x + Gx
                              y − 2ω x + ω 2 y + Gy
                                                     z Gz
                                                             ]   T      ~
                                                                      + Dv (t )
    The gravitational acceleration components (to second order) are:

                       − µ x  3  R 2        z  
                                                     2

                       3 1 + J 2  e  1 − 5    
                       R  2 R 
                                               R  
                                                       
              Gx                                       
             − µ y  3  Re  2            z
                                                     2
                                                        
         G = G y  =  3 1 + J 2   1 − 5    
                         R  2 R 
                                               R   
                                                       
             G z  
              
                       − µ z  3  R 2        z   
                                                     2

                       3 1 + J 2  e   3 − 5   
                       R  2 R 
                                              R   
                                                        
Thrust and Re-entry Dynamics
    As a common example of nonlinear dynamics, consider the ballistic
     propagation equations specified in ECEF coordinates:

                             [x
                                   y
                                   x        z  a
                                            y  z            T
                                                              β = ]
                      x
                                              y
                                                                 z
                                                                                     ~
     
          Ballistic
       x ax
                   +a         Ballistic
                             y ay         +a       z Gz + a
                                                                     aβ   − β 2  + D v (t )
                      s                       s                  s              

    The new states are the relative axial acceleration “a” and the relative
     mass depletion rate “”:
                                          1             
                                         T + C D AC ρ ( v ⋅ v )                       m(t )
                                                                                      
                                          2
    a (t ) = athrust (t ) − adrag (t ) =                                   β (t ) =
                                              m(t )                                   m(t )
    The process noise matrix (if extended to second order) becomes a
     function of the speed. Thus, a more rapidly accelerating target while
     have more process noise injected into the filter.
Pseudo- Measurements
   In the case of the TBM dynamics, the ECEF coordinates are the most
    tractable coordinates.

   However, typically the measurements are in spherical coordinates.

   Furthermore, the Jacobian for the conversion from ECEF to RBE is
    extremely complicated.

   Instead, we can convert the measurements into ECEF as follows:

                              
                zk′ = I 3 x 3 xk + w′
                                    k   Rk' = J meas Rk J meas
                                                          T


   However, since this is a linearization, we must be careful to make sure
    that this approximation holds.
Pseudo- Measurements




   The linearization is valid so long as (pp. 397-402):

                              r σ b2
                                     < 0.4
                               σr
Iterated EKF
   The IEKF iteratively computes the state “n” times during a single update.
    This recursion is based on re-linearization about the estimate:

                  ∂h                         ∂h
            H ki =              where H k0 =                         i = 0,1,..., n
                  ∂x    ˆi
                        xk|k                 ∂x           ˆ
                                                          xk|k −1

   The state is updated iteratively with a re-linearized residual and gain:
                                                         
                                  xk+k1 = xk |k −1 + K kiη ki
                                  ˆi |    ˆ
                    
                   ηki = z k − h( xk |k ) − H ki
                                  ˆi
                                                       ˆi
                                                       xk|k
                                                              [x
                                                               ˆ   k |k −1 − xk | k
                                                                             ˆi       ]
                                                                                          −1
                         H i  H i                           Hi  + R 
                                      T                                           T
         K k = Pk |k −1  k ˆ i   k
           i
                                                      Pk |k −1  k ˆ i   k
                            xk|k 
                                    
                                               ˆi
                                               xk|k               xk|k 
                                                                           
   Finally, the covariance is computed based upon the values of the final
    iteration:
                                                                          + K n R ( K n )T
                                                                          T
       Pk |k =  I − K kn H kn
               
                                      P  I − K n H n
                                   n  k |k −1  k   k                       k k     k
                                xk|k 
                                 ˆ                                 xk|k 
                                                                    ˆn
Multiple-Model Filtering
Why Multiple Models?
   When the target dynamics differ from the modeled dynamics, the state
    estimates are subject to:

     – Biases (Lags) in the state estimate

     – Inconsistent covariance output

     – In a tracking environment, this increases the probability of mis-
       association and track loss

   In most tracking applications, the true target dynamics have an unknown
    time dependence.

   To accommodate changing target dynamics, one can develop multiple
    target dynamics models and perform hypothesis testing.

   This approach is called hybrid state estimation.
Why Multiple Models?
                              6-State Velocity Error (Meters/Sec) (x=red, y=blue, z=green, True Error=black)
                  100


 Assuming a       80
                                                               True state estimate error
   Constant
Velocity target   60
dynamics, the
  estimation                                                                         Confidence interval
                  40
                                                                                      Given by Kalman
errors become                                                                            Covariance
 inconsistent     20

   during an
 acceleration      0



                  -20



                  -40



                  -60
                    1800   1900     2000      2100      2200      2300      2400       2500      2600      2700   2800
Why Multiple Models?
                               9-State Velocity Error (Meters/Sec) (x=red, y=blue, z=green, True Error=black)
                  200


 A Constant                                                     True state estimate error
 Acceleration
                  150

                                                                                      Confidence interval
model remains
                  100                                                                  Given by Kalman
  consistent,                                                                             Covariance
 however the       50
 steady-state
  estimation        0
error is larger
                   -50



                  -100



                  -150



                  -200
                     1800   1900     2000      2100      2200      2300       2400      2500      2600      2700   2800
Adaptive Process Noise
   Since the normalized innovations squared indicate the consistency of
    the dynamics model, it can be monitored to detect deviations (pp. 424-
    426).
   At each update, the perform the following threshold test:
                  T −1 
          d k2 = η k S k η k ≥ ε max           {           }
                                             P d k2 ≥ ε max = α

   Then, the process noise value is adjusted such that the statistical distance
    is equal to this threshold value:
                          T −1 
                         η k S k η k = ε max
                                         q

   The disadvantage is that false alarms result in sudden increases in error.

   We can use a sliding window average of these residuals, however this can
    delay the detection of a maneuver (pp. 424-426).
State Estimation – Multiple Models
   We can further assume that the true dynamics is one of “n” models:
              r       r r        r
              xk = Fk −1 xk −1 + vk −1       r = 1,2,  , n
                              r
              z kr = H kr xkr + wk −1         r = 1,2,  , n
   Using Kalman filter outputs, each model likelihood function can be
    computed:

                                [
                           1  T −1  
                       exp− η kr S kr η kr 
                           2
                                              ]
                                             ; r = 1,2, , n
                 Λrk =
                                    [
                            det 2π S kr  ]
   At each filter update, the posterior model probabilities “ ki ” are computed
    recursively using Baye’s Theorem. The proper output can be selected
    using these probabilities (pp. 453-457).
State Estimation – SMM

                      Hypothesis 1
                                        x1
                                        ˆk

Measurement                            ˆ2
                                       xk              Λik µ k −1
                                                             i
                      Hypothesis 2           µ ki =   n                       Hypothesis

                                                      ∑Λ
                                                      j =1
                                                             j
                                                             k   µ    j
                                                                     k −1
                                                                               Selection




                      Hypothesis “n”                                             ˆ
                                                                                 xk
                                        ˆn
                                        xk                                  Most Probable
                                                                            State Estimate



   Each Kalman filter is updated independently and has no knowledge about
    the performance of any other filter.

   This approach assumes that the target dynamics are time-independent
State Estimation – IMM
                                                 Measurement


           Conditional Probability Update/
                                                  Hypothesis 1
                                                                    x1
                                                                    ˆk
    1
ˆ
x
             State Estimate Interaction

    k −1

                                                                   ˆ2
                                                                   xk        Probability
ˆ2
xk −1                                             Hypothesis 2
                                                                              Updates
                                                                                             Estimate
                                                                                             Mixing



                                                                        ˆn
                                                                        xk
ˆ
x   n
    k −1
                                                                                               ˆ
                                                                                               xk
                                                  Hypothesis “n”
                                                                                           IMM Estimate


    Each Kalman filter interacts with others just prior to an update

    This interaction allows for the possibility of a transition

    This approach assumes that the target dynamics will change according to
     a Markov process. pp. 453-457.
State Estimation – IMM
                              x1k −1|k −1 , P1k −1|k −1
                              ˆ                                             x 2 k −1|k −1 , P 2 k −1|k −1
                                                                            ˆ
                                                                                                                       µ k −1 
                                                                                                                          1

                                                      Interaction                                                      2 
                                                                                                                       µ k −1 
                           x 0,1k −1|k −1 , P 0,1k −1|k −1
                           ˆ                                             x 0, 2 k −1|k −1 , P 0, 2 k −1|k −1
                                                                         ˆ

      Kalman                                              Prob.                                                  Kalman              
zk                                 Λ1k                                                       Λ2k                                      zk
        Filter                                            Updates                                                  Filter

     x1 k | k , P1 k | k
     ˆ                                                                                                         x 2 k |k , P 2 k | k
                                                                                                               ˆ
                                                      {µ     1
                                                             k −1   , µ k2−1   }
                                                          Estimate
                                                          Mixing
                                                                                         ˆ
                                                                                         xk |k , Pk |k
State Estimation – Applied IMM
                             6-State Velocity Error (Meters/Sec) (x=red, y=blue, z=green, True Error=black)
                100


IMM adapts to    80
                                                              True state estimate error
  changes in
                                                                                    Confidence interval
     target      60
                                                                                     Given by Kalman
dynamics and     40                                                                     Covariance
  provides a
  consistent     20


  covariance      0
 during these
  transitions    -20


                 -40


                 -60


                 -80


                -100
                   1800   1900     2000      2100      2200      2300      2400       2500      2600      2700   2800
State Estimation – IMM Markov Matrix
   The particular choice of the Markov Matrix is somewhat of an art.

   Just like any filter tuning process, one can choose a Markov Matrix
    simply based upon observed performance.

   Alternatively, this transition matrix has a physical relationship to the
    Mean Sojourn Time of a given dynamics state.

                       1                      1           Tscan
          E[ Ni ] =               pii = 1 −          = 1−
                    1 − pii                 E[ N i ]      E [τ i ]
State Estimation – VSIMM
     Air Targets: Adaptive Grid Coordinated Turning Model

     TBM Targets: Constant Axial Thrust, Ballistic, Singer ECA

                           Air IMM
                                      xkAir
                                      ˆ
    Measurement

                                          TBM
                                                SPRT
                                                             Hypothesis
                                                              Selection   ˆ
                                                                          xk
                                      ˆ
                                      x   k
                          TBM IMM



     The SPRT is performed as follows:

                  xkAir ; > T2
                    ˆ
µ kAir                                              1− β              β
           ⇒ mixed ; > T1 and < T2             T2 =         and T1 =
µk
 TBM
                                                     α               1−α
                  ˆ TBM ; < T1
                   xk
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering
2002 Technical Conference on Tracking and Filtering

Weitere ähnliche Inhalte

Was ist angesagt?

Signal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationSignal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationGabriel Peyré
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationGabriel Peyré
 
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo CameraBouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo CameraYuji Oyamada
 
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsLow Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsGabriel Peyré
 
Advanced algebra (some terminologies)
Advanced algebra (some terminologies)Advanced algebra (some terminologies)
Advanced algebra (some terminologies)aufpaulalonzo
 
2003 Ames.Models
2003 Ames.Models2003 Ames.Models
2003 Ames.Modelspinchung
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Leonid Zhukov
 
An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...Alexander Decker
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Gabriel Peyré
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsGabriel Peyré
 
03 convexfunctions
03 convexfunctions03 convexfunctions
03 convexfunctionsSufyan Sahoo
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorizationrubyyc
 
Numerical analysis convexity, concavity
Numerical analysis  convexity, concavityNumerical analysis  convexity, concavity
Numerical analysis convexity, concavitySHAMJITH KM
 
Model Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesModel Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesGabriel Peyré
 
Mesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingMesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingGabriel Peyré
 
On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...Nikita V. Artamonov
 

Was ist angesagt? (20)

Signal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationSignal Processing Course : Convex Optimization
Signal Processing Course : Convex Optimization
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems Regularization
 
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo CameraBouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
 
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsLow Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
 
Mo u quantified
Mo u   quantifiedMo u   quantified
Mo u quantified
 
Advanced algebra (some terminologies)
Advanced algebra (some terminologies)Advanced algebra (some terminologies)
Advanced algebra (some terminologies)
 
2003 Ames.Models
2003 Ames.Models2003 Ames.Models
2003 Ames.Models
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Digital fiiter
Digital fiiterDigital fiiter
Digital fiiter
 
1533 game mathematics
1533 game mathematics1533 game mathematics
1533 game mathematics
 
An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
 
03 convexfunctions
03 convexfunctions03 convexfunctions
03 convexfunctions
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
Numerical analysis convexity, concavity
Numerical analysis  convexity, concavityNumerical analysis  convexity, concavity
Numerical analysis convexity, concavity
 
Model Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesModel Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular Gauges
 
Mesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingMesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic Sampling
 
On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...
 

Ähnlich wie 2002 Technical Conference on Tracking and Filtering

Engg maths k notes(4)
Engg maths k notes(4)Engg maths k notes(4)
Engg maths k notes(4)Ranjay Kumar
 
Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009Sean Meyn
 
Linear Algebra and Matrix
Linear Algebra and MatrixLinear Algebra and Matrix
Linear Algebra and Matrixitutor
 
CBSE Class 12 Mathematics formulas
CBSE Class 12 Mathematics formulasCBSE Class 12 Mathematics formulas
CBSE Class 12 Mathematics formulasParth Kshirsagar
 
Inverse Matrix & Determinants
Inverse Matrix & DeterminantsInverse Matrix & Determinants
Inverse Matrix & Determinantsitutor
 
Discussion of Matti Vihola's talk
Discussion of Matti Vihola's talkDiscussion of Matti Vihola's talk
Discussion of Matti Vihola's talkChristian Robert
 
On the Stick and Rope Problem - Draft 1
On the Stick and Rope Problem - Draft 1On the Stick and Rope Problem - Draft 1
On the Stick and Rope Problem - Draft 1Iwan Pranoto
 
Rational Exponents
Rational ExponentsRational Exponents
Rational ExponentsPhil Saraspe
 
Determinants - Mathematics
Determinants - MathematicsDeterminants - Mathematics
Determinants - MathematicsDrishti Bhalla
 
Lecture 4 chapter 1 review section 2-1
Lecture 4   chapter 1 review section 2-1Lecture 4   chapter 1 review section 2-1
Lecture 4 chapter 1 review section 2-1njit-ronbrown
 
Weatherwax cormen solutions
Weatherwax cormen solutionsWeatherwax cormen solutions
Weatherwax cormen solutionskirankoushik
 
Integral table
Integral tableIntegral table
Integral tablebags07
 
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control systemAlireza Mirzaei
 
Mathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdf
Mathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdfMathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdf
Mathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdf9866560321sv
 

Ähnlich wie 2002 Technical Conference on Tracking and Filtering (20)

Engg maths k notes(4)
Engg maths k notes(4)Engg maths k notes(4)
Engg maths k notes(4)
 
3.-Matrix.pdf
3.-Matrix.pdf3.-Matrix.pdf
3.-Matrix.pdf
 
Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009Markov Tutorial CDC Shanghai 2009
Markov Tutorial CDC Shanghai 2009
 
Linear Algebra and Matrix
Linear Algebra and MatrixLinear Algebra and Matrix
Linear Algebra and Matrix
 
CBSE Class 12 Mathematics formulas
CBSE Class 12 Mathematics formulasCBSE Class 12 Mathematics formulas
CBSE Class 12 Mathematics formulas
 
Inverse Matrix & Determinants
Inverse Matrix & DeterminantsInverse Matrix & Determinants
Inverse Matrix & Determinants
 
Discussion of Matti Vihola's talk
Discussion of Matti Vihola's talkDiscussion of Matti Vihola's talk
Discussion of Matti Vihola's talk
 
On the Stick and Rope Problem - Draft 1
On the Stick and Rope Problem - Draft 1On the Stick and Rope Problem - Draft 1
On the Stick and Rope Problem - Draft 1
 
Rational Exponents
Rational ExponentsRational Exponents
Rational Exponents
 
Determinants - Mathematics
Determinants - MathematicsDeterminants - Mathematics
Determinants - Mathematics
 
Lecture 4 chapter 1 review section 2-1
Lecture 4   chapter 1 review section 2-1Lecture 4   chapter 1 review section 2-1
Lecture 4 chapter 1 review section 2-1
 
Matrices 1.pdf
Matrices 1.pdfMatrices 1.pdf
Matrices 1.pdf
 
Weatherwax cormen solutions
Weatherwax cormen solutionsWeatherwax cormen solutions
Weatherwax cormen solutions
 
Assignment6
Assignment6Assignment6
Assignment6
 
Matlab/R Dictionary
Matlab/R DictionaryMatlab/R Dictionary
Matlab/R Dictionary
 
Future CMB Experiments
Future CMB ExperimentsFuture CMB Experiments
Future CMB Experiments
 
Integral table
Integral tableIntegral table
Integral table
 
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control system
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Mathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdf
Mathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdfMathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdf
Mathematical-Formula-Handbook.pdf-76-watermark.pdf-68.pdf
 

Kürzlich hochgeladen

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 

Kürzlich hochgeladen (20)

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 

2002 Technical Conference on Tracking and Filtering

  • 3. Overview  Mathematics Overview – Linear Algebra and Linear Systems – Probability and Hypothesis Testing – State Estimation  Filtering Fundamentals – Linear and Non-linear Filtering – Multiple Model Filtering  Tracking Basics – Track Maintenance – Data Association Techniques – Activity Control
  • 5. Mathematics Review  Linear Algebra and Linear Systems – Definitions, Notations, Jacobians and Matrix Inversion Lemma – State-Space Representation (Continuous and Discrete) and Observability  Probability Basics – Probability, Conditional Probability, Baye’s and Total Probability Theorem – Random Variables, Gaussian Mixture, and Covariance Matrices  Bayesian Hypothesis Testing – Neyman-Pearson Lemma and Wald’s Theorem – Chi-Square Distribution  Estimation Basics – Maximum Likelihood (ML) and Maximum A Posteriori (MAP) Estimators – Least Squares (LS) and Minimum Mean Square Error (MMSE) Estimators – Cramer-Rao Lower Bound, Fisher Information, Consistency and Efficiency
  • 7. Definitions and Notations  a1  a  T  a = [ ai ] =  2  a = [ a1 a2  an ]    an   a11 a12  a1m   a11 a21  an1  a  a2 m  a  an 2  [ ] A = aij =  21   a22      [ ] AT = a ji =  12   a22          an1 an 2  anm  a1m a2 m  anm 
  • 8. Basic Matrix and Vector Properties  Symmetric and Skew Symmetric Matrix A = AT A = − AT  Matrix Product (NxS = [NxM] [MxS]): [ ] m C = cij = AB = ∑ aik bkj k =1  Transpose of Matrix Product [ ] m C = c ji = ( AB ) = B A = ∑ b jk aki T T T T k =1  Matrix Inverse AA−1 = I
  • 9. Basic Matrix and Vector Properties  Inner Product (Vectors must have equal length) n 〈 a, b〉 = a b = ∑ ai bi T i =1  Outer Product (NxM = [N] [M]) [ ] [ abT = C = cij = ai b j ]  Matrix Trace n Tr ( A) = ∑ aii = Tr ( AT ) i =1  Trace of Matrix Product ∂ ∂ ( AB ) Tr ( AB) = Tr ( BA) ∂A (Tr ( ABAT ) ) = A( B + BT ) = BT ∂A
  • 10. Matrix Inversion Lemma  In Estimation Theory, the following complicated inverse appears: (P −1 +H R H T −1 ) −1  The Matrix Inversion Lemma yields an alternative expression which does not depend on the inverses of the matrices in the above expression: ( P − PH T HPH T + R ) −1 HP  An alternative form of the Matrix Inversion Lemma is: ( A + BCB )T −1 −1 = A − A B B A B+C−1 ( T −1 ) −1 −1 B T A−1
  • 11. The Gradient  The Gradient operator with respect to an n-dimensional vector “x” is: T  ∂ ∂  ∇x =      ∂x1 ∂xn   Thus the gradient of a scalar function “f” is: T  ∂f ∂f  ∇x f =      ∂x1 ∂xn   The gradient of an m-dimensional vector-valued function is: T T  ∂ ∂    ∇x f =     [ f1 ( x )  f m ( x )] = NxM  ∂x1 ∂xn 
  • 12. The Jacobian Matrix  The Jacobian Matrix is a matrix of derivatives describing a linear mapping from one set of coordinates to another. This is the transpose of the gradient of a vector-valued function (p. 24):  ∂x1 ∂x1   ∂x′   ∂xn  ′   ∂x ∂ ( x1 , x2 ,..., xm )  1  J ( x , x ′) =  = =     ∂x ′ ∂ ( x1 , x′ ,..., xn ) ′ 2 ′  ∂xm  ∂xm   ∂x1 ′ ∂xn  ′    This is typically used as part of a Vector Taylor Expansion for approximating a transformation.     ∂x   x = x ( xo ) +  ⋅ ( x ′ − xo ) +  ′ ′ ∂x ′ xo′ 
  • 13. The Jacobian Matrix: An Example  The conversion from Spherical to Cartesian coordinates yields:  x = [ r Sin(b) Cos (e) r Cos (b) Cos (e) r Sin(e)]  x ′ = [ r b e]  Sin(b) Cos (e) r Cos (b) Cos (e) − r Sin(b) Sin(e)  J ( x , x ′) = Cos (b) Cos (e) − r Sin(b) Cos (e) − r Cos (b) Sin(e)       Sin(e) 0 r Cos (e)         ∂x ∂x ′ J ( x , x ′) J ( x ′, x ) =   = I ∂x ′ ∂x
  • 15. Dirac Delta Function  The Dirac Delta Function is defined by: δ (t − τ ) = 0 ∀t ≠ τ  This function is defined by its behavior under integration: τ ∈ [ a, b ] b ∫ δ (t − τ )dt = 1 a  In general, the Dirac Delta Function has the following “sifting” behavior: f (t ) δ (t − τ )dt = f (τ ) τ ∈ [ a, b] b ∫a  The discrete version of this is called the Kronnecker Delta: 0 ∀i ≠ j δ ij =   1 i= j
  • 16. State-Space Representation (Continuous)  A Dynamic Equation is typically expressed in the standard form (p. 27):     (t ) = A(t ) x (t ) + B(t )u (t ) x  x (t ) is the state vector of dimension “nx”  u (t ) is the control input vector of dimension “ny” A(t ) is the system matrix of dimension “nx x nx” B (t ) is the input gain matrix of dimension “nx x ny”  While the Measurement Equation is expressed in the standard form:   z (t ) = C (t ) x (t )  z (t ) is the measurement vector of dimension “nz” C (t ) is the observation matrix of dimension “nz x nx”
  • 17. Example State-Space System  A typical (simple) example is the constant velocity system: ξ(t ) = 0   This system is not yet in state-space form: ξ  0 1 ξ  0 0  u1     =   ξ  + 0 0 u  ξ  0 0       2      (t ) = A(t ) x (t ) + B(t )u (t ) x  And suppose that we only have position measurements available: ξ  ξ meas = [1 0]   ξ    z (t ) = C (t ) x (t )
  • 18. State-Space Representation (Discrete)  A continuous state-space system can also be written in discrete form (p. 29):    xk = Fk −1 xk −1 + Gk −1 u k −1  xk is the state vector of dimension “nx” at time “k”  uk is the control input vector of dimension “ny” at time “k” Fk is the transition matrix of dimension “nx x nx” at time “k” Gk is the input gain matrix of dimension “nx x ny” at time “k”  While the Measurement Equation is expressed in the discrete form:   z k = H k xk  zk is the measurement vector of dimension “nz” at time “k” Hk is the observation matrix of dimension “nz x nx” at time “k”
  • 19. Example Revisited in Discrete Time  The constant velocity discrete time model is given by: ξ k  1 t k − t k −1  ξ k −1  0 0  u1k −1  ξ  = 0   ξ  + 0 0 u  1   k −1    k    2 k −1     xk = Fk −1 xk −1 + Gk −1 u k −1  Since there is no time-dependence in the measurement equation, it is a trivial extension to the continuous example: ξ k  ξ meas = [1 0]    ξ k  k   z k = H k xk
  • 20. State Transition Matrix  We wish to be able to convert a continuous linear system to a discrete time linear system. Most physical problems are easily expressible in the continuous form while most measurements are discrete. Consider the following time-invariant homogeneous linear system (pp. 180-182):    (t ) = A(t ) x (t ) where x A(t ) = A for t ∈ [ t k −1 , t k ]  We have the solution:   { }  x (t ) = Fk −1 ( t , t k −1 ) x ( t k −1 ) = L−1 ( sI − A) = e A( t −tk −1 ) x ( t k −1 ) −1 for t ∈ [ t k −1 , t k ]  If we add a term, making an inhomogeneous linear system, we obtain:     x (t ) = A(t ) x (t ) + B(t )u (t ) where B(t ) = B for t ∈ [ t k −1 , t k ]
  • 21. Matrix Superposition Integral  Then, the state transition matrix is applied to the additive term and integration is performed to obtain the generalized solution:    x (t ) = Fk −1 ( t , t k −1 ) x ( t k −1 ) + ∫ Fk −1 ( t ,τ ) B(τ )u (τ ) dτ for t ∈ [ t k −1 , t k ] t t k −1  Consider the following example: u(t)=(t) 2σ 2 β x2 1 x1 s+β s  x1  0 1   x1   0    x  = 0 − β   x  +  2σ 2 β u (t )  2    2   
  • 22. Observability Criteria  A system is categorized as observable if the state can be determined from a finite number of observations, assuming that the state-space model is correct.  For a time-invariant linear system, the observability matrix is given by:  H   HF  Ω=      n x −1  H F   Thus, the system is observable if this matrix has a rank equal to “nx” (pp. 25,28,30).
  • 23. Observability Criteria: An Example  For the nearly constant velocity model described above, we have:  [1 0]  1 0   Ω= 1 ∆t   =  [1 0]     1 ∆t    0 1    The rank of this matrix is “2” only if the delta time interval is non-zero. Thus, we can only estimate position and velocity both (using only position measurements) if these position measurements are separated in time.  The actual calculation of rank is a subject for a linear algebra course and leads to ideas such as linear independence and singularity (p. 25)
  • 25. Axioms of Probability  Suppose that “A” and “B” denote random events, then the following axioms hold true for probabilities: – Probabilities are non-negative: P{ A} ≥ 0 ∀A – The probability of a certain event is unity: P{ S } = 1 – Additive for mutually exclusive events: If P{ A ∩ B} = 0 then P{ A ∪ B} = P{ A} + P{ B} Mutually Exclusive
  • 26. Conditional Probability  The conditional probability of an event “A” given the event “B” is: P{ A ∩ B} P{ A | B} = P{ B}  For example, we might ask the following tracking related questions: – Probability of observing the current measurement given the previous estimate of the track state – Probability of observing a target detection within a certain surveillance region given that a true target is present  Formulating these conditional probabilities is the foundation of track initiation, deletion, data association, SNR detection schemes…
  • 27. Total Probability Theorem  Assume that we have a set of events “Bi” which are mutually exclusive: P{ Bi ∩ B j } = 0 ∀ i ≠ j  And exhaustive: n ∑ P{ B } = 1 i =1 i  Then the Total Probability Theorem states: n n P{ A} = ∑ P{ A ∩ Bi } = ∑ P{ A | Bi } P{ Bi } i =1 i =1
  • 28. Baye’s Theorem  We can work the conditional probability definition in order to obtain the reverse conditional probability: P{ Bi ∩ A} P{ A | Bi } P{ Bi } P{ Bi | A} = = P{ A} P{ A}  This conditional probability “Bi” is called the Posterior Probability while the unconditional probability of “Bi” is called the Prior Probability.  In the case of “Bi” being mutually exclusive and exhaustive, we have (p. 47): P{ A | Bi } P{ Bi } Posterior Probability P{ Bi | A} = Prior Probability ∑ P{ A | B } P{ B } n j j j =1 Likelihood Function
  • 29. Gaussian (Normal) Random Variables  The Gaussian Random Variable is the most well-known, well- investigated type because of its wide application in the real world and its tractable mathematics.  A Gaussian Random Variable is one which has the following probability density function (PDF) : ( x−µ ) 2 1 − p ( x) = N ( x; µ , σ ) = 2 e 2σ 2 2πσ 2  and is denoted: x ~ N (µ ,σ 2 )
  • 30. Gaussian (Normal) Random Variables  The Expectation and Second Central Moment of this distribution are: ( x−µ ) 2 ∞ x − E[ x ] = ∫ e 2σ 2 dx = µ Mean −∞ 2π σ  ∞ x2 ( x−µ )  2 − E[( x − E[ x]) ] = E[ x ] − E[ x] =  ∫ 2 2 2 e 2σ dx  − µ 2 = σ 2 2  −∞ 2π σ    Mean Square Variance  These are only with respect to scalar random variables…what about vector random variables?
  • 31. Vector Gaussian Random Variables  The vector generalization is straight forward:     ( x − µ ) T P −1 ( x − µ )    1 − p( x ) = N ( x; µ , P) = e 2 2π P  The Expectation and Second Central Moment of this distribution are:   E[ x ] = µ     E[( x − E[ x ])( x − E[ x ])T ] = P  Notice that the Variance is now replaced with a matrix called a Covariance Matrix.  If the vector “x” is a zero-mean error vector than the covariance matrix is called the Mean Square Error.
  • 32. Baye’s Theorem: Gaussian Case  The “noise” of a device, denoted “x”, is observed. Normal functionality is denoted by event “B1” while a defective device is denoted by event “B2”: B1 = N ( x;0, σ 12 ) B2 = N ( x;0, σ 2 ) 2  The conditional probability of defect is (using Baye’s Theorem): P{ x | B2 } P{ B2 } 1 P{ B2 | x} = = P{ x | B1} P{ B1} + P{ x | B2 } P{ B2 } 1 + P{ x | B1} P{ B1 } P{ x | B2 } P{ B2 }  Using the two distributions, we have: 1 P{ B2 | x} = x2 x2 σ 2 P{ B1} − 2+ 2 2σ 1 2σ 2 1+ e σ 1 P{ B2 }
  • 33. Baye’s Theorem: Gaussian Case  If we assume the diffuse prior, that the probability of each event is equal, then we have a simplified formula: 1 P{ B2 | x} = x2 x2 σ2 − 2+ 2 2σ 1 2σ 2 1+ e σ1  If we further assume that  2 = 4  1 and that x =  2, then we have: P{ B2 | x} ≈ 0.998  Note that the likelihood ratio largely dominates the result of this calculation. This quantity is crucial in inference and statistical decision theory and often called “evidence from the data”. P{ x | B1} Λ( B1 , B2 ) = P{ x | B2 }
  • 34. Gaussian Mixture  Suppose we have “n” possible events “Aj” which are mutually exclusive and exhaustive. And further suppose that each event has a Gaussian PDF as follows (pp. 55-56): A j ={ x ~ N ( x j , Pj )} and P{ A j } = p j ∆  Then, the total PDF is given by the Total Probability Theorem: n p ( x) = ∑ p ( x | A j ) P{ Ai } j =1  This mixture can be approximated as another Gaussian once the mixed moments are computed.
  • 35. Gaussian Mixture  The first moment (mean) is easily derived as: n  n [ x = E [ p ( x)] = E ∑ p ( x | A j ) P{ Ai }  = ∑ E p ( x | A j ) p j]  j =1  j =1 n x = ∑ pj xj j =1  The covariance matrix is more complicated, but we simply apply the definition: [ ] [ ] n P = E ( x − x )( x − x ) = ∑ E ( x − x )( x − x ) | A j p j T T j =1 [ ] = ∑ E ( x − x + x j − x j )( x − x + x j − x j ) | A j p j n T j =1
  • 36. Gaussian Mixture  Continuing the insanity: [ ] P = ∑ E ( x − x j )( x − x j ) | A j p j + ∑ ( x j − x )( x j − x ) p j n n T T j =1 j =1 = ∑ Pj p j + ∑ ( x j − x )( x j − x ) p j n n T j =1 j =1 Spread of the Means  The spread of the means term inflates the covariance of the final mixed random variable to account for the differences between each individual mean and the mixed mean.
  • 38. Bayesian Hypothesis Testing  We consider two competing hypotheses about a parameter “” defined as: Null Hypothesis H0 : θ = θ0 Alternate Hypothesis H 1 : θ = θ1  We also define standard definitions concerning the decision errors: ∆ Type I Error (False Alarm) PeI = P{ accept H1 | H 0 true} = α ∆ Type II Error (Miss) PeII = P{ accept H 0 | H1 true} = β
  • 39. Neyman-Pearson Lemma  The power of the hypothesis test is defined as: ∆ Test Power (Detection) π = P{ accept H1 | H1 true} = 1 − β  The Neyman-Pearson Lemma states that the optimal decision (most powerful test) rule subject to a fixed Type I Error () is the Likelihood Ratio Test (pp.72-73): P{ z | H1}  H1 ; > Λ 0 Λ( H1 , H 0 ) ⇒ = P{ z | H 0 }  H 0 ; < Λ 0 Likelihood Functions P{ Λ ( H1 , H 0 ) > Λ 0 | H 0 } = PeI = α
  • 40. Sequential Probability Ratio Test  Suppose, we have a sequence of independent identically distributed (i.i.d.) measurements “Z={zi}” and we wish to perform a hypothesis test. We can formulate this in a recursive form as follows: P{ H1 ∩ Z } P{ Z | H1} P0 { H1} PR( H1 , H 0 ) = = P{ H 0 ∩ Z } P{ Z | H 0 } P0 { H 0 } Likelihood Functions a priori Probabilities P0 { H1} n P{ zi | H1} n PRn ( H1 , H 0 ) = ∏ P{ z | H } = PR0 ( H1 , H 0 ) ∏ Λ i ( H1 , H 0 ) P0 { H 0 } i =1 i 0 i =1 n ln ( PRn ( H1 , H 0 ) ) = ln ( PR0 ( H1 , H 0 ) ) + ∑ ln ( Λ i ( H1 , H 0 ) ) i =1
  • 41. Sequential Probability Ratio Test  So, the recursive for of the SPRT is: ln( PRk ( H1 , H 0 ) ) = ln( PRk −1 ( H1 , H 0 ) ) + ln( Λ k ( H1 , H 0 ) )  Using Wald’s Theorem, we continue to test this quantity against two thresholds until a decision is made:  H1 ; > T2  ln ( PRk ( H1 , H 0 ) ) ⇒ continue ; > T1 and < T2  H 0 ; < T1  1 − β   β  T2 = ln   and T1 = ln 1 − α   α     Wald’s Theorem applies when the observations are an i.i.d. sequence.
  • 42. Chi-Square Distribution  The chi-square distribution with “n” degrees of freedom has the following functional form: n−2 x 1 − χ n ( x) = 2 n x 2 e 2 n 2 Γ  2  2  It is related to an “n” dimensional vector Gaussian distribution as follows: ( x − x ) T P −1 ( x − x ) ~ χ n2  More generally, the sum of squares of “n” independent zero-mean, unity variance random variables is distributed as a chi-square with “n” degrees of freedom (pp.58-60).
  • 43. Chi-Square Distribution  The chi-square distribution with “n” degrees of freedom has the following statistical moments: E[ x] = n E[( x − E[ x]) 2 ] = 2n  The sum of two independent random variables which are chi-square are also chi-square: q1 ~ χ 2 n1 q2 ~ χ 2 n2 q1 + q2 ~ χ 2 n1 + n2
  • 45. Parameter Estimator  A parameter estimator is a function of the observations (measurements) that yields an estimate of a time-invariant quantity (parameter). This estimator is typically denoted as: [ ] where Z ={ z j } j =1 ∆ ∆ k xk = x k , Z ˆ ˆ k k Estimate Estimator Observations  We also denote the error in the estimate as: ∆ ~ =x−x xk ˆk True Estimate
  • 46. Estimation Paradigms  Non-Bayesian (Non-Random): – There is no prior PDF incorporated – The Likelihood Function PDF is formed – This Likelihood Function PDF is used to estimate the parameter ∆ Λ Z ( x) = p( Z | x )  Bayesian (Random): – Start with a prior PDF of the parameter – Use Baye’s Theorem to find the posterior PDF – This posterior PDF is used to estimate the parameter p( Z | x ) p( x ) 1 p( x | Z ) = = p( Z | x ) p( x ) p( Z ) c Posterior Likelihood Prior
  • 47. Estimation Methods  Maximum Likelihood Estimator (Non-Random): x ML ( Z ) = arg max[ p( Z | x ) ] ˆ x dp( Z | x ) =0 dx x ML ˆ  Maximum A Posteriori Estimator (Random): x MAP ( Z ) = arg max[ p( Z | x ) p( x ) ] ˆ x
  • 48. Unbiased Estimators  Non-Bayesian (Non-Random): E[ xk ( Z k )] p ( Z k | x = x ) = x0 ˆ 0  Bayesian (Random): [ ( )] ( E xk Z k ˆ p x∩Z k ) = E[ x] p ( x )  General Case: [ ( )] E ~k Z k = 0 x
  • 49. Estimation Comparison Example  Consider a single measurement of an unknown parameter “x” which is susceptible to additive noise “w” that is zero-mean Gaussian: z = x + w w ~ N 0, σ 2 ( )  The ML approach yields: ( z−x ) 2 1 − Λ ( x ) = p ( z | x ) = N ( z; x, σ ) = 2 e 2σ 2 2πσ 2 x ML ˆ = arg max[ Λ ( x)] = z x  Thus, the MLE is the measurement itself because there is no prior knowledge.
  • 50. Estimation Comparison Example  The MAP, with a Gaussian prior, approach yields: p( x) = N ( x; x , σ 0 ) 2 ( z − x)2 ( x− x )2 ( x −ξ ( z )) 2 − − − 2 2σ 2 2σ 02 p( z | x) p( x) e e 2σ 1 p( x | z ) = = = p( z ) 2πσσ 0 p( z ) 2πσ 12  x z  1 1 1 ξ ( z ) = σ 12  2 + 2  and σ  = 2+ 2  0 σ  σ 12 σ σ0 Prior Information x MAP = arg max[ p( x | z )] = ξ ( z ) ˆ x Measurement Information  Thus, the MAPE is a linear combination of the prior information and the observation and it is weighted based upon the variance of each. NOTE: The MLE and MAPE are equivalent for a diffuse prior !
  • 51. Batch Estimation Paradigms  Consider that we now have a set of observations available for estimating a parameter and that in general these observations are corrupted by measurement noise: Z k = { z j = h j ( x) + w j } j =1,,k  Least Squares (Non-Random) k 2 ˆ x LS k [ = arg min ∑ z j − h j ( x)  ] x  j =1   Minimum Mean Square Error (Random): xkMMSE = arg min E ( x − x ) | Z k ˆ ˆ x ˆ 2 [ ] [ ] ( ) ∆ ∞ ˆ x MMSE k = E x | Z = ∫ x p x | Z k dx k −∞
  • 52. Unbiasedness of ML and MAP Estimators  Maximum Likelihood Estimate: E[ xkML ] = E[ z ] = E[ x0 + w] = x0 + E[ w] = x0 ˆ  Maximum A Posterior Estimate:  σ2  [ ] σ0 σ2 σ 02 2 Exˆ k= E 2 MAP x+ 2 z = 2 2  x+ 2 E[ z ] σ + σ 0 σ +σ0  σ +σ0 σ +σ0 2 2 2 σ2 σ 02 σ2 σ02 = 2 x+ 2 E[ x + w] = 2 x+ 2 ( x + E[ w]) σ +σ0 2 σ +σ0 2 σ +σ02 σ +σ0 2 σ2 σ0 2 = 2 x+ 2 x = x = E[ x] σ +σ02 σ +σ02
  • 53. Estimation Errors  Non-Bayesian (Non-Random): { } 2 { } Var[ xk ( Z k )] = E[ xk ( Z k ) − E[ xk ( Z k )] ] = E[ xk ( Z k ) − x0 ] ˆ ˆ ˆ ˆ 2  Bayesian (Random): ˆ ˆ { } 2 [ˆ { } 2 MSE[ xk ( Z k )] = E[ xk ( Z k ) − x ] = E E[ xk ( Z k ) − x | Z k ] ]  General Case: [x ( ) E ~k Z k 2 ] = ( ( )) var xk Z k ˆ x unbiased and x non − random ˆ ( ( )) ˆ k  MSE xk Z all cases
  • 54. Variances of ML and MAP Estimators  Maximum Likelihood Estimate: ˆ ( ˆ ) var[ xkML ] = E[ xkML − x0 ] = E[( z − x0 ) ] = σ 2 2 2  Maximum A Posterior Estimate: [ ] [( var xkMAP = E xkMAP − x ˆ ˆ ) 2 ] σ 2σ 02 = 2 σ +σ0 2 [ ] < σ 2 = var xkML ˆ  The MAPE error is less than the MLE error since the MAPE incorporates prior information.
  • 55. Cramer-Rao Lower Bound  The Cramer-Rao Lower Bound states that a limit on the ability to estimate a parameter. [ ( )] [( ( ) )( ( ) ) MSE xk Z k = E xk Z k − x xk Z k − x ˆ ˆ ˆ T ]≥ J k −1  Not surprisingly, this lower limit is related to the likelihood function which we recall as the “evidence from the data”. This limit is called the Fisher Information Matrix. ∆ [   T ( ( J k = − E ∇ x ∇ x ln p Z k | x ) )]  x  When equality holds, the estimator is called efficient. An example of this is the MLE estimate we have been working with.
  • 57. Filtering Fundamentals  Linear Filtering – Linear Gaussian Assumptions, Kalman Filter, Kalman Properties – Direct Discrete-Time, Discretized Continuous-Time, Steady State Gains  Non-Linear Filtering – Non-Linear Dynamics & Measurements, Extended Kalman Filter – Iterated Extended Kalman Filter  Multiple-Model Filtering – Need for Multiple Models, Adaptive Filtering – Switching Multiple Model & Interacting Multiple Model Filter – Variable Structure IMM
  • 59. Kalman-Bucy Problem  A stochastic discrete-time linear dynamic system:     xk = Fk −1 xk −1 + Gk −1 uk −1 + Γk −1ν k −1  xk is the state vector of dimension “nx” at time “k”  Gk uk is the control input of dimension “nx” at time “k” Fk is the transition matrix of dimension “nx x nx” at time “k”  Γkν k is the plant noise of dimension “nx” at time “k”  The measurement equation is expressed in the discrete form:    z k = H k xk + wk  zk is the measurement vector of dimension “nz” at time “k” Hk is the observation matrix of dimension “nz x nx” at time “k”  wk is the measurement noise of dimension “nz” at time “k”
  • 60. Kalman-Bucy Problem  The Linear Gaussian Assumptions are:  E [ν k ] = 0 [  T ] E ν kν j = Qk δ jk  E [ wk ] = 0 [   ] E wk wT = Rk δ jk j  The measurement and plant noises are uncorrelated:   E[ wkν k ] = 0  The conditional mean is: [ ] ∆ x j |k = E x j | Z k ˆ Z k = { zi , i ≤ k } ˆ xk | k Filtered State Estimate ˆ xk |k −1 Extrapolated State Estimate  The estimation error is denoted by: ∆ ~ =x −x x j |k ˆ j |k j
  • 61. Kalman-Bucy Problem  The estimate covariance is defined as: [ ] ∆ Pj|k = E ~ j|k ~ jT|k | Z k x x Pk |k Filtered Error Covariance Pk |k −1 Extrapolated Error Covariance  The predicted measurement is given by: [ ] [ ] [ ] [ ] ∆     z k |k −1 = E z k | Z k −1 = E H k xk + wk | Z k −1 = H k E xk | Z k −1 + E wk | Z k −1 = H k xk |k −1 ˆ ˆ  The measurement residual or innovation is denoted by: ∆ ηk = zk − zk |k −1 = z k − H k xk |k −1 ˆ ˆ
  • 62. Kalman-Bucy Approach  Recall that the MMSE is equivalent to the MAPE in the Gaussian case.  Recall that the MAPE, with a Gaussian prior, is a linear combination of the measurement and the prior information.  Recall that the prior information was, more specifically, the expectation of the random variable prior to receiving the measurement.  If we consider the Kalman Filter to be a recursive process which applies a static Bayesian estimation (MMSE) algorithm at each step, we are compelled to consider the following linear combination.  ′ ˆ ˆ k |k = K k xk |k −1 + K k z k x Prior State Observation Information Information
  • 63. Kalman Filter - Unbiasedness  We start with the proposed linear combination:  ′ ˆ xk |k = K k xk |k −1 + K k z k ˆ  We wish to ensure that the estimate is unbiased, that is: E [ ~k |k ] = 0 x  Given the proposed linear combination, we determine the error to be:  ~ = [K′ + K H − I ] x + K′ ~ + K w  xk | k k k k k k xk |k −1 k k  Applying the unbiasedness constraint, we have:   E[ ~k |k ] = 0 = [ K k + K k H k − I ] E[ xk ] + K k E[ ~k |k −1 ] + K k E[ wk ] x ′ ′ x ′ Kk = I − Kk H k
  • 64. Kalman Filter – Kalman Gain  So, we have the following simplified linear combination:  xk |k = xk |k −1 + K k η k ˆ ˆ  We also desire the filtered error covariance, so that it can be minimized: [ Pk |k = E ~k |k ~k |k x x T ] Pk |k = ( I − K k H k ) Pk |k −1 ( I − K k H k ) + K k Rk K k T T  If we minimize the trace of this expression with respect to the gain: K k = Pk |k −1 H k T [H Pk k |k −1 T H k + Rk ] −1
  • 65. Kalman Filter - Recipe  Extrapolation:  xk |k −1 = Fk −1 ⋅ xk −1|k −1 + Gk −1uk −1 ˆ ˆ T Pk |k −1 = Fk −1 ⋅ Pk −1|k −1 ⋅ Fk −1 + Γk −1Qk −1ΓkT−1  Update:  xk |k = xk |k −1 + K kη k ˆ ˆ Pk |k = ( I − K k H k ) Pk |k −1 ( I − K k H k ) + K k Rk K k T T
  • 66. Kalman Filter – Innovations  The innovations are zero-mean, uncorrelated (p. 213) and have covariance: [ ] S k = E η kη k = H k Pk |k −1 H k + Rk T T  The normalized innovation squared or statistical distance is chi-square distributed: −1 d k2 = η k S k η k ~ χ nz T 2  So, we expect that the innovations should have a mean and variance of: { } E[ d i2 ] = nz { } var[ d i2 ] = 2nz  The Kalman Gain can now be written as: K k = Pk |k −1 H k T [H P k T k |k −1 H k + Rk ] −1 T = Pk |k −1 H k S k −1  The state errors are correlated: [ E~ ~ x x T k |k k −1|k −1 ] = [ I − K H ]F k k P k −1 k −1|k −1
  • 67. Kalman Filter – Likelihood Function  We wish to compute the likelihood function given the dynamics model used: [ ] [ ] p z k | Z k −1 = p z k | xk |k −1 = N ( z k ; z k |k −1 , S k ) = N ( z k − z k |k −1 ;0, S k ) = N (η k ;0, S k ) ˆ ˆ ˆ  Which has the explicit form: [  1  T −1   exp− η k S k η k  ] [ Λ k = p z k | Z k −1 ] =  2  det[ 2π S k ]  Alternatively, we can write:  1  exp − d k2  [ Λ k = p z k | Z k −1 ] =  2  ⇒ ln Λ = − 1 d 2 − 1 ln( det[ 2π S ] ) [ det 2π S kr ] k 2 k 2 k
  • 68. Kalman Filter – Measurement Validation  Suppose our Kalman filter has the following output at a given time step: 5 0  1 0  10 Pk +1|k = H k +1 =  ˆ k +1|k =   x  0 16   0 1  15  Suppose that we now receive 3 measurements of unknown origin:  4 0 1  7  2 16  3 19  R i = z k +1 =  , z k +1 =  , z k +1 =   0 9 k +1   20 5 25  Evaluate the consistency of these measurements for this Kalman filter model. This procedure is called gating and is the basis for data association. ( ) 1  ( )  ( ) d k2+1 z k +1 = 2 d k2+1 z k2+1 = 8 d k2+1 zk3+1 = 13 χ 2 ( 95% ) = 6 2 χ 2 ( 99% ) = 9.2 2
  • 69. Kalman Filter – Initialization  The true initial state is a random variable distributed as: x0 = N ( x0|0 , P0|0 )  ˆ  It is just as important that the initial covariance and estimate realistically reflect the actual accuracy. Thus, the initial estimate should satisfy: ~ T P −1 ~ ≤ χ 2 ( 95% ) x0|0 0|0 x0|0 nx  If the initial covariance is too small, then the Kalman gain will initially be small and the filter will take a longer time to converge.  Ideally, the initial state estimate should be within one standard deviation (indicated by the initial covariance) of the true value. This will lead to optimal convergence time.
  • 70. Kalman Filter – Initialization  In general, a batch weighted least-squares curve fit can be used (Chapter 3): [ −1 x0|0 = H init Rinit H init ˆ T ] −1 T −1 H init Rinit zinit [ T −1 P0|0 = H init Rinit H init ] −1   [  T zinit = z0 , , z nx −1 ] [ ( H init = H 0 ,  , H nx −1 Fnx − 2 ) ] n x −1 T  R0  0    Rinit =      0  Rn −1   x   This initialization will always be statistically consistent so long as the measurement errors are properly characterized.
  • 71. Kalman Filter – Summary  The Kalman Gain: – Proportional to the Predicted Error – Inversely Proportional to the Innovation Error  The Covariance Matrix: – Independent of measurements – Indicates the error in the state estimate assuming that all of the assumptions/models are correct  The Kalman Estimator: – Optimal MMSE state estimator (Gaussian) – Best Linear MMSE state estimator (Non-Gaussian) – The state and covariance completely summarize the past
  • 72. Kalman Filter – Summary
  • 73. Kalman Filter – Summary
  • 74. Kalman Filter: Direct Discrete Time Example  Consider the simplest example of the nearly constant velocity (CV) dynamics model: T 2 2  ξ k   1 T   xk =   xk −1 +  ν k −1 xk =    [ ] E ν k2 = q 0 1   T  ξ k  Discrete White Noise Acceleration  z k = [1 0] xk + wk  [ ] E wk = r 2  The recursive estimation process is given by the Kalman equations derived above.  How do we select “q”? q ≅ amax 2
  • 75. Kalman Filter: Other Direct Discrete Time Models  For nearly constant acceleration (CA) models, the Discrete Weiner Process Acceleration (DWPA) model is commonly used: 1 T T 2 2 T 2 2 ξ k    x k = 0 1    T  xk −1 +  T ν k −1    xk = ξ k  [ ] E ν k2 = q 0 0 1   1  ξk         T 4 4 T 3 3 T 2 2   Qk = qΓk ΓkT = q T 3 3 T 2 T  q ≅ ∆amax 2 T 2 2 T 1     Notice the simple relationship between the “q-value” and the physical parameter that is one derivative higher than that which is estimated.
  • 76. Kalman Filter: Discretized Continuous-Time Models  These models are derived from continuous time representations using the matrix superposition integral. Ignoring the control input:   x ( t ) = Ax (t ) + Dv (t )  ~ E [ v (t )v (τ )] = q δ ( t − τ ) ~ ~ ~   xk = Fk −1 xk −1 + vk −1 where F =e AT and T ~ vk = ∫ e A(T −τ ) Dv (τ ) dτ 0  Thus, the process noise covariance is found by: [ Qk = E v v = ∫T k k ] T 0 ∫ 0 T Fk ( T − τ 1 ) D E [ v (τ 1 )v (τ 2 )] DT FkT ( T − τ 2 ) dτ 1dτ 2 ~ ~ = ∫ Fk ( T − τ 1 ) D q DT FkT ( T − τ 1 ) dτ 1 ~ T 0
  • 77. Kalman Filter: Discretized Continuous-Time Models  Continuous White Noise Acceleration (CWNA) for CV Model: ~ ~ T 3 3 T 2 2  ~ ξ(t ) = v (t )  Qk = q  2  q ≅ amaxT 2 T 2 T   Continuous Weiner Process Acceleration (CWPA) for CA Model: T 5 20 T 4 8 T 3 6  ~ ξ(t ) = v (t )  ~ Qk = q  T 4 8 T 3 3 T 2 2  ~ q ≅ ∆amax T 2  T3 6 T2 2 T     Singer [IEEE-AES, 1970] developed the Exponentially Correlated Acceleration (ECA) for CA model (p. 187 & pp.321-324):   ~ ξ(t ) = −αξ(t ) + v (t )
  • 78. Kalman Filter: Time Consistent Extrapolation  So, what is the difference between the Direct Discrete-Time and Discretized Continuous-Time models for CV or CA models. Which one should be used? [ FPF T + Q DWNA ] T =2 [ {[ ≠ F FPF T + Q DWNA ] }F T =1 T + Q DWNA ] T =1 [ FPF T + Q CWNA ] = [ F {[ FPF T + Q CWNA ] }F T + Q CWNA ] T =2 T =1 T =1  Thus, for the Continuous-Time model, 2 extrapolations of 1 second yields the same result as 1 extrapolation of 2 seconds.  In general, the Continuous-Time models have this time consistent property.  This is because the process noise covariance is derived using the transition matrix, while the Direct Discrete-Time is arbitrary.
  • 79. Kalman Filter: Steady State Gains  If we iterate the Kalman equations for the covariance indefinitely, the updated covariance (and thus the Kalman gain) will reach steady state.  This is only true for Kalman models that have constant coefficients.  In this case, the steady-state solution is found using the Algebraic Matrix Riccati Equation (pp. 211 & 350): [ Pss = F Pss − Pss H ( HPss H + R ) HPss F T + Q T T −1 ]  The steady state Kalman gain becomes: K ss = Pss H T [ HP H ss T +R ] −1
  • 80. Kalman Filter: Steady State Biases  If a Kalman filter has reached steady-state, then it is possible to predict the filter’s bias resulting from un-modeled dynamics.  Consider the CV model with an un-modeled constant acceleration (p. 13):    ξ k  T 2 2  α  xk = F xk −1 + Γν k −1 + Gλ xk =    G =   K ss =   ξk   T  β T  Un-Modeled Acceleration  The steady-state error is found to be:  1−α 2   β T  ~ = ( I − K H ) F ~ + ( I − K H ) Gλ ⇒ ~ =  xss xss xss λ ss ss  2α − β  T  β   
  • 81. Kalman Filter – Summary #2  The Kalman Gain: – Reaches steady-state for constant coefficient models – Can determine steady-state errors for un-modeled dynamics  The Covariance Matrix: – Is only consistent for when model matches true – Has no knowledge of the residuals  The Kalman Estimator: – We need modifications for more general models – What about non-linear dynamics?
  • 83. Nonlinear Estimation Problems  Previously, all dynamics and measurement models were linear. Now, we consider a broader scope of estimation problems:     ~ x = f ( x , t ) + Du (t ) + v (t )    z (t ) = h( x , t ) + w(t )  Nonlinear Dynamics: – Ballistic Dynamics (TBM exo-atmospheric, Satellites, etc…) – Drag/Thust Dynamics (TBM re-entry, TBM Boost, etc…)  Nonlinear Measurements: – Spherical Measurements – Angular Measurements – Doppler Measurements
  • 84. EKF - Nonlinear Dynamics  The state propagation can be done using numerical integration or a Taylor Series Expansion (linearization):  However, the linearization is necessary in order to propagate the covariance:      xk ≅ Fk −1 xk −1 + Gk −1uk −1 + Γk −1vk −1  Fk −1 = e [  f ( x ) ( t k −t k −1 ) ]  x = xk −1|k −1 ˆ ∂f ≅ I + (t k − t k −1 )  ∂x  + (t k − t k −1 ) 2 ∂ 2 f 2  ∂x 2  + x = xk −1|k −1 ˆ x = xk −1|k −1 ˆ Jacobian Matrix Hessian Matrix  The state and covariance propagation are precisely as before:   ˆk |k −1 = Fk −1 ⋅ xk −1|k −1 + Gk −1uk −1 x ˆ   T Pk |k −1 = Fk −1 ⋅ Pk −1|k −1 ⋅ Fk −1 + Γk −1Qk −1ΓkT−1
  • 85. EKF - Nonlinear Measurements  We compute the linearization of the observation function:       h( xk ) = h( xk |k −1 ) + H k ( xk − xk |k −1 ) + H k ( xk − xk |k −1 ) 2 +  ˆ ˆ ˆ  ∂h   ∂ 2h Jacobian Matrix Hk =  Hk = 2 Hessian Matrix ∂x x = xk|k −1  ˆ ∂x x = x  ˆ k |k −1  The residual is thus: ∆  ηk = zk − z k |k −1 = z k − h( xk |k −1 ) = H k ~k |k −1 + wk ˆ ˆ x  The covariance update and Kalman gain are precisely as before (381- 386):  ˆk |k = xk |k −1 + K kη k x ˆ Pk |k = ( I − K k H k ) Pk |k −1 ( I − K k H k ) + K k Rk K k T T −1 K k = Pk |k −1 H k S k T
  • 86. Polar Measurements  Previously, we dealt with unrealistic observation models that assumed that measurements were Cartesian. Polar measurements are more typical. In this case, the observation function is nonlinear:    r   x 2 + y 2  x k = [ xk  yk ]  z k = h( xk ) + wk h( xk ) =   =  −1  xk  yk  T b  tan ( x y )    The Kalman Gain and Covariance Update only require the Jacobian of this observation function:  x y  0 0 ∂h  x 2 + y 2 x2 + y2 H =  =  ∂x  y −x   x2 + y2 0 0  x2 + y2   This Jacobian is evaluated at the extrapolated estimate.
  • 87. Ballistic Dynamics  As a common example of nonlinear dynamics, consider the ballistic propagation equations specified in ECEF coordinates: [x   y x   z ] T = y  z [x  2 ω y + ω 2 x + Gx  y − 2ω x + ω 2 y + Gy   z Gz  ] T ~ + Dv (t )  The gravitational acceleration components (to second order) are:  − µ x  3  R 2   z   2  3 1 + J 2  e  1 − 5      R  2 R     R     Gx        − µ y  3  Re  2  z 2   G = G y  =  3 1 + J 2   1 − 5     R  2 R     R     G z      − µ z  3  R 2   z    2  3 1 + J 2  e   3 − 5     R  2 R      R     
  • 88. Thrust and Re-entry Dynamics  As a common example of nonlinear dynamics, consider the ballistic propagation equations specified in ECEF coordinates: [x   y x   z  a y  z  T β = ]  x  y  z   ~  Ballistic x ax  +a  Ballistic y ay +a z Gz + a  aβ − β 2  + D v (t )  s s s   The new states are the relative axial acceleration “a” and the relative mass depletion rate “”:  1   T + C D AC ρ ( v ⋅ v ) m(t )    2 a (t ) = athrust (t ) − adrag (t ) = β (t ) = m(t ) m(t )  The process noise matrix (if extended to second order) becomes a function of the speed. Thus, a more rapidly accelerating target while have more process noise injected into the filter.
  • 89. Pseudo- Measurements  In the case of the TBM dynamics, the ECEF coordinates are the most tractable coordinates.  However, typically the measurements are in spherical coordinates.  Furthermore, the Jacobian for the conversion from ECEF to RBE is extremely complicated.  Instead, we can convert the measurements into ECEF as follows:    zk′ = I 3 x 3 xk + w′ k Rk' = J meas Rk J meas T  However, since this is a linearization, we must be careful to make sure that this approximation holds.
  • 90. Pseudo- Measurements  The linearization is valid so long as (pp. 397-402): r σ b2 < 0.4 σr
  • 91. Iterated EKF  The IEKF iteratively computes the state “n” times during a single update. This recursion is based on re-linearization about the estimate: ∂h ∂h H ki =  where H k0 =  i = 0,1,..., n ∂x ˆi xk|k ∂x ˆ xk|k −1  The state is updated iteratively with a re-linearized residual and gain:  xk+k1 = xk |k −1 + K kiη ki ˆi | ˆ   ηki = z k − h( xk |k ) − H ki ˆi ˆi xk|k [x ˆ k |k −1 − xk | k ˆi ] −1  H i  H i Hi  + R  T T K k = Pk |k −1  k ˆ i   k i Pk |k −1  k ˆ i  k  xk|k   ˆi xk|k  xk|k    Finally, the covariance is computed based upon the values of the final iteration:  + K n R ( K n )T T Pk |k =  I − K kn H kn  P  I − K n H n n  k |k −1  k k  k k k  xk|k  ˆ  xk|k  ˆn
  • 93. Why Multiple Models?  When the target dynamics differ from the modeled dynamics, the state estimates are subject to: – Biases (Lags) in the state estimate – Inconsistent covariance output – In a tracking environment, this increases the probability of mis- association and track loss  In most tracking applications, the true target dynamics have an unknown time dependence.  To accommodate changing target dynamics, one can develop multiple target dynamics models and perform hypothesis testing.  This approach is called hybrid state estimation.
  • 94. Why Multiple Models? 6-State Velocity Error (Meters/Sec) (x=red, y=blue, z=green, True Error=black) 100 Assuming a 80 True state estimate error Constant Velocity target 60 dynamics, the estimation Confidence interval 40 Given by Kalman errors become Covariance inconsistent 20 during an acceleration 0 -20 -40 -60 1800 1900 2000 2100 2200 2300 2400 2500 2600 2700 2800
  • 95. Why Multiple Models? 9-State Velocity Error (Meters/Sec) (x=red, y=blue, z=green, True Error=black) 200 A Constant True state estimate error Acceleration 150 Confidence interval model remains 100 Given by Kalman consistent, Covariance however the 50 steady-state estimation 0 error is larger -50 -100 -150 -200 1800 1900 2000 2100 2200 2300 2400 2500 2600 2700 2800
  • 96. Adaptive Process Noise  Since the normalized innovations squared indicate the consistency of the dynamics model, it can be monitored to detect deviations (pp. 424- 426).  At each update, the perform the following threshold test:  T −1  d k2 = η k S k η k ≥ ε max { } P d k2 ≥ ε max = α  Then, the process noise value is adjusted such that the statistical distance is equal to this threshold value:  T −1  η k S k η k = ε max q  The disadvantage is that false alarms result in sudden increases in error.  We can use a sliding window average of these residuals, however this can delay the detection of a maneuver (pp. 424-426).
  • 97. State Estimation – Multiple Models  We can further assume that the true dynamics is one of “n” models: r r r r xk = Fk −1 xk −1 + vk −1 r = 1,2,  , n   r z kr = H kr xkr + wk −1 r = 1,2,  , n  Using Kalman filter outputs, each model likelihood function can be computed: [  1  T −1   exp− η kr S kr η kr   2 ]  ; r = 1,2, , n Λrk = [ det 2π S kr ]  At each filter update, the posterior model probabilities “ ki ” are computed recursively using Baye’s Theorem. The proper output can be selected using these probabilities (pp. 453-457).
  • 98. State Estimation – SMM Hypothesis 1 x1 ˆk Measurement ˆ2 xk Λik µ k −1 i Hypothesis 2 µ ki = n Hypothesis ∑Λ j =1 j k µ j k −1 Selection Hypothesis “n” ˆ xk ˆn xk Most Probable State Estimate  Each Kalman filter is updated independently and has no knowledge about the performance of any other filter.  This approach assumes that the target dynamics are time-independent
  • 99. State Estimation – IMM Measurement Conditional Probability Update/ Hypothesis 1 x1 ˆk 1 ˆ x State Estimate Interaction k −1 ˆ2 xk Probability ˆ2 xk −1 Hypothesis 2 Updates Estimate Mixing ˆn xk ˆ x n k −1 ˆ xk Hypothesis “n” IMM Estimate  Each Kalman filter interacts with others just prior to an update  This interaction allows for the possibility of a transition  This approach assumes that the target dynamics will change according to a Markov process. pp. 453-457.
  • 100. State Estimation – IMM x1k −1|k −1 , P1k −1|k −1 ˆ x 2 k −1|k −1 , P 2 k −1|k −1 ˆ  µ k −1  1 Interaction  2   µ k −1  x 0,1k −1|k −1 , P 0,1k −1|k −1 ˆ x 0, 2 k −1|k −1 , P 0, 2 k −1|k −1 ˆ  Kalman Prob. Kalman  zk Λ1k Λ2k zk Filter Updates Filter x1 k | k , P1 k | k ˆ x 2 k |k , P 2 k | k ˆ {µ 1 k −1 , µ k2−1 } Estimate Mixing ˆ xk |k , Pk |k
  • 101. State Estimation – Applied IMM 6-State Velocity Error (Meters/Sec) (x=red, y=blue, z=green, True Error=black) 100 IMM adapts to 80 True state estimate error changes in Confidence interval target 60 Given by Kalman dynamics and 40 Covariance provides a consistent 20 covariance 0 during these transitions -20 -40 -60 -80 -100 1800 1900 2000 2100 2200 2300 2400 2500 2600 2700 2800
  • 102. State Estimation – IMM Markov Matrix  The particular choice of the Markov Matrix is somewhat of an art.  Just like any filter tuning process, one can choose a Markov Matrix simply based upon observed performance.  Alternatively, this transition matrix has a physical relationship to the Mean Sojourn Time of a given dynamics state. 1 1 Tscan E[ Ni ] = pii = 1 − = 1− 1 − pii E[ N i ] E [τ i ]
  • 103. State Estimation – VSIMM  Air Targets: Adaptive Grid Coordinated Turning Model  TBM Targets: Constant Axial Thrust, Ballistic, Singer ECA Air IMM xkAir ˆ Measurement TBM SPRT Hypothesis Selection ˆ xk ˆ x k TBM IMM  The SPRT is performed as follows:  xkAir ; > T2 ˆ µ kAir  1− β β ⇒ mixed ; > T1 and < T2 T2 = and T1 = µk TBM  α 1−α  ˆ TBM ; < T1 xk