SlideShare ist ein Scribd-Unternehmen logo
1 von 506
Downloaden Sie, um offline zu lesen
ADSP-21K Optimized DSP Library User’s Manual




CHAPTER 5   Function Descriptions For
            The ADSP-21K Optimized
            DSP Library




            Each function described in the following pages includes the following topics in order to
            better understand its use:
            •   Name
            •   Description of the function's operation
            •   The algorithm as applicable
            •   Synopsis of function prototype
            •   Domain valid for arguments
            •   Accuracy of the returned value(s)
            •   Execution time in machine cycles
            •   Notes applicable to this function




                    Wideband Computers, Inc.                                                   5-55
ADSP-21K Optimized DSP Library User’s Manual




 acort ( a, c, m, n )
                 NAME   Auto-correlation (Time Domain)

          DESCRIPTION   Computes the time domain auto-correlation of the real elements stored in input vector
                        a[ ]. Values m and n define the number of auto-correlation values to compute. The
                        resulting auto-correlation values are stored in output vector c[ ].
                                n–i–1
           ALGORITHM    Ci=        ∑      Ai + j • Aj              i = { 0, 1, 2, …m – 1 }
                                  j= 0
             SYNOPSIS   void acort ( a, c, m, n )
                        float     *a ; /* Pointer to input vector a[ ]                                    */
                        float     *c ; /* Pointer to output vector c[ ]                                   */
                        int         m ; /* Lag count m                                                    */
                        int         n ; /* Number of elements in vector a[ ]                              */


              DOMAIN    -3.4E+38 to 3.4E+38

           ACCURACY     7.75 decimal digits

       EXECUTION TIME   31 + 9*M + (M+1) (2*N-M)

                NOTES   The file tacort.c included in the distribution tape provides an example of this func-
                        tion’s use.

                        Note that the lag count m must be less than or equal to the number of floating-point
                        elements (i.e. m ≤ n ).




5-56                               Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




acos_wci ( x )
                 NAME    Arc Cosine

        DESCRIPTION      This function computes the arc cosine of a floating-point number, x. The computed
                         value returned from this function is in the range [0 to π ] radians. A domain error is
                         returned if x is not in the range [-1 to +1].

        ALGORITHM        return = cos –1( x )
           SYNOPSIS      float acos_wci ( float x )

            DOMAIN       -1.0 < x < +1.0

         ACCURACY        7.75 decimal digits

    EXECUTION TIME       If A <= 0.5 then 55 cycles, Else if A >0.5 then 75 cycles

                 NOTES   The file tacos.c included in the distribution tape provides an example of this function's
                         use.




acosh_wci ( x )
                 NAME    Inverse Hyperbolic Cosine

        DESCRIPTION      This function computes the inverse hyperbolic cosine of a floating-point number, x.

        ALGORITHM           return = cosh – 1( x )
           SYNOPSIS      float acosh_wci ( float x )

            DOMAIN       1.0 to 3.4E+38

         ACCURACY        7.75 decimal digits

    EXECUTION TIME       72 cycles

                 NOTES   The file tacosh.c included in the distribution tape provides an example of this func-
                         tion's use.




                                 Wideband Computers, Inc.                                                     5-57
ADSP-21K Optimized DSP Library User’s Manual




 alawc ( a, i, c, k, n )
                 NAME      a-Law Compression

          DESCRIPTION      This routine performs an a-law compression on the elements in input vector a and out-
                           puts the compressed results to output vector c.


                           C mk = alaw compression of A mi
           ALGORITHM

                           m = { 0, 1, 2, …n – 1 }
              SYNOPSIS     void alawc ( a, i, c, k, n )
                           int        *a ; /* Pointer to input vector a                                    */
                           int         i ; /* Element stride for vector i                                  */
                           int        *c ; /* Pointer to output vector c                                   */
                           int         k ; /* Element stride for vector c                                  */
                           int         n ; /* Number of floating-point elements                            */


               DOMAIN      0 to 255

            ACCURACY       7.75 decimal digits

       EXECUTION TIME      49 + 12 * ( N-1 )

                 NOTES     The file talawc.c included in the distribution tape provides an example of this func-
                           tion’s use.

                           The alawc() routine takes a linear 13-bit signed speech sample and compresses it
                           according to CCITT (now ITU) recommendation G.711. The 8-bit compressed sample
                           is output to vector c.

                           This function is found on the serial port hardware for the ADSP-2106x DSP proces-
                           sors.




5-58                                  Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




alawe ( a, i, c, k, n )
                NAME      a-Law Expansion

         DESCRIPTION      This routine performs an a-law expansion on the elements in input vector a and out-
                          puts the expanded results to output vector c.


                          C mk = alaw expansion of A mi
          ALGORITHM

                          m = { 0, 1, 2, …n – 1 }
             SYNOPSIS     void alawe ( a, i, c, k, n )
                          int         *a ; /* Pointer to input vector a                                   */
                          int          i ; /* Element stride for vector i                                 */
                          int         *c ; /* Pointer to output vector c                                  */
                          int          k ; /* Element stride for vector c                                 */
                          int          n ; /* Number of floating-point elements                           */


              DOMAIN      0 to 255

           ACCURACY       7.75 decimal digits

    EXECUTION TIME        46 + 17 * ( N-1 )

                NOTES     The file talawe.c included in the distribution tape provides an example of this func-
                          tion’s use.

                          The alawe() routine takes an 8-bit compressed speech sample and expands it accord-
                          ing to CCITT (now ITU) recommendation G.711. The 13-bit signed sample is output
                          to vector c.

                          This function is found on the serial port hardware for the ADSP-2106x DSP proces-
                          sors.




                                     Wideband Computers, Inc.                                                  5-59
ADSP-21K Optimized DSP Library User’s Manual




 alpha ( df, a, &al, &n )
               NAME    Kaiser-Bessel Window Shape Parameter

         DESCRIPTION   Computes a Kaiser-Bessel window shape parameter for later use by the kaiser( ) win-
                       dow mutiply library function. The computation is based on the input attenutation
                       specified in input scalar a and the transition width specified in real input scalar df.
                       From this, a count of floating-point elements (output scalar n) and an output window
                       shape parameter (output scalar al) is computed.


                        If A ≤ 21 then al = 0
          ALGORITHM
                        Else If
                                                                           0.4
                        A < 50 then al = 0.5842 • ( A – 21 )                     + 0.07886 • ( A – 21 )
                        Else If
                        al = 0.1102 • ( A – 8.7 )

                        Number of Elements n is computed as follows:
                                             ( A – 7.95 )
                        If A > 21 then d = ------------------------ else d = 0.922
                                                                  -
                                                 14.36
                        n = 1 + ceiling ( d ⁄ df )
                        n = n + 1 – remainder ( n ⁄ 2 )

            SYNOPSIS   void alpha ( df, a, &al, &n )
                       float dm *df ; /* Input transition width in fs units                             */
                       float dm       a      ; /* Input ripple attenutation in dB                       */
                       float dm &al ; /* Output alpha window shape parameter */
                       int          &n       ; /* Output floating-point element count */

                       -3.4E+38 to 3.4E+38

          ACCURACY     7.75 decimal digits




5-60                              Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




alpha ( df, a, &al, &n )
    EXECUTION TIME       If a >= 50 then 143 Cycles

                         If 21 < a < 50 then 221 Cycles

                         If A <= 21 then 124 Cycles

                 NOTES   The file talpha.c included in the distribution tape provides an example of this func-
                         tion’s use.
                                                                                       – A ⁄ 20
                                df = ∆f ⁄ f s, A = ripple attentuation in dB, δ = 10




asin_wci ( x )
                 NAME    Arc Sine

        DESCRIPTION      This function computes the arc sine of a floating-point number, x. The computed
                         value returned from this function is in the range [-π/2 to π/2] radians. A domain error
                         is returned if x is not in the range [-1 to +1].

         ALGORITHM       return = sin – 1( x )

                                 Wideband Computers, Inc.                                                    5-61
ADSP-21K Optimized DSP Library User’s Manual




 asin_wci ( x )
             SYNOPSIS     float asin_wci ( float x )

              DOMAIN      - 1.0 < x < +1.0

           ACCURACY       7.75 decimal digits

       EXECUTION TIME     If A <= 0.5 then 55 cycles, Else if A >0.5 then 73 cycles

                  NOTES   The file tasin.c included in the distribution tape provides an example of this function's
                          use.




 asinh_wci ( x )
                  NAME    Inverse Hyperbolic Sine

          DESCRIPTION     This function computes the inverse hyperbolic sine of a floating-point number, x.

           ALGORITHM         return = sinh –1( x )
             SYNOPSIS     float asinh_wci ( float x )

              DOMAIN      -3.4E+38 to 3.4E+38

           ACCURACY       7.75 decimal digits

       EXECUTION TIME     57 cycles

                  NOTES   The file tasinh.c included in the distribution tape provides an example of this func-
                          tion's use.




5-62                                  Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




aspec ( a, c, n )
                NAME   Accumulating Auto-spectrum

        DESCRIPTION    Computes the auto-spectrum of complex input vector a by multiplying vector a by its
                       complex conjugate and adding the resulting real number to the current value of vector
                       c. Vector c must be initialized prior to invoking a series of accumulating auto-spec-
                       trum calls.
                                                 2                2
         ALGORITHM     C m ⇐ C m + Re Am + Im Am
                          m = { 0, 1, 2, …n – 1 }

            SYNOPSIS   void aspec ( a, c, n )
                       complex *a ; /* Pointer to input vector a                                     */
                       float       *c ; /* Pointer to output vector c                                */
                       int           n ; /* Element count for vector c                               */

             DOMAIN    -3.4E+38 to 3.4E+38

          ACCURACY     7.75 decimal digits

    EXECUTION TIME     28 + 6*N cycles

               NOTES   The file taspec.c included in the distribution tape provides an example of this func-
                       tion’s use.

                       The stride of vectors a and c must always be 1.

                       If you wish to clear the auto-spectrum results before they are added to output vector c
                       use the vclr( ) function. If the results are not cleared using vclr( ), autospectrum results
                       are added to output vector c, thus computing an accumulating autospectrum.

                       Note that input vector a is of type complex, and data arguments supplied to this routine
                       will be treated as interleaved real and imaginary data.




                               Wideband Computers, Inc.                                                        5-63
ADSP-21K Optimized DSP Library User’s Manual




 atan_wci ( x )
                  NAME    Arc Tangent

          DESCRIPTION     This function computes the arc tangent of a floating-point number x. The computed
                          value returned from this function is in the range [-π/2 to +π/2] radians.

           ALGORITHM      return = tan –1( x )
             SYNOPSIS     float atan_wci ( float x )

              DOMAIN      - 4.2E+37 < x < +4.2E+37

           ACCURACY       7.75 decimal digits

       EXECUTION TIME     59 cycles

                  NOTES   The file tatan.c included in the distribution tape provides an example of this function's
                          use.




5-64                                  Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




atan2_wci ( y, x )
               NAME   Arc Tangent 2 Arguments

        DESCRIPTION   This function computes the arc tangent of a floating-point number x. The computed
                      value returned from this function is in the range [-π to +π] radians.


                                  –1 y
                      return = tan  -- 
         ALGORITHM
                                      -
                                    x

           SYNOPSIS   float atan2_wci ( y, x )
                      float dm y ;          /*   Input value y                               */
                      float dm x ;          /*   Input value x                               */



            DOMAIN    - 4.2E+37 < y/x < +4.2E+37, except x = 0.0

          ACCURACY    7.75 decimal digits

    EXECUTION TIME    76 cycles

              NOTES   The file tatan2.c included in the distribution tape provides an example of this func-
                      tion's use.




                              Wideband Computers, Inc.                                                    5-65
ADSP-21K Optimized DSP Library User’s Manual




 atanh_wci ( x )
                NAME    Inverse Hyperbolic Tangent

          DESCRIPTION   This function computes the inverse hyperbolic tangent of a floating-point number, x.

           ALGORITHM    return = tanh – 1( x )
             SYNOPSIS   float atanh_wci ( float x )

              DOMAIN    -1.0 to +1.0

           ACCURACY     7.75 decimal digits

       EXECUTION TIME   59 cycles

               NOTES    The file tatanh.c included in the distribution tape provides an example of this func-
                        tion's use.




5-66                                Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




bartlett ( a, i, c, k, n )
                 NAME        Bartlett Window

         DESCRIPTION         This function generates a Bartlett window multiply on the elements of input vector a
                             and places the results in output vector c.

          ALGORITHM
                                                         1 
                                               m – -- n 
                                                          2
                                                           -
                              C mk = Ami • 1 – ---------------- 
                                                    -- n 
                                                     1
                                                      -
                                                    2           

                               m = { 0, 1, 2, …n – 1 }
             SYNOPSIS        void bartlett ( a, i, c, k, n )
                             float *a ; /*         Pointer to input vector a                                    */
                             int      i ; /*       Address stride in words for input vector a                   */
                             float *c ; /*         Pointer to output vector c                                   */
                             int      k ; /*       Address stride in words for output vector c                  */
                             int      n ; /*       Element count                                                */

              DOMAIN         -3.4 x 1038 to +3.4 x 1038

           ACCURACY          7.75 decimal digits

     EXECUTION TIME          44 + 17 * ( N-1 ) cycles

                NOTES        The file tbartlett.c included in the distribution diskette provides an example of this
                             function’s use.

                             The Bartlett window is also known as a triangular window.




                                     Wideband Computers, Inc.                                                     5-67
ADSP-21K Optimized DSP Library User’s Manual




 biquad ( x, d, c, y, n )
                 NAME       Bi-Quad IIR Filter

         DESCRIPTION        Using a bi-quad implementation, this function computes an IIR ( Infinite Impulse
                            Response ) filter using coefficients stored in input vector c, delay node points stored in
                            input buffer d, and applied to the elements of input vector x. The results are stored in
                            output vector y.

                                                              –1                –2
                                      B0 + B1 z + B2 z
          ALGORITHM         H ( z ) = -----------------------------------------------
                                                                                    -
                                                          –1                   –2
                                        1 – A1 z – A2 z

                            where



                            Dm = A2 • Dm – 2 + A1 • Dm – 1 + xm
                            Y m = B2 • Dm – 2 + B1 • Dm – 1 + Dm


                              m = { 0, 1 , 2 , …, n – 1 }
             SYNOPSIS       void biquad ( x, d, c, y, n )
                            float       *x ; /* Pointer to input buffer vector x of length n                        */
                            float       *d ; /* Pointer to input delay node buff vector d of length 2               */
                            float       *c ; /* Pointer to input coeff buffer vector c of length 5                  */
                            float       *y ; /* Pointer to output buffer vector y of length n                       */
                            int           n ; /* Number of input/output samples to compute                          */


              DOMAIN        -3.4E+38 to 3.4E+38

           ACCURACY         7.75 decimal digits




5-68                                       Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




biquad ( x, d, c, y, n )
    EXECUTION TIME         65 + 13*N

               NOTES       This is a single bi-quad form of an infinite impulse response filter (IIR), defined by the
                           first equation shown above. It is implemented using a delay node buffer d shown in the
                           second and third equation shown above. The coefficients a[ ] and b[ ] are passed in a
                           single array c[ ] given by the following:

                              c [ 0 ] = A2 c [ 1 ] = B2 c [ 2 ] = A1 c [ 3 ] = B1 c [ 4 ] = B0
                           Prior to executing the filter loop, the two “oldest” delay node values are loaded from
                           buffer d[ ]. When the filter loop has completed (n samples have been processed) the
                           two “newest” delay node values are written to d[ ]. In this way the filter delay node
                           states are retained between calls, allowing filtering on blocks of contiguous samples.
                           The user is responsible for allocating the delay node array and for initializing its ele-
                           ments to zero prior to the first call to biquad( ).

                           Defining

                              d0 = D m               d1 = D m – 1            d2 = D m – 2
                           Then

                              d0 = c0 • d2 + c2 • d1 + xm
                              ym = c1 • d2 + c 3 • d1 + c 4 • d0


                           d2 = d1
                           d1 = d0


                             m = { 0 , 1 , 2, … , n – 1 }


                           The coefficient buffer length is defined symbolically in the file dsppac.h as
                           DSP_BIQUAD_NCOEFF. The delay node buffer length is defined symbolically in
                           the file dsppac.h as DSP_BIQUAD_NDELAY.

                           The number of input samples n must be greater than or equal to 5.

                           The file tbiquad.c included in the distribution tape provides an example of this func-
                           tion’s use.


                                   Wideband Computers, Inc.                                                       5-69
ADSP-21K Optimized DSP Library User’s Manual




 blkman ( a, i, c, k, w, h, n )
                 NAME    Blackman Window Multiply

         DESCRIPTION     Multiplies the input vector a[ ] by a Blackman window and stores the result to vector
                         c[ ].

          ALGORITHM                                               2πmi                      4πmi
                          C   mk =   A   mi   • 0.42 – 0.50 • cos ------------ + 0.08 • cos ------------
                                                                             -                         -
                                                                       N                         N
                          m = { 0, 1, 2, …, n – 1 }

             SYNOPSIS    void blkman ( a, i, c, k, w, h, n )
                         float       dm *a ; /* Pointer to input vector a                                  */
                         int                   i ; /* Element stride for vector a                          */
                         float       dm *c ; /* Pointer to output vector c                                 */
                         int                   k ; /* Element stride for vector c                          */
                         float       pm *w ; /* Pointer to cosine weights array                            */
                         int                   h ; /* Element stride for weights array                     */
                         int                   n ; /* Element count for vector c                           */

              DOMAIN     -3.4E+38 to 3.4E+38

           ACCURACY      7.75 decimal digits




5-70                                 Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




blkman ( a, i, c, k, w, h, n )
    EXECUTION TIME      41 + 4*(N-1) cycles

               NOTES    The file tblkman.c included in the distribution tape provides an example of this func-
                        tion’s use.

                        For real-time applications, the Blackman window can be computed once, and a simple
                        multiply used to window data as shown in the variable W ml . The Blackman Win-
                        dow is computed using the winwts( ) function found in the DSP Pac library. The win-
                        wts( ) function computes the weights array using the sin and cosine functions. This
                        array is pointed to by variable w listed in the synopsis section above.

                        The blkman( ) function is a vector function. You may therefore use the stride argu-
                        ments i, k and h to decimate both the input and output for data congruence. For exam-
                        ple, suppose you use winwts( ) to compute the FFT weights for a 16K FFT. This would
                        result in an fftwts array whose length would be 16,384 points. If you were to later
                        decide to compute an FFT of length 1,024 and run a Blackman Window on the results,
                        you would not need to rerun the winwts( ) function to generate new weights. Simply
                        use the old weights and stride by 16 (16,384/1024 = 16) on stride element h to obtain
                        the correct Blackman window FFT weights . In this manner you need only compute
                        winwts( ) once and later us them for varying length FFTs and windowing functions.

                        The cosine arguments are held in input vector w[ ] and can be computed from the win-
                        wts( ) function. Note that larger vector sizes of w[ ] can be used by changing the stride
                        for w[ ]. For example, if w[ ] were computed for a window of size 2,048, but a Black-
                        man Window of 1,024 was needed, use a stride of 2,048/1,024 = 2.

                        Note that the Blackman window has a passband ripple of 0.0017 dB, a maximum stop-
                        band attenuation of 74 dB, and a 57 dB main lobe relative to side lobe.




                                 Wideband Computers, Inc.                                                    5-71
ADSP-21K Optimized DSP Library User’s Manual




 blkmanh ( a, i, c, k, w, h, n )
                NAME     Blackman-Harris Window Multiply

         DESCRIPTION     Multiplies the input vector a[ ] by a Blackman-Harris window and stores the result to
                         output vector c[ ].
                                                               2πmi                         4πmi                         6πmi
          ALGORITHM      C mk = A mi • 0.35875 – 0.48829 • cos ------------ + 0.14128 • cos ------------ – 0.01168 • cos ------------
                                                                          -                            -                            -
                                                                    N                            N                            N
                           m = { 0, 1, 2, …, n – 1 }

             SYNOPSIS    void blkmanh ( a, i, c, k, w, h, n )
                         float        dm *a ; /* Pointer to input vector a                                                     */
                         int                  i ; /* Element stride for vector a                                               */
                         float        dm *c ; /* Pointer to output vector c                                                    */
                         int                  k ; /* Element stride for vector c                                               */
                         float        pm *w ; /* Pointer to cosine weights array                                               */
                         int                  h ; /* Element stride for weights array                                          */
                         int                  n ; /* Element count for vector c                                                */

              DOMAIN     -3.4E+38 to 3.4E+38

           ACCURACY      7.75 decimal digits




5-72                                   Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




blkmanh ( a, i, c, k, w, h, n )
    EXECUTION TIME      54 + 6*(N-1) cycles

               NOTES    The file tblkmanh.c included in the distribution tape provides an example of this
                        function’s use.

                        For real time applications, the Blackman-Harris window can be computed once, and a
                        simple multiply used to window data, as shown in the variable W ml . The Blackman-
                        Harris Window is computed using the winwts( ) function found in the DSP Pac library.
                        The winwts( ) function computes the weights array using the sin and cosine functions.
                        This array is pointed to by variable w listed in the synopisis section above.

                        The blkmanh function is a vector function. You may therefore use the stride argu-
                        ments i, k and h to decimate both the input and output for data congruence. For exam-
                        ple, suppose you use winwts( ) to compute the FFT weights for a 16K point FFT. This
                        would result in an fftwts array whose length would be 16,384 points. If you were to
                        later decide to compute an FFT of length 1,024 and run a Blackman-Harris Window on
                        the results, you would not need to rerun the winwts( ) function to generate new
                        weights. Simply use the old weights and stride by 16 (16,384/1024 = 16) on stride ele-
                        ment h to obtain the correct window FFT weights . In this manner you need only com-
                        pute winwts( ) once and later us them for varying length FFTs and windowing
                        functions.

                        The cosine arguments are held in input vector w[ ] and can be computed from the win-
                        wts( ) function. Note that larger vector sizes of w[ ] can be used by changing the stride
                        for w[ ]. For example, if w[ ] were computed for a window of size 2,048, but a Black-
                        man Window of 1,024 was needed, use a stride of 2,048/1,024 = 2.

                        Note that the Blackman-Harris window has a passband ripple of 0.0017 dB, a maxi-
                        mum stopband attenuation of 74 dB, and a 57 dB main lobe relative to side lobe.




                                Wideband Computers, Inc.                                                     5-73
ADSP-21K Optimized DSP Library User’s Manual




 cacort ( a, c, m, n )
                 NAME    Complex Auto-Correlation (Time Domain)

          DESCRIPTION    Computes the time domain auto-correlation of the complex elements stored in input
                         vector a[ ]. Values m and n define the number of auto-correlation values to compute.
                         The resulting auto-correlation values are stored in output complex vector c[ ].
                                 n–i–1

           ALGORITHM     Ci=        ∑       Ai + j • Aj              i = { 0, 1, 2, …m – 1 }
                                   j=0
             SYNOPSIS    void cacort ( a, c, m, n )
                         complex dm *a ; /* Pointer to input vector a[ ]                                         */
                         complex dm *c ; /* Pointer to output vector c[ ]                                        */
                         int               m ; /* Lag count m                                                    */
                         int               n ; /* Number of elements in vector a[ ]                              */


              DOMAIN     -3.4E+38 to 3.4E+38

           ACCURACY      7.75 decimal digits

       EXECUTION TIME    39 + ( 9 + 5 * n ) * n

                NOTES    The file tacort.c included in the distribution tape provides an example of this func-
                         tion’s use.

                         Note that the lag count m must be less than or equal to the number of floating-point
                         elements (i.e. m ≤ n ).

                         The strides of vectors a[ ] and c [ ] must be 1.




5-74                                Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




ccdotpr ( a, i, b, j, c, k, n )
                NAME     Complex Dot Product Multiply by Conjugate

        DESCRIPTION      This function computes the complex dot product of complex input vector a by the
                         complex conjugate of input vector b and stores the results in complex output vector c.
                         This can be alternatively expressed as C=AB*.

         ALGORITHM                       n–1

                            Re { C } =   ∑ Re{ Ami } • Re { Bmj } + Im { Ami } • Im{ Bmj }
                                         m= 0
                                         n–1

                            Im { C } =    ∑ –Re{ Ami } • Im { Bmj } + Im { Ami } • Re { Bmj }
                                         m= 0
                          m = { 0, 1, 2…n – 1 }
            SYNOPSIS     void ccdotpr ( a, i, b, j, c, k, n )
                         complex *a ; /*       Pointer to complex input vector a                    */
                         int        i ; /*     Address stride in words for input vector a           */
                         complex *b ; /*       Pointer to complex input vector b                    */
                         int        j ; /*     Address stride in words for input vector b           */
                         complex *c ; /*       Pointer to complex output vector c                   */
                         int        k ; /*     Address stride in words for output vector c          */
                         int        n ; /*     Element count                                        */


             DOMAIN      -3.4 x 1038 to +3.4 x 1038

          ACCURACY       7.75 decimal digits

    EXECUTION TIME        64 + 4*(N-1) cycles

               NOTES     The file tccdotpr.c included in the distribution diskette provides an example of this
                         function’s use.




                                  Wideband Computers, Inc.                                                       5-75
ADSP-21K Optimized DSP Library User’s Manual




 ccmmul ( a, b, x, y, b, z, c )
                NAME    Complex Matrix Multiply By Congugate of Complex Matrix

         DESCRIPTION    This function computes the multiplication of the conjugate of complex input matrix
                        a [ ] [ ] times the elements of complex input matrix b[ ] [ ]. The dimensions of com-
                        plex input matrix a[ ] [ ] are x and y, while the dimensions of complex input matrix
                        b[ ] [ ] are defined by input scalars y and z. The results are stored in complex output
                        matrix c[ ] [ ], which is of dimensions x and z.

          ALGORITHM                           y

                           Re ( C ij ) =    ∑ [ ( Re )Aik • ( Re )Bkj + ( Im )Aik • ( Im )Bkj ]
                                           k=1
                                            y
                           Im(C
                                  ij ) =    ∑      [ ( Re )C ik • ( Im )B kj – ( Re )B kj • ( Im )A ik ]
                                           k=1
                         for      i = { 0, 1, …x }
                         for      j = { 0, 1, …z }

            SYNOPSIS    void ccmmul( a, x, y, b, z, c )
                        complex dm *a ; /*             Pointer to complex input matrix a[ ][ ]               */
                        int                x ; /*      Number of rows in complex matrix a[ ][ ]              */
                        int                y ; /*      Number of columns in matrix a[ ][ ] And               */
                                                  /*   Number of rows in complex matrix b[ ][ ]              */
                        complex dm *b ; /*             Pointer to complex input matrix b[ ][ ]               */
                        int                z ; /*      Number of columns in matrix b[ ][ ]                   */
                        complex dm *c ; /*             Pointer to complex output matrix c[ ][ ]              */

              DOMAIN    -3.4 x 1038 to +3.4 x 1038

           ACCURACY     7.75 decimal digits




5-76                                Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




ccmmul ( a, b, x, y, b, z, c )
    EXECUTION TIME     62 + ( 6 + ( 12 + 7 * Y ) * Z ) * X cycles

              NOTES    The file tccmmul.c included in the distribution diskette provides an example of this
                       function’s use.

                       a[x][y] = 1, 1 2, 2 3, 3 4, 4
                                 5, 5 6, 6 7, 7 8, 8
                                 9, 9 10, 10 11, 11 12, 12


                                  1, 2      3, 4     5, 6
                       b[y][z] = 7, 8      9, 10    11, 12
                                 13, 14    15, 16   17, 18
                                 19, 20    21, 22   23, 24
                       x = 3, y = 4, z = 3 ;

                       ccmmul ( a, x, y, b, z, c ) ;

                       The resulting values in output matrix c [ ] [ ] would be as follows:

                       c[x][y] = 270, 10 310, 10 350, 10
                                 606, 26 610, 26 814, 26
                                 942, 42 1110, 42 1278, 42
                       The storage methodology for matrices is by rows. Matrices can be thought of as one
                       long array (vector) where the beginning of each row is offset by the number of col-
                       umns.




                                 Wideband Computers, Inc.                                                 5-77
ADSP-21K Optimized DSP Library User’s Manual




 ccmsmul ( a, x, y, b, c )
                NAME    Complex Scalar-Complex Congugate Matrix Multiplication

         DESCRIPTION    This function computes the multiplication of the conjugate of the complex input
                        matrix a[ ] [ ] times complex input scalar b. The dimensions of complex input matrix
                        a[ ] [ ] are x and y. The results are stored in complex output matrix c[ ] [ ], which is of
                        dimensions x and y.

         ALGORITHM
                         Cxy = B • Axy

            SYNOPSIS    void ccmsmul( a, x, y, b, c )
                        complex dm *a ; /*           Pointer to complex input matrix a[ ][ ]                    */
                        int             x ; /*       Number of rows in complex matrix a[ ][ ]                   */
                        int             y ; /*       Number of columns in matrix a[ ][ ]                        */
                        complex dm *b ; /*           Pointer to complex input scalar b                          */
                        complex dm *c ; /*           Pointer to complex output matrix c[ ][ ]                   */

             DOMAIN     -3.4 x 1038 to +3.4 x 1038

          ACCURACY      7.75 decimal digits




5-78                                Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




ccmsmul ( a, x, y, b, c )
   EXECUTION TIME      46 + 2 * X * Y cycles

              NOTES    The file tccmsmul.c included in the distribution diskette provides an example of this
                       function’s use.

                       a[x][y] =     1, 2     3, 4     5, 6
                                     7, 8    9, 10    11, 12
                                    13, 14   15, 16   17, 18
                                    19, 20   21, 22   23, 24
                                    25, 26   27, 28   29, 30


                       b = {8,2}

                       x = 8, y = 7 ;

                       ccmsmul ( a, x, y, b, c ) ;

                       The resulting values in output matrix c [ ] [ ] would be as follows:

                       c[x][y] =      12, – 14    32, – 26       52, – 38
                                      72, – 50    92, – 62      112, – 74
                                     132, – 86    152, – 98    172, – 110
                                    192, – 122   212, – 134    232, – 146
                                    252, – 158   272, – 170    292, – 182



                       The storage methodology for matrices is by rows. Matrices can be thought of as one
                       long array (vector) where the beginning of each row is offset by the number of col-
                       umns.




                                   Wideband Computers, Inc.                                               5-79
ADSP-21K Optimized DSP Library User’s Manual




 cccort ( a, b, c, m, n )
                NAME        Complex Cross-Correlation (Time Domain)

          DESCRIPTION       Computes the time domain (real) cross-correlation of the time domain (real) elements
                            stored in complex input vectors a[ ] and b[ ]. The result is stored in complex output
                            vector c [ ]. Values m and n define the number of cross-correlation values to compute.
                            The implementation uses a time domain technique.
                                    n–i–1

           ALGORITHM
                            Ci =       ∑       Ai + j • Bj                 i = { 0, 1, 2, …, m – 1 }
                                      j=0


             SYNOPSIS       void cccort ( a, b, c, m, n )
                            complex dm *a ; /* Pointer to input vector a[ ]                                      */
                            complex dm *b ; /* Pointer to input vector b[ ]                                      */
                            complex dm *c ; /* Pointer to output vector c[ ]                                     */
                            int              m ; /* Lag count m                                                  */
                            int              n ; /* Number of elements in vector c[ ]                            */

              DOMAIN        -3.4E+38 to 3.4E+38

           ACCURACY         7.75 decimal digits

       EXECUTION TIME       41 + ( 9 + 5 * n ) * m

                NOTES       The file tcccort.c included in the distribution tape provides an example of this func-
                            tion’s use.

                            Note that the lag count must be less than or equal to the number of floating-point ele-
                            ments (i.e. m ≤ n ).

                            The strides of vectors a[ ], b[ ], and c[ ] must always be 1.




5-80                                    Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




ccort ( a, b, c, m, n )
               NAME       Cross-Correlation (Time Domain)

        DESCRIPTION       Computes the time domain (real) cross-correlation of the time domain (real) elements
                          stored in input vectors a[ ] and b[ ]. The result is stored in output real vector c [ ]. Val-
                          ues m and n define the number of cross-correlation values to compute. The implemen-
                          tation uses a time domain technique.
                                    n–i–1
         ALGORITHM        Cm =         ∑        Ai + j • Bj                 i = { 0, 1, 2, …, m – 1 }
                                      j=0

            SYNOPSIS      void ccort ( a, b, c, m, n )
                          float      *a ; /* Pointer to input vector a[ ]                                        */
                          float      *b ; /* Pointer to input vector b[ ]                                        */
                          float      *c ; /* Pointer to output vector c[ ]                                       */
                          int         m ; /* Lag count m                                                         */
                          int         n ; /* Number of elements in vector c[ ]                                   */

             DOMAIN       -3.4E+38 to 3.4E+38

          ACCURACY        7.75 decimal digits

    EXECUTION TIME        32 + 9 * M + (M+1)(2*N-M)

               NOTES      The file tccort.c included in the distribution tape provides an example of this func-
                          tion’s use.

                          Note that the lag count must be less than or equal to the number of floating-point ele-
                          ments (i.e. m ≤ n ).

                          The strides of vectors a, b, and c must always be 1.




                                   Wideband Computers, Inc.                                                        5-81
ADSP-21K Optimized DSP Library User’s Manual




 cdesamp ( data, coeff, output, d, n, p )
                NAME    Complex Decimating Finite Impulse Response (FIR) Filter

         DESCRIPTION   The function computes the convolution of complex vectors data [ ] and coeff [ ] plac-
                       ing the results in complex vector output [ ]. The number of output samples n and the
                       number of coefficients p may be dissimilar. n elements will be written to output [ ].

                       Complex vector data [ ] represents the real and imaginary (I and Q) components of the
                       input data respectively. Likewise, complex vector coeff [ ] represents the real and
                       imaginary ( I and Q) components of the coefficient data. A complex multiply and add
                       is performed to compute the convolutional output. The decimation factor d is used to
                       stride the next starting point in data [ ].

                                             p–1
          ALGORITHM
                        Output [ i ] =       ∑ data [ i • d + j ] • coeff [ p – j – 1 ]
                                             j=0
                       i = { 0, 1, 2…n – 1 }
            SYNOPSIS   void cdesamp ( data, coeff, output, d, n, p )
                       complex dm *data         ; /* Complex input data ( len n+p-1 )                    */
                       complex pm *coeff        ; /* Complex coefficients ( len p )                      */
                       complex dm *output ; /* Complex output data ( len n )                             */
                       int           d          ; /* Decimation factor                                   */
                       int            n         ; /* Number of output samples                            */
                       int            p         ; /* Number of coefficients                              */



             DOMAIN    -3.4E+38 to 3.4E+38

          ACCURACY     7.75 decimal digits




5-82                              Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cdesamp ( data, coeff, output, d, n, p )
    EXECUTION TIME    36 + ( 7 + 5 * p ) * n cycles

              NOTES   The file tcdesamp.c included in the distribution tape provides an example of this func-
                      tion’s use.

                      The number of filter output samples to generate can be obtained as follows:

                        n = ( ndata – p ) ⁄ d + 1
                      where ndata is the number of elements in data[ ].

                      A complex correlation can be performed by reversing the order of the coefficients vec-
                      tor.




                              Wideband Computers, Inc.                                                   5-83
ADSP-21K Optimized DSP Library User’s Manual




 cdotpr ( a, i, b, j, c, k, n )
                 NAME     Complex Dot Product

          DESCRIPTION     This function computes the complex dot product of complex input vector a and com-
                          plex input vector b and stores the results in complex output vector c. This can altena-
                          tively thought of as C = A • B .

           ALGORITHM                    n–1

                           Re { C } =    ∑ Re { Ami } • Re { Bmj } – Im{ Ami } • Im{ Bmj }
                                        m= 0
                                         n–1

                           Im { C } =    ∑ Re{ Ami } • Im { Bmj } + Im { Ami } • Re { Bmj }
                                        m= 0
                           m = { 0, 1, 2…n – 1 }
             SYNOPSIS     void cdotpr ( a, i, b, j, c, k, n )
                          complex *a ; /*       Pointer to complex input vector a                     */
                          int        i ; /*     Address stride in words for input vector a            */
                          complex *b ; /*       Pointer to complex input vector b                     */
                          int        j ; /*     Address stride in words for input vector b            */
                          complex *c ; /*       Pointer to complex output vector c                    */
                          int        k ; /*     Address stride in words for output vector c           */
                          int        n ; /*     Element count                                         */


               DOMAIN     -3.4 x 1038 to +3.4 x 1038

           ACCURACY       7.75 decimal digits

       EXECUTION TIME      64 + 4*(N-1) cycles

                 NOTES    The file tcdotpr.c included in the distribution diskette provides an example of this
                          function’s use.




5-84                                    Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




ceil_wci ( x )
                 NAME    Round Up to Nearest Integer

        DESCRIPTION      This function computes the smallest integral value greater than or equal to the float-
                         ing-point number x. A floating-point representation of this integer value is returned.

         ALGORITHM       return = smallest int ≥ x
            SYNOPSIS     float ceil_wci ( float x )

             DOMAIN      -3.4E+38 to 3.40E+38

          ACCURACY       7.75 decimal digits

    EXECUTION TIME       20 cycles

                 NOTES   The file tceil.c included in the distribution tape provides an example of this function's
                         use.




                                 Wideband Computers, Inc.                                                     5-85
ADSP-21K Optimized DSP Library User’s Manual




 cfft ( xr, xi, wr, wi, wstr, yr, yi, n )
                  NAME     Fast Fourier Transform Of Complex Input Data

          DESCRIPTION      Computes the Fast Fourier Transform of the complex input elements stored in com-
                           plex input vector a. The results are stored in complex output vector c.
                                     n–1
                                                 – i2πmk ⁄ n
           ALGORITHM       Cm =       ∑ Ake                               m = { 0, 1, 2, …, n – 1 }
                                     k=0

              SYNOPSIS     void cfft ( xr, xi, wr, wi, wstr, yr, yi, n )
                           float     dm    *xr    ; /* Pointer to real input data                         */
                           float     dm   *xi     ; /* Pointer to imaginary input data                    */
                           float     pm    *wr    ; /* Pointer to cosine table                            */
                           float     dm    *wi    ; /* Pointer to sine table                              */
                           int              wstr ; /* Cosine/sine table stride                            */
                           float     dm    *yr    ; /* Pointer to real output data                        */
                           float     pm   *yi     ; /* Pointer to imaginary output data                   */
                           int              n     ; /* FFT Size (In Complex Elements)                     */

               DOMAIN      -3.4E+38 to 3.4E+38

            ACCURACY       7.75 decimal digits

       EXECUTION TIME      See Attached Table Below




5-86                                  Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cfft ( xr, xi, wr, wi, wstr, yr, yi, n )
                NOTES     This is a radix-2 Fast Fourier Transform using parallel data memory/program memory
                          data accesses to maximize the throughput on the 21020/60/62 processor. The complex
                          input data is separated into real and imaginary parts, xr and xi. These vectors must be
                          aligned on an address which is an integer multiple of the FFT size, as required for
                          21K bit-reverse addressing. The input vectors are both in data memory; the imagi-
                          nary data is bit-reversed into program memory at the beginning of the routine. The
                          number of elements n supplied to the algorithm must be an integral power of two and
                          a minimum of 32.

                          The complex output is separated into real and imaginary parts, yr and yi. These vec-
                          tors may have arbitrary address alignment; however yr is in the data memory and yi is
                          in the program memory. Vectors xr and xi must be in data memory and each must be
                          aligned to an integral multiple of n.

                          Vectors wr and wi are in program memory and data memory respectively and are
                          given the values:

                                  wr [ k ] = cos [ 2πk ⁄ wst*n ]     k = ( 0, 1, …, wstn ⁄ 2 – 1 )program memory

                                  wi [ k ] = sin [ 2πk ⁄ wst*n ]     k = ( 0, 1, …, wstn ⁄ 2 – 1 )data memory

                          The weight stride wst allows cfft() to be called with varying sizes n from a single set
                          of weights.These weights are generated using the fftwts() function.

                          This precomputed FFT weight approach was implemented in order to ensure accurate
                          results and boost the available cfft() dynamic range to approximately 130 dB for
                          longer length (>16K) FFTs. This is accomplished by using an implementation that
                          does not rely on a recursive call to a sin/cosine approximation routine, as found in
                          other implementations. Rather, the FFT weights are precomputed accurately using the
                          fftwts() function. This is sufficient for A/D converters with bit lengths up to 22 bits.

                          The number of elements n must be an integral power of two and a minimum of 32.
                          Vector yr is in data memory and has a minimum size of n. Vector yi is in program
                          memory and has a minimum size of n.

                          The file tcfft.c included in the distribution tape provides an example of this function’s
                          use.




                                  Wideband Computers, Inc.                                                      5-87
ADSP-21K Optimized DSP Library User’s Manual




 cfft ( xr, xi, wr, wi, wstr, yr, yi, n )
       SPECIAL NOTES       Previous users have sometimes reported problems associated with implementing inter-
                           rupt service routines (ISRs), when used in conjunction with the FFT routines ( cfft( ),
                           cffti( ), rfft( ), rffti( ) ). Observations related to the Wideband technical staff typically
                           include a description of the Wideband routine executing perfectly, but unable to return
                           to an exact state after being interrupted by the ISR ( what is described as a “tumble
                           into the weeds.” )

                           The Wideband Fast Fourier transforms, both complex and real, forward and inverse,
                           use the built-in bit reversing and circular addressing capabilites of the SHARC archi-
                           tecture. Also, other routines such as some of the FIR filters use the SHARC’s internal
                           circular addressing capabilities.

                           End users are usually cognizant that their ISR calling routine is responsible for saving
                           and restoring the registers of the Wideband routines. However, end users sometimes
                           forget to save and restore ( push and pop ) the mode 1 regiser, which is associated with
                           bir reversing and the B ( base ) and L ( length ) registers associated with circular
                           addressing. In such circumstances where they are not saved and restored by the ISR
                           they are unable to return the proper length parameter ( L Register ) used for circular
                           addressing or the proper mode ( Mode 1 Register ) used in Bit Reversing. This results
                           in the strange manefestations users sometimes report.

                           To properly save and restore the above mentioned registers in an ISR, refer to page 4-
                           21, section 4.3 of the Analog Devices ADSP-21000 Family C Tools Manual (#31-
                           000005-08, dated August 95) which references examples of in line assembly code
                           within C code to save and restore registers.

                           For a detailed review of the relationships between the various FFT functions and how
                           to use them with one another, see the final section of Chapter 4.




5-88                                   Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




Performance Issues

The inital timing shown below for the 32 point to 4,096 point FFTs were timed using the
Analog Devices simulator.

Performance Timings For Complex FFTs

    Number of
     Points           Processor Cycles
         8           See cfft8( ) function
        16           See cfft16( ) function
        32                771 Cycles
        64               1,274 Cycles
        128              2,368 Cycles
        256              4,724 Cycles
        512             10,060 Cycles
       1,024            21,618 Cycles
       2,048            46,744 Cycles
       4,096            101,054 Cycles
       8,192            217,828 Cycles
      16,384            467,722 Cycles
      32,768           1,000,240 Cycles
      65,536           2,130,774 Cycles




              Wideband Computers, Inc.                                            5-89
ADSP-21K Optimized DSP Library User’s Manual




 cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n )
               NAME    Complex 2-Dimensional Fast Fourier Transform

         DESCRIPTION   Computes a 2-Dimensional Fast Fourier Transform of the complex input elements
                       stored in vector a[ ]. The results are stored in complex output vector c[ ].

                                     n–1n–1
                                                     – 2 πj ( ( r ⋅ R + c ⋅ C ) ⁄ n )
         ALGORITHM
                        C r, c =     ∑ ∑ Ake
                                     r = 0c = 0
                        R = { 0, 1, …n – 1 }
                        C = { 0, 1, …n – 1 }
            SYNOPSIS   void cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n )
                       float    dm *xr        ; /* Pointer to real input/output data                   */
                       float    dm *xi        ; /* Pointer to imaginary input/output data */
                       float    pm *wr        ; /* Pointer to cosine table                             */
                       float    dm *wi        ; /* Pointer to sine table                               */
                       int           wstr     ; /* Consine/sine Table table                            */
                       float    dm *tmpdm     ; /* Pointer to real output data                         */
                       float    pm *tmppm     ; /* Pointer to imag output data                         */
                       int       n            ; /* CFFT2D Size (Complex Elements n x n)                */

             DOMAIN    -3.4E+38 to 3.4E+38

          ACCURACY     7.75 decimal digits




5-90                              Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n )
    EXECUTION TIME    32 x 32 Pts.            44,532 cycles

                      64 x 64 Pts.           165,364 cycles

                      128 x 128 Pts.         659,572 cycles




                              Wideband Computers, Inc.               5-91
ADSP-21K Optimized DSP Library User’s Manual




 cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n )
               NOTES   The input data is an nxn complex matric x separated into real and imaginary parts xr
                       and xi stored as follows:

                        Re ( x r, c ) = xr [ r • n + c ]
                       r = { 0, 1, …, n – 1 }       c = { 0, 1, …, n – 1 }

                        Im ( x r, c ) = xi [ r • n + c ]
                       r = { 0, 1, …, n – 1 }       c = { 0, 1, …, n – 1 }

                       Variables r and c are the row and column numbers.

                       The DFT output replaces the input, and is stored as follows:

                       Re ( F R, C ) = xr [ R • n + C ]
                       R = { 0, 1, …, n – 1 }     C = { 0, 1, …, n – 1 }

                       Im ( F R, C ) = xi [ R • n + C ]
                       R = { 0, 1, …, n – 1 }     C = { 0, 1, …, n – 1 }

                       A radix-2 Fast Fourier Transform (FFT) algorithm is used to compute the individual
                       row and column DFTs.

                       The number of elements n must be an integral power of two and a minimum of 32.

                       Vectors xr and xi must be in data memory and are adress-aligned to an integral multi-
                       ple of n.

                       Vectors wr and wi must be in program memory and data memory respectively and are
                       pre-computed to be:

                       wr [ k ] = cos [ 2πk ⁄ wst*n ]                 k = ( 0, 1, …, wstn ⁄ 2 – 1 )

                       wi [ k ] = sin [ 2πk ⁄ wst*n ]                k = ( 0, 1, …, wstn ⁄ 2 – 1 )

                       Vector tmpdm must be in data memory, having a minimum size of n, and be address-
                       aligned to an integral multiple of n.

                       Vector tmppm must be in program memory and have a minimum size of n,and be
                       address-aligned to an integral multiple of n.

                       The file tcfft2d.c included in the distribution tape provides an example of this func-
                       tion’s use.


5-92                                 Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cfft8 ( xr, xi, yr, yi )
                 NAME      8-Point Complex Fast Fourier Transform (Inline)

         DESCRIPTION       Computes the Fast Fourier Transform of the complex input elements stored in input
                           vector xr and xi. The results are stored in output vector yr and yi.



          ALGORITHM                       7
                                                   – 2πj ( m • k ⁄ 8 )
                             Ym =         ∑ Xke                                 m = { 0, 1, 2, …, 7 }
                                       k=0

             SYNOPSIS      void cfft8 ( xr, xi, yr, yi )
                           float     dm    *xr     ; /* Pointer to real input data                         */
                           float     dm   *xi     ; /* Pointer to imaginary input data                     */
                           float     dm    *yr    ; /* Pointer to real output data                         */
                           float     pm   *yi     ; /* Pointer to imaginary output data                    */

               DOMAIN      -3.4E+38 to 3.4E+38

           ACCURACY        7.75 decimal digits




                                   Wideband Computers, Inc.                                                5-93
ADSP-21K Optimized DSP Library User’s Manual




 cfft8 ( xr, xi, yr, yi )
       EXECUTION TIME       184 Cycles

                  NOTES     This is an 8-point radix-2 Fast Fourier Transform using parallel data memory/program
                            memory data accesses to maximize the throughput on the 21020/60/62 processor.

                            The complex input data is separated into real and imaginary parts, xr and xi. These
                            vectors must be aligned on an address which is an integer multiple of the FFT
                            size, as required for 21K bit-reverse addressing. The input vectors are both in data
                            memory; the imaginary data is bit-reversed into program memory at the beginning of
                            the routine.

                            This algorithm utilizies a decimation in time approach. As the cffti( ) function requires
                            a minimum of 32-points as input, there is no corresponding inverse algorithm for this
                            routine. The complex output is separated into real and imaginary parts, yr and yi.
                            These vectors may have arbitrary address alignment; however yr is in the data mem-
                            ory and yi is in the program memory.
                                •Vectors xr and xi are defined in cfft8dta.asm using the dm_align segment to
                                  ensure address alignment.

                            For a detailed review of the relationships between the various FFT functions and how
                            to use them with one another, see the final section of Chapter 4.

                            The file tcfft8.c included in the distribution tape provides an example of this func-
                            tion’s use.




5-94                                     Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cfft16 ( xr, xi, yr, yi )
                 NAME       16-Point Complex Fast Fourier Transform (Inline)

         DESCRIPTION        Computes the Fast Fourier Transform of the complex input elements stored in input
                            vector xr and xi. The results are stored in output vector yr and yi.

                                       15

                                       ∑ Xke
          ALGORITHM                               – 2πj16
                            Ym =                                     m = { 0, 1, 2, …, 15 }
                                     k=0
             SYNOPSIS       void cfft16 ( xr, xi, yr, yi )
                            float     dm    *xr     ; /* Pointer to real input data                         */
                            float     dm    *xi    ; /* Pointer to imaginary input data                     */
                            float     dm    *yr    ; /* Pointer to real output data                         */
                            float     pm    *yi    ; /* Pointer to imaginary output data                    */

              DOMAIN        -3.4E+38 to 3.4E+38

           ACCURACY         7.75 decimal digits




                                    Wideband Computers, Inc.                                                5-95
ADSP-21K Optimized DSP Library User’s Manual




 cfft16 ( xr, xi, yr, yi )
       EXECUTION TIME        388 Cycles

                 NOTES       This is an 16-point radix-2 Fast Fourier Transform using parallel data memory/pro-
                             gram memory data accesses to maximize the throughput on the 21020/60/62 proces-
                             sor.

                             The complex input data is separated into real and imaginary parts, xr and xi. These
                             vectors must be aligned on an address which is an integer multiple of the FFT
                             size, as required for 21K bit-reverse addressing. The input vectors are both in data
                             memory; the imaginary data is bit-reversed into program memory at the beginning of
                             the routine.

                             This algorithm utilizies a decimation in time approach. As the cffti( ) function requires
                             a minimum of 32-points as input, there is no corresponding inverse algorithm for this
                             routine. The complex output is separated into real and imaginary parts, yr and yi.
                             These vectors may have arbitrary address alignment; however yr is in the data mem-
                             ory and yi is in the program memory.
                                 •Vectors xr and xi are defined in cfft16dt.asm using the dm_align segment to
                                   ensure address alignment.

                             For a detailed review of the relationships between the various FFT functions and how
                             to use them with one another, see the final section of Chapter 4.

                             The file tcfft16.c included in the distribution tape provides an example of this func-
                             tion’s use.




5-96                                      Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cffti ( xr, xi, wr, wi, wstr, yr, yi, n )
                 NAME     Inverse Complex FFT

         DESCRIPTION      Computes the Inverse Fast Fourier Transform of the input elements stored in vectors
                          xr and xi. The results are stored in complex output vector c. Note the Inverse FFT is
                          the same as the Forward FFT except that the sign of the imaginary components of the
                          twiddle factors is negated. The Inverse FFT swaps the real and imaginary input data,
                          perform the Forward FFT with the same weights table, and swaps the real and imagi-
                          nary ouptut data. Scaling by 1/N is then performed.

                                      n–1
                                                  i2πmk ⁄ n
                                         ∑ Ak e
          ALGORITHM             1
                          C m = --
                                 -                                        m = { 0, 1, 2, …, n – 1 }
                                n
                                      k=0
             SYNOPSIS     void cffti ( xr, xi, wr, wi, wstr, yr, yi, n )
                          float     dm    *xr     ; /* Pointer to real input data                           */
                          float     dm    *xi     ; /* Pointer to imaginary input data                      */
                          float     pm    *wr     ; /* Pointer to cosine table                              */
                          float     dm    *wi     ; /* Pointer to sine table                                */
                          int              wstr ; /* Cosine/sine table stride                               */
                          float     dm    *yr     ; /* Pointer to real output data                          */
                          float     pm    *yi     ; /* Pointer to imaginary output data                     */
                          int              n      ; /* FFT Size (In Complex Elements)                       */

              DOMAIN      -3.4E+38 to 3.4E+38

           ACCURACY       7.75 decimal digits

     EXECUTION TIME       22,650 Cycles @ 1,024 Points - Data and Program In On-Board Cache




                                  Wideband Computers, Inc.                                                  5-97
ADSP-21K Optimized DSP Library User’s Manual




 cffti ( xr, xi, wr, wi, wstr, yr, yi, n )
                 NOTES     This is a radix-2 inverse Fast Fourier Transform using parallel DM/PM data accesses
                           to maximize the throughput on the 21020 processor.The complex input data is sepa-
                           rated into real and imaginary parts, xr and xi. These vectors must be aligned on an
                           address which is an integer multiple of the FFT size, as required for 21K bit-
                           reverse addressing. The input vectors are both in DM; the imaginary data is bit-
                           reversed into PM at the beginning of the routine.The number of elements n must be an
                           integral power of two and a minimum of 32.

                           The complex output is separated into real and imaginary parts, yr and yi. These vec-
                           tors may have arbitrary address alignment; however yr is in the DM and yi is in the
                           PM. Vectors xr and xi mus be in data memory and each must be aligned to an integral
                           multiple of n.

                           Vectors wr and wi are in program memory and data memory respectively and are
                           given the values:

                             wr [ k ] = cos [ 2πk ⁄ wst*n ]              k = ( 0, 1, …, wstn ⁄ 2 – 1 )

                             wi [ k ] = sin [ 2πk ⁄ wst*n ]             k = ( 0, 1, …, wstn ⁄ 2 – 1 )

                           The weight stride, wst, allows for calling cfft() with varying sizes n from a single set
                           of weights. These weights are generated using the fftwts( ) function.

                           Vector yr is in data memory and has a minimum size of n.Vector yi is in program
                           memory and has a minimum size of n.

                           The file tcfft.c included in the distribution tape provides an example of this function’s
                           use.




5-98                                   Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cffti ( xr, xi, wr, wi, wstr, yr, yi, n )
       SPECIAL NOTES      Previous users have sometimes reported problems associated with implementing inter-
                          rupt service routines (ISRs), when used in conjunction with the FFT routines (cfft ( ),
                          cffti ( ), rfft ( ), rffti ( ) ). Observations related to the Wideband technical staff typi-
                          cally include a description of the Wideband routine executing perfectly, but unable to
                          return to an exact state after being interrupted by the ISR ( a “tumble into the weeds.” )

                          The Wideband Fast Fourier transforms, both complex and real, forward and inverse,
                          use the built-in bit reversing and circular addressing capabilites of the SHARC archi-
                          tecture. Also, other routines such as some of the FIR filters use the SHARC’s internal
                          circular addressing capabilities.

                          End users are usually cognizant that their ISR calling routine is responsible for saving
                          and restoring the registers of the Wideband routines. However, end users sometimes
                          forget to save and restore ( push and pop ) the mode 1 regiser, which is associated with
                          bir reversing and the B ( base ) and L ( length ) registers associated with circular
                          addressing. In such circumstances where they are not saved and restored by the ISR
                          they are unable to return the proper length parameter ( L Register ) used for circular
                          addressing or the proper mode ( Mode 1 Register ) used in Bit Reversing. This results
                          in the strange manefestations users sometimes report.

                          To properly save and restore the above mentioned registers in an ISR, refer to page 4-
                          21, section 4.3 of the Analog Devices ADSP-21000 Family C Tools Manual (#31-
                          000005-08, dated August 95) which references examples of in line assembly code
                          within C code to save and restore registers.

                          For a detailed review of the relationships between the various FFT functions and how
                          to use them with one another, see the final section of Chapter 4.




                                  Wideband Computers, Inc.                                                        5-99
ADSP-21K Optimized DSP Library User’s Manual




TABLE 8   Table of Inverse Complex FFT Timing

          Number         Processor
          of Points       Cycles
             32         868 Cycles
             64         1,435 Cycles
            128         2,657 Cycles
            256         5,319 Cycles
            512        11,117 Cycles
           1,024       23,699 Cycles
           2,048       50,873 Cycles
           4,096       109,281 Cycles
           8,192       234,244 Cycles
           16,384      500,525 Cycles
           32,768     1.072,560 Cycles
           65,536     2,288,128 Cycles




5-100                 Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cfir ( ii, qq, ci, cq, oi, oq, d, n, p )
                 NAME     Complex Finite Impulse Response Filter

         DESCRIPTION      The function cfir( ) computes the convolution of vectors ii[ ], iq[ ], ci[ ], and cq[ ]
                          placing the results in oi[ ] and oq[ ] respectively. The number of output samples n and
                          the number of coefficients p may be dissimilar. n elements will be writtento oi[ ] and
                          oq[].

                          The vectors ii[ ] and iq[ ] represent the real and imaginary (I and Q) components of the
                          input data respectively. Likewise,the vectors ci[ ] and cq[ ] represent the real and
                          imaginary (I and Q) components of the coefficient data. A complex multiply and add
                          is performed to compute the convolutional output. The decimation factor d is used to
                          stride the next starting ii[ ] and iq[ ] data.

                                    p= 1
          ALGORITHM
                          C[ i ]=    ∑ a[a • d + j] • b[p – j – 1]
                                    j=0
                          m = { 0, 1, 2, …, n – 1 }
                          where
                          a [ ] compromises complex components ii [ ] and iq [ ]
                          b [ ] compromises complex components ci [ ] and cq [ ]
                          c [ ] compromises complex components oi [ ] and oq [ ]
             SYNOPSIS     void cfir ( ii, qq, ci, cq, oi, oq, d, n, p )
                          */ float     dm    *ii ;    Input samples for I data ( len n+p-1 )                   */
                          */ float     dm    *iq ;    Input samples for Q data ( len n+p-1 )                   */
                          */ float     pm    *ci ;    Coefficients for I data           ( len p          )     */
                          */ float     pm    *cq ;    Coefficients for Q data           ( len p          )     */
                          */ float     dm    *oi ;    Output samples for I data ( len n                  )     */
                          */ float     dm    *oq ;    Output samples for Q data ( len n                  )     */
                          */ int              d   ;   Decimation factor                                        */
                          */ int              n   ;   Number of output samples                                 */
                          */ int              p   ;   Number of coefficients                                   */




                                   Wideband Computers, Inc.                                                  5-101
ADSP-21K Optimized DSP Library User’s Manual




 cfir ( ii, qq, ci, cq, oi, oq, d, n, p )
               DOMAIN      -3.4E+38 to 3.4E+38

            ACCURACY       7.75 decimal digits

        EXECUTION TIME     59 + ( 9 + 5 * p ) * n cycles

                 NOTES     The file tfir.c included in the distribution tape provides an example of this function’s
                           use.

                           The number of filter output samples to generate can be obtainted as follows:
                               ( ndata – p )
                           n = ----------------------------
                                       d+1
                           where ndata is the number of elements in ii[ ] and iq[ ].

                           A correlation can be performed by reversing the order of the coefficients vector.




5-102                                    Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




chksum ( a, i, type, n )
              NAME    Perform Checksum

       DESCRIPTION    This function performs a checksum on a memory block. The memory block is defined
                      by the start address a offset by n. The type flag determines whether dm or pm memory
                      is tested ( 1 = dm, 0 = pm).

        ALGORITHM     Return ⇐ Checksum

           SYNOPSIS   void chksum ( a, i, type, n )
                      int        a ; /*     Start address of memory                                      */
                      int        i ; /*     Memory Stride                                                */
                      int    type ; /*      Type of memory to test ( dm or pm )                          */
                      int        n ; /*     Length of block to be checked                                */



            DOMAIN    -3.4 x 1038 to +3.4 x 1038

         ACCURACY     7.75 decimal digits

   EXECUTION TIME     17 + 2 * N cycles

              NOTES   The file tchksum.c included in the distribution diskette provides an example of this
                      function’s use.

                      chksum( ) performs a two’s complement on the sum of the elements within the mem-
                      ory block. The check sum value is returned.




                               Wideband Computers, Inc.                                                 5-103
ADSP-21K Optimized DSP Library User’s Manual




 cmadd ( a, b, x, y, c )
                NAME       Complex Matrix Addition

         DESCRIPTION       This function computes the addition of complex input matrix a with complex input
                           matrix b and stores the results to complex output matrix c.

          ALGORITHM
                            C ri11 C ri12 C ri13   A ri11 A ri12 A ri13   B ri11 B ri12 B ri13
                                                 =                      +
                            C ri21 C ri22 C ri23   A ri21 A ri22 A ri23   B ri21 B ri22 B ri23


                           where ri indicates a real and imaginary component

            SYNOPSIS       void cmadd ( a, b, x, y, c )
                           complex dm *a ; /*           Pointer to input matrix a [ ][ ]                      */
                           complex dm *b ; /*           Pointer to input matrix b [ ][ ]                      */
                           int            x ; /*    Number of rows in matrix a[ ][ ] & b[ ][ ]                */
                           int            y ; /*    Number of columns in matrix a[ ][ ]& b[ ][ ]*/
                           complex dm *c ; /*           Pointer to output matrix c [ ][ ]                     */

             DOMAIN        -3.4 x 1038 to +3.4 x 1038

          ACCURACY         7.75 decimal digits




5-104                                  Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cmadd ( a, b, x, y, c )
    EXECUTION TIME        32 + 3*X*Y cycles

              NOTES       The file tcmadd.c included in the distribution diskette provides an example of this
                          function’s use.

                          The addition of a complex matrix is mathematically expressed as follows:

                           Real C [ x ] [ y ] = A [ x ] [ y ] Real + B [ x ] [ y ] Real

                           Imaginary C [ x ] [ y ] = A [ x ] [ y ] Imaginary + B [ x ] [ y ] Imaginary

                          An example of the additon of one complex matrix to another is as follows:

                                       1, 2           3.8, 1.7 8.8, 5.5         9.9, 14
                                      7.1, 5          9.3, 1.6 0.4, 1           51, 3.3
                                      0.9, 1            8, 5     2.1, 6        – 3.1, – 1
                          A[ x][ y]=
                                      9.3, 1          2.5, 1.5 6.9, 9          10, 22.1
                                     1.3, 1.4         0.2, 4.5 0.9, 51.4        1.5, 4.4
                                      9.2, 4          7.8, 1.7 61, 3.4         14.3, 1.4

                                        3.2, 1          8.8, 2     9.9, 3    44.3, 13.3
                                        8.1, 4          6.5, 5     3.2, 6    – 2.3, – 9.9
                                        8.9, 7          2.8, 8     1.7, 9    – 8.1, – 2.2
                           B [x] [y] =
                                       6.4, 10         11, 1.3    12, 4.5    22.9, – 5.4
                                        6.5, 7          2.1, 8     2.2, 9      32, 9.8
                                        1.1, 4          7.7, 5     4.4, 6    – 2.1, – 0.3

                          x=6, y=4

                                       4.2, 3          12.6, 3.7      18.7, 8.5      54.2, 27.3
                                      15.2, 9          15.8, 6.6        3.6, 7       48.7, – 6.6
                                       9.8, 8          10.8, 13        3.8, 15      – 11.2, – 3.2
                          C [x] [y] =
                                      15.7, 11         13.5, 2.8     18.9, 13.5      32.9, 16.7
                                      7.8, 8.4         2.3, 12.5      3.1, 60.4      33.5, 14.2
                                      10.3, 8          15.5, 6.7      65.4, 9.4       12.2, 1.1




                                      Wideband Computers, Inc.                                              5-105
ADSP-21K Optimized DSP Library User’s Manual




 cmmov ( a, x, y, b )
                 NAME    Complex Matrix Move

           DESCRIPTION   This function moves a source complex input matrix a to a destination complex output
                         matrix b.

            ALGORITHM
                         C ri11 C ri12 C ri13   A ri11 A ri12 A ri13
                                              ⇐
                         C ri21 C ri22 C ri23   A ri21 A ri22 A ri23


                         where ri indicates a real and imaginary component

              SYNOPSIS   void cmmov ( a, x, y, b )
                         complex dm *a ; /*           Pointer to input matrix a [ ][ ]                         */
                         int            x ; /*        Number of rows in matrix a[ ][ ]                         */
                         int            y ; /*        Number of columns in matrix a[ ][ ]                      */
                         complex dm *b ; /*           Pointer to output matrix b [ ][ ]                        */



               DOMAIN    -3.4 x 1038 to +3.4 x 1038

            ACCURACY     7.75 decimal digits

        EXECUTION TIME   13 + ( 2 * X * Y ) cycles

                NOTES    The file tcmmov.c included in the distribution diskette provides an example of this
                         function’s use.

                         The storage methodology for matrices is by rows. Matricies can be thought of as one
                         long array (vector) where the beginning of each row is offset by the length of the col-
                         umn.




5-106                                Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cmmul ( a, x, y, b, z, c )
               NAME    Complex Matrix Multiplication

        DESCRIPTION    This function computes the multiplication of complex input matrix a times complex
                       input matrix b and stores the results to complex output matrix c. The dimension of
                       complex matrix a [ ] [ ] is x and y and the dimension of complex input matrix b [ ] [ ]
                       is y and z. The resulting complex output matrix c [ ] [ ] is of dimension x and z.

         ALGORITHM
                                                                B ri11 B ri12
                         C ri11 C ri12   A ri11 A ri12 A ri13
                                       =                      • B ri21 B ri22
                         C ri21 C ri22   A ri21 A ri22 A ri23
                                                                B ri31 B ri32
                        where ri indicates a real and imaginary component

           SYNOPSIS    void cmmul ( a, x, y, b, z, c )
                       complex dm *a ; /*           Pointer to input matrix a [ ][ ]                        */
                       int            x ; /*        Number of rows in matrix a[ ][ ]                        */
                       int            y ; /*        Number of columns in matrix a[ ][ ]                     */
                                             /*   Number of rows in matrix b[ ][ ]                          */
                       complex dm *b ; /*           Pointer to input matrix b [ ][ ]                        */
                       int            z ; /*        Number of columns in matrix b[ ][ ]                     */
                       complex dm *c ; /*           Pointer to output matrix c [ ][ ]                       */

             DOMAIN    -3.4 x 1038 to +3.4 x 1038

          ACCURACY     7.75 decimal digits




                                Wideband Computers, Inc.                                                  5-107
ADSP-21K Optimized DSP Library User’s Manual




 cmmul ( a, x, y, b, z, c )
        EXECUTION TIME   45 + (4 + ( 10 + 5 * Y) * Z) * X cycles

                NOTES    The file tcmmul.c included in the distribution diskette provides an example of this
                         function’s use.

                         The multiplication of a complex matrix is as follows:
                                        y

                          C[x ][y ]=   ∑ ( Real Sum + Imaginary Sum )       where
                                       k=1

                         Real Sum = A Real • BReal – AImaginary • BImagainary


                         Imaginary Sum = A Real • BImaginary + BReal • A Imaginary


                         The storage methodology for matrices is by rows. Matrices can be thought of as one
                         long array (vector) where the beginning of each row is offset by the length of the col-
                         umn.

                         The first row of a [ ] [ ] times the first column of b [ ] [ ] is the first element of c [ ] [ ]
                         (row 1, column 1). The first row of a [ ] [ ] times the second row of b [ ] [ ] is the sec-
                         ond element of c [ ] [ ] (row 1, column 2 ) ... etc.

                         This algorithm follows the general law of matrix multiplication whereby the number
                         of columns of input matrix a must equal the number of rows of input matrix b.

                         ri indicates that each component of the matrix is composed of a complex number
                         which has both a real and imaginary component.

                         An example of the multipication of one complex matrices by another is as follows:

                                                                                    1, 2     3, 4     5, 6
                                          1, 1 2, 2 3, 3 4, 4
                                                                                    7, 8     9, 10   11, 12
                          A [ x ] [ y ] = 5, 5 6, 6 7, 7 8, 8          B [y] [z] =
                                                                                   13, 14   15, 16   17, 18
                                          9, 9 10, 10 11, 11 12, 12
                                                                                   19, 20   21, 22   23, 24

                         x=3, y=4, z=3

                                 – 10, 270 – 10, 310 – 10, 350
                         C [ ] = – 26, 606 – 26, 710 – 26, 814
                                 – 42, 942 – 42, 1110 – 42, 1278




5-108                                  Wideband Computers, Inc.
ADSP-21K Optimized DSP Library User’s Manual




cmmul_dpd ( a, x, y, b, z, c )
              NAME    Complex Matrix Multiplication (Data Memory x Program Memory to Data Memory)

       DESCRIPTION    This function computes the multiplication of complex input matrix a[ ] (in data mem-
                      ory) times complex input matrix b[ ] (in program memory) and stores the results to
                      complex output matrix c[ ] (in data memory). The dimension of complex matrix a [ ] [
                      ] is x and y and the dimension of complex input matrix b [ ] [ ] is y and z. The resulting
                      complex output matrix c [ ] [ ] is of dimension x and z.

        ALGORITHM
                                                              B ri11 B ri12
                       C ri11 C ri12   A ri11 A ri12 A ri13
                                     =                      • B ri21 B ri22
                       C ri21 C ri22   A ri21 A ri22 A ri23
                                                              B ri31 B ri32
                      where ri indicates a real and imaginary component

           SYNOPSIS   void cmmul_dpd ( a, x, y, b, z, c )
                      complex dm *a ; /*           Pointer to input matrix a [ ][ ]                          */
                      int             x ; /*       Number of rows in matrix a[ ][ ]                          */
                      int             y ; /*       Number of columns in matrix a[ ][ ]                       */
                                            /*   Number of rows in matrix b[ ][ ]                            */
                      complex pm *b ; /*           Pointer to input matrix b [ ][ ]                          */
                      int             z ; /*       Number of columns in matrix b[ ][ ]                       */
                      complex dm *c ; /*           Pointer to output matrix c [ ][ ]                         */

            DOMAIN    -3.4 x 1038 to +3.4 x 1038

         ACCURACY     7.75 decimal digits




                               Wideband Computers, Inc.                                                     5-109
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual
Chap5 - ADSP 21K Manual

Weitere ähnliche Inhalte

Was ist angesagt?

SPU Optimizations-part 1
SPU Optimizations-part 1SPU Optimizations-part 1
SPU Optimizations-part 1
Naughty Dog
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
Ruymán Reyes
 
On Resolution Proofs for Combinational Equivalence
On Resolution Proofs for Combinational EquivalenceOn Resolution Proofs for Combinational Equivalence
On Resolution Proofs for Combinational Equivalence
satrajit
 
Reducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology MappingReducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology Mapping
satrajit
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
Thomas Goddard
 
Injecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive SubsamplingInjecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive Subsampling
Martino Ferrari
 

Was ist angesagt? (20)

Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011Liszt los alamos national laboratory Aug 2011
Liszt los alamos national laboratory Aug 2011
 
fft using labview
fft using labviewfft using labview
fft using labview
 
Order Independent Transparency
Order Independent TransparencyOrder Independent Transparency
Order Independent Transparency
 
High-Level Synthesis with GAUT
High-Level Synthesis with GAUTHigh-Level Synthesis with GAUT
High-Level Synthesis with GAUT
 
SPU Optimizations-part 1
SPU Optimizations-part 1SPU Optimizations-part 1
SPU Optimizations-part 1
 
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver OverheadOpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
OpenGL NVIDIA Command-List: Approaching Zero Driver Overhead
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
 
Shadow Volumes on Programmable Graphics Hardware
Shadow Volumes on Programmable Graphics HardwareShadow Volumes on Programmable Graphics Hardware
Shadow Volumes on Programmable Graphics Hardware
 
Chap03[1]
Chap03[1]Chap03[1]
Chap03[1]
 
On Resolution Proofs for Combinational Equivalence
On Resolution Proofs for Combinational EquivalenceOn Resolution Proofs for Combinational Equivalence
On Resolution Proofs for Combinational Equivalence
 
MSc Presentation
MSc PresentationMSc Presentation
MSc Presentation
 
Arvindsujeeth scaladays12
Arvindsujeeth scaladays12Arvindsujeeth scaladays12
Arvindsujeeth scaladays12
 
Reducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology MappingReducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology Mapping
 
Continuous Systems To Discrete Event Systems
Continuous Systems To Discrete Event SystemsContinuous Systems To Discrete Event Systems
Continuous Systems To Discrete Event Systems
 
Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
Parallel Algorithms: Sort & Merge, Image Processing, Fault ToleranceParallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
 
Performance Portability Through Descriptive Parallelism
Performance Portability Through Descriptive ParallelismPerformance Portability Through Descriptive Parallelism
Performance Portability Through Descriptive Parallelism
 
Dsp lab manual
Dsp lab manualDsp lab manual
Dsp lab manual
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL Functionality
 
Injecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive SubsamplingInjecting image priors into Learnable Compressive Subsampling
Injecting image priors into Learnable Compressive Subsampling
 

Ähnlich wie Chap5 - ADSP 21K Manual

Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
DefconRussia
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
mspline
 

Ähnlich wie Chap5 - ADSP 21K Manual (20)

Annotations.pdf
Annotations.pdfAnnotations.pdf
Annotations.pdf
 
Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++
 
dsp.pdf
dsp.pdfdsp.pdf
dsp.pdf
 
What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++
 
Lec1
Lec1Lec1
Lec1
 
Dsp lab manual
Dsp lab manualDsp lab manual
Dsp lab manual
 
Advanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter pptAdvanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter ppt
 
Andes open cl for RISC-V
Andes open cl for RISC-VAndes open cl for RISC-V
Andes open cl for RISC-V
 
Lec1
Lec1Lec1
Lec1
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Tutorial2
Tutorial2Tutorial2
Tutorial2
 
Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
 
Ecad &amp;vlsi lab 18
Ecad &amp;vlsi lab 18Ecad &amp;vlsi lab 18
Ecad &amp;vlsi lab 18
 
Elliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsubaElliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsuba
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
B61301007 matlab documentation
B61301007 matlab documentationB61301007 matlab documentation
B61301007 matlab documentation
 
Hardware Description Language
Hardware Description Language Hardware Description Language
Hardware Description Language
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
 
6 compiler lab - Flex
6 compiler lab - Flex6 compiler lab - Flex
6 compiler lab - Flex
 
VCE Unit 01 (2).pptx
VCE Unit 01 (2).pptxVCE Unit 01 (2).pptx
VCE Unit 01 (2).pptx
 

Chap5 - ADSP 21K Manual

  • 1. ADSP-21K Optimized DSP Library User’s Manual CHAPTER 5 Function Descriptions For The ADSP-21K Optimized DSP Library Each function described in the following pages includes the following topics in order to better understand its use: • Name • Description of the function's operation • The algorithm as applicable • Synopsis of function prototype • Domain valid for arguments • Accuracy of the returned value(s) • Execution time in machine cycles • Notes applicable to this function Wideband Computers, Inc. 5-55
  • 2. ADSP-21K Optimized DSP Library User’s Manual acort ( a, c, m, n ) NAME Auto-correlation (Time Domain) DESCRIPTION Computes the time domain auto-correlation of the real elements stored in input vector a[ ]. Values m and n define the number of auto-correlation values to compute. The resulting auto-correlation values are stored in output vector c[ ]. n–i–1 ALGORITHM Ci= ∑ Ai + j • Aj i = { 0, 1, 2, …m – 1 } j= 0 SYNOPSIS void acort ( a, c, m, n ) float *a ; /* Pointer to input vector a[ ] */ float *c ; /* Pointer to output vector c[ ] */ int m ; /* Lag count m */ int n ; /* Number of elements in vector a[ ] */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 31 + 9*M + (M+1) (2*N-M) NOTES The file tacort.c included in the distribution tape provides an example of this func- tion’s use. Note that the lag count m must be less than or equal to the number of floating-point elements (i.e. m ≤ n ). 5-56 Wideband Computers, Inc.
  • 3. ADSP-21K Optimized DSP Library User’s Manual acos_wci ( x ) NAME Arc Cosine DESCRIPTION This function computes the arc cosine of a floating-point number, x. The computed value returned from this function is in the range [0 to π ] radians. A domain error is returned if x is not in the range [-1 to +1]. ALGORITHM return = cos –1( x ) SYNOPSIS float acos_wci ( float x ) DOMAIN -1.0 < x < +1.0 ACCURACY 7.75 decimal digits EXECUTION TIME If A <= 0.5 then 55 cycles, Else if A >0.5 then 75 cycles NOTES The file tacos.c included in the distribution tape provides an example of this function's use. acosh_wci ( x ) NAME Inverse Hyperbolic Cosine DESCRIPTION This function computes the inverse hyperbolic cosine of a floating-point number, x. ALGORITHM return = cosh – 1( x ) SYNOPSIS float acosh_wci ( float x ) DOMAIN 1.0 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 72 cycles NOTES The file tacosh.c included in the distribution tape provides an example of this func- tion's use. Wideband Computers, Inc. 5-57
  • 4. ADSP-21K Optimized DSP Library User’s Manual alawc ( a, i, c, k, n ) NAME a-Law Compression DESCRIPTION This routine performs an a-law compression on the elements in input vector a and out- puts the compressed results to output vector c. C mk = alaw compression of A mi ALGORITHM m = { 0, 1, 2, …n – 1 } SYNOPSIS void alawc ( a, i, c, k, n ) int *a ; /* Pointer to input vector a */ int i ; /* Element stride for vector i */ int *c ; /* Pointer to output vector c */ int k ; /* Element stride for vector c */ int n ; /* Number of floating-point elements */ DOMAIN 0 to 255 ACCURACY 7.75 decimal digits EXECUTION TIME 49 + 12 * ( N-1 ) NOTES The file talawc.c included in the distribution tape provides an example of this func- tion’s use. The alawc() routine takes a linear 13-bit signed speech sample and compresses it according to CCITT (now ITU) recommendation G.711. The 8-bit compressed sample is output to vector c. This function is found on the serial port hardware for the ADSP-2106x DSP proces- sors. 5-58 Wideband Computers, Inc.
  • 5. ADSP-21K Optimized DSP Library User’s Manual alawe ( a, i, c, k, n ) NAME a-Law Expansion DESCRIPTION This routine performs an a-law expansion on the elements in input vector a and out- puts the expanded results to output vector c. C mk = alaw expansion of A mi ALGORITHM m = { 0, 1, 2, …n – 1 } SYNOPSIS void alawe ( a, i, c, k, n ) int *a ; /* Pointer to input vector a */ int i ; /* Element stride for vector i */ int *c ; /* Pointer to output vector c */ int k ; /* Element stride for vector c */ int n ; /* Number of floating-point elements */ DOMAIN 0 to 255 ACCURACY 7.75 decimal digits EXECUTION TIME 46 + 17 * ( N-1 ) NOTES The file talawe.c included in the distribution tape provides an example of this func- tion’s use. The alawe() routine takes an 8-bit compressed speech sample and expands it accord- ing to CCITT (now ITU) recommendation G.711. The 13-bit signed sample is output to vector c. This function is found on the serial port hardware for the ADSP-2106x DSP proces- sors. Wideband Computers, Inc. 5-59
  • 6. ADSP-21K Optimized DSP Library User’s Manual alpha ( df, a, &al, &n ) NAME Kaiser-Bessel Window Shape Parameter DESCRIPTION Computes a Kaiser-Bessel window shape parameter for later use by the kaiser( ) win- dow mutiply library function. The computation is based on the input attenutation specified in input scalar a and the transition width specified in real input scalar df. From this, a count of floating-point elements (output scalar n) and an output window shape parameter (output scalar al) is computed. If A ≤ 21 then al = 0 ALGORITHM Else If 0.4 A < 50 then al = 0.5842 • ( A – 21 ) + 0.07886 • ( A – 21 ) Else If al = 0.1102 • ( A – 8.7 ) Number of Elements n is computed as follows: ( A – 7.95 ) If A > 21 then d = ------------------------ else d = 0.922 - 14.36 n = 1 + ceiling ( d ⁄ df ) n = n + 1 – remainder ( n ⁄ 2 ) SYNOPSIS void alpha ( df, a, &al, &n ) float dm *df ; /* Input transition width in fs units */ float dm a ; /* Input ripple attenutation in dB */ float dm &al ; /* Output alpha window shape parameter */ int &n ; /* Output floating-point element count */ -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits 5-60 Wideband Computers, Inc.
  • 7. ADSP-21K Optimized DSP Library User’s Manual alpha ( df, a, &al, &n ) EXECUTION TIME If a >= 50 then 143 Cycles If 21 < a < 50 then 221 Cycles If A <= 21 then 124 Cycles NOTES The file talpha.c included in the distribution tape provides an example of this func- tion’s use. – A ⁄ 20 df = ∆f ⁄ f s, A = ripple attentuation in dB, δ = 10 asin_wci ( x ) NAME Arc Sine DESCRIPTION This function computes the arc sine of a floating-point number, x. The computed value returned from this function is in the range [-π/2 to π/2] radians. A domain error is returned if x is not in the range [-1 to +1]. ALGORITHM return = sin – 1( x ) Wideband Computers, Inc. 5-61
  • 8. ADSP-21K Optimized DSP Library User’s Manual asin_wci ( x ) SYNOPSIS float asin_wci ( float x ) DOMAIN - 1.0 < x < +1.0 ACCURACY 7.75 decimal digits EXECUTION TIME If A <= 0.5 then 55 cycles, Else if A >0.5 then 73 cycles NOTES The file tasin.c included in the distribution tape provides an example of this function's use. asinh_wci ( x ) NAME Inverse Hyperbolic Sine DESCRIPTION This function computes the inverse hyperbolic sine of a floating-point number, x. ALGORITHM return = sinh –1( x ) SYNOPSIS float asinh_wci ( float x ) DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 57 cycles NOTES The file tasinh.c included in the distribution tape provides an example of this func- tion's use. 5-62 Wideband Computers, Inc.
  • 9. ADSP-21K Optimized DSP Library User’s Manual aspec ( a, c, n ) NAME Accumulating Auto-spectrum DESCRIPTION Computes the auto-spectrum of complex input vector a by multiplying vector a by its complex conjugate and adding the resulting real number to the current value of vector c. Vector c must be initialized prior to invoking a series of accumulating auto-spec- trum calls. 2 2 ALGORITHM C m ⇐ C m + Re Am + Im Am m = { 0, 1, 2, …n – 1 } SYNOPSIS void aspec ( a, c, n ) complex *a ; /* Pointer to input vector a */ float *c ; /* Pointer to output vector c */ int n ; /* Element count for vector c */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 28 + 6*N cycles NOTES The file taspec.c included in the distribution tape provides an example of this func- tion’s use. The stride of vectors a and c must always be 1. If you wish to clear the auto-spectrum results before they are added to output vector c use the vclr( ) function. If the results are not cleared using vclr( ), autospectrum results are added to output vector c, thus computing an accumulating autospectrum. Note that input vector a is of type complex, and data arguments supplied to this routine will be treated as interleaved real and imaginary data. Wideband Computers, Inc. 5-63
  • 10. ADSP-21K Optimized DSP Library User’s Manual atan_wci ( x ) NAME Arc Tangent DESCRIPTION This function computes the arc tangent of a floating-point number x. The computed value returned from this function is in the range [-π/2 to +π/2] radians. ALGORITHM return = tan –1( x ) SYNOPSIS float atan_wci ( float x ) DOMAIN - 4.2E+37 < x < +4.2E+37 ACCURACY 7.75 decimal digits EXECUTION TIME 59 cycles NOTES The file tatan.c included in the distribution tape provides an example of this function's use. 5-64 Wideband Computers, Inc.
  • 11. ADSP-21K Optimized DSP Library User’s Manual atan2_wci ( y, x ) NAME Arc Tangent 2 Arguments DESCRIPTION This function computes the arc tangent of a floating-point number x. The computed value returned from this function is in the range [-π to +π] radians. –1 y return = tan  --  ALGORITHM - x SYNOPSIS float atan2_wci ( y, x ) float dm y ; /* Input value y */ float dm x ; /* Input value x */ DOMAIN - 4.2E+37 < y/x < +4.2E+37, except x = 0.0 ACCURACY 7.75 decimal digits EXECUTION TIME 76 cycles NOTES The file tatan2.c included in the distribution tape provides an example of this func- tion's use. Wideband Computers, Inc. 5-65
  • 12. ADSP-21K Optimized DSP Library User’s Manual atanh_wci ( x ) NAME Inverse Hyperbolic Tangent DESCRIPTION This function computes the inverse hyperbolic tangent of a floating-point number, x. ALGORITHM return = tanh – 1( x ) SYNOPSIS float atanh_wci ( float x ) DOMAIN -1.0 to +1.0 ACCURACY 7.75 decimal digits EXECUTION TIME 59 cycles NOTES The file tatanh.c included in the distribution tape provides an example of this func- tion's use. 5-66 Wideband Computers, Inc.
  • 13. ADSP-21K Optimized DSP Library User’s Manual bartlett ( a, i, c, k, n ) NAME Bartlett Window DESCRIPTION This function generates a Bartlett window multiply on the elements of input vector a and places the results in output vector c. ALGORITHM  1   m – -- n  2 - C mk = Ami • 1 – ----------------   -- n  1 -  2  m = { 0, 1, 2, …n – 1 } SYNOPSIS void bartlett ( a, i, c, k, n ) float *a ; /* Pointer to input vector a */ int i ; /* Address stride in words for input vector a */ float *c ; /* Pointer to output vector c */ int k ; /* Address stride in words for output vector c */ int n ; /* Element count */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits EXECUTION TIME 44 + 17 * ( N-1 ) cycles NOTES The file tbartlett.c included in the distribution diskette provides an example of this function’s use. The Bartlett window is also known as a triangular window. Wideband Computers, Inc. 5-67
  • 14. ADSP-21K Optimized DSP Library User’s Manual biquad ( x, d, c, y, n ) NAME Bi-Quad IIR Filter DESCRIPTION Using a bi-quad implementation, this function computes an IIR ( Infinite Impulse Response ) filter using coefficients stored in input vector c, delay node points stored in input buffer d, and applied to the elements of input vector x. The results are stored in output vector y. –1 –2 B0 + B1 z + B2 z ALGORITHM H ( z ) = ----------------------------------------------- - –1 –2 1 – A1 z – A2 z where Dm = A2 • Dm – 2 + A1 • Dm – 1 + xm Y m = B2 • Dm – 2 + B1 • Dm – 1 + Dm m = { 0, 1 , 2 , …, n – 1 } SYNOPSIS void biquad ( x, d, c, y, n ) float *x ; /* Pointer to input buffer vector x of length n */ float *d ; /* Pointer to input delay node buff vector d of length 2 */ float *c ; /* Pointer to input coeff buffer vector c of length 5 */ float *y ; /* Pointer to output buffer vector y of length n */ int n ; /* Number of input/output samples to compute */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits 5-68 Wideband Computers, Inc.
  • 15. ADSP-21K Optimized DSP Library User’s Manual biquad ( x, d, c, y, n ) EXECUTION TIME 65 + 13*N NOTES This is a single bi-quad form of an infinite impulse response filter (IIR), defined by the first equation shown above. It is implemented using a delay node buffer d shown in the second and third equation shown above. The coefficients a[ ] and b[ ] are passed in a single array c[ ] given by the following: c [ 0 ] = A2 c [ 1 ] = B2 c [ 2 ] = A1 c [ 3 ] = B1 c [ 4 ] = B0 Prior to executing the filter loop, the two “oldest” delay node values are loaded from buffer d[ ]. When the filter loop has completed (n samples have been processed) the two “newest” delay node values are written to d[ ]. In this way the filter delay node states are retained between calls, allowing filtering on blocks of contiguous samples. The user is responsible for allocating the delay node array and for initializing its ele- ments to zero prior to the first call to biquad( ). Defining d0 = D m d1 = D m – 1 d2 = D m – 2 Then d0 = c0 • d2 + c2 • d1 + xm ym = c1 • d2 + c 3 • d1 + c 4 • d0 d2 = d1 d1 = d0 m = { 0 , 1 , 2, … , n – 1 } The coefficient buffer length is defined symbolically in the file dsppac.h as DSP_BIQUAD_NCOEFF. The delay node buffer length is defined symbolically in the file dsppac.h as DSP_BIQUAD_NDELAY. The number of input samples n must be greater than or equal to 5. The file tbiquad.c included in the distribution tape provides an example of this func- tion’s use. Wideband Computers, Inc. 5-69
  • 16. ADSP-21K Optimized DSP Library User’s Manual blkman ( a, i, c, k, w, h, n ) NAME Blackman Window Multiply DESCRIPTION Multiplies the input vector a[ ] by a Blackman window and stores the result to vector c[ ]. ALGORITHM 2πmi 4πmi C mk = A mi • 0.42 – 0.50 • cos ------------ + 0.08 • cos ------------ - - N N m = { 0, 1, 2, …, n – 1 } SYNOPSIS void blkman ( a, i, c, k, w, h, n ) float dm *a ; /* Pointer to input vector a */ int i ; /* Element stride for vector a */ float dm *c ; /* Pointer to output vector c */ int k ; /* Element stride for vector c */ float pm *w ; /* Pointer to cosine weights array */ int h ; /* Element stride for weights array */ int n ; /* Element count for vector c */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits 5-70 Wideband Computers, Inc.
  • 17. ADSP-21K Optimized DSP Library User’s Manual blkman ( a, i, c, k, w, h, n ) EXECUTION TIME 41 + 4*(N-1) cycles NOTES The file tblkman.c included in the distribution tape provides an example of this func- tion’s use. For real-time applications, the Blackman window can be computed once, and a simple multiply used to window data as shown in the variable W ml . The Blackman Win- dow is computed using the winwts( ) function found in the DSP Pac library. The win- wts( ) function computes the weights array using the sin and cosine functions. This array is pointed to by variable w listed in the synopsis section above. The blkman( ) function is a vector function. You may therefore use the stride argu- ments i, k and h to decimate both the input and output for data congruence. For exam- ple, suppose you use winwts( ) to compute the FFT weights for a 16K FFT. This would result in an fftwts array whose length would be 16,384 points. If you were to later decide to compute an FFT of length 1,024 and run a Blackman Window on the results, you would not need to rerun the winwts( ) function to generate new weights. Simply use the old weights and stride by 16 (16,384/1024 = 16) on stride element h to obtain the correct Blackman window FFT weights . In this manner you need only compute winwts( ) once and later us them for varying length FFTs and windowing functions. The cosine arguments are held in input vector w[ ] and can be computed from the win- wts( ) function. Note that larger vector sizes of w[ ] can be used by changing the stride for w[ ]. For example, if w[ ] were computed for a window of size 2,048, but a Black- man Window of 1,024 was needed, use a stride of 2,048/1,024 = 2. Note that the Blackman window has a passband ripple of 0.0017 dB, a maximum stop- band attenuation of 74 dB, and a 57 dB main lobe relative to side lobe. Wideband Computers, Inc. 5-71
  • 18. ADSP-21K Optimized DSP Library User’s Manual blkmanh ( a, i, c, k, w, h, n ) NAME Blackman-Harris Window Multiply DESCRIPTION Multiplies the input vector a[ ] by a Blackman-Harris window and stores the result to output vector c[ ]. 2πmi 4πmi 6πmi ALGORITHM C mk = A mi • 0.35875 – 0.48829 • cos ------------ + 0.14128 • cos ------------ – 0.01168 • cos ------------ - - - N N N m = { 0, 1, 2, …, n – 1 } SYNOPSIS void blkmanh ( a, i, c, k, w, h, n ) float dm *a ; /* Pointer to input vector a */ int i ; /* Element stride for vector a */ float dm *c ; /* Pointer to output vector c */ int k ; /* Element stride for vector c */ float pm *w ; /* Pointer to cosine weights array */ int h ; /* Element stride for weights array */ int n ; /* Element count for vector c */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits 5-72 Wideband Computers, Inc.
  • 19. ADSP-21K Optimized DSP Library User’s Manual blkmanh ( a, i, c, k, w, h, n ) EXECUTION TIME 54 + 6*(N-1) cycles NOTES The file tblkmanh.c included in the distribution tape provides an example of this function’s use. For real time applications, the Blackman-Harris window can be computed once, and a simple multiply used to window data, as shown in the variable W ml . The Blackman- Harris Window is computed using the winwts( ) function found in the DSP Pac library. The winwts( ) function computes the weights array using the sin and cosine functions. This array is pointed to by variable w listed in the synopisis section above. The blkmanh function is a vector function. You may therefore use the stride argu- ments i, k and h to decimate both the input and output for data congruence. For exam- ple, suppose you use winwts( ) to compute the FFT weights for a 16K point FFT. This would result in an fftwts array whose length would be 16,384 points. If you were to later decide to compute an FFT of length 1,024 and run a Blackman-Harris Window on the results, you would not need to rerun the winwts( ) function to generate new weights. Simply use the old weights and stride by 16 (16,384/1024 = 16) on stride ele- ment h to obtain the correct window FFT weights . In this manner you need only com- pute winwts( ) once and later us them for varying length FFTs and windowing functions. The cosine arguments are held in input vector w[ ] and can be computed from the win- wts( ) function. Note that larger vector sizes of w[ ] can be used by changing the stride for w[ ]. For example, if w[ ] were computed for a window of size 2,048, but a Black- man Window of 1,024 was needed, use a stride of 2,048/1,024 = 2. Note that the Blackman-Harris window has a passband ripple of 0.0017 dB, a maxi- mum stopband attenuation of 74 dB, and a 57 dB main lobe relative to side lobe. Wideband Computers, Inc. 5-73
  • 20. ADSP-21K Optimized DSP Library User’s Manual cacort ( a, c, m, n ) NAME Complex Auto-Correlation (Time Domain) DESCRIPTION Computes the time domain auto-correlation of the complex elements stored in input vector a[ ]. Values m and n define the number of auto-correlation values to compute. The resulting auto-correlation values are stored in output complex vector c[ ]. n–i–1 ALGORITHM Ci= ∑ Ai + j • Aj i = { 0, 1, 2, …m – 1 } j=0 SYNOPSIS void cacort ( a, c, m, n ) complex dm *a ; /* Pointer to input vector a[ ] */ complex dm *c ; /* Pointer to output vector c[ ] */ int m ; /* Lag count m */ int n ; /* Number of elements in vector a[ ] */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 39 + ( 9 + 5 * n ) * n NOTES The file tacort.c included in the distribution tape provides an example of this func- tion’s use. Note that the lag count m must be less than or equal to the number of floating-point elements (i.e. m ≤ n ). The strides of vectors a[ ] and c [ ] must be 1. 5-74 Wideband Computers, Inc.
  • 21. ADSP-21K Optimized DSP Library User’s Manual ccdotpr ( a, i, b, j, c, k, n ) NAME Complex Dot Product Multiply by Conjugate DESCRIPTION This function computes the complex dot product of complex input vector a by the complex conjugate of input vector b and stores the results in complex output vector c. This can be alternatively expressed as C=AB*. ALGORITHM n–1 Re { C } = ∑ Re{ Ami } • Re { Bmj } + Im { Ami } • Im{ Bmj } m= 0 n–1 Im { C } = ∑ –Re{ Ami } • Im { Bmj } + Im { Ami } • Re { Bmj } m= 0 m = { 0, 1, 2…n – 1 } SYNOPSIS void ccdotpr ( a, i, b, j, c, k, n ) complex *a ; /* Pointer to complex input vector a */ int i ; /* Address stride in words for input vector a */ complex *b ; /* Pointer to complex input vector b */ int j ; /* Address stride in words for input vector b */ complex *c ; /* Pointer to complex output vector c */ int k ; /* Address stride in words for output vector c */ int n ; /* Element count */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits EXECUTION TIME 64 + 4*(N-1) cycles NOTES The file tccdotpr.c included in the distribution diskette provides an example of this function’s use. Wideband Computers, Inc. 5-75
  • 22. ADSP-21K Optimized DSP Library User’s Manual ccmmul ( a, b, x, y, b, z, c ) NAME Complex Matrix Multiply By Congugate of Complex Matrix DESCRIPTION This function computes the multiplication of the conjugate of complex input matrix a [ ] [ ] times the elements of complex input matrix b[ ] [ ]. The dimensions of com- plex input matrix a[ ] [ ] are x and y, while the dimensions of complex input matrix b[ ] [ ] are defined by input scalars y and z. The results are stored in complex output matrix c[ ] [ ], which is of dimensions x and z. ALGORITHM y Re ( C ij ) = ∑ [ ( Re )Aik • ( Re )Bkj + ( Im )Aik • ( Im )Bkj ] k=1 y Im(C ij ) = ∑ [ ( Re )C ik • ( Im )B kj – ( Re )B kj • ( Im )A ik ] k=1 for i = { 0, 1, …x } for j = { 0, 1, …z } SYNOPSIS void ccmmul( a, x, y, b, z, c ) complex dm *a ; /* Pointer to complex input matrix a[ ][ ] */ int x ; /* Number of rows in complex matrix a[ ][ ] */ int y ; /* Number of columns in matrix a[ ][ ] And */ /* Number of rows in complex matrix b[ ][ ] */ complex dm *b ; /* Pointer to complex input matrix b[ ][ ] */ int z ; /* Number of columns in matrix b[ ][ ] */ complex dm *c ; /* Pointer to complex output matrix c[ ][ ] */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits 5-76 Wideband Computers, Inc.
  • 23. ADSP-21K Optimized DSP Library User’s Manual ccmmul ( a, b, x, y, b, z, c ) EXECUTION TIME 62 + ( 6 + ( 12 + 7 * Y ) * Z ) * X cycles NOTES The file tccmmul.c included in the distribution diskette provides an example of this function’s use. a[x][y] = 1, 1 2, 2 3, 3 4, 4 5, 5 6, 6 7, 7 8, 8 9, 9 10, 10 11, 11 12, 12 1, 2 3, 4 5, 6 b[y][z] = 7, 8 9, 10 11, 12 13, 14 15, 16 17, 18 19, 20 21, 22 23, 24 x = 3, y = 4, z = 3 ; ccmmul ( a, x, y, b, z, c ) ; The resulting values in output matrix c [ ] [ ] would be as follows: c[x][y] = 270, 10 310, 10 350, 10 606, 26 610, 26 814, 26 942, 42 1110, 42 1278, 42 The storage methodology for matrices is by rows. Matrices can be thought of as one long array (vector) where the beginning of each row is offset by the number of col- umns. Wideband Computers, Inc. 5-77
  • 24. ADSP-21K Optimized DSP Library User’s Manual ccmsmul ( a, x, y, b, c ) NAME Complex Scalar-Complex Congugate Matrix Multiplication DESCRIPTION This function computes the multiplication of the conjugate of the complex input matrix a[ ] [ ] times complex input scalar b. The dimensions of complex input matrix a[ ] [ ] are x and y. The results are stored in complex output matrix c[ ] [ ], which is of dimensions x and y. ALGORITHM Cxy = B • Axy SYNOPSIS void ccmsmul( a, x, y, b, c ) complex dm *a ; /* Pointer to complex input matrix a[ ][ ] */ int x ; /* Number of rows in complex matrix a[ ][ ] */ int y ; /* Number of columns in matrix a[ ][ ] */ complex dm *b ; /* Pointer to complex input scalar b */ complex dm *c ; /* Pointer to complex output matrix c[ ][ ] */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits 5-78 Wideband Computers, Inc.
  • 25. ADSP-21K Optimized DSP Library User’s Manual ccmsmul ( a, x, y, b, c ) EXECUTION TIME 46 + 2 * X * Y cycles NOTES The file tccmsmul.c included in the distribution diskette provides an example of this function’s use. a[x][y] = 1, 2 3, 4 5, 6 7, 8 9, 10 11, 12 13, 14 15, 16 17, 18 19, 20 21, 22 23, 24 25, 26 27, 28 29, 30 b = {8,2} x = 8, y = 7 ; ccmsmul ( a, x, y, b, c ) ; The resulting values in output matrix c [ ] [ ] would be as follows: c[x][y] = 12, – 14 32, – 26 52, – 38 72, – 50 92, – 62 112, – 74 132, – 86 152, – 98 172, – 110 192, – 122 212, – 134 232, – 146 252, – 158 272, – 170 292, – 182 The storage methodology for matrices is by rows. Matrices can be thought of as one long array (vector) where the beginning of each row is offset by the number of col- umns. Wideband Computers, Inc. 5-79
  • 26. ADSP-21K Optimized DSP Library User’s Manual cccort ( a, b, c, m, n ) NAME Complex Cross-Correlation (Time Domain) DESCRIPTION Computes the time domain (real) cross-correlation of the time domain (real) elements stored in complex input vectors a[ ] and b[ ]. The result is stored in complex output vector c [ ]. Values m and n define the number of cross-correlation values to compute. The implementation uses a time domain technique. n–i–1 ALGORITHM Ci = ∑ Ai + j • Bj i = { 0, 1, 2, …, m – 1 } j=0 SYNOPSIS void cccort ( a, b, c, m, n ) complex dm *a ; /* Pointer to input vector a[ ] */ complex dm *b ; /* Pointer to input vector b[ ] */ complex dm *c ; /* Pointer to output vector c[ ] */ int m ; /* Lag count m */ int n ; /* Number of elements in vector c[ ] */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 41 + ( 9 + 5 * n ) * m NOTES The file tcccort.c included in the distribution tape provides an example of this func- tion’s use. Note that the lag count must be less than or equal to the number of floating-point ele- ments (i.e. m ≤ n ). The strides of vectors a[ ], b[ ], and c[ ] must always be 1. 5-80 Wideband Computers, Inc.
  • 27. ADSP-21K Optimized DSP Library User’s Manual ccort ( a, b, c, m, n ) NAME Cross-Correlation (Time Domain) DESCRIPTION Computes the time domain (real) cross-correlation of the time domain (real) elements stored in input vectors a[ ] and b[ ]. The result is stored in output real vector c [ ]. Val- ues m and n define the number of cross-correlation values to compute. The implemen- tation uses a time domain technique. n–i–1 ALGORITHM Cm = ∑ Ai + j • Bj i = { 0, 1, 2, …, m – 1 } j=0 SYNOPSIS void ccort ( a, b, c, m, n ) float *a ; /* Pointer to input vector a[ ] */ float *b ; /* Pointer to input vector b[ ] */ float *c ; /* Pointer to output vector c[ ] */ int m ; /* Lag count m */ int n ; /* Number of elements in vector c[ ] */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 32 + 9 * M + (M+1)(2*N-M) NOTES The file tccort.c included in the distribution tape provides an example of this func- tion’s use. Note that the lag count must be less than or equal to the number of floating-point ele- ments (i.e. m ≤ n ). The strides of vectors a, b, and c must always be 1. Wideband Computers, Inc. 5-81
  • 28. ADSP-21K Optimized DSP Library User’s Manual cdesamp ( data, coeff, output, d, n, p ) NAME Complex Decimating Finite Impulse Response (FIR) Filter DESCRIPTION The function computes the convolution of complex vectors data [ ] and coeff [ ] plac- ing the results in complex vector output [ ]. The number of output samples n and the number of coefficients p may be dissimilar. n elements will be written to output [ ]. Complex vector data [ ] represents the real and imaginary (I and Q) components of the input data respectively. Likewise, complex vector coeff [ ] represents the real and imaginary ( I and Q) components of the coefficient data. A complex multiply and add is performed to compute the convolutional output. The decimation factor d is used to stride the next starting point in data [ ]. p–1 ALGORITHM Output [ i ] = ∑ data [ i • d + j ] • coeff [ p – j – 1 ] j=0 i = { 0, 1, 2…n – 1 } SYNOPSIS void cdesamp ( data, coeff, output, d, n, p ) complex dm *data ; /* Complex input data ( len n+p-1 ) */ complex pm *coeff ; /* Complex coefficients ( len p ) */ complex dm *output ; /* Complex output data ( len n ) */ int d ; /* Decimation factor */ int n ; /* Number of output samples */ int p ; /* Number of coefficients */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits 5-82 Wideband Computers, Inc.
  • 29. ADSP-21K Optimized DSP Library User’s Manual cdesamp ( data, coeff, output, d, n, p ) EXECUTION TIME 36 + ( 7 + 5 * p ) * n cycles NOTES The file tcdesamp.c included in the distribution tape provides an example of this func- tion’s use. The number of filter output samples to generate can be obtained as follows: n = ( ndata – p ) ⁄ d + 1 where ndata is the number of elements in data[ ]. A complex correlation can be performed by reversing the order of the coefficients vec- tor. Wideband Computers, Inc. 5-83
  • 30. ADSP-21K Optimized DSP Library User’s Manual cdotpr ( a, i, b, j, c, k, n ) NAME Complex Dot Product DESCRIPTION This function computes the complex dot product of complex input vector a and com- plex input vector b and stores the results in complex output vector c. This can altena- tively thought of as C = A • B . ALGORITHM n–1 Re { C } = ∑ Re { Ami } • Re { Bmj } – Im{ Ami } • Im{ Bmj } m= 0 n–1 Im { C } = ∑ Re{ Ami } • Im { Bmj } + Im { Ami } • Re { Bmj } m= 0 m = { 0, 1, 2…n – 1 } SYNOPSIS void cdotpr ( a, i, b, j, c, k, n ) complex *a ; /* Pointer to complex input vector a */ int i ; /* Address stride in words for input vector a */ complex *b ; /* Pointer to complex input vector b */ int j ; /* Address stride in words for input vector b */ complex *c ; /* Pointer to complex output vector c */ int k ; /* Address stride in words for output vector c */ int n ; /* Element count */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits EXECUTION TIME 64 + 4*(N-1) cycles NOTES The file tcdotpr.c included in the distribution diskette provides an example of this function’s use. 5-84 Wideband Computers, Inc.
  • 31. ADSP-21K Optimized DSP Library User’s Manual ceil_wci ( x ) NAME Round Up to Nearest Integer DESCRIPTION This function computes the smallest integral value greater than or equal to the float- ing-point number x. A floating-point representation of this integer value is returned. ALGORITHM return = smallest int ≥ x SYNOPSIS float ceil_wci ( float x ) DOMAIN -3.4E+38 to 3.40E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 20 cycles NOTES The file tceil.c included in the distribution tape provides an example of this function's use. Wideband Computers, Inc. 5-85
  • 32. ADSP-21K Optimized DSP Library User’s Manual cfft ( xr, xi, wr, wi, wstr, yr, yi, n ) NAME Fast Fourier Transform Of Complex Input Data DESCRIPTION Computes the Fast Fourier Transform of the complex input elements stored in com- plex input vector a. The results are stored in complex output vector c. n–1 – i2πmk ⁄ n ALGORITHM Cm = ∑ Ake m = { 0, 1, 2, …, n – 1 } k=0 SYNOPSIS void cfft ( xr, xi, wr, wi, wstr, yr, yi, n ) float dm *xr ; /* Pointer to real input data */ float dm *xi ; /* Pointer to imaginary input data */ float pm *wr ; /* Pointer to cosine table */ float dm *wi ; /* Pointer to sine table */ int wstr ; /* Cosine/sine table stride */ float dm *yr ; /* Pointer to real output data */ float pm *yi ; /* Pointer to imaginary output data */ int n ; /* FFT Size (In Complex Elements) */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME See Attached Table Below 5-86 Wideband Computers, Inc.
  • 33. ADSP-21K Optimized DSP Library User’s Manual cfft ( xr, xi, wr, wi, wstr, yr, yi, n ) NOTES This is a radix-2 Fast Fourier Transform using parallel data memory/program memory data accesses to maximize the throughput on the 21020/60/62 processor. The complex input data is separated into real and imaginary parts, xr and xi. These vectors must be aligned on an address which is an integer multiple of the FFT size, as required for 21K bit-reverse addressing. The input vectors are both in data memory; the imagi- nary data is bit-reversed into program memory at the beginning of the routine. The number of elements n supplied to the algorithm must be an integral power of two and a minimum of 32. The complex output is separated into real and imaginary parts, yr and yi. These vec- tors may have arbitrary address alignment; however yr is in the data memory and yi is in the program memory. Vectors xr and xi must be in data memory and each must be aligned to an integral multiple of n. Vectors wr and wi are in program memory and data memory respectively and are given the values: wr [ k ] = cos [ 2πk ⁄ wst*n ] k = ( 0, 1, …, wstn ⁄ 2 – 1 )program memory wi [ k ] = sin [ 2πk ⁄ wst*n ] k = ( 0, 1, …, wstn ⁄ 2 – 1 )data memory The weight stride wst allows cfft() to be called with varying sizes n from a single set of weights.These weights are generated using the fftwts() function. This precomputed FFT weight approach was implemented in order to ensure accurate results and boost the available cfft() dynamic range to approximately 130 dB for longer length (>16K) FFTs. This is accomplished by using an implementation that does not rely on a recursive call to a sin/cosine approximation routine, as found in other implementations. Rather, the FFT weights are precomputed accurately using the fftwts() function. This is sufficient for A/D converters with bit lengths up to 22 bits. The number of elements n must be an integral power of two and a minimum of 32. Vector yr is in data memory and has a minimum size of n. Vector yi is in program memory and has a minimum size of n. The file tcfft.c included in the distribution tape provides an example of this function’s use. Wideband Computers, Inc. 5-87
  • 34. ADSP-21K Optimized DSP Library User’s Manual cfft ( xr, xi, wr, wi, wstr, yr, yi, n ) SPECIAL NOTES Previous users have sometimes reported problems associated with implementing inter- rupt service routines (ISRs), when used in conjunction with the FFT routines ( cfft( ), cffti( ), rfft( ), rffti( ) ). Observations related to the Wideband technical staff typically include a description of the Wideband routine executing perfectly, but unable to return to an exact state after being interrupted by the ISR ( what is described as a “tumble into the weeds.” ) The Wideband Fast Fourier transforms, both complex and real, forward and inverse, use the built-in bit reversing and circular addressing capabilites of the SHARC archi- tecture. Also, other routines such as some of the FIR filters use the SHARC’s internal circular addressing capabilities. End users are usually cognizant that their ISR calling routine is responsible for saving and restoring the registers of the Wideband routines. However, end users sometimes forget to save and restore ( push and pop ) the mode 1 regiser, which is associated with bir reversing and the B ( base ) and L ( length ) registers associated with circular addressing. In such circumstances where they are not saved and restored by the ISR they are unable to return the proper length parameter ( L Register ) used for circular addressing or the proper mode ( Mode 1 Register ) used in Bit Reversing. This results in the strange manefestations users sometimes report. To properly save and restore the above mentioned registers in an ISR, refer to page 4- 21, section 4.3 of the Analog Devices ADSP-21000 Family C Tools Manual (#31- 000005-08, dated August 95) which references examples of in line assembly code within C code to save and restore registers. For a detailed review of the relationships between the various FFT functions and how to use them with one another, see the final section of Chapter 4. 5-88 Wideband Computers, Inc.
  • 35. ADSP-21K Optimized DSP Library User’s Manual Performance Issues The inital timing shown below for the 32 point to 4,096 point FFTs were timed using the Analog Devices simulator. Performance Timings For Complex FFTs Number of Points Processor Cycles 8 See cfft8( ) function 16 See cfft16( ) function 32 771 Cycles 64 1,274 Cycles 128 2,368 Cycles 256 4,724 Cycles 512 10,060 Cycles 1,024 21,618 Cycles 2,048 46,744 Cycles 4,096 101,054 Cycles 8,192 217,828 Cycles 16,384 467,722 Cycles 32,768 1,000,240 Cycles 65,536 2,130,774 Cycles Wideband Computers, Inc. 5-89
  • 36. ADSP-21K Optimized DSP Library User’s Manual cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n ) NAME Complex 2-Dimensional Fast Fourier Transform DESCRIPTION Computes a 2-Dimensional Fast Fourier Transform of the complex input elements stored in vector a[ ]. The results are stored in complex output vector c[ ]. n–1n–1 – 2 πj ( ( r ⋅ R + c ⋅ C ) ⁄ n ) ALGORITHM C r, c = ∑ ∑ Ake r = 0c = 0 R = { 0, 1, …n – 1 } C = { 0, 1, …n – 1 } SYNOPSIS void cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n ) float dm *xr ; /* Pointer to real input/output data */ float dm *xi ; /* Pointer to imaginary input/output data */ float pm *wr ; /* Pointer to cosine table */ float dm *wi ; /* Pointer to sine table */ int wstr ; /* Consine/sine Table table */ float dm *tmpdm ; /* Pointer to real output data */ float pm *tmppm ; /* Pointer to imag output data */ int n ; /* CFFT2D Size (Complex Elements n x n) */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits 5-90 Wideband Computers, Inc.
  • 37. ADSP-21K Optimized DSP Library User’s Manual cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n ) EXECUTION TIME 32 x 32 Pts. 44,532 cycles 64 x 64 Pts. 165,364 cycles 128 x 128 Pts. 659,572 cycles Wideband Computers, Inc. 5-91
  • 38. ADSP-21K Optimized DSP Library User’s Manual cfft2d ( xr, xi, wr, wi, wstr, tmpdm, tmppm, n ) NOTES The input data is an nxn complex matric x separated into real and imaginary parts xr and xi stored as follows: Re ( x r, c ) = xr [ r • n + c ] r = { 0, 1, …, n – 1 } c = { 0, 1, …, n – 1 } Im ( x r, c ) = xi [ r • n + c ] r = { 0, 1, …, n – 1 } c = { 0, 1, …, n – 1 } Variables r and c are the row and column numbers. The DFT output replaces the input, and is stored as follows: Re ( F R, C ) = xr [ R • n + C ] R = { 0, 1, …, n – 1 } C = { 0, 1, …, n – 1 } Im ( F R, C ) = xi [ R • n + C ] R = { 0, 1, …, n – 1 } C = { 0, 1, …, n – 1 } A radix-2 Fast Fourier Transform (FFT) algorithm is used to compute the individual row and column DFTs. The number of elements n must be an integral power of two and a minimum of 32. Vectors xr and xi must be in data memory and are adress-aligned to an integral multi- ple of n. Vectors wr and wi must be in program memory and data memory respectively and are pre-computed to be: wr [ k ] = cos [ 2πk ⁄ wst*n ] k = ( 0, 1, …, wstn ⁄ 2 – 1 ) wi [ k ] = sin [ 2πk ⁄ wst*n ] k = ( 0, 1, …, wstn ⁄ 2 – 1 ) Vector tmpdm must be in data memory, having a minimum size of n, and be address- aligned to an integral multiple of n. Vector tmppm must be in program memory and have a minimum size of n,and be address-aligned to an integral multiple of n. The file tcfft2d.c included in the distribution tape provides an example of this func- tion’s use. 5-92 Wideband Computers, Inc.
  • 39. ADSP-21K Optimized DSP Library User’s Manual cfft8 ( xr, xi, yr, yi ) NAME 8-Point Complex Fast Fourier Transform (Inline) DESCRIPTION Computes the Fast Fourier Transform of the complex input elements stored in input vector xr and xi. The results are stored in output vector yr and yi. ALGORITHM 7 – 2πj ( m • k ⁄ 8 ) Ym = ∑ Xke m = { 0, 1, 2, …, 7 } k=0 SYNOPSIS void cfft8 ( xr, xi, yr, yi ) float dm *xr ; /* Pointer to real input data */ float dm *xi ; /* Pointer to imaginary input data */ float dm *yr ; /* Pointer to real output data */ float pm *yi ; /* Pointer to imaginary output data */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits Wideband Computers, Inc. 5-93
  • 40. ADSP-21K Optimized DSP Library User’s Manual cfft8 ( xr, xi, yr, yi ) EXECUTION TIME 184 Cycles NOTES This is an 8-point radix-2 Fast Fourier Transform using parallel data memory/program memory data accesses to maximize the throughput on the 21020/60/62 processor. The complex input data is separated into real and imaginary parts, xr and xi. These vectors must be aligned on an address which is an integer multiple of the FFT size, as required for 21K bit-reverse addressing. The input vectors are both in data memory; the imaginary data is bit-reversed into program memory at the beginning of the routine. This algorithm utilizies a decimation in time approach. As the cffti( ) function requires a minimum of 32-points as input, there is no corresponding inverse algorithm for this routine. The complex output is separated into real and imaginary parts, yr and yi. These vectors may have arbitrary address alignment; however yr is in the data mem- ory and yi is in the program memory. •Vectors xr and xi are defined in cfft8dta.asm using the dm_align segment to ensure address alignment. For a detailed review of the relationships between the various FFT functions and how to use them with one another, see the final section of Chapter 4. The file tcfft8.c included in the distribution tape provides an example of this func- tion’s use. 5-94 Wideband Computers, Inc.
  • 41. ADSP-21K Optimized DSP Library User’s Manual cfft16 ( xr, xi, yr, yi ) NAME 16-Point Complex Fast Fourier Transform (Inline) DESCRIPTION Computes the Fast Fourier Transform of the complex input elements stored in input vector xr and xi. The results are stored in output vector yr and yi. 15 ∑ Xke ALGORITHM – 2πj16 Ym = m = { 0, 1, 2, …, 15 } k=0 SYNOPSIS void cfft16 ( xr, xi, yr, yi ) float dm *xr ; /* Pointer to real input data */ float dm *xi ; /* Pointer to imaginary input data */ float dm *yr ; /* Pointer to real output data */ float pm *yi ; /* Pointer to imaginary output data */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits Wideband Computers, Inc. 5-95
  • 42. ADSP-21K Optimized DSP Library User’s Manual cfft16 ( xr, xi, yr, yi ) EXECUTION TIME 388 Cycles NOTES This is an 16-point radix-2 Fast Fourier Transform using parallel data memory/pro- gram memory data accesses to maximize the throughput on the 21020/60/62 proces- sor. The complex input data is separated into real and imaginary parts, xr and xi. These vectors must be aligned on an address which is an integer multiple of the FFT size, as required for 21K bit-reverse addressing. The input vectors are both in data memory; the imaginary data is bit-reversed into program memory at the beginning of the routine. This algorithm utilizies a decimation in time approach. As the cffti( ) function requires a minimum of 32-points as input, there is no corresponding inverse algorithm for this routine. The complex output is separated into real and imaginary parts, yr and yi. These vectors may have arbitrary address alignment; however yr is in the data mem- ory and yi is in the program memory. •Vectors xr and xi are defined in cfft16dt.asm using the dm_align segment to ensure address alignment. For a detailed review of the relationships between the various FFT functions and how to use them with one another, see the final section of Chapter 4. The file tcfft16.c included in the distribution tape provides an example of this func- tion’s use. 5-96 Wideband Computers, Inc.
  • 43. ADSP-21K Optimized DSP Library User’s Manual cffti ( xr, xi, wr, wi, wstr, yr, yi, n ) NAME Inverse Complex FFT DESCRIPTION Computes the Inverse Fast Fourier Transform of the input elements stored in vectors xr and xi. The results are stored in complex output vector c. Note the Inverse FFT is the same as the Forward FFT except that the sign of the imaginary components of the twiddle factors is negated. The Inverse FFT swaps the real and imaginary input data, perform the Forward FFT with the same weights table, and swaps the real and imagi- nary ouptut data. Scaling by 1/N is then performed. n–1 i2πmk ⁄ n ∑ Ak e ALGORITHM 1 C m = -- - m = { 0, 1, 2, …, n – 1 } n k=0 SYNOPSIS void cffti ( xr, xi, wr, wi, wstr, yr, yi, n ) float dm *xr ; /* Pointer to real input data */ float dm *xi ; /* Pointer to imaginary input data */ float pm *wr ; /* Pointer to cosine table */ float dm *wi ; /* Pointer to sine table */ int wstr ; /* Cosine/sine table stride */ float dm *yr ; /* Pointer to real output data */ float pm *yi ; /* Pointer to imaginary output data */ int n ; /* FFT Size (In Complex Elements) */ DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 22,650 Cycles @ 1,024 Points - Data and Program In On-Board Cache Wideband Computers, Inc. 5-97
  • 44. ADSP-21K Optimized DSP Library User’s Manual cffti ( xr, xi, wr, wi, wstr, yr, yi, n ) NOTES This is a radix-2 inverse Fast Fourier Transform using parallel DM/PM data accesses to maximize the throughput on the 21020 processor.The complex input data is sepa- rated into real and imaginary parts, xr and xi. These vectors must be aligned on an address which is an integer multiple of the FFT size, as required for 21K bit- reverse addressing. The input vectors are both in DM; the imaginary data is bit- reversed into PM at the beginning of the routine.The number of elements n must be an integral power of two and a minimum of 32. The complex output is separated into real and imaginary parts, yr and yi. These vec- tors may have arbitrary address alignment; however yr is in the DM and yi is in the PM. Vectors xr and xi mus be in data memory and each must be aligned to an integral multiple of n. Vectors wr and wi are in program memory and data memory respectively and are given the values: wr [ k ] = cos [ 2πk ⁄ wst*n ] k = ( 0, 1, …, wstn ⁄ 2 – 1 ) wi [ k ] = sin [ 2πk ⁄ wst*n ] k = ( 0, 1, …, wstn ⁄ 2 – 1 ) The weight stride, wst, allows for calling cfft() with varying sizes n from a single set of weights. These weights are generated using the fftwts( ) function. Vector yr is in data memory and has a minimum size of n.Vector yi is in program memory and has a minimum size of n. The file tcfft.c included in the distribution tape provides an example of this function’s use. 5-98 Wideband Computers, Inc.
  • 45. ADSP-21K Optimized DSP Library User’s Manual cffti ( xr, xi, wr, wi, wstr, yr, yi, n ) SPECIAL NOTES Previous users have sometimes reported problems associated with implementing inter- rupt service routines (ISRs), when used in conjunction with the FFT routines (cfft ( ), cffti ( ), rfft ( ), rffti ( ) ). Observations related to the Wideband technical staff typi- cally include a description of the Wideband routine executing perfectly, but unable to return to an exact state after being interrupted by the ISR ( a “tumble into the weeds.” ) The Wideband Fast Fourier transforms, both complex and real, forward and inverse, use the built-in bit reversing and circular addressing capabilites of the SHARC archi- tecture. Also, other routines such as some of the FIR filters use the SHARC’s internal circular addressing capabilities. End users are usually cognizant that their ISR calling routine is responsible for saving and restoring the registers of the Wideband routines. However, end users sometimes forget to save and restore ( push and pop ) the mode 1 regiser, which is associated with bir reversing and the B ( base ) and L ( length ) registers associated with circular addressing. In such circumstances where they are not saved and restored by the ISR they are unable to return the proper length parameter ( L Register ) used for circular addressing or the proper mode ( Mode 1 Register ) used in Bit Reversing. This results in the strange manefestations users sometimes report. To properly save and restore the above mentioned registers in an ISR, refer to page 4- 21, section 4.3 of the Analog Devices ADSP-21000 Family C Tools Manual (#31- 000005-08, dated August 95) which references examples of in line assembly code within C code to save and restore registers. For a detailed review of the relationships between the various FFT functions and how to use them with one another, see the final section of Chapter 4. Wideband Computers, Inc. 5-99
  • 46. ADSP-21K Optimized DSP Library User’s Manual TABLE 8 Table of Inverse Complex FFT Timing Number Processor of Points Cycles 32 868 Cycles 64 1,435 Cycles 128 2,657 Cycles 256 5,319 Cycles 512 11,117 Cycles 1,024 23,699 Cycles 2,048 50,873 Cycles 4,096 109,281 Cycles 8,192 234,244 Cycles 16,384 500,525 Cycles 32,768 1.072,560 Cycles 65,536 2,288,128 Cycles 5-100 Wideband Computers, Inc.
  • 47. ADSP-21K Optimized DSP Library User’s Manual cfir ( ii, qq, ci, cq, oi, oq, d, n, p ) NAME Complex Finite Impulse Response Filter DESCRIPTION The function cfir( ) computes the convolution of vectors ii[ ], iq[ ], ci[ ], and cq[ ] placing the results in oi[ ] and oq[ ] respectively. The number of output samples n and the number of coefficients p may be dissimilar. n elements will be writtento oi[ ] and oq[]. The vectors ii[ ] and iq[ ] represent the real and imaginary (I and Q) components of the input data respectively. Likewise,the vectors ci[ ] and cq[ ] represent the real and imaginary (I and Q) components of the coefficient data. A complex multiply and add is performed to compute the convolutional output. The decimation factor d is used to stride the next starting ii[ ] and iq[ ] data. p= 1 ALGORITHM C[ i ]= ∑ a[a • d + j] • b[p – j – 1] j=0 m = { 0, 1, 2, …, n – 1 } where a [ ] compromises complex components ii [ ] and iq [ ] b [ ] compromises complex components ci [ ] and cq [ ] c [ ] compromises complex components oi [ ] and oq [ ] SYNOPSIS void cfir ( ii, qq, ci, cq, oi, oq, d, n, p ) */ float dm *ii ; Input samples for I data ( len n+p-1 ) */ */ float dm *iq ; Input samples for Q data ( len n+p-1 ) */ */ float pm *ci ; Coefficients for I data ( len p ) */ */ float pm *cq ; Coefficients for Q data ( len p ) */ */ float dm *oi ; Output samples for I data ( len n ) */ */ float dm *oq ; Output samples for Q data ( len n ) */ */ int d ; Decimation factor */ */ int n ; Number of output samples */ */ int p ; Number of coefficients */ Wideband Computers, Inc. 5-101
  • 48. ADSP-21K Optimized DSP Library User’s Manual cfir ( ii, qq, ci, cq, oi, oq, d, n, p ) DOMAIN -3.4E+38 to 3.4E+38 ACCURACY 7.75 decimal digits EXECUTION TIME 59 + ( 9 + 5 * p ) * n cycles NOTES The file tfir.c included in the distribution tape provides an example of this function’s use. The number of filter output samples to generate can be obtainted as follows: ( ndata – p ) n = ---------------------------- d+1 where ndata is the number of elements in ii[ ] and iq[ ]. A correlation can be performed by reversing the order of the coefficients vector. 5-102 Wideband Computers, Inc.
  • 49. ADSP-21K Optimized DSP Library User’s Manual chksum ( a, i, type, n ) NAME Perform Checksum DESCRIPTION This function performs a checksum on a memory block. The memory block is defined by the start address a offset by n. The type flag determines whether dm or pm memory is tested ( 1 = dm, 0 = pm). ALGORITHM Return ⇐ Checksum SYNOPSIS void chksum ( a, i, type, n ) int a ; /* Start address of memory */ int i ; /* Memory Stride */ int type ; /* Type of memory to test ( dm or pm ) */ int n ; /* Length of block to be checked */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits EXECUTION TIME 17 + 2 * N cycles NOTES The file tchksum.c included in the distribution diskette provides an example of this function’s use. chksum( ) performs a two’s complement on the sum of the elements within the mem- ory block. The check sum value is returned. Wideband Computers, Inc. 5-103
  • 50. ADSP-21K Optimized DSP Library User’s Manual cmadd ( a, b, x, y, c ) NAME Complex Matrix Addition DESCRIPTION This function computes the addition of complex input matrix a with complex input matrix b and stores the results to complex output matrix c. ALGORITHM C ri11 C ri12 C ri13 A ri11 A ri12 A ri13 B ri11 B ri12 B ri13 = + C ri21 C ri22 C ri23 A ri21 A ri22 A ri23 B ri21 B ri22 B ri23 where ri indicates a real and imaginary component SYNOPSIS void cmadd ( a, b, x, y, c ) complex dm *a ; /* Pointer to input matrix a [ ][ ] */ complex dm *b ; /* Pointer to input matrix b [ ][ ] */ int x ; /* Number of rows in matrix a[ ][ ] & b[ ][ ] */ int y ; /* Number of columns in matrix a[ ][ ]& b[ ][ ]*/ complex dm *c ; /* Pointer to output matrix c [ ][ ] */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits 5-104 Wideband Computers, Inc.
  • 51. ADSP-21K Optimized DSP Library User’s Manual cmadd ( a, b, x, y, c ) EXECUTION TIME 32 + 3*X*Y cycles NOTES The file tcmadd.c included in the distribution diskette provides an example of this function’s use. The addition of a complex matrix is mathematically expressed as follows: Real C [ x ] [ y ] = A [ x ] [ y ] Real + B [ x ] [ y ] Real Imaginary C [ x ] [ y ] = A [ x ] [ y ] Imaginary + B [ x ] [ y ] Imaginary An example of the additon of one complex matrix to another is as follows: 1, 2 3.8, 1.7 8.8, 5.5 9.9, 14 7.1, 5 9.3, 1.6 0.4, 1 51, 3.3 0.9, 1 8, 5 2.1, 6 – 3.1, – 1 A[ x][ y]= 9.3, 1 2.5, 1.5 6.9, 9 10, 22.1 1.3, 1.4 0.2, 4.5 0.9, 51.4 1.5, 4.4 9.2, 4 7.8, 1.7 61, 3.4 14.3, 1.4 3.2, 1 8.8, 2 9.9, 3 44.3, 13.3 8.1, 4 6.5, 5 3.2, 6 – 2.3, – 9.9 8.9, 7 2.8, 8 1.7, 9 – 8.1, – 2.2 B [x] [y] = 6.4, 10 11, 1.3 12, 4.5 22.9, – 5.4 6.5, 7 2.1, 8 2.2, 9 32, 9.8 1.1, 4 7.7, 5 4.4, 6 – 2.1, – 0.3 x=6, y=4 4.2, 3 12.6, 3.7 18.7, 8.5 54.2, 27.3 15.2, 9 15.8, 6.6 3.6, 7 48.7, – 6.6 9.8, 8 10.8, 13 3.8, 15 – 11.2, – 3.2 C [x] [y] = 15.7, 11 13.5, 2.8 18.9, 13.5 32.9, 16.7 7.8, 8.4 2.3, 12.5 3.1, 60.4 33.5, 14.2 10.3, 8 15.5, 6.7 65.4, 9.4 12.2, 1.1 Wideband Computers, Inc. 5-105
  • 52. ADSP-21K Optimized DSP Library User’s Manual cmmov ( a, x, y, b ) NAME Complex Matrix Move DESCRIPTION This function moves a source complex input matrix a to a destination complex output matrix b. ALGORITHM C ri11 C ri12 C ri13 A ri11 A ri12 A ri13 ⇐ C ri21 C ri22 C ri23 A ri21 A ri22 A ri23 where ri indicates a real and imaginary component SYNOPSIS void cmmov ( a, x, y, b ) complex dm *a ; /* Pointer to input matrix a [ ][ ] */ int x ; /* Number of rows in matrix a[ ][ ] */ int y ; /* Number of columns in matrix a[ ][ ] */ complex dm *b ; /* Pointer to output matrix b [ ][ ] */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits EXECUTION TIME 13 + ( 2 * X * Y ) cycles NOTES The file tcmmov.c included in the distribution diskette provides an example of this function’s use. The storage methodology for matrices is by rows. Matricies can be thought of as one long array (vector) where the beginning of each row is offset by the length of the col- umn. 5-106 Wideband Computers, Inc.
  • 53. ADSP-21K Optimized DSP Library User’s Manual cmmul ( a, x, y, b, z, c ) NAME Complex Matrix Multiplication DESCRIPTION This function computes the multiplication of complex input matrix a times complex input matrix b and stores the results to complex output matrix c. The dimension of complex matrix a [ ] [ ] is x and y and the dimension of complex input matrix b [ ] [ ] is y and z. The resulting complex output matrix c [ ] [ ] is of dimension x and z. ALGORITHM B ri11 B ri12 C ri11 C ri12 A ri11 A ri12 A ri13 = • B ri21 B ri22 C ri21 C ri22 A ri21 A ri22 A ri23 B ri31 B ri32 where ri indicates a real and imaginary component SYNOPSIS void cmmul ( a, x, y, b, z, c ) complex dm *a ; /* Pointer to input matrix a [ ][ ] */ int x ; /* Number of rows in matrix a[ ][ ] */ int y ; /* Number of columns in matrix a[ ][ ] */ /* Number of rows in matrix b[ ][ ] */ complex dm *b ; /* Pointer to input matrix b [ ][ ] */ int z ; /* Number of columns in matrix b[ ][ ] */ complex dm *c ; /* Pointer to output matrix c [ ][ ] */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits Wideband Computers, Inc. 5-107
  • 54. ADSP-21K Optimized DSP Library User’s Manual cmmul ( a, x, y, b, z, c ) EXECUTION TIME 45 + (4 + ( 10 + 5 * Y) * Z) * X cycles NOTES The file tcmmul.c included in the distribution diskette provides an example of this function’s use. The multiplication of a complex matrix is as follows: y C[x ][y ]= ∑ ( Real Sum + Imaginary Sum ) where k=1 Real Sum = A Real • BReal – AImaginary • BImagainary Imaginary Sum = A Real • BImaginary + BReal • A Imaginary The storage methodology for matrices is by rows. Matrices can be thought of as one long array (vector) where the beginning of each row is offset by the length of the col- umn. The first row of a [ ] [ ] times the first column of b [ ] [ ] is the first element of c [ ] [ ] (row 1, column 1). The first row of a [ ] [ ] times the second row of b [ ] [ ] is the sec- ond element of c [ ] [ ] (row 1, column 2 ) ... etc. This algorithm follows the general law of matrix multiplication whereby the number of columns of input matrix a must equal the number of rows of input matrix b. ri indicates that each component of the matrix is composed of a complex number which has both a real and imaginary component. An example of the multipication of one complex matrices by another is as follows: 1, 2 3, 4 5, 6 1, 1 2, 2 3, 3 4, 4 7, 8 9, 10 11, 12 A [ x ] [ y ] = 5, 5 6, 6 7, 7 8, 8 B [y] [z] = 13, 14 15, 16 17, 18 9, 9 10, 10 11, 11 12, 12 19, 20 21, 22 23, 24 x=3, y=4, z=3 – 10, 270 – 10, 310 – 10, 350 C [ ] = – 26, 606 – 26, 710 – 26, 814 – 42, 942 – 42, 1110 – 42, 1278 5-108 Wideband Computers, Inc.
  • 55. ADSP-21K Optimized DSP Library User’s Manual cmmul_dpd ( a, x, y, b, z, c ) NAME Complex Matrix Multiplication (Data Memory x Program Memory to Data Memory) DESCRIPTION This function computes the multiplication of complex input matrix a[ ] (in data mem- ory) times complex input matrix b[ ] (in program memory) and stores the results to complex output matrix c[ ] (in data memory). The dimension of complex matrix a [ ] [ ] is x and y and the dimension of complex input matrix b [ ] [ ] is y and z. The resulting complex output matrix c [ ] [ ] is of dimension x and z. ALGORITHM B ri11 B ri12 C ri11 C ri12 A ri11 A ri12 A ri13 = • B ri21 B ri22 C ri21 C ri22 A ri21 A ri22 A ri23 B ri31 B ri32 where ri indicates a real and imaginary component SYNOPSIS void cmmul_dpd ( a, x, y, b, z, c ) complex dm *a ; /* Pointer to input matrix a [ ][ ] */ int x ; /* Number of rows in matrix a[ ][ ] */ int y ; /* Number of columns in matrix a[ ][ ] */ /* Number of rows in matrix b[ ][ ] */ complex pm *b ; /* Pointer to input matrix b [ ][ ] */ int z ; /* Number of columns in matrix b[ ][ ] */ complex dm *c ; /* Pointer to output matrix c [ ][ ] */ DOMAIN -3.4 x 1038 to +3.4 x 1038 ACCURACY 7.75 decimal digits Wideband Computers, Inc. 5-109