SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Smart Processor Picks for Consumer Media Applications




         Optimized DSP Software • Independent DSP Analysis




                  Introduction to Video Compression
                                                      (Workshop 210)



                                            Berkeley Design Technology, Inc.

                                                            info@BDTI.com
                                                         http://www.BDTI.com



                                                    © 2004 Berkeley Design Technology, Inc.




            Outline
             • Motivation and scope
             • Still-image compression techniques
             • Motion estimation and compensation
             • Reducing artifacts
             • Color conversion
             • Conclusions




          © 2004 Berkeley Design Technology, Inc.                                             2



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                                       Page 1                          April 2004
Smart Processor Picks for Consumer Media Applications




            Motivation and Scope
             •   Consumer video products increasingly rely on video
                 compression
                  •   DVDs, digital TV, personal video recorders, Internet video,
                      multimedia jukeboxes, video-capable cell phones and PDAs,
                      camcorders…
             •   Video product developers need to understand the
                 operation of video “codecs”
                  •   To select codecs, processors, software modules
                  •   To optimize software
             •   This presentation covers:
                  •   Operation of video codecs and post-processing
                  •   Computational and memory demands of key codec and post-
                      processing components

          © 2004 Berkeley Design Technology, Inc.                                   3




            Outline
             • Motivation and scope
             • Still-image compression techniques
             • Motion estimation and compensation
             • Reducing artifacts
             • Color conversion
             • Conclusions




          © 2004 Berkeley Design Technology, Inc.                                   4



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                          Page 2                             April 2004
Smart Processor Picks for Consumer Media Applications




            Still-Image Compression
             •   Still-image compression
                  •   Still-image techniques provide a basis for video
                      compression
                  •   Video can be compressed using still-image
                      compression individually on each frame
                  •   E.g., “Motion JPEG” or MJPEG
             •   But modern video codecs go well beyond this
                  •   Start with still-image compression techniques
                  •   Add motion estimation/compensation
                       •   Takes advantage of similarities between frames in a video
                           sequence



          © 2004 Berkeley Design Technology, Inc.                                        5




            Still-Image Compression
                                                           Transform

                                                           Quantization

                                                        Run-length coding

                                                      Variable length coding


                      Frame                                                     Frame
                      Header                                                    Header




                           Typical Still-Image Compression Data Flow

          © 2004 Berkeley Design Technology, Inc.                                        6



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                          Page 3                                  April 2004
Smart Processor Picks for Consumer Media Applications




            Block Transform: 8x8 DCT
                                                       • 8x8 DCT blocks applied on Y, U,
                                                         and V planes individually
                                                       • The energy is concentrated in the
                                                         low frequencies
                                                       • Perceptual information also
                            Y values                     concentrated in low frequencies
                                                                               High energy

                                             8x8 DCT                           Low energy



                                            8x8 IDCT


                Spatial domain                           Frequency domain
          © 2004 Berkeley Design Technology, Inc.                                            7




            Block Transform: Resource Reqt’s
            •   Compute load:
                 • Up to 30% of total video decoder processor cycles
                 • MPEG-4 CIF (352x288) @ 30 fps:
                     •   71,280 DCTs/s
                     •   ~40 MHz on a TMS320C55x DSP
                     •   ~10 MHz if using TMS320C55x DCT accelerator
                 •   Many implementation and optimization options
                     •   Can make compute requirements hard to predict
            •   Memory usage: negligible




          © 2004 Berkeley Design Technology, Inc.                                            8



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                              Page 4                                  April 2004
Smart Processor Picks for Consumer Media Applications




                         Quantization
         Original




                                                                       Quantizer
         Reconstructed




                                                                     •Removes perceptually
                                                                      less significant data
                             Spatial domain              Frequency
                                                         domain      •Reduces the number of
                                                                      bits per DCT coefficient
               © 2004 Berkeley Design Technology, Inc.                                         9




                         Quantization: Resource Reqt’s
                         • Quantization (encoder) and dequantization
                           (decoder and encoder) have similar compute
                           loads
                         • Compute load:
                            • From 3% to about 15% of total decoder
                              processor cycles
                                 •   Typically near the lower end of this range
                             •   MPEG-4 CIF (352x288) @ 30 fps:
                                 •   ~10 MHz on a TMS320C55x DSP (estimated)
                         •   Memory usage: negligible


               © 2004 Berkeley Design Technology, Inc.                                        10



                                            © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                                Page 5                                  April 2004
Smart Processor Picks for Consumer Media Applications




            Coding Quantized DCT Coefficients
               Goal: Reduce the number of bits required to
               transmit the quantized coefficients
                     Occurrence




                                                Quantized DCT values


               Observation: Unequal distribution of quantized
               DCT coefficient values
          © 2004 Berkeley Design Technology, Inc.                                11




            Variable Length Coding (VLC/VLD)
                 •   Allocates fewer bits to the most frequent symbols
                     (e.g., using Huffman)
                 •   Integer number of bits per symbol
                      •   Not the most efficient coding method
                      •   Arithmetic coding more efficient, but expensive
                      •   Run-length coding improves efficiency of VLC/VLD for
                          image and video coding

                          Symbol                    Frequency       Code
                           A                         22             1
                           B                         16             011
                           C                         9              0101
                           D                         7              0100
                           E                         4              0011
                           F                         2              0010
                           ...                       ...            ...
          © 2004 Berkeley Design Technology, Inc.                                12



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                                Page 6                     April 2004
Smart Processor Picks for Consumer Media Applications




            Run-Length Coding
            Encodes value and number of successive
             occurrences
            Takes advantage of the high number of
             recurring zeros




                                         20,1 ; 18,1 ; 20,1 ; 18,4 ; ... ; 0, 37
          © 2004 Berkeley Design Technology, Inc.                                  13




            Variable Length Coding: Processing
            Reqt’s
            •   VL decoding much more computationally demanding
                than VL encoding
            •   VLD compute load:
                 • Up to 25% of total video decoder processor cycles
                 • MPEG-4 CIF (352x288) @ 30 fps, 700 kbps:
                     •   ~15-25 MHz on a TMS320C55x DSP (estimated)
                 •About 11 operations per bit on average
            •   Memory usage
                • A few KB of memory for lookup tables
                • More for speed optimizations


          © 2004 Berkeley Design Technology, Inc.                                  14



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                            Page 7                           April 2004
Smart Processor Picks for Consumer Media Applications




            Resynchronization Markers
             •   Without markers, a single bit error in the coded
                 bitstream prevents decoding of the rest of the frame
                  •   Size of a corrupted variable-length code word is unknown
                  •   Therefore, the start of the next code word (and all following
                      code words) is unknown
             •   Resynchronization markers help the decoder recover
                 from bitstream errors
                  •   Provide a known bit pattern interspersed throughout the
                      bitstream
                  •   In case of an error, decoder searches for next marker, then
                      continues decoding

                                  Corrupted codeword            Un-decodable data
                        Frame                                                         Frame
                        Header                                                        Header



                                                                   Resynchronization markers
          © 2004 Berkeley Design Technology, Inc.                                              15




            Reversible VLC
             •   Reversible VLC, in conjunction with resynchronization
                 markers, further assists error recovery
                  •   Code words can be decoded forward and backward
                  •   In case of error, data can be decoded backward from the
                      next resynchronization marker
                  •   More data is recovered compared to resynchronization
                      markers alone


                                                       Data decoded       Normal decoding
                                                       backward           continues forward
                                 Corrupted codeword
                      Frame                                                         Frame
                      Header                                                        Header




                                                                  Resynchronization markers

          © 2004 Berkeley Design Technology, Inc.                                              16



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                            Page 8                                       April 2004
Smart Processor Picks for Consumer Media Applications




            AC-DC Prediction                                 Spatial        Frequency
                                                             domain         domain




                   Current block
                   Candidates not selected
                   for prediction
                   Candidate selected for
                   prediction
                                         AC-DC prediction: encode first
                                         differential row and column
                                                    DC prediction: encode
          © 2004 Berkeley Design Technology, Inc.
                                                    differential DC value               17




            AC-DC Prediction

            • AC-DC    prediction cannot be used in
              conjunction with motion compensation
               • ⇒ Used mostly for compressing a
                 single image
               • DC prediction used in JPEG
            • AC-DC prediction often uses simple
              filters to predict each coefficient value
              from one or more adjacent blocks

          © 2004 Berkeley Design Technology, Inc.                                       18



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                                Page 9                            April 2004
Smart Processor Picks for Consumer Media Applications




            AC-DC Prediction: Processing Reqt’s
            •   Compute load:
                 • DC prediction has negligible load
                 • AC-DC prediction used in about 8% of frames in
                   typical video
                        •   Negligible average load (~2% of processor cycles in
                            decoder)
                        •   Substantial per-frame load (~20-30% of cycles required to
                            decode a frame that uses AC-DC prediction)
            •   Memory usage:
                • Under 2 KB for CIF (352x288) resolution
                        •   But more memory (up to 10 KB) can result in faster code
                    •   May be overlapped with other memory structures
                        not in use during prediction

          © 2004 Berkeley Design Technology, Inc.                                       19




            Outline
                • Motivation and scope
                • Still-image compression techniques
                • Motion estimation and compensation
                • Reducing artifacts
                • Color conversion
                • Conclusions




          © 2004 Berkeley Design Technology, Inc.                                       20



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 10                                  April 2004
Smart Processor Picks for Consumer Media Applications




            Motion Estimation and Compensation
             •   Still image compression ignores the
                 correlation between frames of video
                  •   JPEG achieves ~10:1 compression ratio
                  •   Wavelet transform-based image coding reaches
                      compression ratios up to ~30:1
             •   Adding motion estimation and compensation
                 results in much higher compression ratios
                  •   Good video quality at compression ratios as high
                      as ~200:1



          © 2004 Berkeley Design Technology, Inc.                                             21




            Motion Estimation and Compensation
                                                          •   I frame is encoded as a still
             •   Requires at least one                        image and doesn’t depend on
                 “reference frame”                            any reference frame
                  •   Reference frame must be
                      encoded before the                        I
                      current frame
                  •   But, reference frame can            •   P frame depends on previously
                                                              displayed reference frame
                      be a future frame in the
                      display sequence
             •   Three kinds of frames:                                            P
                 I, P, and B                              •   B frame depends on previous
                                                              and future reference frames


                                                                      B

          © 2004 Berkeley Design Technology, Inc.                                             22



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 11                                        April 2004
Smart Processor Picks for Consumer Media Applications




            Typical Sequence of I, P, B frames
                                                           Display order (left to right)
                                                       I
                      Encoding order (top to bottom)
                                                                           P
                                                             B
                                                                   B
                                                                       B
                                                                                               P
                                                                               B
                                                                                     B
                                                                                          B
                                                                                                                  P
                                                                                                   B
                                                                                                       B
                                                                                                              B
                                                                                                                       I



          © 2004 Berkeley Design Technology, Inc.                                                                              23




            Motion Estimation
             •   Predict the contents of each macroblock
                 based on motion relative to reference frame
                  •   Search reference frame for a 16x16 region that
                      matches the macroblock
                  •   Encode motion vectors
                  •   Encode difference between predicted and actual
                      macroblock pixels



                                                                                                                      MV=
                                                                                         MV                           Motion
                                                                                                                      Vector

                                                            Reference frame                   Current frame
          © 2004 Berkeley Design Technology, Inc.                                                                              24



                                                                 © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                                                        Page 12                                          April 2004
Smart Processor Picks for Consumer Media Applications




             Video Encoder Block Diagram
          (Current MB)
                              +
           Input                                8x8 DCT                  Quantize                       VLC + scan
            *MB          Diff block   -         (3.4%)                   (0.4%)                         (7.3%)
                         (2.5%)

                                                                         Inv-Quantize
                                                                         (0.5%)
                                                Intra/Inter
                                                mode

                                                                         8x8 IDCT
                                0                                        (1.7%)                          Bitstream
                                                                                                         MUX
                                                                                                                            Bitstream Out
                                                                                    +
                               Motion                                      +          Add block
                               Compensation                                           (2.9%)                                Encode Intra MB
                               (1%)
                                                                                                                            (1.8%)
                                                                           Frame
                                                                           Store                                            Encode Inter MB
                                                                                    (Reference MB)                          control code
                   Motion Estimation (63.4%):
                                                                                                                            (3.9%)
                   -     Fast Integer ME: 37%
                   -     Half-Pel + 4MV refine: 26%                                                                         Misc functions
                                                                                                         MV Coder
                                                                                                                            (4.1%)

           © 2004 Berkeley Design Technology, Inc.                                                                                            25
                                                                            *MB – Macro-Block        Source: Motorola SPS




             Motion Estimation: The Problem
                                                                                                                     Search area




          • Search on 16x16 blocks                                      • Sub-pixel interpolation required
                                                                          for non-integer motion vectors
          • Typically on luminance only
                                                                  or

                                                Σ|Y[i]|       2
                                                                   Σ|Y[i]|       • SAD more often used
                                              SSD                 SAD


           © 2004 Berkeley Design Technology, Inc.                                                                                            26



                                          © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                                            Page 13                                                                     April 2004
Smart Processor Picks for Consumer Media Applications




            Motion Estimation: Efficient Motion
            Vector Search
                                       • Exhaustive search is impractical
                                       • Evaluate only promising candidate
                                                                           ~
                                         motion vectors
                                            •   Don’t have to find absolute best match
                                                    •   Trade video quality for computational load
                                            •   Many methods in use                                   ~Σ
                                                    •   Often proprietary
                                                    •   Refine candidate vector selection in stages
                                                    •   Predict candidate vectors from surrounding
                                                        macroblocks and/or previous frames


         Motion vector search approach is a key differentiator
         between video encoder implementations
          © 2004 Berkeley Design Technology, Inc.                                                      27




            Motion Estimation: Processing
            Reqt’s
             •   Compute load
                  •   Most demanding task in video compression
                       •   Up to 80% of total encoder processor cycles
                       •   Many search methods exist; requirements vary by method
                       •   May vary with video program content
                       •   Makes encoder computational demand several times
                           greater than that of the decoder
                       •   Dominated by SAD computation
             •   Memory usage
                  •   Motion estimation requires reference frame buffers
                       •   Frame buffers dominate the memory requirements of the
                           encoder
                       •   E.g., 152,064 bytes per frame @ CIF (352x288) resolution
                  •   High memory bandwidth required
          © 2004 Berkeley Design Technology, Inc.                                                      28



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                                   Page 14                                       April 2004
Smart Processor Picks for Consumer Media Applications




            Motion Compensation: Processing
            Reqt’s
             •   Motion compensation copies pixels from reference
                 frame to predict current macroblock
                  •   Requires interpolation for non-integer motion vector values
             •   Compute load
                  •   Varies with video program content
                  •   Can require from 5% to 40% of total decoder processor
                      cycles
                  •   MPEG-4 CIF (352x288) @ 30 fps:
                       •   Roughly 15-25 MHz on a TMS320C55x DSP (estimated)
             •   Memory usage
                  •   Requires reference frame buffers
                       •   Frame buffers dominate decoder memory requirements
                  •   Good memory bandwidth is desirable, but less critical
                      compared to motion estimation

          © 2004 Berkeley Design Technology, Inc.                                   29




            Outline
             • Motivation and scope
             • Still-image compression techniques
             • Motion estimation and compensation
             • Reducing artifacts
             • Color conversion
             • Conclusions




          © 2004 Berkeley Design Technology, Inc.                                   30



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 15                              April 2004
Smart Processor Picks for Consumer Media Applications




            Artifacts: Blocking and Ringing
             •   Blocking: Borders of 8x8 blocks become
                 visible in reconstructed frame




             •   Ringing: Distortions near edges of image
                 features
                                                                     Reconstructed
                                      Original                       image
                                      image                          (with ringing
                                                                     Artifacts)

          © 2004 Berkeley Design Technology, Inc.                                    31




            Deblocking and Deringing Filters
               Low-pass filters are used to smooth the
               image where artifacts occur
             • Deblocking:
                  •   Low-pass filter the pixels at borders of 8x8 blocks
                  •   One-dimensional filter applied perpendicular to 8x8
                      block borders
             •   Deringing:
                  •   Detect edges of image features
                  •   Adaptively apply 2D filter to smooth out areas near
                      edges
                  •   Little or no filtering applied to edge pixels in order
                      to avoid blurring

          © 2004 Berkeley Design Technology, Inc.                                    32



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 16                               April 2004
Smart Processor Picks for Consumer Media Applications




            Artifact Reduction
            Example: Deblocking




                 Original MPEG Still Frame                 Horizontally & Vertically
                                                            Deblocked Still Frame

          © 2004 Berkeley Design Technology, Inc.                                           33




            Artifact Reduction: Post-processing
            vs. In-loop
             •   Deblocking/deringing often applied after                 Video     Reference
                 the decoder (post-processing)                            Decoder   frame

                  •   Reference frames are not filtered
                                                                          Deblock
                  •   Developers free to select best filters for the      Dering
                      application or not filter at all

             •   Deblocking/deringing can be incorporated
                 in the compression algorithm (in-loop                    Video
                 filtering)                                               Decoder
                                                                                    Reference
                  •   Reference frames are filtered                       Deblock
                                                                                    frame

                  •   Same filters must be applied in encoder and         Dering
                      decoder
                  •   Better image quality at very low bit-rates

          © 2004 Berkeley Design Technology, Inc.                                           34



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 17                                      April 2004
Smart Processor Picks for Consumer Media Applications




            Artifact Reduction: Processing Reqt’s
             •   Deblocking and deringing filters can require
                 more processor cycles than the video decoder
                  •   Example: MPEG-4 Simple Profile, Level 1
                      (176x144, 15 fps) decoding requires 14 MIPS on
                      ARM’s ARM9E for a relatively complex video
                      sequence
                  •   With deblocking and deringing added, load
                      increases to 39 MIPS
                       •   Nearly 3x increase compared to MPEG-4 decoding alone!
             •   Post-processing may require an additional
                 frame buffer

          © 2004 Berkeley Design Technology, Inc.                                  35




            Outline
             • Motivation and scope
             • Still-image compression techniques
             • Motion estimation and compensation
             • Reducing artifacts
             • Color conversion
             • Conclusions




          © 2004 Berkeley Design Technology, Inc.                                  36



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 18                             April 2004
Smart Processor Picks for Consumer Media Applications




            Color Space Conversion
                 •   Need for color conversion
                     •   Capture and display video equipment: RGB...
                     •   ...while codecs use YUV
                 •   Computational demand
                     •   12 operations per pixel ⇒ 36 million
                         operations/second for CIF (352x288) @ 30 fps
                          •   About 36 MHz on a TMS320C55x DSP
                          •   Not including interpolation of chrominance planes
                     •   Roughly 1/3 to 2/3 as many processor cycles as
                         the video decoder


          © 2004 Berkeley Design Technology, Inc.                                 37




            Outline
             • Motivation and scope
             • Still-image compression techniques
             • Motion estimation and compensation
             • Reducing artifacts
             • Color conversion
             • Conclusions




          © 2004 Berkeley Design Technology, Inc.                                 38



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 19                            April 2004
Smart Processor Picks for Consumer Media Applications




            Conclusions
            Understanding the computational and memory
            requirements of video compression is critical but
            challenging
                  •   Application design choices are driven by computational and
                      memory requirements
                       •   Algorithm selection, processor selection, software optimization
                       •   Video processing often stresses processing resources
                  •   But video applications combine many different signal-
                      processing techniques
                       •   Transforms, prediction, quantization, entropy coding, image
                           filtering, etc.
                  •   And there is large variation in computational and memory
                      requirements among different applications
                       •   E.g., digital camcorder has vastly different requirements from a
                           video-enabled cell phone, even when using the same
                           compression standard


          © 2004 Berkeley Design Technology, Inc.                                             39




            Conclusions, cont.
             •   Understanding computational load
                  •   Computational load of encoder is several times greater than
                      that of decoder due to motion estimation
                  •   Computational load proportional to frame size and rate for
                      most functions
                       •   Note: VLD computational load is proportional to bit rate
                  •   Post-processing steps—deblocking, deringing, color space
                      conversion—add considerably to the computational load
             •   Computational load can be difficult to predict
                  •   Many different approaches to motion estimation
                  •   Computational load of some tasks can fluctuate wildly
                      depending on video program content
                       •   E.g., motion compensation


          © 2004 Berkeley Design Technology, Inc.                                             40



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 20                                        April 2004
Smart Processor Picks for Consumer Media Applications




            Conclusions, cont.
             •   Understanding memory requirements:
                  •   Memory requirements dominated by frame buffers
                      •   A decoder that supports only I and P frames requires two frame
                          buffers (current and reference)
                      •   A decoder that supports I, P, and B frames requires three
                          buffers (current and two reference)
                      •   Deblocking/deringing/color conversion may require an additional
                          buffer
                  •   Program memory, tables, other data comprise a non-
                      negligible portion of memory use
                      •   But this portion is still several times smaller than frame buffers
             •   High memory use often necessitates off-chip memory
                  •   Off-chip memory bandwidth can be a performance bottleneck



          © 2004 Berkeley Design Technology, Inc.                                              41




            Conclusions, cont.
             •   Video compression used in many products
                  •   DVDs, digital TV, personal video recorders, Internet video,
                      multimedia jukeboxes, video-capable cell phones and PDAs,
                      camcorders…
             •   Different products have different needs
                  •   Wide range of frame sizes and rates, video quality, bit rates,
                      post-processing options, etc.
                  •   Result in wide range of computational and memory
                      requirements
             •   Need to understand the operation of video codecs
                  •   To understand computational and memory requirements
                  •   To select codecs, processors, software modules
                  •   To optimize software

          © 2004 Berkeley Design Technology, Inc.                                              42



                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                          Page 21                                        April 2004
Smart Processor Picks for Consumer Media Applications




            For More Information…
            www.BDTI.com
            Free Information
             • White papers/presentation slides on
                  •   DSP software optimization
                  •   Streaming media implementation
                  •   Processor architectures and
                      performance
                  •   Digital audio compression
             •   Article reprints on DSP-oriented
                 processors and applications
                  •   EE Times
                  •   IEEE Spectrum
                  •   IEEE Computer and others
             •   comp.dsp FAQ


                                                                           2004 Edition
          © 2004 Berkeley Design Technology, Inc.                                         43




                                      © 2004 Berkeley Design Technology, Inc.

Embedded Systems Conference                         Page 22                                    April 2004

Weitere ähnliche Inhalte

Was ist angesagt?

Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessing
chiportal
 
Ergo - Managed Desktop Services Brochure
Ergo - Managed Desktop Services BrochureErgo - Managed Desktop Services Brochure
Ergo - Managed Desktop Services Brochure
ffurlong
 
Video Conferencing
Video ConferencingVideo Conferencing
Video Conferencing
Videoguy
 
Managing change in the data center network
Managing change in the data center networkManaging change in the data center network
Managing change in the data center network
Interop
 
Smarter Computing: Expert Integrated System
Smarter Computing: Expert Integrated SystemSmarter Computing: Expert Integrated System
Smarter Computing: Expert Integrated System
IBM Danmark
 
Unified Conferencing
Unified ConferencingUnified Conferencing
Unified Conferencing
Videoguy
 
AG-HPX371
AG-HPX371AG-HPX371
AG-HPX371
AVNed
 
Panasonic AG-HPX370
Panasonic AG-HPX370Panasonic AG-HPX370
Panasonic AG-HPX370
AV ProfShop
 
Panasonic AG-HPX371E
Panasonic AG-HPX371EPanasonic AG-HPX371E
Panasonic AG-HPX371E
AV ProfShop
 

Was ist angesagt? (19)

Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessing
 
Ergo - Managed Desktop Services Brochure
Ergo - Managed Desktop Services BrochureErgo - Managed Desktop Services Brochure
Ergo - Managed Desktop Services Brochure
 
Video Conferencing
Video ConferencingVideo Conferencing
Video Conferencing
 
Managing change in the data center network
Managing change in the data center networkManaging change in the data center network
Managing change in the data center network
 
Pos vs pc
Pos vs pcPos vs pc
Pos vs pc
 
CIBER creates next-generation IT infrastructure solution for SAP hosting serv...
CIBER creates next-generation IT infrastructure solution for SAP hosting serv...CIBER creates next-generation IT infrastructure solution for SAP hosting serv...
CIBER creates next-generation IT infrastructure solution for SAP hosting serv...
 
IBM BP Kickoff 2013 VDI Solutions
IBM BP Kickoff 2013    VDI SolutionsIBM BP Kickoff 2013    VDI Solutions
IBM BP Kickoff 2013 VDI Solutions
 
Smarter Computing: Expert Integrated System
Smarter Computing: Expert Integrated SystemSmarter Computing: Expert Integrated System
Smarter Computing: Expert Integrated System
 
Verdens bedste BPM-platform leveret som cloud, Christian A. Givskov, IBM
Verdens bedste BPM-platform leveret som cloud, Christian A. Givskov, IBMVerdens bedste BPM-platform leveret som cloud, Christian A. Givskov, IBM
Verdens bedste BPM-platform leveret som cloud, Christian A. Givskov, IBM
 
Unified Conferencing
Unified ConferencingUnified Conferencing
Unified Conferencing
 
Brochure quiterian DDWeb
Brochure quiterian DDWebBrochure quiterian DDWeb
Brochure quiterian DDWeb
 
Open iT Software Usage Metering Toolset for IT Asset Managers
Open iT Software Usage Metering Toolset for IT Asset Managers Open iT Software Usage Metering Toolset for IT Asset Managers
Open iT Software Usage Metering Toolset for IT Asset Managers
 
AG-HPX371
AG-HPX371AG-HPX371
AG-HPX371
 
Novell Success Stories: Collaboration in Travel and Hospitality
Novell Success Stories: Collaboration in Travel and HospitalityNovell Success Stories: Collaboration in Travel and Hospitality
Novell Success Stories: Collaboration in Travel and Hospitality
 
Panasonic AG-HPX370
Panasonic AG-HPX370Panasonic AG-HPX370
Panasonic AG-HPX370
 
Panasonic AG-HPX371E
Panasonic AG-HPX371EPanasonic AG-HPX371E
Panasonic AG-HPX371E
 
Stream 3 - Cloud Computing
Stream 3 - Cloud ComputingStream 3 - Cloud Computing
Stream 3 - Cloud Computing
 
Estimating Packaged Software - The first part of a framework
Estimating Packaged Software - The first part of a frameworkEstimating Packaged Software - The first part of a framework
Estimating Packaged Software - The first part of a framework
 
User Experience Monitoring presented at CA World 2011
User Experience Monitoring   presented at CA World 2011User Experience Monitoring   presented at CA World 2011
User Experience Monitoring presented at CA World 2011
 

Ähnlich wie Introduction to video compression

video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
Aniruddh Tyagi
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
aniruddh Tyagi
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
aniruddh Tyagi
 
“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...
“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...
“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...
Edge AI and Vision Alliance
 
HEVC Main Main10 Profile Encoding
HEVC Main Main10 Profile EncodingHEVC Main Main10 Profile Encoding
HEVC Main Main10 Profile Encoding
Logic Fruit Technologies
 
Compact Descriptors for Visual Search
Compact Descriptors for Visual SearchCompact Descriptors for Visual Search
Compact Descriptors for Visual Search
Antonio Capone
 

Ähnlich wie Introduction to video compression (20)

L51 w
L51 wL51 w
L51 w
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
 
“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...
“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...
“Using a Neural Processor for Always-sensing Cameras,” a Presentation from Ex...
 
OpenEye IP Video Basics
OpenEye IP Video BasicsOpenEye IP Video Basics
OpenEye IP Video Basics
 
HEVC Main Main10 Profile Encoding
HEVC Main Main10 Profile EncodingHEVC Main Main10 Profile Encoding
HEVC Main Main10 Profile Encoding
 
Quiterian analytics
Quiterian analyticsQuiterian analytics
Quiterian analytics
 
LTE Testing - Network Performance, Security, and Stability at Massive Scale
LTE Testing - Network Performance, Security, and Stability at Massive ScaleLTE Testing - Network Performance, Security, and Stability at Massive Scale
LTE Testing - Network Performance, Security, and Stability at Massive Scale
 
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
 
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
 
“Comparing ML-Based Audio with ML-Based Vision: An Introduction to ML Audio f...
“Comparing ML-Based Audio with ML-Based Vision: An Introduction to ML Audio f...“Comparing ML-Based Audio with ML-Based Vision: An Introduction to ML Audio f...
“Comparing ML-Based Audio with ML-Based Vision: An Introduction to ML Audio f...
 
Compact Descriptors for Visual Search
Compact Descriptors for Visual SearchCompact Descriptors for Visual Search
Compact Descriptors for Visual Search
 
Mobixell pipeline webinar_june_20_2012
Mobixell pipeline webinar_june_20_2012Mobixell pipeline webinar_june_20_2012
Mobixell pipeline webinar_june_20_2012
 
An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)
 
Real Time Video Processing in FPGA
Real Time Video Processing in FPGA Real Time Video Processing in FPGA
Real Time Video Processing in FPGA
 
Xpt
XptXpt
Xpt
 
Xpt
XptXpt
Xpt
 
XPT Corporate Capability
XPT Corporate Capability XPT Corporate Capability
XPT Corporate Capability
 
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
 

Mehr von Sumathi Gnanasekaran (13)

ATTRIBUTES RELATED TO SOFTWARE AGING.docx
ATTRIBUTES RELATED TO SOFTWARE AGING.docxATTRIBUTES RELATED TO SOFTWARE AGING.docx
ATTRIBUTES RELATED TO SOFTWARE AGING.docx
 
ATTRIBUTES RELATED TO SOFTWARE AGING.docx
ATTRIBUTES RELATED TO SOFTWARE AGING.docxATTRIBUTES RELATED TO SOFTWARE AGING.docx
ATTRIBUTES RELATED TO SOFTWARE AGING.docx
 
ATTRIBUTES RELATED TO SOFTWARE AGING.docx
ATTRIBUTES RELATED TO SOFTWARE AGING.docxATTRIBUTES RELATED TO SOFTWARE AGING.docx
ATTRIBUTES RELATED TO SOFTWARE AGING.docx
 
Pcd question bank
Pcd question bank Pcd question bank
Pcd question bank
 
Tqm qb
Tqm qbTqm qb
Tqm qb
 
Oops qb cse
Oops qb cseOops qb cse
Oops qb cse
 
Ss tools
Ss toolsSs tools
Ss tools
 
Ss tools
Ss toolsSs tools
Ss tools
 
System software 5th unit
System software 5th unitSystem software 5th unit
System software 5th unit
 
Compiler notes--unit-iii
Compiler notes--unit-iiiCompiler notes--unit-iii
Compiler notes--unit-iii
 
If202 cloud computing important questions
If202 cloud computing important questions If202 cloud computing important questions
If202 cloud computing important questions
 
Sw engg two mark question
Sw engg two mark questionSw engg two mark question
Sw engg two mark question
 
Se syllabus
Se syllabusSe syllabus
Se syllabus
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Introduction to video compression

  • 1. Smart Processor Picks for Consumer Media Applications Optimized DSP Software • Independent DSP Analysis Introduction to Video Compression (Workshop 210) Berkeley Design Technology, Inc. info@BDTI.com http://www.BDTI.com © 2004 Berkeley Design Technology, Inc. Outline • Motivation and scope • Still-image compression techniques • Motion estimation and compensation • Reducing artifacts • Color conversion • Conclusions © 2004 Berkeley Design Technology, Inc. 2 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 1 April 2004
  • 2. Smart Processor Picks for Consumer Media Applications Motivation and Scope • Consumer video products increasingly rely on video compression • DVDs, digital TV, personal video recorders, Internet video, multimedia jukeboxes, video-capable cell phones and PDAs, camcorders… • Video product developers need to understand the operation of video “codecs” • To select codecs, processors, software modules • To optimize software • This presentation covers: • Operation of video codecs and post-processing • Computational and memory demands of key codec and post- processing components © 2004 Berkeley Design Technology, Inc. 3 Outline • Motivation and scope • Still-image compression techniques • Motion estimation and compensation • Reducing artifacts • Color conversion • Conclusions © 2004 Berkeley Design Technology, Inc. 4 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 2 April 2004
  • 3. Smart Processor Picks for Consumer Media Applications Still-Image Compression • Still-image compression • Still-image techniques provide a basis for video compression • Video can be compressed using still-image compression individually on each frame • E.g., “Motion JPEG” or MJPEG • But modern video codecs go well beyond this • Start with still-image compression techniques • Add motion estimation/compensation • Takes advantage of similarities between frames in a video sequence © 2004 Berkeley Design Technology, Inc. 5 Still-Image Compression Transform Quantization Run-length coding Variable length coding Frame Frame Header Header Typical Still-Image Compression Data Flow © 2004 Berkeley Design Technology, Inc. 6 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 3 April 2004
  • 4. Smart Processor Picks for Consumer Media Applications Block Transform: 8x8 DCT • 8x8 DCT blocks applied on Y, U, and V planes individually • The energy is concentrated in the low frequencies • Perceptual information also Y values concentrated in low frequencies High energy 8x8 DCT Low energy 8x8 IDCT Spatial domain Frequency domain © 2004 Berkeley Design Technology, Inc. 7 Block Transform: Resource Reqt’s • Compute load: • Up to 30% of total video decoder processor cycles • MPEG-4 CIF (352x288) @ 30 fps: • 71,280 DCTs/s • ~40 MHz on a TMS320C55x DSP • ~10 MHz if using TMS320C55x DCT accelerator • Many implementation and optimization options • Can make compute requirements hard to predict • Memory usage: negligible © 2004 Berkeley Design Technology, Inc. 8 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 4 April 2004
  • 5. Smart Processor Picks for Consumer Media Applications Quantization Original Quantizer Reconstructed •Removes perceptually less significant data Spatial domain Frequency domain •Reduces the number of bits per DCT coefficient © 2004 Berkeley Design Technology, Inc. 9 Quantization: Resource Reqt’s • Quantization (encoder) and dequantization (decoder and encoder) have similar compute loads • Compute load: • From 3% to about 15% of total decoder processor cycles • Typically near the lower end of this range • MPEG-4 CIF (352x288) @ 30 fps: • ~10 MHz on a TMS320C55x DSP (estimated) • Memory usage: negligible © 2004 Berkeley Design Technology, Inc. 10 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 5 April 2004
  • 6. Smart Processor Picks for Consumer Media Applications Coding Quantized DCT Coefficients Goal: Reduce the number of bits required to transmit the quantized coefficients Occurrence Quantized DCT values Observation: Unequal distribution of quantized DCT coefficient values © 2004 Berkeley Design Technology, Inc. 11 Variable Length Coding (VLC/VLD) • Allocates fewer bits to the most frequent symbols (e.g., using Huffman) • Integer number of bits per symbol • Not the most efficient coding method • Arithmetic coding more efficient, but expensive • Run-length coding improves efficiency of VLC/VLD for image and video coding Symbol Frequency Code A 22 1 B 16 011 C 9 0101 D 7 0100 E 4 0011 F 2 0010 ... ... ... © 2004 Berkeley Design Technology, Inc. 12 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 6 April 2004
  • 7. Smart Processor Picks for Consumer Media Applications Run-Length Coding Encodes value and number of successive occurrences Takes advantage of the high number of recurring zeros 20,1 ; 18,1 ; 20,1 ; 18,4 ; ... ; 0, 37 © 2004 Berkeley Design Technology, Inc. 13 Variable Length Coding: Processing Reqt’s • VL decoding much more computationally demanding than VL encoding • VLD compute load: • Up to 25% of total video decoder processor cycles • MPEG-4 CIF (352x288) @ 30 fps, 700 kbps: • ~15-25 MHz on a TMS320C55x DSP (estimated) •About 11 operations per bit on average • Memory usage • A few KB of memory for lookup tables • More for speed optimizations © 2004 Berkeley Design Technology, Inc. 14 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 7 April 2004
  • 8. Smart Processor Picks for Consumer Media Applications Resynchronization Markers • Without markers, a single bit error in the coded bitstream prevents decoding of the rest of the frame • Size of a corrupted variable-length code word is unknown • Therefore, the start of the next code word (and all following code words) is unknown • Resynchronization markers help the decoder recover from bitstream errors • Provide a known bit pattern interspersed throughout the bitstream • In case of an error, decoder searches for next marker, then continues decoding Corrupted codeword Un-decodable data Frame Frame Header Header Resynchronization markers © 2004 Berkeley Design Technology, Inc. 15 Reversible VLC • Reversible VLC, in conjunction with resynchronization markers, further assists error recovery • Code words can be decoded forward and backward • In case of error, data can be decoded backward from the next resynchronization marker • More data is recovered compared to resynchronization markers alone Data decoded Normal decoding backward continues forward Corrupted codeword Frame Frame Header Header Resynchronization markers © 2004 Berkeley Design Technology, Inc. 16 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 8 April 2004
  • 9. Smart Processor Picks for Consumer Media Applications AC-DC Prediction Spatial Frequency domain domain Current block Candidates not selected for prediction Candidate selected for prediction AC-DC prediction: encode first differential row and column DC prediction: encode © 2004 Berkeley Design Technology, Inc. differential DC value 17 AC-DC Prediction • AC-DC prediction cannot be used in conjunction with motion compensation • ⇒ Used mostly for compressing a single image • DC prediction used in JPEG • AC-DC prediction often uses simple filters to predict each coefficient value from one or more adjacent blocks © 2004 Berkeley Design Technology, Inc. 18 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 9 April 2004
  • 10. Smart Processor Picks for Consumer Media Applications AC-DC Prediction: Processing Reqt’s • Compute load: • DC prediction has negligible load • AC-DC prediction used in about 8% of frames in typical video • Negligible average load (~2% of processor cycles in decoder) • Substantial per-frame load (~20-30% of cycles required to decode a frame that uses AC-DC prediction) • Memory usage: • Under 2 KB for CIF (352x288) resolution • But more memory (up to 10 KB) can result in faster code • May be overlapped with other memory structures not in use during prediction © 2004 Berkeley Design Technology, Inc. 19 Outline • Motivation and scope • Still-image compression techniques • Motion estimation and compensation • Reducing artifacts • Color conversion • Conclusions © 2004 Berkeley Design Technology, Inc. 20 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 10 April 2004
  • 11. Smart Processor Picks for Consumer Media Applications Motion Estimation and Compensation • Still image compression ignores the correlation between frames of video • JPEG achieves ~10:1 compression ratio • Wavelet transform-based image coding reaches compression ratios up to ~30:1 • Adding motion estimation and compensation results in much higher compression ratios • Good video quality at compression ratios as high as ~200:1 © 2004 Berkeley Design Technology, Inc. 21 Motion Estimation and Compensation • I frame is encoded as a still • Requires at least one image and doesn’t depend on “reference frame” any reference frame • Reference frame must be encoded before the I current frame • But, reference frame can • P frame depends on previously displayed reference frame be a future frame in the display sequence • Three kinds of frames: P I, P, and B • B frame depends on previous and future reference frames B © 2004 Berkeley Design Technology, Inc. 22 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 11 April 2004
  • 12. Smart Processor Picks for Consumer Media Applications Typical Sequence of I, P, B frames Display order (left to right) I Encoding order (top to bottom) P B B B P B B B P B B B I © 2004 Berkeley Design Technology, Inc. 23 Motion Estimation • Predict the contents of each macroblock based on motion relative to reference frame • Search reference frame for a 16x16 region that matches the macroblock • Encode motion vectors • Encode difference between predicted and actual macroblock pixels MV= MV Motion Vector Reference frame Current frame © 2004 Berkeley Design Technology, Inc. 24 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 12 April 2004
  • 13. Smart Processor Picks for Consumer Media Applications Video Encoder Block Diagram (Current MB) + Input 8x8 DCT Quantize VLC + scan *MB Diff block - (3.4%) (0.4%) (7.3%) (2.5%) Inv-Quantize (0.5%) Intra/Inter mode 8x8 IDCT 0 (1.7%) Bitstream MUX Bitstream Out + Motion + Add block Compensation (2.9%) Encode Intra MB (1%) (1.8%) Frame Store Encode Inter MB (Reference MB) control code Motion Estimation (63.4%): (3.9%) - Fast Integer ME: 37% - Half-Pel + 4MV refine: 26% Misc functions MV Coder (4.1%) © 2004 Berkeley Design Technology, Inc. 25 *MB – Macro-Block Source: Motorola SPS Motion Estimation: The Problem Search area • Search on 16x16 blocks • Sub-pixel interpolation required for non-integer motion vectors • Typically on luminance only or Σ|Y[i]| 2 Σ|Y[i]| • SAD more often used SSD SAD © 2004 Berkeley Design Technology, Inc. 26 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 13 April 2004
  • 14. Smart Processor Picks for Consumer Media Applications Motion Estimation: Efficient Motion Vector Search • Exhaustive search is impractical • Evaluate only promising candidate ~ motion vectors • Don’t have to find absolute best match • Trade video quality for computational load • Many methods in use ~Σ • Often proprietary • Refine candidate vector selection in stages • Predict candidate vectors from surrounding macroblocks and/or previous frames Motion vector search approach is a key differentiator between video encoder implementations © 2004 Berkeley Design Technology, Inc. 27 Motion Estimation: Processing Reqt’s • Compute load • Most demanding task in video compression • Up to 80% of total encoder processor cycles • Many search methods exist; requirements vary by method • May vary with video program content • Makes encoder computational demand several times greater than that of the decoder • Dominated by SAD computation • Memory usage • Motion estimation requires reference frame buffers • Frame buffers dominate the memory requirements of the encoder • E.g., 152,064 bytes per frame @ CIF (352x288) resolution • High memory bandwidth required © 2004 Berkeley Design Technology, Inc. 28 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 14 April 2004
  • 15. Smart Processor Picks for Consumer Media Applications Motion Compensation: Processing Reqt’s • Motion compensation copies pixels from reference frame to predict current macroblock • Requires interpolation for non-integer motion vector values • Compute load • Varies with video program content • Can require from 5% to 40% of total decoder processor cycles • MPEG-4 CIF (352x288) @ 30 fps: • Roughly 15-25 MHz on a TMS320C55x DSP (estimated) • Memory usage • Requires reference frame buffers • Frame buffers dominate decoder memory requirements • Good memory bandwidth is desirable, but less critical compared to motion estimation © 2004 Berkeley Design Technology, Inc. 29 Outline • Motivation and scope • Still-image compression techniques • Motion estimation and compensation • Reducing artifacts • Color conversion • Conclusions © 2004 Berkeley Design Technology, Inc. 30 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 15 April 2004
  • 16. Smart Processor Picks for Consumer Media Applications Artifacts: Blocking and Ringing • Blocking: Borders of 8x8 blocks become visible in reconstructed frame • Ringing: Distortions near edges of image features Reconstructed Original image image (with ringing Artifacts) © 2004 Berkeley Design Technology, Inc. 31 Deblocking and Deringing Filters Low-pass filters are used to smooth the image where artifacts occur • Deblocking: • Low-pass filter the pixels at borders of 8x8 blocks • One-dimensional filter applied perpendicular to 8x8 block borders • Deringing: • Detect edges of image features • Adaptively apply 2D filter to smooth out areas near edges • Little or no filtering applied to edge pixels in order to avoid blurring © 2004 Berkeley Design Technology, Inc. 32 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 16 April 2004
  • 17. Smart Processor Picks for Consumer Media Applications Artifact Reduction Example: Deblocking Original MPEG Still Frame Horizontally & Vertically Deblocked Still Frame © 2004 Berkeley Design Technology, Inc. 33 Artifact Reduction: Post-processing vs. In-loop • Deblocking/deringing often applied after Video Reference the decoder (post-processing) Decoder frame • Reference frames are not filtered Deblock • Developers free to select best filters for the Dering application or not filter at all • Deblocking/deringing can be incorporated in the compression algorithm (in-loop Video filtering) Decoder Reference • Reference frames are filtered Deblock frame • Same filters must be applied in encoder and Dering decoder • Better image quality at very low bit-rates © 2004 Berkeley Design Technology, Inc. 34 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 17 April 2004
  • 18. Smart Processor Picks for Consumer Media Applications Artifact Reduction: Processing Reqt’s • Deblocking and deringing filters can require more processor cycles than the video decoder • Example: MPEG-4 Simple Profile, Level 1 (176x144, 15 fps) decoding requires 14 MIPS on ARM’s ARM9E for a relatively complex video sequence • With deblocking and deringing added, load increases to 39 MIPS • Nearly 3x increase compared to MPEG-4 decoding alone! • Post-processing may require an additional frame buffer © 2004 Berkeley Design Technology, Inc. 35 Outline • Motivation and scope • Still-image compression techniques • Motion estimation and compensation • Reducing artifacts • Color conversion • Conclusions © 2004 Berkeley Design Technology, Inc. 36 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 18 April 2004
  • 19. Smart Processor Picks for Consumer Media Applications Color Space Conversion • Need for color conversion • Capture and display video equipment: RGB... • ...while codecs use YUV • Computational demand • 12 operations per pixel ⇒ 36 million operations/second for CIF (352x288) @ 30 fps • About 36 MHz on a TMS320C55x DSP • Not including interpolation of chrominance planes • Roughly 1/3 to 2/3 as many processor cycles as the video decoder © 2004 Berkeley Design Technology, Inc. 37 Outline • Motivation and scope • Still-image compression techniques • Motion estimation and compensation • Reducing artifacts • Color conversion • Conclusions © 2004 Berkeley Design Technology, Inc. 38 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 19 April 2004
  • 20. Smart Processor Picks for Consumer Media Applications Conclusions Understanding the computational and memory requirements of video compression is critical but challenging • Application design choices are driven by computational and memory requirements • Algorithm selection, processor selection, software optimization • Video processing often stresses processing resources • But video applications combine many different signal- processing techniques • Transforms, prediction, quantization, entropy coding, image filtering, etc. • And there is large variation in computational and memory requirements among different applications • E.g., digital camcorder has vastly different requirements from a video-enabled cell phone, even when using the same compression standard © 2004 Berkeley Design Technology, Inc. 39 Conclusions, cont. • Understanding computational load • Computational load of encoder is several times greater than that of decoder due to motion estimation • Computational load proportional to frame size and rate for most functions • Note: VLD computational load is proportional to bit rate • Post-processing steps—deblocking, deringing, color space conversion—add considerably to the computational load • Computational load can be difficult to predict • Many different approaches to motion estimation • Computational load of some tasks can fluctuate wildly depending on video program content • E.g., motion compensation © 2004 Berkeley Design Technology, Inc. 40 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 20 April 2004
  • 21. Smart Processor Picks for Consumer Media Applications Conclusions, cont. • Understanding memory requirements: • Memory requirements dominated by frame buffers • A decoder that supports only I and P frames requires two frame buffers (current and reference) • A decoder that supports I, P, and B frames requires three buffers (current and two reference) • Deblocking/deringing/color conversion may require an additional buffer • Program memory, tables, other data comprise a non- negligible portion of memory use • But this portion is still several times smaller than frame buffers • High memory use often necessitates off-chip memory • Off-chip memory bandwidth can be a performance bottleneck © 2004 Berkeley Design Technology, Inc. 41 Conclusions, cont. • Video compression used in many products • DVDs, digital TV, personal video recorders, Internet video, multimedia jukeboxes, video-capable cell phones and PDAs, camcorders… • Different products have different needs • Wide range of frame sizes and rates, video quality, bit rates, post-processing options, etc. • Result in wide range of computational and memory requirements • Need to understand the operation of video codecs • To understand computational and memory requirements • To select codecs, processors, software modules • To optimize software © 2004 Berkeley Design Technology, Inc. 42 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 21 April 2004
  • 22. Smart Processor Picks for Consumer Media Applications For More Information… www.BDTI.com Free Information • White papers/presentation slides on • DSP software optimization • Streaming media implementation • Processor architectures and performance • Digital audio compression • Article reprints on DSP-oriented processors and applications • EE Times • IEEE Spectrum • IEEE Computer and others • comp.dsp FAQ 2004 Edition © 2004 Berkeley Design Technology, Inc. 43 © 2004 Berkeley Design Technology, Inc. Embedded Systems Conference Page 22 April 2004