SlideShare ist ein Scribd-Unternehmen logo
1 von 26
EE 5359 PROPOSAL
H.264 to VC-1 TRANSCODING




                 Vidhya Vijayakumar
               Student I.D.: 1000-622152
                Date: September 24, 2009




                                       1
H.264 to VC-1 TRANSCODER
OBJECTIVE:

       The objective of the thesis is to implement a H.264 bitstream to VC-1
transcoder for progressive compression.

MOTIVATION:

        The high definition video adoption has been growing rapidly for the last five
years. The high definition DVD format Blue ray has mandated MPEG-2[3], H.264 [2]
and VC-1 [1] as video compression formats. The coexistence of these different video
coding standards creates a need for transcoding. As more and more end products use
the above standards, transcoding from one format to another adds value to the
product’s capability. While there has been recent work on MPEG-2 to H.264
transcoding [3], VC-1 to H.264 transcoding [4], the published work on H.264 to VC-1
transcoding is nearly non-existent. This has created the motivation to develop a
transcoder that can efficiently transcode a H.264 bitstream to a VC-1 bitstream.

DETAILS:

       Video transcoding is the operation of converting video from one format to
another [5]. A format is defined by characteristics such as bit-rate, spatial resolution
etc. One of the earliest applications of transcoding is to adapt the bit-rate of a
compressed stream to the channel bandwidth for universal multimedia access in all
kinds of channels like wireless networks, Internet, dial-up networks etc. Changes in
the characteristics of an encoded stream like bit-rate, spatial resolution, quality etc can
also be achieved by scalable video coding [5].However, in cases where the available
network bandwidth is insufficient or if it fluctuates with time, it may be difficult to set
the base layer bit-rate. In addition, scalable video coding demands additional
complexities at both the encoder and the decoder.

       The basic architecture for converting an H.264 bitstream into a VC-1
elementary stream arises from complete decoding of the H.264 stream and then re-
encoding into a VC-1 stream. However, this involves significant computational
complexity [6]. Hence there also is a need to transcode at low complexity.

        Transcoding can in general be implemented in the spatial domain or in the
transform domain or in a combination of the two domains. The common transcoding
architectures [5] are:

Open loop transform domain transcoding




                  Fig. 1 Open loop transform domain transcoder architecture [5]


                                                                                         2
Open loop transcoders are computationally efficient (Fig 1). They operate in the DCT
domain. However they are subject to drift error. Drift error occurs due to rounding,
quantization loss and clipping functions.

      Cascaded Pixel Domain Architecture (CPDT)




                  Fig. 2 Cascaded pixel domain transcoder architecture [5]

This is the most basic transcoding architecture (Fig 2). The motion vectors from the
incoming bit stream are extracted and reused. Thus the complexity of the motion
estimation block is eliminated which accounts for 60% of the encoder computation.
As compared to the previous architecture, CPDT is drift free. Hence, even though it is
slightly more complex, it is suited for heterogeneous transcoding between different
standards where the basic parameters like mode decisions, motion vectors etc are to
be re-derived.

Simplified DCT Domain transcoders (SDDT)




                Fig. 3 Simplified transform domain transcoder architecture [5]

This transcoder is based on the assumption that DCT, IDCT and motion compensation
are linear processes (Fig 3). This architecture requires that motion compensation be
performed in the DCT domain, which is a major computationally intensive operation
[3]. For instance, as shown in the figure 4, the goal is trying to compute the DCT




                                                                                    3
coefficients of the target block B from the four overlapping blocks B1, B2, B3 and
B4.




                    Fig. 4   Transform domain motion compensation illustration [5]



         Also, clipping functions and rounding operations performed for interpolation
in fractional pixel motion compensation lead to a drift in the transcoded video.


Cascaded DCT Domain transcoders (CDDT)




                 Fig. 5 Cascaded transform domain transcoder architecture [5]

This is used for spatial/temporal resolution downscaling and other coding parameter
changes (Fig 5). As compared with SDDT, greater flexibility is achieved by
introducing another transform domain motion compensation block; however it is far
more computationally intensive and requires more memory [3]. It is often applied to
downscaling applications where the encoder end memory will not cost much due to
downscaled resolution.




                                                                                        4
Choice of basic transcoder architecture:

        DCT domain transcoders have the main drawback that motion compensation
in transform domain is very computationally intensive. DCT domain transcoders are
also, less flexible as compared to pixel domain transcoders, for instance, the SDDT
architecture can only be used for bit rate reduction transcoding. It assumes that the
spatial and temporal resolutions stay the same and that the output video uses the same
frame types, mode decisions and motion vectors as the input video.

        For H.264 to VC-1 transcoding, it is required to implement several changes in
order to accommodate the mismatches between the two standards. For instance, for
motion estimation and compensation, H.264 supports 16x16, 16x8, 8x16, 8x8, 8x4,
4x8, 4x4 macroblock partitions (Fig 6), but VC-1 supports 16x16 and 8x8 only (Fig
7). The transform size and type (8x8 and 4x4 in H.264 and 8x8, 4x8, 8x4 and 4x4 in
VC-1) are different and make transform domain transcoding prohibitively complex.
Hence, the use of DCT domain transcoders is not very ideal.




                 Fig.6 Segmentations of the macroblock for motion compensation in H.264
                Top: segmentation of macroblocks, bottom: segmentation of 8x8 partitions. [2]




                Fig.7 Segmentations of the macroblock for motion compensation in VC-1 [2]

       From Fig. 8, it can be inferred that, the cascaded pixel domain architecture
outperforms the DCT domain transcoders. Also for larger GOP sizes, the drift in DCT
domain transcoders becomes more significant.




                                                                                                5
Fig.8 PSNR vs Bit-rate graph for the Foreman sequence transcoded with a GOP size 15, using
                  different transcoding architectures as described in Figs. 1, 2, 3 and 5. [5]



             Hence, heterogeneous transcoding in the pixel domain is preferred for
      standards transcoding.

      Standards transcoding:

             When transcoding between two different standards, the main factor involved is
      compatibility between the profile and level of the input stream and that of the output
      stream for a specific purpose. The goal here is to transcode a H.264 bitstream of
      Baseline profile to VC-1 bit stream of Simple profile.

      The table 1 compares and contrasts the characteristics of both standards

                              H.264 High Profile                  VC-1 Main Profile
Chroma Format                 4:2:0                               4:2:0
Picture coding type           I ,P ,B                             I ,P ,B
Transform size                4x4, 8x8                            8x8, 4x8, 8x4, 4x4
Intra Prediction              Directional Predictors              None
Block sizes for Motion        16x16, 16x8, 8x16, 8x8, 4x8,        16x16, 8x8
Compensation                  8x4, 4x4

                  Table 1 Main characteristics of H.264 Main profile and VC-1 Main profile

      Overview of H.264:

              H.264 [2] is a standard for video compression, and is equivalent to
      MPEG-4 Part 10, or MPEG-4 AVC (for advanced video coding) (Fig 9). As of 2008,
      it is the latest block-oriented motion-compensation-based video standard developed
      by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC


                                                                                                      6
Moving Picture Experts Group (MPEG), and it was the product of a partnership effort
known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC
MPEG-4 Part 10 standard (formally, ISO/IEC 14496-10) are jointly maintained so
that they have identical technical content.




                                 Fig 9 H.264 Encoder [32]




                                Fig 10. H.264 Decoder [32]

        The standardization of the first version of H.264/AVC was completed in May
2003. The JVT then developed extensions to the original standard that are known as
the Fidelity Range Extensions (FRExt) [29]. These extensions enable higher quality
video coding by supporting increased sample bit depth precision and higher-resolution
color information, including sampling structures known as YUV 4:2:2 and YUV
4:4:4. Several other features are also included in the Fidelity Range Extensions
project, such as adaptive switching between 4×4 and 8×8 integer transforms, encoder-
specified perceptual-based quantization weighting matrices, efficient inter-picture
lossless coding, and support of additional color spaces. The design work on the
Fidelity Range Extensions was completed in July 2004, and the drafting work on them
was completed in September 2004.

        Scalable video coding (SVC) [30] as specified in Annex G of H.264/AVC
allows the construction of bitstreams that contain sub-bitstreams that conform to
H.264/AVC. For temporal bitstream scalability, i.e., the presence of a sub-bitstream
with a smaller temporal sampling rate than the bitstream, complete access units are
removed from the bitstream when deriving the sub-bitstream. In this case, high-level
syntax and inter prediction reference pictures in the bitstream are constructed
accordingly. For spatial and quality bitstream scalabilities, i.e. the presence of a sub-
bitstream with lower spatial resolution or quality than the bitstream, network


                                                                                       7
abstraction layer (NAL) units are removed from the bitstream when deriving the sub-
bitstream. In this case, inter-layer prediction, i.e., the prediction of the higher spatial
resolution or quality signal by data of the lower spatial resolution or quality signal, is
typically used for efficient coding. The Scalable Video Coding extension was
completed in November 2007.
Some of the features adopted in H.264 for enhancement of prediction, improved
coding efficiency and robustness to data errors/losses are listed as follows.

Features for enhancement of prediction

         •   Directional spatial prediction for intra coding

         •   Variable block-size motion compensation with small block size




                       Figure 11 – Various block sizes in H.264

         •   Quarter-sample-accurate motion compensation
         •   Motion vectors over picture boundaries
         •   Multiple reference picture motion compensation
         •   Decoupling of referencing order from display order
         •   Decoupling of picture representation methods from picture referencing
             capability
         •   Weighted prediction
         •   Improved “skipped” and “direct” motion inference
         •   In-the-loop deblocking filtering

Features for improved coding efficiency

         •   Small block-size transform
         •   Exact-match inverse transform




                                                                                         8
Figure – Forward 4x4 and 8x8 integer transform

         •   Short word-length transform
         •   Hierarchical block transform
         •   Arithmetic entropy coding
         •   Context-adaptive entropy coding

Features for robustness to data errors/losses

         •   Parameter set structure
         •   NAL unit syntax structure
         •   Flexible slice size
         •   Flexible macroblock ordering (FMO)
         •   Arbitrary slice ordering (ASO)
         •   Redundant pictures
         •   Data partitioning
         •   SP/SI synchronization/switching pictures

Profiles in H.264

H.264 standard defines numerous profiles.

        •    Constrained baseline profile
        •    Baseline
        •    Main profile
        •    Extended profile
        •    High profile


                                                                     9
•   High 10 profile
•   High 4:2:2 profile
•   High 4:4:4 predictive profile
•   High stereo profile
•   High 10 intra profile
•   High 4:2:2 intra profile
•   High 4:4:4 intra profile
•   CAVLC 4:4:4 intra profile
•   Scalable baseline profile
•   Scalable high profile
•   Scalable high intra profile




        Table Features in baseline, main and extended profile




                    Table Features in high profile




                                                                10
High Profiles
                                                            Adaptive transform block size
                      Extended Profile                      Quantization scaling matrices
                                                               Main Profile
                                                                 CABAC
                   Data partition
                                            B slice
                   SI slice                Weighted prediction

                  SP slice
                                            I slice
                                             P slice
                                             CAVLC

                                    Arbitrary slice order
                                    Flexible macroblock order
                                    Redundant slice

                                         Baseline Profile




      Figure 12 Comparison of H.264 baseline, main, extended and high profile

Overview of VC-1

          VC-1 [1] is the informal name of the SMPTE 421M video codec standard
initially developed by Microsoft. It was released on April 3, 2006 by SMPTE. It is
now a supported standard for Blu-ray Discs, and Windows Media Video 9.

         VC-1 is an evolution of the conventional DCT-based video codec design also
found in H.261 [31], H.263 [27], MPEG-1[40] and MPEG-2[3]. It is widely
characterized as an alternative to the latest ITU-T and MPEG video codec standard
known as H.264/MPEG-4 AVC. VC-1 contains coding tools for interlaced video
sequences as well as progressive encoding. The main goal of VC-1 development and
standardization is to support the compression of interlaced content without first
converting it to progressive, making it more attractive to broadcast and video industry
professionals.

        The VC-1 codec is designed to achieve state-of-the-art compressed video
quality at bit rates that may range from very low to very high. The codec can easily
handle 1920 pixel × 1080 pixel resolution at 6 to 30 megabits per second (Mbps) for
high-definition video. VC-1 is capable of higher resolutions such as 2048 pixels ×
1536 pixels for digital cinema, and of a maximum bit rate of 135 Mbps. An example
of very low bit rate video would be 160 pixel × 120 pixel resolution at 10 kilobits per
second (Kbps) for modem applications.


                                                                                            11
The basic functionality of VC-1 involves a block-based motion compensation
and spatial transform scheme similar to that used in other video compression
standards such as MPEG-1 and H.261 [31]. However, VC-1 includes a number of
innovations and optimizations that make it distinct from the basic compression
scheme, resulting in excellent quality and efficiency. VC-1 Advanced Profile is also
transport independent. This provides even greater flexibility for device manufacturers
and content services.




                               Fig. 11 VC – 1 Codec [32]

Profiles in VC-1

VC-1 defines three profiles
  1. Simple
  2. Main
  3. Advanced


                                   Simple                  Main          Advanced

    Baseline intra frame
                                     Yes                   Yes             Yes
       compression


  Variable-sized transform           Yes                   Yes             Yes


      16-bit transform               Yes                   Yes             Yes

   Overlapped transform              Yes                   Yes             Yes

     4 motion vector per
                                     Yes                   Yes             Yes
        macroblock




                                                                                    12
¼ pixel luminance motion
                                         Yes                     Yes     Yes
       compensation


 ¼ pixel chrominance motion
                                          No                     Yes     Yes
        compensation

        Start codes                       No                     Yes     Yes


  Extended motion vectors                 No                     Yes     Yes



                                        Simple                  Main   Advanced



         Loop filter                      No                     Yes     Yes



 Dynamic resolution change                No                     Yes     Yes


    Adaptive macroblock
                                          No                     Yes     Yes
       quantisation


         B frames                         No                     Yes     Yes



   Intensity compensation                 No                     Yes     Yes



     Range adjustment                     No                     Yes     Yes



Field and frame coding modes              No                     No      Yes



        GOP Layer                         No                     No      Yes



     Display metadata                     No                     No      Yes


                              Table – Features in VC-1 profiles [49]




Innovations




                                                                                  13
VC-1 includes a number of innovations that enable it to produce high quality
content. This section provides brief descriptions of some of these features.

Adaptive Block Size Transform

        Traditionally, 8 × 8 transforms have been used for image and video coding.
However, there is evidence to suggest that 4 × 4 transforms can reduce ringing
artifacts at edges and discontinuities. VC-1 is capable of coding an 8 × 8 block using
either an 8 × 8 transform, two 8 × 4 transforms, two 4 × 8 transforms, or four 4 × 4
transforms. This feature enables coding that takes advantage of the different transform
sizes as needed for optimal image quality.




                              Figure – VC-1 transform sizes [4]

16-Bit Transforms
        In order to minimize the computational complexity of the decoder, VC-1 uses
16-bit transforms. This also has the advantage of easy implementation on the large
amount of digital signal processing (DSP) hardware built with 16-bit processors.
Among the constraints put on transforms specified in VC-1 is the requirement that the
16-bit values used produce results that can fit in 16 bits. The constraints on transforms
ensure that decoding is as efficient as possible on a wide range of devices.

Motion Compensation
         Motion compensation is the process of generating a prediction of a video
frame by displacing the reference frame. Typically, the prediction is formed for a
block (an 8 × 8 pixel tile) or a macroblock (a 16 × 16 pixel tile) of data. The
displacement of data due to motion is defined by a motion vector, which captures the
shift along both the x- and y-axes.




                         Figure VC-1 motion compensation sizes [4]


                                                                                      14
The efficiency of the codec is affected by the size of the predicted block, the
granularity of sub-pixel data that can be captured, and the type of filter used for
generating sub-pixel predictors. VC-1 uses 16 × 16 blocks for prediction, with the
ability to generate mixed frames of 16 × 16 and 8 × 8 blocks. The finest granularity of
sub-pixel information supported by VC-1 is 1/4 pixel. Two sets of filters are used by
VC-1 for motion compensation. The first is an approximate bicubic filter with four
taps. The second is a bilinear filter with two taps. The four-tap bicubic filters used in
VC-1 for ¼ and ½ pixel shifts are: [-4 53 18 -3]/64 and [-1 9 9 -1]/16.




                  Figure – Integer, half and quarter pel positions [2]
                     (A-Q Integer, aa-hh half, a-s quarter pel positions)

           VC-1 combines the motion vector settings defined by the block size, sub-
pixel resolution, and filter type into modes. The result is four motion compensation
modes that suit a range of different situations. This classification of settings into
modes also helps compact decoder implementations.

Loop Filtering

       VC-1 uses an in-loop deblocking filter that attempts to remove block-
boundary discontinuities introduced by quantization errors in interpolated frames.
These discontinuities can cause visible artifacts in the decompressed video frames and
can impact the quality of the frame as a predictor for future interpolated frames.



                                                                                      15
Figure – Loop filtering in VC-1 [4] (Only pixel p4 and p5 are filtered)

        The loop filter takes into account the adaptive block size transforms. The filter
is also optimized to reduce the number of operations required.

Interlaced Coding

        Interlaced video content is widely used in television broadcasting. When
encoding interlaced content, the VC-1 codec can take advantage of the characteristics
of interlaced frames to improve compression. This is achieved by using data from
both fields to predict motion compensation in interpolated frames.

Advanced B Frame Coding

        A bi-directional or B frame is a frame that is interpolated from data both in
previous and subsequent frames. B frames are distinct from I frames (also called key
frames), which are encoded without reference to other frames. B frames are also
distinct from P frames, which are interpolated from previous frames only. VC-1
includes several optimizations that make B frames more efficient. VC-1 does not have
a fixed group of pictures (GOP) structure and the number of pictures in a GOP can
vary.

Fading Compensation

        Due to the nature of compression that uses motion compensation, encoding of
video frames that contain fades to or from black is very inefficient. With a uniform
fade, every macroblock needs adjustments to luminance. VC-1 includes fading
compensation, which detects fades and uses alternate methods to adjust luminance.
This feature improves compression efficiency for sequences with fading and other
global illumination changes.


Differential Quantization

       Differential quantization, or dquant, is an encoding method in which multiple
quantization steps are used within a single frame. Rather than quantize the entire
frame with a single quantization level, macroblocks are identified within the frame
that might benefit from lower quantization levels and greater number of preserved AC


                                                                                      16
coefficients. Such macroblocks are then encoded at lower quantization levels than the
one used for the remaining macroblocks in the frame. The simplest and typically most
efficient form of differential quantization involves only two quantizer levels (bi-level
dquant), but VC-1 supports multiple levels, also.

MAPPING DIFFERENCES BETWEEN THE TWO STANDARDS:

        The transcoding algorithm considered in this research assumes full H.264
decoding down to the pixel level, followed by a reduced complexity VC-1 encoding.
The data gathered during the H.264 decoding stage is used to accelerate the VC-1
encoding stage. It is assumed that the H.264 encoded bitstream is generated with an
R-D optimized encoder. The picture coding types used are similar in both the
standards. The transform size and type are different and makes transform domain
transcoding prohibitively complex. The semantics of intra MBs are similar except for
the intra directional prediction allowed in H.264 and the mixed MBs in VC-1. The
inter prediction has significant differences including the block size of MC, block size
of transform, and reference frames used. These similarities between the codecs can be
exploited in reducing the transcoding complexity.

Intra MB Mode Mapping:

        An intra MB in the incoming H.264 bitstream is coded as a VC-1 intra MB. A
H.264 intra MB can be coded as Intra 4x4 (9 different directional modes) or Intra
16x16 (4 different modes). But a VC-1 intra MB has four 8x8 blocks and has no
prediction modes. Since intra MB in VC-1 uses 8x8 transform, irrespective of the
block size (16x16 or 4x4) in H.264, we need not carry over the information of the
intra prediction type in H.264. Table 2 shows the proposed intra MB mapping.

                         H.264 Intra MB     VC-1 Intra MB
                     Intra 16x16 (Any mode)  Intra MB 8x8
                      Intra 4x4 (Any mode)   Intra MB 8x8

                         Table 2 H.264 and VC-1 Intra MB mapping




         Figure – Matrix for one-dimensional 8-point inverse transform [32]

Inter MB Mode Mapping:


                                                                                     17
An inter coded MB in the incoming H.264 bitstream is coded as inter MB in
VC-1. The inter MB in H.264 has 7 different motion compensation sizes – 16x16,
16x8, 8x16, 8x8, 4x8, 8x4, 4x4. The inter MB in VC-1 has 2 different motion
compensation sizes 16x16 and 8x8. Another significant difference is that H.264 uses
4x4 (and 8x8 in fidelity range extensions) transform sizes where as VC-1 uses 4
different transform sizes – 8x8, 4x8, 8x4 and 4x4.

        The 16x16, 8x16, 16x8 motion compensation sizes are usually selected in
H.264 for areas that are relatively uniform and will be mapped to inter 16x16 MB in
VC-1 using the selected H.264 MC block size as a measure of homogeneity in the
block to be able to differentiate the transform size to be applied in VC-1.

       The 8x8, 8x4, 4x8 and 4x4 modes are usually selected in H.264 for areas that
have non-uniform motion. The 16x16 mode in VC-1 is eliminated for such non-
uniform MBs. The MB is then mapped to 8x8 block size in VC-1 with the H.264
block size determining the transform size to be used in VC-1.

Table 3 describes the decision making for mapping the inter MBs and the type of
transform to be used in VC-1.

           H.264 Inter MB VC-1 Inter MB Transform size in VC-1
             Inter 16x16   Inter 16x16           8x8
              Inter 16x8   Inter 16x16           8x4
              Inter 8x16   Inter 16x16           4x8
               Inter 8x8    Inter 8x8            8x8
               Inter 4x8    Inter 8x8            4x8
               Inter 8x4    Inter 8x8            8x4
               Inter 4x4    Inter 8x8            4x4

             Table 3 H.264 and VC-1 Inter MB mapping and VC-1 transform type




Motion vector mapping:

Re-use of motion vectors selected in H.264 can significantly reduce the complexity of
VC-1 encoding. Table 4 describes the re-use of motion vectors.

          H.264 Inter MB VC-1 Inter MB             Motion Vector Re-use
            Inter 16x16   Inter 16x16               Same motion vectors
             Inter 16x8   Inter 16x16             Average of motion vectors
             Inter 8x16   Inter 16x16             Average of motion vectors
              Inter 8x8    Inter 8x8                Same motion vectors
              Inter 4x8    Inter 8x8              Average of motion vectors
              Inter 8x4    Inter 8x8              Average of motion vectors
              Inter 4x4    Inter 8x8              Average of motion vectors

                  Table 4 H.264 and VC-1 Inter MB motion vector mapping



                                                                                  18
Reference Pictures:

  H.264/AVC standard defines the use of up to sixteen reference pictures for motion
  estimation, while VC-1 uses only one or two, according to the slice type P or B
  respectively. The reuse of motion vectors implies using the same reference pictures to
  maintain their meaning. The motion vector conversion assumes that motion vector
  length is related to the reference image distance [39]. The source motion vectors are
  scaled, according to figure 12 in order to use valid VC-1 reference pictures. This
  conversion assumes constant motion between H.264/AVC and VC-1 reference
  pictures. The motion vector conversion is performed by scaling it with the temporal
  distance between the two reference pictures.

           H.264


           VC-1

                            Fig 12 Motion vector scaling [38]

  Skipped Macroblock:
  When a skipped macro block is signaled in the bit stream, no further data is sent for
  that macro block. The mode conversion of H.264 skip macroblocks to VC-1 skip is a
  straightforward process. Since the skip macro block definition of both standards is
  fully compatible, a direct conversion is possible.

  OPEN LOOP TRANSCODER:

  The open loop transcoder is designed by cascading a H.264 encoder [44], H.264 [44]
  decoder, VC-1 encoder [45] and a VC-1 decoder [45].




YUV      H.264 Encoder       H.264 Decoder        VC-1 Encoder       VC-1 Decoder    YUV


                              Fig 13 Open loop transcoder

  Performance of open loop transcoder

  Mean square error (MSE), peak-to-peak signal to noise ratio (PSNR), structural
  similarity index measure (SSIM) for Foreman QCIF (3 frames) is calculated using the
  open loop transcoder.




                                                                                      19
Fig 14 MSE of open loop transcoder – Foreman sequence




Fig 15 PSNR of open loop transcoder – Foreman sequence




                                                         20
Fig 16 SSIM of open loop transcoder – Foreman sequence


CONCLUSIONS:

        As mentioned earlier, it is proposed to transcode an H.264 bitstream to a VC-1
stream in the pixel domain (CPDT) and compare the results (MSE, PSNR, SSIM,
complexity, bit rates) against an open loop transcoder. On the encoder side, since
there is no re-estimation of the motion vectors, the complexity on the encoder side
reduces by about 40-50%. Road map ahead is to extract re-usable information from
the H.264 bitstream to be used in VC-1 encoding.

REFERENCES:

[1] VC-1 Compressed Video Bitstream Format and Decoding Process (SMPTE
   421M-2006), SMPTE Standard, 2006.
[2] T. Wiegand et al, “Overview of the H.264/AVC video coding standard,” IEEE
   Trans. CSVT, Vol. 13, pp. 560-576, July 2003.
[3] C. Chen, P-H.Wu and H. Chen, “MPEG-2 to H.264 transcoding,” Picture Coding
   Symposium, pp. 15-17 Dec, 2004.
[4] Jae-Beom Lee and H. Kalva, "An efficient algorithm for VC-1 to H.264 video
   transcoding in progressive compression," IEEE International Conference on
   Multimedia and Expo, pp. 53-56, July 2006
[5] J Xin, C.W. Lin and M.T. Sun, “Digital video transcoding”, Proceedings of the
   IEEE, Vol. 93, pp 84-97, Jan 2005.
[6] A. Vetros, C. Christopoulos and H. Sun, “Video transcoding architectures and
   techniques: An overview”, IEEE Signal Processing Magazine, Vol. 20, pp 18-29,
   March 2003.
[7] Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec. H.264 /
   ISO / IEC 14496-10, Mar 2005.
[8] S. Srinivasan and S. L. Regunathan, “An overview of VC-1” Proc. SPIE, vol.
   5960, pp. 720–728, 2005.
[9] P. List et al, “Adaptive deblocking filter,” IEEE Trans. Circuits Syst. Video
   Technol., vol. 13, pp.614–619, Jun. 2003.
[10]T. D. Tran, J. Liang and C. Tu, “Lapped transform via time-domain pre- and post-
   filtering,” IEEE Trans. Signal Proc., vol. 51, pp. 1557–1571, Jun. 2003.




                                                                                   21
[11]C. C. Cheng, T. S. Chang, and K. B. Lee, “An in-place architecture for the
   deblocking filter in H.264/AVC,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol.
   53, pp. 530–534, Jul. 2006.
[12]T. C. Chen et al “Analysis and architecture design of an HDTV720p 30 frames/s
   H.264/AVC encoder,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, pp. 673
   – 688, Jun. 2006.
[13]Y.-W. Huang et al “Architecture design for deblocking filter in H.264 / JVT /
   AVC,” in IEEE Proc. Int. Conf. Multimedia and Expo, pp. 693–696, July 2003.
[14]S.-C. Chang et al “A platform based bus-interleaved architecture for de-blocking
   filter in H.264/MPEG-4 AVC,” IEEE Trans. Consumer Electron., vol. 51, pp.
   249–255, Feb 2005.
[15]M. Sima, Y. Zhou, and W. Zhang, “An efficient architecture for adaptive
   deblocking filter of H.264/AVC video coding,” IEEE Trans. Consumer
   Electronics, vol. 50, pp. 292–296, Feb. 2004.
[16]S.-Y. Shih, C.-R. Chang and Y.-L. Lin, “A near optimal deblocking filter for
   H.264 advanced video coding” in Proc. Asia and South Pacific Design
   Automation Conf., pp. 170–175, Jan 2006.
[17]T.-M. Liu et al, “A memory-efficient deblocking filter for H.264/AVC video
   coding,” in Proc. IEEE Int. Symp. Circuits Syst., pp. 2140–2143, May 2005.
[18]T.-M. Liu et al, “A 125 µ W fully scalable MPEG-2 and H.264/AVC video
   decoder for mobile applications,” IEEE J. Solid-State Circuits, vol. 42, pp. 161–
   169, Jan. 2007.
[19]L. Li, S. Goto and T. Ikenaga, “An efficient deblocking filter architecture with 2-
   dimensional parallel memory for H.264/AVC,” in Proc. Asia and South Pacific
   Design Automation Conf., pp.623–626, 2005
[20]H.-Y. Lin et al “Efficient deblocking filter architecture for H.264 video coders,”
   in IEEE ISCAS, pp 4, May 2006
[21]T.-M. Liu, W.-P. Lee and C.-Y. Lee, “An in/post-loop deblocking filter with
   hybrid filtering schedule” IEEE Trans. Circuits Syst. for Video Technol., vol. 17,
   pp. 937–943, Jul. 2007.
[22]I. Ahmad et al, “Video transcoding: An overview of various techniques and
   research Issues”, IEEE Trans. on Multimedia, vol. 7, pp. 793-8, Oct. 2005




                                                                                    22
[23]Y.L Lee and T.Q Nguyen, "Analysis and efficient architecture design for VC-1
   overlap smoothing and in-loop deblocking Filter," IEEE Trans Circuits and Syst.
   for Video Technol, vol.18, pp 1786-1796, Dec. 2008
[24]G. Fernandez-Escribano et al, “Speeding-up the macroblock partition mode
   decision for MPEG-2 to H.264 transcoding,” Proceedings of IEEE ICIP 2006,
   Atlanta, pp 869-872, Sept 2006.
[25]Z. Zhou et al "Motion information and coding mode reuse for MPEG-2 to H.264
   transcoding", Proceedings of the IEEE ISCAS 2005, pp 1230-1233, May 2005.
[26]B. Petljanski and H. Kalva, “DCT domain intra MB mode decision for MPEG-2
   to H.264 transcoding” Proceedings of the IEEE ICCE 2006, pp. 419-420, Jan
   2006.
[27]J. Bialkowski, A. Kaup and K. Illgner, “Fast transcoding of intra frames between
   H.263 and H.264,” IEEE ICIP, vol.4, pp. 2785- 2788, Oct 2004.
[28]Y.-K. Lee, S.-S. Lee, and Y.-L. Lee, “MPEG-4 to H.264 transcoding using
   macroblock statistics,” Proceedings of the IEEE ICME 2006, pp.57-60, Toronto,
   Canada, July 2006.
[29]G. Sullivan, P. Topiwalla and A. Luthra, “The H.264/AVC video coding
   standard: overview and introduction to the fidelity range extensions”, SPIE
   Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp.
   53-74 Aug 2004.
[30]T. Weigand et al, “Introduction to the Special Issue on Scalable Video Coding—
   Standardization and Beyond” IEEE Trans on Circuits and Systems for Video
   Technology, Vol 17, pp 1034, Sept 2007.
[31]Von Roden and T. Praktische, “H.261 and MPEG1- A comparison” Conference
   Proceedings of the 1996 IEEE Fifteenth Annual International Phoenix Conference
   on Computers and Communications, pp.65-71, Mar 1996
[32]S. Srinivasan et al, “Windows Media Video 9: overview and applications” Signal
   Processing: Image Communication, Vol 19, pp 851-875, Oct 2004.
[33]S. K. Kwon, A. Tamhankar and K.R. Rao, "An overview of H.264/MPEG-4 Part
   10," Special issue of Journal of Visual Communication                and Image
   Representation,vol.17, pp 186-216, April 2006.
[34]G.A Davidson et al, “ATSC video and audio coding”, Proc. IEEE, vol 94, pp
   60-76, Jan 2006.


                                                                                 23
[35]J. Bialkowski, M Barkowky and A. Kaup, “Overview of low complexity video
   transcoding from H.263 to H.264” IEEE ICME, pp 49-52, 2006.
[36]T. D. Nguyen et al, “Efficient MPEG-4 to H.264/AVC transcoding with spatial
   downscaling”, ETRI Journal, vol.29, no.6, pp 826-828, Dec. 2007.
[37]H. Kalva, G.F. Escribano and K Kunzelmann, “Reduced resolution MPEG-2 to
   H.264 transcoder” Proc. SPIE, Vol. 7257, 72571V Jan 2009.
[38]S Moiron et al, "H.264/AVC to MPEG-2 video transcoding architecture", Proc
   Conf. on Telecommunications - ConfTele, Peniche, Portugal, Vol. 1, pp. 449 -
   452, May, 2007.
[39]S Moiron et al, “Video transcoding from H.264/AVC to MPEG-2 with reduced
   computational complexity”, Signal Processing: Image Communication, vol 24, pp
   637-650, September 2009
[40]Mei-Juan Chen, Ming-Chung Chu and Chih-Wei Pan, “Efficient motion-
   estimation algorithm for reduced frame-rate video transcoder”, IEEE Trans on
   Circuits and Systems for Video Technology, vol. 12, pp. 269–275, Apr. 2002.
[41]ISO/IEC 11172-2:1993 Information technology -- Coding of moving pictures and
   associated audio for digital storage media at up to about 1,5 Mbits/s -- Part 2:
   Video
[42]H. Kalva and J.B. Lee, "The VC-1 Video Coding Standard," IEEE Multimedia,
   vol. 14, pp. 88-91, Oct.-Dec. 2007
[43]P. Bordes, A. Orhand, “Improved Algorithm for fast transcoding H.264”
   EUSIPCO 2007.




REFERENCE BOOKS:


[44]K. Sayood, “Introduction to Data compression”, III edition, Morgan
   Kauffmann publishers, 2006.
[45]I.E.G. Richardson, “H.264 and MPEG-4 video compression: video coding for
   next-generation multimedia”, Wiley, 2003.




                                                                                 24
[46]K. R. Rao and P. C. Yip, “The transform and data compression handbook”,
       Boca Raton, FL: CRC press, 2001.
[47]K.R. Rao and J.J. Hwang “Techniques and Standards for Image, Video, and
       Audio Coding” - Prentice Hall, 1996.
[48]J.B. Lee and H. Kalva, The VC-1 and H.264 Video Compression Standards
       for Broadband Video Services, Springer, 2008.


REFERENCE WEBSITES:


[49]JM software                           : http://iphome.hhi.de/suehring/tml/
[50]VC-1 Software                 : http://www.smpte.org/home
[51]Microsoft website             - VC-1 Technical Overview
       http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx#VC1C
       omparedtoOtherCodecs
[52]VC-1 Wikipedia site - http://en.wikipedia.org/wiki/VC-1
[53]




ACRONYMS:

ASO                               Arbitrary slice ordering
AVC                               Advanced Video Coding
B MB                              Bi-predicted MB
CDDT                              Cascaded DCT Domain Transcoder
CPDT                              Cascaded Pixel Domain Transcoder
DCT                               Discrete Cosine Transform
DSP                               Digital Signal Processing
DVD                               Digital Versatile Disc
FMO                               Flexible macroblock ordering
FRExt                             Fidelity Range Extensions
GOP                               Group Of Pictures
I MB                              Intra Predicted MB
IEC                               International Electrotechnical Commission
ISO                               International Organization for Standardization
ITU-T                             International Telecommunication Union – Transmission
                                  sector
JVT                               Joint Video Team
P MB                              Inter Predicted MB
IDCT                              Inverse Discrete Cosine Transform
IQ                                Inverse Quantizer
MB                                Macroblock


                                                                                           25
ME      Motion Estimation
MC      Motion Compensation
MV      Motion Vector
MPEG    Moving Picture Experts Group
MSE     Mean Square Error
PSNR    Peak –to – peak Signal to Noise Ratio
Q       Quantizer
R-D     Rate - Distortion
SDDT    Simplified DCT Domain Transcoder
SP/SI   Switched P / Switched I
SMPTE   Society of Motion Picture and Television Engineers
SSIM    Structural Similarity Index Measure
SVC     Scalable Video Coding
VCEG    Video Coding Experts Group
VLC     Variable Length Coding
VLD     Variable Length Decoder
YUV     Y- Luminance and UV- Chrominance




                                                             26

Weitere ähnliche Inhalte

Was ist angesagt?

3D Video Programming for Android
3D Video Programming for Android3D Video Programming for Android
3D Video Programming for AndroidYoss Cohen
 
H.264 video standard
H.264 video standardH.264 video standard
H.264 video standardSajan Sahu
 
HEVC VIDEO CODEC By Vinayagam Mariappan
HEVC VIDEO CODEC By Vinayagam MariappanHEVC VIDEO CODEC By Vinayagam Mariappan
HEVC VIDEO CODEC By Vinayagam MariappanVinayagam Mariappan
 
HEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxHEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxYoss Cohen
 
Spatial Scalable Video Compression Using H.264
Spatial Scalable Video Compression Using H.264Spatial Scalable Video Compression Using H.264
Spatial Scalable Video Compression Using H.264IOSR Journals
 
Emerging H.264 Standard:
Emerging H.264 Standard:Emerging H.264 Standard:
Emerging H.264 Standard:Videoguy
 
Emerging H.264 Standard: Overview and TMS320DM642- Based ...
Emerging H.264 Standard: Overview and TMS320DM642- Based ...Emerging H.264 Standard: Overview and TMS320DM642- Based ...
Emerging H.264 Standard: Overview and TMS320DM642- Based ...Videoguy
 
Complexity Analysis in Scalable Video Coding
Complexity Analysis in Scalable Video CodingComplexity Analysis in Scalable Video Coding
Complexity Analysis in Scalable Video CodingWaqas Tariq
 
Mpeg 2 transport streams
Mpeg 2 transport streamsMpeg 2 transport streams
Mpeg 2 transport streamschikien276
 
Introduction to HEVC
Introduction to HEVCIntroduction to HEVC
Introduction to HEVCYoss Cohen
 
High Efficiency Video Codec
High Efficiency Video CodecHigh Efficiency Video Codec
High Efficiency Video CodecTejus Adiga M
 

Was ist angesagt? (20)

3D Video Programming for Android
3D Video Programming for Android3D Video Programming for Android
3D Video Programming for Android
 
Report
ReportReport
Report
 
H.264 video standard
H.264 video standardH.264 video standard
H.264 video standard
 
HEVC overview main
HEVC overview mainHEVC overview main
HEVC overview main
 
HEVC VIDEO CODEC By Vinayagam Mariappan
HEVC VIDEO CODEC By Vinayagam MariappanHEVC VIDEO CODEC By Vinayagam Mariappan
HEVC VIDEO CODEC By Vinayagam Mariappan
 
HEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxHEVC Definitions and high-level syntax
HEVC Definitions and high-level syntax
 
Deblocking_Filter_v2
Deblocking_Filter_v2Deblocking_Filter_v2
Deblocking_Filter_v2
 
Spatial Scalable Video Compression Using H.264
Spatial Scalable Video Compression Using H.264Spatial Scalable Video Compression Using H.264
Spatial Scalable Video Compression Using H.264
 
Emerging H.264 Standard:
Emerging H.264 Standard:Emerging H.264 Standard:
Emerging H.264 Standard:
 
H.263 Video Codec
H.263 Video CodecH.263 Video Codec
H.263 Video Codec
 
Video coding standards ppt
Video coding standards pptVideo coding standards ppt
Video coding standards ppt
 
Feature hevc
Feature hevcFeature hevc
Feature hevc
 
Emerging H.264 Standard: Overview and TMS320DM642- Based ...
Emerging H.264 Standard: Overview and TMS320DM642- Based ...Emerging H.264 Standard: Overview and TMS320DM642- Based ...
Emerging H.264 Standard: Overview and TMS320DM642- Based ...
 
H.264 vs HEVC
H.264 vs HEVCH.264 vs HEVC
H.264 vs HEVC
 
Complexity Analysis in Scalable Video Coding
Complexity Analysis in Scalable Video CodingComplexity Analysis in Scalable Video Coding
Complexity Analysis in Scalable Video Coding
 
MPEG4 vs H.264
MPEG4 vs H.264MPEG4 vs H.264
MPEG4 vs H.264
 
H261
H261H261
H261
 
Mpeg 2 transport streams
Mpeg 2 transport streamsMpeg 2 transport streams
Mpeg 2 transport streams
 
Introduction to HEVC
Introduction to HEVCIntroduction to HEVC
Introduction to HEVC
 
High Efficiency Video Codec
High Efficiency Video CodecHigh Efficiency Video Codec
High Efficiency Video Codec
 

Andere mochten auch

Algorithm to remove spectral leakage
Algorithm to remove spectral leakageAlgorithm to remove spectral leakage
Algorithm to remove spectral leakageFangXuIEEE
 
Wavelet Transform - Intro
Wavelet Transform - IntroWavelet Transform - Intro
Wavelet Transform - IntroImane Haf
 
Digital video watermarking scheme using discrete wavelet transform and standa...
Digital video watermarking scheme using discrete wavelet transform and standa...Digital video watermarking scheme using discrete wavelet transform and standa...
Digital video watermarking scheme using discrete wavelet transform and standa...eSAT Publishing House
 
Interpixel redundancy
Interpixel redundancyInterpixel redundancy
Interpixel redundancyNaveen Kumar
 
Wavelet video processing tecnology
Wavelet video processing tecnologyWavelet video processing tecnology
Wavelet video processing tecnologyPrashant Madnavat
 
Image compression using discrete wavelet transform
Image compression using discrete wavelet transformImage compression using discrete wavelet transform
Image compression using discrete wavelet transformHarshal Ladhe
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transformpiyush_11
 
Wavelet based image compression technique
Wavelet based image compression techniqueWavelet based image compression technique
Wavelet based image compression techniquePriyanka Pachori
 
Image compression
Image compressionImage compression
Image compressionAle Johnsan
 
Image compression introductory presentation
Image compression introductory presentationImage compression introductory presentation
Image compression introductory presentationTariq Abbas
 

Andere mochten auch (16)

Algorithm to remove spectral leakage
Algorithm to remove spectral leakageAlgorithm to remove spectral leakage
Algorithm to remove spectral leakage
 
Wavelet Transform - Intro
Wavelet Transform - IntroWavelet Transform - Intro
Wavelet Transform - Intro
 
Digital video watermarking scheme using discrete wavelet transform and standa...
Digital video watermarking scheme using discrete wavelet transform and standa...Digital video watermarking scheme using discrete wavelet transform and standa...
Digital video watermarking scheme using discrete wavelet transform and standa...
 
Barcelona keynote web
Barcelona keynote webBarcelona keynote web
Barcelona keynote web
 
Interpixel redundancy
Interpixel redundancyInterpixel redundancy
Interpixel redundancy
 
Wavelet
WaveletWavelet
Wavelet
 
Wavelet video processing tecnology
Wavelet video processing tecnologyWavelet video processing tecnology
Wavelet video processing tecnology
 
Image compression using discrete wavelet transform
Image compression using discrete wavelet transformImage compression using discrete wavelet transform
Image compression using discrete wavelet transform
 
Wavelet
WaveletWavelet
Wavelet
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transform
 
JPEG Image Compression
JPEG Image CompressionJPEG Image Compression
JPEG Image Compression
 
Wavelet based image compression technique
Wavelet based image compression techniqueWavelet based image compression technique
Wavelet based image compression technique
 
Image compression
Image compressionImage compression
Image compression
 
Image compression introductory presentation
Image compression introductory presentationImage compression introductory presentation
Image compression introductory presentation
 
Image compression
Image compressionImage compression
Image compression
 
Image Compression
Image CompressionImage Compression
Image Compression
 

Ähnlich wie proposal

Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612NITC
 
A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...
A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...
A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...csandit
 
H.264 video compression standard.
H.264 video compression standard.H.264 video compression standard.
H.264 video compression standard.Axis Communications
 
h.264 video compression standard.
h.264 video compression standard.h.264 video compression standard.
h.264 video compression standard.Videoguy
 
H264 video compression explained
H264 video compression explainedH264 video compression explained
H264 video compression explainedcnssources
 
Next generation video compression
Next generation video compressionNext generation video compression
Next generation video compressionEricsson
 
Next generation video compression
Next generation video compressionNext generation video compression
Next generation video compressionEricsson Slides
 
/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.ppt
/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.ppt/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.ppt
/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.pptVideoguy
 
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...DR.P.S.JAGADEESH KUMAR
 
New generation video coding OVERVIEW.pptx
New generation video coding OVERVIEW.pptxNew generation video coding OVERVIEW.pptx
New generation video coding OVERVIEW.pptxYaseenMo
 
Motion Vector Recovery for Real-time H.264 Video Streams
Motion Vector Recovery for Real-time H.264 Video StreamsMotion Vector Recovery for Real-time H.264 Video Streams
Motion Vector Recovery for Real-time H.264 Video StreamsIDES Editor
 
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODERHARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODERcscpconf
 
The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...Videoguy
 

Ähnlich wie proposal (20)

E010132529
E010132529E010132529
E010132529
 
H264 final
H264 finalH264 final
H264 final
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
10.1.1.184.6612
10.1.1.184.661210.1.1.184.6612
10.1.1.184.6612
 
A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...
A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...
A REAL-TIME H.264/AVC ENCODER&DECODER WITH VERTICAL MODE FOR INTRA FRAME AND ...
 
H.264 video compression standard.
H.264 video compression standard.H.264 video compression standard.
H.264 video compression standard.
 
h.264 video compression standard.
h.264 video compression standard.h.264 video compression standard.
h.264 video compression standard.
 
H264 video compression explained
H264 video compression explainedH264 video compression explained
H264 video compression explained
 
Next generation video compression
Next generation video compressionNext generation video compression
Next generation video compression
 
Next generation video compression
Next generation video compressionNext generation video compression
Next generation video compression
 
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
[IJET-V1I2P1] Authors :Imran Ullah Khan ,Mohd. Javed Khan ,S.Hasan Saeed ,Nup...
 
/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.ppt
/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.ppt/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.ppt
/conferences/spr2004/presentations/eubanks/eubanks_mpeg4.ppt
 
video compression2
video compression2video compression2
video compression2
 
video compression2
video compression2video compression2
video compression2
 
video compression2
video compression2video compression2
video compression2
 
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...
Performance Evaluation of H.264 AVC Using CABAC Entropy Coding For Compound I...
 
New generation video coding OVERVIEW.pptx
New generation video coding OVERVIEW.pptxNew generation video coding OVERVIEW.pptx
New generation video coding OVERVIEW.pptx
 
Motion Vector Recovery for Real-time H.264 Video Streams
Motion Vector Recovery for Real-time H.264 Video StreamsMotion Vector Recovery for Real-time H.264 Video Streams
Motion Vector Recovery for Real-time H.264 Video Streams
 
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODERHARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
HARDWARE SOFTWARE CO-SIMULATION OF MOTION ESTIMATION IN H.264 ENCODER
 
The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...
 

Mehr von Videoguy

Energy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingEnergy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingVideoguy
 
Microsoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresMicrosoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresVideoguy
 
Proxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingProxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingVideoguy
 
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksFree-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksVideoguy
 
Instant video streaming
Instant video streamingInstant video streaming
Instant video streamingVideoguy
 
Video Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideo Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideoguy
 
Video Streaming
Video StreamingVideo Streaming
Video StreamingVideoguy
 
Reaching a Broader Audience
Reaching a Broader AudienceReaching a Broader Audience
Reaching a Broader AudienceVideoguy
 
Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Videoguy
 
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGVideoguy
 
Impact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingImpact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingVideoguy
 
Application Brief
Application BriefApplication Brief
Application BriefVideoguy
 
Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Videoguy
 
Streaming Video into Second Life
Streaming Video into Second LifeStreaming Video into Second Life
Streaming Video into Second LifeVideoguy
 
Flash Live Video Streaming Software
Flash Live Video Streaming SoftwareFlash Live Video Streaming Software
Flash Live Video Streaming SoftwareVideoguy
 
Videoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoguy
 
Streaming Video Formaten
Streaming Video FormatenStreaming Video Formaten
Streaming Video FormatenVideoguy
 
iPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareiPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareVideoguy
 
Glow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxGlow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxVideoguy
 

Mehr von Videoguy (20)

Energy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingEnergy-Aware Wireless Video Streaming
Energy-Aware Wireless Video Streaming
 
Microsoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresMicrosoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_Pres
 
Proxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingProxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video Streaming
 
Adobe
AdobeAdobe
Adobe
 
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksFree-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
 
Instant video streaming
Instant video streamingInstant video streaming
Instant video streaming
 
Video Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideo Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A Survey
 
Video Streaming
Video StreamingVideo Streaming
Video Streaming
 
Reaching a Broader Audience
Reaching a Broader AudienceReaching a Broader Audience
Reaching a Broader Audience
 
Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...
 
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
 
Impact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingImpact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video Streaming
 
Application Brief
Application BriefApplication Brief
Application Brief
 
Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Video Streaming Services – Stage 1
Video Streaming Services – Stage 1
 
Streaming Video into Second Life
Streaming Video into Second LifeStreaming Video into Second Life
Streaming Video into Second Life
 
Flash Live Video Streaming Software
Flash Live Video Streaming SoftwareFlash Live Video Streaming Software
Flash Live Video Streaming Software
 
Videoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions Cookbook
 
Streaming Video Formaten
Streaming Video FormatenStreaming Video Formaten
Streaming Video Formaten
 
iPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareiPhone Live Video Streaming Software
iPhone Live Video Streaming Software
 
Glow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxGlow: Video streaming training guide - Firefox
Glow: Video streaming training guide - Firefox
 

proposal

  • 1. EE 5359 PROPOSAL H.264 to VC-1 TRANSCODING Vidhya Vijayakumar Student I.D.: 1000-622152 Date: September 24, 2009 1
  • 2. H.264 to VC-1 TRANSCODER OBJECTIVE: The objective of the thesis is to implement a H.264 bitstream to VC-1 transcoder for progressive compression. MOTIVATION: The high definition video adoption has been growing rapidly for the last five years. The high definition DVD format Blue ray has mandated MPEG-2[3], H.264 [2] and VC-1 [1] as video compression formats. The coexistence of these different video coding standards creates a need for transcoding. As more and more end products use the above standards, transcoding from one format to another adds value to the product’s capability. While there has been recent work on MPEG-2 to H.264 transcoding [3], VC-1 to H.264 transcoding [4], the published work on H.264 to VC-1 transcoding is nearly non-existent. This has created the motivation to develop a transcoder that can efficiently transcode a H.264 bitstream to a VC-1 bitstream. DETAILS: Video transcoding is the operation of converting video from one format to another [5]. A format is defined by characteristics such as bit-rate, spatial resolution etc. One of the earliest applications of transcoding is to adapt the bit-rate of a compressed stream to the channel bandwidth for universal multimedia access in all kinds of channels like wireless networks, Internet, dial-up networks etc. Changes in the characteristics of an encoded stream like bit-rate, spatial resolution, quality etc can also be achieved by scalable video coding [5].However, in cases where the available network bandwidth is insufficient or if it fluctuates with time, it may be difficult to set the base layer bit-rate. In addition, scalable video coding demands additional complexities at both the encoder and the decoder. The basic architecture for converting an H.264 bitstream into a VC-1 elementary stream arises from complete decoding of the H.264 stream and then re- encoding into a VC-1 stream. However, this involves significant computational complexity [6]. Hence there also is a need to transcode at low complexity. Transcoding can in general be implemented in the spatial domain or in the transform domain or in a combination of the two domains. The common transcoding architectures [5] are: Open loop transform domain transcoding Fig. 1 Open loop transform domain transcoder architecture [5] 2
  • 3. Open loop transcoders are computationally efficient (Fig 1). They operate in the DCT domain. However they are subject to drift error. Drift error occurs due to rounding, quantization loss and clipping functions. Cascaded Pixel Domain Architecture (CPDT) Fig. 2 Cascaded pixel domain transcoder architecture [5] This is the most basic transcoding architecture (Fig 2). The motion vectors from the incoming bit stream are extracted and reused. Thus the complexity of the motion estimation block is eliminated which accounts for 60% of the encoder computation. As compared to the previous architecture, CPDT is drift free. Hence, even though it is slightly more complex, it is suited for heterogeneous transcoding between different standards where the basic parameters like mode decisions, motion vectors etc are to be re-derived. Simplified DCT Domain transcoders (SDDT) Fig. 3 Simplified transform domain transcoder architecture [5] This transcoder is based on the assumption that DCT, IDCT and motion compensation are linear processes (Fig 3). This architecture requires that motion compensation be performed in the DCT domain, which is a major computationally intensive operation [3]. For instance, as shown in the figure 4, the goal is trying to compute the DCT 3
  • 4. coefficients of the target block B from the four overlapping blocks B1, B2, B3 and B4. Fig. 4 Transform domain motion compensation illustration [5] Also, clipping functions and rounding operations performed for interpolation in fractional pixel motion compensation lead to a drift in the transcoded video. Cascaded DCT Domain transcoders (CDDT) Fig. 5 Cascaded transform domain transcoder architecture [5] This is used for spatial/temporal resolution downscaling and other coding parameter changes (Fig 5). As compared with SDDT, greater flexibility is achieved by introducing another transform domain motion compensation block; however it is far more computationally intensive and requires more memory [3]. It is often applied to downscaling applications where the encoder end memory will not cost much due to downscaled resolution. 4
  • 5. Choice of basic transcoder architecture: DCT domain transcoders have the main drawback that motion compensation in transform domain is very computationally intensive. DCT domain transcoders are also, less flexible as compared to pixel domain transcoders, for instance, the SDDT architecture can only be used for bit rate reduction transcoding. It assumes that the spatial and temporal resolutions stay the same and that the output video uses the same frame types, mode decisions and motion vectors as the input video. For H.264 to VC-1 transcoding, it is required to implement several changes in order to accommodate the mismatches between the two standards. For instance, for motion estimation and compensation, H.264 supports 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4 macroblock partitions (Fig 6), but VC-1 supports 16x16 and 8x8 only (Fig 7). The transform size and type (8x8 and 4x4 in H.264 and 8x8, 4x8, 8x4 and 4x4 in VC-1) are different and make transform domain transcoding prohibitively complex. Hence, the use of DCT domain transcoders is not very ideal. Fig.6 Segmentations of the macroblock for motion compensation in H.264 Top: segmentation of macroblocks, bottom: segmentation of 8x8 partitions. [2] Fig.7 Segmentations of the macroblock for motion compensation in VC-1 [2] From Fig. 8, it can be inferred that, the cascaded pixel domain architecture outperforms the DCT domain transcoders. Also for larger GOP sizes, the drift in DCT domain transcoders becomes more significant. 5
  • 6. Fig.8 PSNR vs Bit-rate graph for the Foreman sequence transcoded with a GOP size 15, using different transcoding architectures as described in Figs. 1, 2, 3 and 5. [5] Hence, heterogeneous transcoding in the pixel domain is preferred for standards transcoding. Standards transcoding: When transcoding between two different standards, the main factor involved is compatibility between the profile and level of the input stream and that of the output stream for a specific purpose. The goal here is to transcode a H.264 bitstream of Baseline profile to VC-1 bit stream of Simple profile. The table 1 compares and contrasts the characteristics of both standards H.264 High Profile VC-1 Main Profile Chroma Format 4:2:0 4:2:0 Picture coding type I ,P ,B I ,P ,B Transform size 4x4, 8x8 8x8, 4x8, 8x4, 4x4 Intra Prediction Directional Predictors None Block sizes for Motion 16x16, 16x8, 8x16, 8x8, 4x8, 16x16, 8x8 Compensation 8x4, 4x4 Table 1 Main characteristics of H.264 Main profile and VC-1 Main profile Overview of H.264: H.264 [2] is a standard for video compression, and is equivalent to MPEG-4 Part 10, or MPEG-4 AVC (for advanced video coding) (Fig 9). As of 2008, it is the latest block-oriented motion-compensation-based video standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC 6
  • 7. Moving Picture Experts Group (MPEG), and it was the product of a partnership effort known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 Part 10 standard (formally, ISO/IEC 14496-10) are jointly maintained so that they have identical technical content. Fig 9 H.264 Encoder [32] Fig 10. H.264 Decoder [32] The standardization of the first version of H.264/AVC was completed in May 2003. The JVT then developed extensions to the original standard that are known as the Fidelity Range Extensions (FRExt) [29]. These extensions enable higher quality video coding by supporting increased sample bit depth precision and higher-resolution color information, including sampling structures known as YUV 4:2:2 and YUV 4:4:4. Several other features are also included in the Fidelity Range Extensions project, such as adaptive switching between 4×4 and 8×8 integer transforms, encoder- specified perceptual-based quantization weighting matrices, efficient inter-picture lossless coding, and support of additional color spaces. The design work on the Fidelity Range Extensions was completed in July 2004, and the drafting work on them was completed in September 2004. Scalable video coding (SVC) [30] as specified in Annex G of H.264/AVC allows the construction of bitstreams that contain sub-bitstreams that conform to H.264/AVC. For temporal bitstream scalability, i.e., the presence of a sub-bitstream with a smaller temporal sampling rate than the bitstream, complete access units are removed from the bitstream when deriving the sub-bitstream. In this case, high-level syntax and inter prediction reference pictures in the bitstream are constructed accordingly. For spatial and quality bitstream scalabilities, i.e. the presence of a sub- bitstream with lower spatial resolution or quality than the bitstream, network 7
  • 8. abstraction layer (NAL) units are removed from the bitstream when deriving the sub- bitstream. In this case, inter-layer prediction, i.e., the prediction of the higher spatial resolution or quality signal by data of the lower spatial resolution or quality signal, is typically used for efficient coding. The Scalable Video Coding extension was completed in November 2007. Some of the features adopted in H.264 for enhancement of prediction, improved coding efficiency and robustness to data errors/losses are listed as follows. Features for enhancement of prediction • Directional spatial prediction for intra coding • Variable block-size motion compensation with small block size Figure 11 – Various block sizes in H.264 • Quarter-sample-accurate motion compensation • Motion vectors over picture boundaries • Multiple reference picture motion compensation • Decoupling of referencing order from display order • Decoupling of picture representation methods from picture referencing capability • Weighted prediction • Improved “skipped” and “direct” motion inference • In-the-loop deblocking filtering Features for improved coding efficiency • Small block-size transform • Exact-match inverse transform 8
  • 9. Figure – Forward 4x4 and 8x8 integer transform • Short word-length transform • Hierarchical block transform • Arithmetic entropy coding • Context-adaptive entropy coding Features for robustness to data errors/losses • Parameter set structure • NAL unit syntax structure • Flexible slice size • Flexible macroblock ordering (FMO) • Arbitrary slice ordering (ASO) • Redundant pictures • Data partitioning • SP/SI synchronization/switching pictures Profiles in H.264 H.264 standard defines numerous profiles. • Constrained baseline profile • Baseline • Main profile • Extended profile • High profile 9
  • 10. High 10 profile • High 4:2:2 profile • High 4:4:4 predictive profile • High stereo profile • High 10 intra profile • High 4:2:2 intra profile • High 4:4:4 intra profile • CAVLC 4:4:4 intra profile • Scalable baseline profile • Scalable high profile • Scalable high intra profile Table Features in baseline, main and extended profile Table Features in high profile 10
  • 11. High Profiles Adaptive transform block size Extended Profile Quantization scaling matrices Main Profile CABAC Data partition B slice SI slice Weighted prediction SP slice I slice P slice CAVLC Arbitrary slice order Flexible macroblock order Redundant slice Baseline Profile Figure 12 Comparison of H.264 baseline, main, extended and high profile Overview of VC-1 VC-1 [1] is the informal name of the SMPTE 421M video codec standard initially developed by Microsoft. It was released on April 3, 2006 by SMPTE. It is now a supported standard for Blu-ray Discs, and Windows Media Video 9. VC-1 is an evolution of the conventional DCT-based video codec design also found in H.261 [31], H.263 [27], MPEG-1[40] and MPEG-2[3]. It is widely characterized as an alternative to the latest ITU-T and MPEG video codec standard known as H.264/MPEG-4 AVC. VC-1 contains coding tools for interlaced video sequences as well as progressive encoding. The main goal of VC-1 development and standardization is to support the compression of interlaced content without first converting it to progressive, making it more attractive to broadcast and video industry professionals. The VC-1 codec is designed to achieve state-of-the-art compressed video quality at bit rates that may range from very low to very high. The codec can easily handle 1920 pixel × 1080 pixel resolution at 6 to 30 megabits per second (Mbps) for high-definition video. VC-1 is capable of higher resolutions such as 2048 pixels × 1536 pixels for digital cinema, and of a maximum bit rate of 135 Mbps. An example of very low bit rate video would be 160 pixel × 120 pixel resolution at 10 kilobits per second (Kbps) for modem applications. 11
  • 12. The basic functionality of VC-1 involves a block-based motion compensation and spatial transform scheme similar to that used in other video compression standards such as MPEG-1 and H.261 [31]. However, VC-1 includes a number of innovations and optimizations that make it distinct from the basic compression scheme, resulting in excellent quality and efficiency. VC-1 Advanced Profile is also transport independent. This provides even greater flexibility for device manufacturers and content services. Fig. 11 VC – 1 Codec [32] Profiles in VC-1 VC-1 defines three profiles 1. Simple 2. Main 3. Advanced Simple Main Advanced Baseline intra frame Yes Yes Yes compression Variable-sized transform Yes Yes Yes 16-bit transform Yes Yes Yes Overlapped transform Yes Yes Yes 4 motion vector per Yes Yes Yes macroblock 12
  • 13. ¼ pixel luminance motion Yes Yes Yes compensation ¼ pixel chrominance motion No Yes Yes compensation Start codes No Yes Yes Extended motion vectors No Yes Yes Simple Main Advanced Loop filter No Yes Yes Dynamic resolution change No Yes Yes Adaptive macroblock No Yes Yes quantisation B frames No Yes Yes Intensity compensation No Yes Yes Range adjustment No Yes Yes Field and frame coding modes No No Yes GOP Layer No No Yes Display metadata No No Yes Table – Features in VC-1 profiles [49] Innovations 13
  • 14. VC-1 includes a number of innovations that enable it to produce high quality content. This section provides brief descriptions of some of these features. Adaptive Block Size Transform Traditionally, 8 × 8 transforms have been used for image and video coding. However, there is evidence to suggest that 4 × 4 transforms can reduce ringing artifacts at edges and discontinuities. VC-1 is capable of coding an 8 × 8 block using either an 8 × 8 transform, two 8 × 4 transforms, two 4 × 8 transforms, or four 4 × 4 transforms. This feature enables coding that takes advantage of the different transform sizes as needed for optimal image quality. Figure – VC-1 transform sizes [4] 16-Bit Transforms In order to minimize the computational complexity of the decoder, VC-1 uses 16-bit transforms. This also has the advantage of easy implementation on the large amount of digital signal processing (DSP) hardware built with 16-bit processors. Among the constraints put on transforms specified in VC-1 is the requirement that the 16-bit values used produce results that can fit in 16 bits. The constraints on transforms ensure that decoding is as efficient as possible on a wide range of devices. Motion Compensation Motion compensation is the process of generating a prediction of a video frame by displacing the reference frame. Typically, the prediction is formed for a block (an 8 × 8 pixel tile) or a macroblock (a 16 × 16 pixel tile) of data. The displacement of data due to motion is defined by a motion vector, which captures the shift along both the x- and y-axes. Figure VC-1 motion compensation sizes [4] 14
  • 15. The efficiency of the codec is affected by the size of the predicted block, the granularity of sub-pixel data that can be captured, and the type of filter used for generating sub-pixel predictors. VC-1 uses 16 × 16 blocks for prediction, with the ability to generate mixed frames of 16 × 16 and 8 × 8 blocks. The finest granularity of sub-pixel information supported by VC-1 is 1/4 pixel. Two sets of filters are used by VC-1 for motion compensation. The first is an approximate bicubic filter with four taps. The second is a bilinear filter with two taps. The four-tap bicubic filters used in VC-1 for ¼ and ½ pixel shifts are: [-4 53 18 -3]/64 and [-1 9 9 -1]/16. Figure – Integer, half and quarter pel positions [2] (A-Q Integer, aa-hh half, a-s quarter pel positions) VC-1 combines the motion vector settings defined by the block size, sub- pixel resolution, and filter type into modes. The result is four motion compensation modes that suit a range of different situations. This classification of settings into modes also helps compact decoder implementations. Loop Filtering VC-1 uses an in-loop deblocking filter that attempts to remove block- boundary discontinuities introduced by quantization errors in interpolated frames. These discontinuities can cause visible artifacts in the decompressed video frames and can impact the quality of the frame as a predictor for future interpolated frames. 15
  • 16. Figure – Loop filtering in VC-1 [4] (Only pixel p4 and p5 are filtered) The loop filter takes into account the adaptive block size transforms. The filter is also optimized to reduce the number of operations required. Interlaced Coding Interlaced video content is widely used in television broadcasting. When encoding interlaced content, the VC-1 codec can take advantage of the characteristics of interlaced frames to improve compression. This is achieved by using data from both fields to predict motion compensation in interpolated frames. Advanced B Frame Coding A bi-directional or B frame is a frame that is interpolated from data both in previous and subsequent frames. B frames are distinct from I frames (also called key frames), which are encoded without reference to other frames. B frames are also distinct from P frames, which are interpolated from previous frames only. VC-1 includes several optimizations that make B frames more efficient. VC-1 does not have a fixed group of pictures (GOP) structure and the number of pictures in a GOP can vary. Fading Compensation Due to the nature of compression that uses motion compensation, encoding of video frames that contain fades to or from black is very inefficient. With a uniform fade, every macroblock needs adjustments to luminance. VC-1 includes fading compensation, which detects fades and uses alternate methods to adjust luminance. This feature improves compression efficiency for sequences with fading and other global illumination changes. Differential Quantization Differential quantization, or dquant, is an encoding method in which multiple quantization steps are used within a single frame. Rather than quantize the entire frame with a single quantization level, macroblocks are identified within the frame that might benefit from lower quantization levels and greater number of preserved AC 16
  • 17. coefficients. Such macroblocks are then encoded at lower quantization levels than the one used for the remaining macroblocks in the frame. The simplest and typically most efficient form of differential quantization involves only two quantizer levels (bi-level dquant), but VC-1 supports multiple levels, also. MAPPING DIFFERENCES BETWEEN THE TWO STANDARDS: The transcoding algorithm considered in this research assumes full H.264 decoding down to the pixel level, followed by a reduced complexity VC-1 encoding. The data gathered during the H.264 decoding stage is used to accelerate the VC-1 encoding stage. It is assumed that the H.264 encoded bitstream is generated with an R-D optimized encoder. The picture coding types used are similar in both the standards. The transform size and type are different and makes transform domain transcoding prohibitively complex. The semantics of intra MBs are similar except for the intra directional prediction allowed in H.264 and the mixed MBs in VC-1. The inter prediction has significant differences including the block size of MC, block size of transform, and reference frames used. These similarities between the codecs can be exploited in reducing the transcoding complexity. Intra MB Mode Mapping: An intra MB in the incoming H.264 bitstream is coded as a VC-1 intra MB. A H.264 intra MB can be coded as Intra 4x4 (9 different directional modes) or Intra 16x16 (4 different modes). But a VC-1 intra MB has four 8x8 blocks and has no prediction modes. Since intra MB in VC-1 uses 8x8 transform, irrespective of the block size (16x16 or 4x4) in H.264, we need not carry over the information of the intra prediction type in H.264. Table 2 shows the proposed intra MB mapping. H.264 Intra MB VC-1 Intra MB Intra 16x16 (Any mode) Intra MB 8x8 Intra 4x4 (Any mode) Intra MB 8x8 Table 2 H.264 and VC-1 Intra MB mapping Figure – Matrix for one-dimensional 8-point inverse transform [32] Inter MB Mode Mapping: 17
  • 18. An inter coded MB in the incoming H.264 bitstream is coded as inter MB in VC-1. The inter MB in H.264 has 7 different motion compensation sizes – 16x16, 16x8, 8x16, 8x8, 4x8, 8x4, 4x4. The inter MB in VC-1 has 2 different motion compensation sizes 16x16 and 8x8. Another significant difference is that H.264 uses 4x4 (and 8x8 in fidelity range extensions) transform sizes where as VC-1 uses 4 different transform sizes – 8x8, 4x8, 8x4 and 4x4. The 16x16, 8x16, 16x8 motion compensation sizes are usually selected in H.264 for areas that are relatively uniform and will be mapped to inter 16x16 MB in VC-1 using the selected H.264 MC block size as a measure of homogeneity in the block to be able to differentiate the transform size to be applied in VC-1. The 8x8, 8x4, 4x8 and 4x4 modes are usually selected in H.264 for areas that have non-uniform motion. The 16x16 mode in VC-1 is eliminated for such non- uniform MBs. The MB is then mapped to 8x8 block size in VC-1 with the H.264 block size determining the transform size to be used in VC-1. Table 3 describes the decision making for mapping the inter MBs and the type of transform to be used in VC-1. H.264 Inter MB VC-1 Inter MB Transform size in VC-1 Inter 16x16 Inter 16x16 8x8 Inter 16x8 Inter 16x16 8x4 Inter 8x16 Inter 16x16 4x8 Inter 8x8 Inter 8x8 8x8 Inter 4x8 Inter 8x8 4x8 Inter 8x4 Inter 8x8 8x4 Inter 4x4 Inter 8x8 4x4 Table 3 H.264 and VC-1 Inter MB mapping and VC-1 transform type Motion vector mapping: Re-use of motion vectors selected in H.264 can significantly reduce the complexity of VC-1 encoding. Table 4 describes the re-use of motion vectors. H.264 Inter MB VC-1 Inter MB Motion Vector Re-use Inter 16x16 Inter 16x16 Same motion vectors Inter 16x8 Inter 16x16 Average of motion vectors Inter 8x16 Inter 16x16 Average of motion vectors Inter 8x8 Inter 8x8 Same motion vectors Inter 4x8 Inter 8x8 Average of motion vectors Inter 8x4 Inter 8x8 Average of motion vectors Inter 4x4 Inter 8x8 Average of motion vectors Table 4 H.264 and VC-1 Inter MB motion vector mapping 18
  • 19. Reference Pictures: H.264/AVC standard defines the use of up to sixteen reference pictures for motion estimation, while VC-1 uses only one or two, according to the slice type P or B respectively. The reuse of motion vectors implies using the same reference pictures to maintain their meaning. The motion vector conversion assumes that motion vector length is related to the reference image distance [39]. The source motion vectors are scaled, according to figure 12 in order to use valid VC-1 reference pictures. This conversion assumes constant motion between H.264/AVC and VC-1 reference pictures. The motion vector conversion is performed by scaling it with the temporal distance between the two reference pictures. H.264 VC-1 Fig 12 Motion vector scaling [38] Skipped Macroblock: When a skipped macro block is signaled in the bit stream, no further data is sent for that macro block. The mode conversion of H.264 skip macroblocks to VC-1 skip is a straightforward process. Since the skip macro block definition of both standards is fully compatible, a direct conversion is possible. OPEN LOOP TRANSCODER: The open loop transcoder is designed by cascading a H.264 encoder [44], H.264 [44] decoder, VC-1 encoder [45] and a VC-1 decoder [45]. YUV H.264 Encoder H.264 Decoder VC-1 Encoder VC-1 Decoder YUV Fig 13 Open loop transcoder Performance of open loop transcoder Mean square error (MSE), peak-to-peak signal to noise ratio (PSNR), structural similarity index measure (SSIM) for Foreman QCIF (3 frames) is calculated using the open loop transcoder. 19
  • 20. Fig 14 MSE of open loop transcoder – Foreman sequence Fig 15 PSNR of open loop transcoder – Foreman sequence 20
  • 21. Fig 16 SSIM of open loop transcoder – Foreman sequence CONCLUSIONS: As mentioned earlier, it is proposed to transcode an H.264 bitstream to a VC-1 stream in the pixel domain (CPDT) and compare the results (MSE, PSNR, SSIM, complexity, bit rates) against an open loop transcoder. On the encoder side, since there is no re-estimation of the motion vectors, the complexity on the encoder side reduces by about 40-50%. Road map ahead is to extract re-usable information from the H.264 bitstream to be used in VC-1 encoding. REFERENCES: [1] VC-1 Compressed Video Bitstream Format and Decoding Process (SMPTE 421M-2006), SMPTE Standard, 2006. [2] T. Wiegand et al, “Overview of the H.264/AVC video coding standard,” IEEE Trans. CSVT, Vol. 13, pp. 560-576, July 2003. [3] C. Chen, P-H.Wu and H. Chen, “MPEG-2 to H.264 transcoding,” Picture Coding Symposium, pp. 15-17 Dec, 2004. [4] Jae-Beom Lee and H. Kalva, "An efficient algorithm for VC-1 to H.264 video transcoding in progressive compression," IEEE International Conference on Multimedia and Expo, pp. 53-56, July 2006 [5] J Xin, C.W. Lin and M.T. Sun, “Digital video transcoding”, Proceedings of the IEEE, Vol. 93, pp 84-97, Jan 2005. [6] A. Vetros, C. Christopoulos and H. Sun, “Video transcoding architectures and techniques: An overview”, IEEE Signal Processing Magazine, Vol. 20, pp 18-29, March 2003. [7] Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec. H.264 / ISO / IEC 14496-10, Mar 2005. [8] S. Srinivasan and S. L. Regunathan, “An overview of VC-1” Proc. SPIE, vol. 5960, pp. 720–728, 2005. [9] P. List et al, “Adaptive deblocking filter,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, pp.614–619, Jun. 2003. [10]T. D. Tran, J. Liang and C. Tu, “Lapped transform via time-domain pre- and post- filtering,” IEEE Trans. Signal Proc., vol. 51, pp. 1557–1571, Jun. 2003. 21
  • 22. [11]C. C. Cheng, T. S. Chang, and K. B. Lee, “An in-place architecture for the deblocking filter in H.264/AVC,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, pp. 530–534, Jul. 2006. [12]T. C. Chen et al “Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, pp. 673 – 688, Jun. 2006. [13]Y.-W. Huang et al “Architecture design for deblocking filter in H.264 / JVT / AVC,” in IEEE Proc. Int. Conf. Multimedia and Expo, pp. 693–696, July 2003. [14]S.-C. Chang et al “A platform based bus-interleaved architecture for de-blocking filter in H.264/MPEG-4 AVC,” IEEE Trans. Consumer Electron., vol. 51, pp. 249–255, Feb 2005. [15]M. Sima, Y. Zhou, and W. Zhang, “An efficient architecture for adaptive deblocking filter of H.264/AVC video coding,” IEEE Trans. Consumer Electronics, vol. 50, pp. 292–296, Feb. 2004. [16]S.-Y. Shih, C.-R. Chang and Y.-L. Lin, “A near optimal deblocking filter for H.264 advanced video coding” in Proc. Asia and South Pacific Design Automation Conf., pp. 170–175, Jan 2006. [17]T.-M. Liu et al, “A memory-efficient deblocking filter for H.264/AVC video coding,” in Proc. IEEE Int. Symp. Circuits Syst., pp. 2140–2143, May 2005. [18]T.-M. Liu et al, “A 125 µ W fully scalable MPEG-2 and H.264/AVC video decoder for mobile applications,” IEEE J. Solid-State Circuits, vol. 42, pp. 161– 169, Jan. 2007. [19]L. Li, S. Goto and T. Ikenaga, “An efficient deblocking filter architecture with 2- dimensional parallel memory for H.264/AVC,” in Proc. Asia and South Pacific Design Automation Conf., pp.623–626, 2005 [20]H.-Y. Lin et al “Efficient deblocking filter architecture for H.264 video coders,” in IEEE ISCAS, pp 4, May 2006 [21]T.-M. Liu, W.-P. Lee and C.-Y. Lee, “An in/post-loop deblocking filter with hybrid filtering schedule” IEEE Trans. Circuits Syst. for Video Technol., vol. 17, pp. 937–943, Jul. 2007. [22]I. Ahmad et al, “Video transcoding: An overview of various techniques and research Issues”, IEEE Trans. on Multimedia, vol. 7, pp. 793-8, Oct. 2005 22
  • 23. [23]Y.L Lee and T.Q Nguyen, "Analysis and efficient architecture design for VC-1 overlap smoothing and in-loop deblocking Filter," IEEE Trans Circuits and Syst. for Video Technol, vol.18, pp 1786-1796, Dec. 2008 [24]G. Fernandez-Escribano et al, “Speeding-up the macroblock partition mode decision for MPEG-2 to H.264 transcoding,” Proceedings of IEEE ICIP 2006, Atlanta, pp 869-872, Sept 2006. [25]Z. Zhou et al "Motion information and coding mode reuse for MPEG-2 to H.264 transcoding", Proceedings of the IEEE ISCAS 2005, pp 1230-1233, May 2005. [26]B. Petljanski and H. Kalva, “DCT domain intra MB mode decision for MPEG-2 to H.264 transcoding” Proceedings of the IEEE ICCE 2006, pp. 419-420, Jan 2006. [27]J. Bialkowski, A. Kaup and K. Illgner, “Fast transcoding of intra frames between H.263 and H.264,” IEEE ICIP, vol.4, pp. 2785- 2788, Oct 2004. [28]Y.-K. Lee, S.-S. Lee, and Y.-L. Lee, “MPEG-4 to H.264 transcoding using macroblock statistics,” Proceedings of the IEEE ICME 2006, pp.57-60, Toronto, Canada, July 2006. [29]G. Sullivan, P. Topiwalla and A. Luthra, “The H.264/AVC video coding standard: overview and introduction to the fidelity range extensions”, SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74 Aug 2004. [30]T. Weigand et al, “Introduction to the Special Issue on Scalable Video Coding— Standardization and Beyond” IEEE Trans on Circuits and Systems for Video Technology, Vol 17, pp 1034, Sept 2007. [31]Von Roden and T. Praktische, “H.261 and MPEG1- A comparison” Conference Proceedings of the 1996 IEEE Fifteenth Annual International Phoenix Conference on Computers and Communications, pp.65-71, Mar 1996 [32]S. Srinivasan et al, “Windows Media Video 9: overview and applications” Signal Processing: Image Communication, Vol 19, pp 851-875, Oct 2004. [33]S. K. Kwon, A. Tamhankar and K.R. Rao, "An overview of H.264/MPEG-4 Part 10," Special issue of Journal of Visual Communication and Image Representation,vol.17, pp 186-216, April 2006. [34]G.A Davidson et al, “ATSC video and audio coding”, Proc. IEEE, vol 94, pp 60-76, Jan 2006. 23
  • 24. [35]J. Bialkowski, M Barkowky and A. Kaup, “Overview of low complexity video transcoding from H.263 to H.264” IEEE ICME, pp 49-52, 2006. [36]T. D. Nguyen et al, “Efficient MPEG-4 to H.264/AVC transcoding with spatial downscaling”, ETRI Journal, vol.29, no.6, pp 826-828, Dec. 2007. [37]H. Kalva, G.F. Escribano and K Kunzelmann, “Reduced resolution MPEG-2 to H.264 transcoder” Proc. SPIE, Vol. 7257, 72571V Jan 2009. [38]S Moiron et al, "H.264/AVC to MPEG-2 video transcoding architecture", Proc Conf. on Telecommunications - ConfTele, Peniche, Portugal, Vol. 1, pp. 449 - 452, May, 2007. [39]S Moiron et al, “Video transcoding from H.264/AVC to MPEG-2 with reduced computational complexity”, Signal Processing: Image Communication, vol 24, pp 637-650, September 2009 [40]Mei-Juan Chen, Ming-Chung Chu and Chih-Wei Pan, “Efficient motion- estimation algorithm for reduced frame-rate video transcoder”, IEEE Trans on Circuits and Systems for Video Technology, vol. 12, pp. 269–275, Apr. 2002. [41]ISO/IEC 11172-2:1993 Information technology -- Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbits/s -- Part 2: Video [42]H. Kalva and J.B. Lee, "The VC-1 Video Coding Standard," IEEE Multimedia, vol. 14, pp. 88-91, Oct.-Dec. 2007 [43]P. Bordes, A. Orhand, “Improved Algorithm for fast transcoding H.264” EUSIPCO 2007. REFERENCE BOOKS: [44]K. Sayood, “Introduction to Data compression”, III edition, Morgan Kauffmann publishers, 2006. [45]I.E.G. Richardson, “H.264 and MPEG-4 video compression: video coding for next-generation multimedia”, Wiley, 2003. 24
  • 25. [46]K. R. Rao and P. C. Yip, “The transform and data compression handbook”, Boca Raton, FL: CRC press, 2001. [47]K.R. Rao and J.J. Hwang “Techniques and Standards for Image, Video, and Audio Coding” - Prentice Hall, 1996. [48]J.B. Lee and H. Kalva, The VC-1 and H.264 Video Compression Standards for Broadband Video Services, Springer, 2008. REFERENCE WEBSITES: [49]JM software : http://iphome.hhi.de/suehring/tml/ [50]VC-1 Software : http://www.smpte.org/home [51]Microsoft website - VC-1 Technical Overview http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx#VC1C omparedtoOtherCodecs [52]VC-1 Wikipedia site - http://en.wikipedia.org/wiki/VC-1 [53] ACRONYMS: ASO Arbitrary slice ordering AVC Advanced Video Coding B MB Bi-predicted MB CDDT Cascaded DCT Domain Transcoder CPDT Cascaded Pixel Domain Transcoder DCT Discrete Cosine Transform DSP Digital Signal Processing DVD Digital Versatile Disc FMO Flexible macroblock ordering FRExt Fidelity Range Extensions GOP Group Of Pictures I MB Intra Predicted MB IEC International Electrotechnical Commission ISO International Organization for Standardization ITU-T International Telecommunication Union – Transmission sector JVT Joint Video Team P MB Inter Predicted MB IDCT Inverse Discrete Cosine Transform IQ Inverse Quantizer MB Macroblock 25
  • 26. ME Motion Estimation MC Motion Compensation MV Motion Vector MPEG Moving Picture Experts Group MSE Mean Square Error PSNR Peak –to – peak Signal to Noise Ratio Q Quantizer R-D Rate - Distortion SDDT Simplified DCT Domain Transcoder SP/SI Switched P / Switched I SMPTE Society of Motion Picture and Television Engineers SSIM Structural Similarity Index Measure SVC Scalable Video Coding VCEG Video Coding Experts Group VLC Variable Length Coding VLD Variable Length Decoder YUV Y- Luminance and UV- Chrominance 26