Multimedia basic video compression techniques

• A video consists of a time-ordered sequence of frames, i.e., images.
• An obvious solution to video compression would be predictive coding based on
previous frames.
• Compression proceeds by subtracting images: subtract in time order and code the
residual error.
• It can be done even better by searching for just the right parts of the image to
subtract from the previous frame.
• This scheme will likely work well for a fixed background of scene, but wouldn’t
work with moving objects on frame sequences?
the solution based on : (motion estimation and motion compensation)

• Consecutive frames in a video are similar — temporal redundancy exists.
• Avideo can be viewed as a sequence of images stacked in the temporal dimension
• Temporal redundancy is often significant and it is exploited, so that not every frame of the video needs to be
coded independently as a new image. Instead, the difference between the current frame and other frame(s) in the
sequence is coded. If redundancy between them is great enough, the difference images could consist small values
and low entropy, good for compression.
• All modern digital video compression algorithms (including H.264 and H.265) adopt this Hybrid coding approach,
i.e., predicting and compensating for the differences between video frames to remove the temporal redundancy, and
then transform coding on the residual signal (the differences) to reduce the spatial redundancy.
• Steps of Video compression based on Motion Compensation (MC):
1. Motion Estimation (motion vector search).
2. MC-based Prediction.
3. Derivation of the prediction error, i.e., the difference.

Motion Compensation
 Each image is divided into macroblocks of size N x N.
 By default, N = 16 for luminance images. N = 8 for chrominance images if 4:2:0 chroma subsampling
is adopted.
 Motion compensation is performed at the macroblock level.
 The current image frame is referred to as Target Frame.
 A match is sought between the macroblock in the Target Frame and the most similar macroblock
in previous and/or future frame(s) (referred to as Reference frame(s)) , so the [Target
macroblock is predicted from the Reference macroblock(s)].
 The displacement of the reference macroblock to the target macroblock is called a motion vector
(MV).
 The difference of the two corresponding macroblocks is the prediction error.
 For video compression based on motion compensation, after the first frame, only
 the motion vectors and difference macroblocks need be coded.

• MV search is usually limited to a small immediate neighborhood — both
horizontal and vertical displacements in the range [−p, p] , where p is a
positive integer with a relatively small value. This makes a search window of
size (2p + 1) x (2p + 1), use the upper left corner (x, y) as the origin of the
macroblock in the Target frame.
Macroblocks and Motion Vector in Video Compression

Search for Motion Vectors
• The search for motion vectors MV(u, v) as defined above is matching problem, also called a
correspondence problem.
• The difference between two macroblocks can then be measured by their Mean Absolute Difference
(MAD):
MAD(i, j) =
1
𝑁2 𝑘=0
𝑁−1
𝑙=0
𝑁−1
| 𝐶 𝑥 + 𝑘, 𝑦 + 𝑙 − 𝑅 𝑋 + 𝑖 + 𝑘, 𝑦 + 𝑗 + 𝑙 | ……(1)
N – size of the macroblock.
𝑘 and 𝑙 – indices for pixels in the macroblock.
𝑖 and j – horizontal and vertical displacements.
𝐶 𝑥 + 𝑘, 𝑦 + 𝑙 – pixels in macroblock in Target frame.
𝑅 𝑋 + 𝑖 + 𝑘, 𝑦 + 𝑗 + 𝑙 – pixels in macroblock in Reference frame.
• The goal of the search is to find a vector (i, j) as the motion vector MV = (u, v), such that MAD(i, j)
is minimum:
(u, v) = [ (i, j) | MAD(i, j) is minimum, i ∊ [−p, p], j ∊ [−p, p] ]

Sequential Search
• Sequential search: sequentially search the whole (2p+1)× (2p+1) window in the
Reference frame (also referred to as Full search).
• A macroblock centered at each of the positions within the window is compared to the
macroblock in the Target frame pixel by pixel and their respective MAD is then derived
using Eq. (1)
• The vector (i, j) that offers the least MAD is designated as the MV (u, v) for the
macroblock in the Target frame.
• sequential search method is very costly — assuming each pixel comparison requires
three operations (subtraction, absolute value, addition). Thus the cost for obtaining a
motion vector for a single macroblock is:
• (2p + 1) ·
• Example :

PROCEDURE Motion-vector: sequential-search

2D Logarithmic Search
• Logarithmic search: a cheaper version, that is suboptimal but still usually effective.
• The procedure for 2D Logarithmic Search of motion vectors takes several iterations
and is akin to a binary search:
11
1
1
111
11
2 22
2 22
2 2
3 3 3
3 3
3 3
3
• initially only nine locations in the search
window are used as seeds for a MAD-based
search; they are marked as ‘1’.
• After the one that yields the minimum MAD is
located, the center of the new search region is
moved to it and the step-size (”offset”) is
reduced to half.
• In the next iteration, the nine new locations are
marked as ‘2’,

• Instead of sequentially comparing with (2p + 1)2 macroblocks from the Reference
• frame, the 2-D Logarithmic Search will compare with only 9 · (log2 p + 1) macroblocks.
In fact, it would be 8 · (log2 p + 1) + 1, since the comparison that
• yielded the least MAD from the last iteration can be reused. Therefore, the complexity is
dropped to O(log p · N2). Since p is usually of the same order of magnitude as N, the
saving is substantial compared to O(p2N2). Using the same example as in the previous
subsection, the total operations per second drop to
‫مع‬ ‫الترتيب‬ ‫اعادة‬ ‫يرجى‬‫المثال‬‫الفرق‬ ‫نالحظ‬ ‫لكي‬

PROCEDURE Motion-vector: 2D-logarithmic-search

Hierarchical Search
• The search can benefit from a hierarchical
(multiresolution) approach in which initial
estimation of the motion vector can be
obtained from images with a significantly
reduced resolution.
• A three-level hierarchical search in which the
original image is at Level 0, images at Levels
1 and 2 are obtained by down-sampling from
the previous levels by a factor of 2, and the
initial search is conducted at Level 2.
Downsample by a factor of 2
Motion Estimation
Motion Estimation
Motion Estimation
Motion Vector
Downsample by a factor of 2
• Since the size of the macroblock is smaller and p can also be proportionally reduced, the number
of operations required is greatly reduced.
Level 2
Level 1
Level 0

• ‫الجدول‬ ‫وضع‬ ‫يرجى‬
• ( 10.1) ‫السابقة‬ ‫الطرق‬ ‫بين‬ ‫للمقارنة‬

H.261
• H.261: An earlier digital video compression standard, its principle of MC-
based compression is retained in all later video compression standards.
• The standard was designed for videophone, video conferencing and other
audiovisual services over ISDN (Integrated Services Digital Network).
• The video codec supports bit-rates of p × 64 kbps, where p ranges from 1
to 30 (Hence also known as p* 64 pronounced “p star 64”).
• Require that the delay of the video encoder be less than 150 msec so that
the video can be used for real-time bidirectional video conferencing.
‫ثانية‬ ‫مللي‬

• The following table show Video Formats Supported by H.261
Video format
Luminance
image
resolution
Chrominance
image
resolution
Bit-rate (Mbps) (if
30 fps and
uncompressed )
H.261
support
QCIF 176 × 144 88 × 72 9.1 required
CIF 352 × 288 176 × 144 36.5 optional

ITU Recommendations & H.261 Video Formats
• H.261 belongs to the following set of ITU (International Telecommunication Union)
recommendations for visual telephony systems:
1. H.221 — Frame structure for an audiovisual channel supporting 64 to 1,920 kbps.
2. H.230 — Frame control signals for audiovisual systems.
3. H.242 — Audiovisual communication protocols.
4. H.261 — Video encoder/decoder for audiovisual services at p × 64 kbps.
5. H.320 — Narrow-band audiovisual terminal equipment for p × 64 kbps transmission.

H.261 Frame Sequence
• Two types of image frames are defined: Intra-frames (I-frames) and Inter-frames (P-frames):
– I-frames are treated as independent images. Transform coding method similar to JPEG is applied
within each I-frame, hence “Intra”.
– P-frames are not independent: coded by a forward predictive coding method (prediction from a
previous P-frame is allowed — not just from a previous I-frame), in which current macroblocks are
predicted from similar macroblocks in the preceding I- or P-frame, and differences between the
macroblocks are coded.

H.261 Frame Sequence Cont.
• Two types of image frames are defined: Intra-frames (I-frames) and Inter -frames
(P-frames):
• I-frames are treated as independent images. Transform coding method similar to
JPEG is applied within each I-frame, hence “Intra”.
• P-frames are not independent: coded by a forward predictive coding method
• (prediction from a previous P-frame is allowed — not just from a previous I-frame).
• Temporal redundancy removal is included in P-frame coding, whereas I-frame
coding performs only spatial redundancy removal.
• To avoid propagation of coding errors,
• an I-frame is usually sent a couple of times in each second of the video.
• Motion vectors in H.261 are always measured in units of full pixel and they have a
limited range of ± 15 pixels, i.e., p = 15.

Intra-frame (I-frame) Coding
• Macroblocks are of size
16× 16 pixels for the Y
frame, and 8 × 8 for Cb
and Cr frames, since
4:2:0chroma subsampling
is employed.
• A macroblock consists of
four Y, one Cb, and one Cr
8 × 8 blocks.
• For each 8 × 8 block a DCT transform is applied, the DCT coefficients then go through
quantization zigzag scan and entropy coding..

Inter-frame (P-frame) Predictive Coding
• The H.261 P-frame coding scheme
based on motion compensation.
• For each macroblock in the Target
frame, a motion vector is allocated by
one of the search methods discussed
earlier.
• After the prediction, a difference
macroblock is derived to measure the
prediction error. It is also carried in the
form of four Y blocks, one Cb, and one
Cr block.
• Each of these 8×8 blocks go through
DCT, quantization, zigzag scan and
entropy coding procedures.

• The P-frame coding encodes the difference macroblock (not the Target macroblock itself).
• Since the difference macroblock usually has a much smaller entropy than the Target
macroblock, a large compression ratio is attainable.
• Sometimes, a good match cannot be found, i.e., the prediction error exceeds a certain
acceptable level.
• The MB itself is then encoded (treated as an Intra MB) and in this case it is termed a non-
motion compensated MB.
• For motion vector, the difference MVD is sent for entropy coding:
MVD = MVPreceding − MVCurrent
Inter-frame (P-frame) Predictive Coding Cont.

10.4.3 Quantization in H.261
• the quantization in H.261 does not use 8×8
quantization matrices , as in( JPEG )and MPEG
• The quantization in H.261 uses a constant step_size,
for all DCT coefficients within a macroblock. If
we use DCT and QDCT to denote the DCT
coefficients before and after the quantization, then
for DC coefficients in Intra mode:

10.4.4 H.261 Encoder and Decoder
• A scenario is used where frames I, P1, and P2are encoded and then
decoded.
• Note: decoded frames (not the original frames) are used as
reference frames in motion estimation.
• The data that goes through the observation points indicated by the
circled numbers are summarized.

H.263
• H.263 is an improved video coding standard for videoconferencing and other audiovisual
services transmitted on Public Switched Telephone Networks (PSTN).
• It aims at low bitrate communications at bitrates of less than 64 kbps.
• It was adopted by the ITU-T Study Group 15 in 1995. Similar to H.261, it uses predictive
coding for inter-frames, to reduce temporal redundancy, and transform coding for the
remaining signal, to reduce spatial redundancy (for both intra-frames and difference
macroblocks from inter-frame prediction)
• Supports more standard frame sizes (SQCIF, QCIF, CIF, 4CIF, 16CIF).
• Targeted low bit rate video <64 kb/s.
• Works well at high rates, too. (1/98) H.263 Ver. 2 (H.263+), ITU-T

Motion compensation in H.263
• The process of motion compensation in H.263 is similar to that of H.261.
• The motion vector (MV) is, however, not simply derived from the current macroblock. The
horizontal and vertical components of the MV are predicted from the median values of
• the horizontal and vertical components, respectively, ofMV1,MV2,MV3from the “previous ”,
“above” and “above and right” macroblocks Namely , for the macroblock with MV(u , v),
• up=medi an(u1,u2,u3),
• vp=medi an(v1,v2,v3)

Optional H. 263 coding modes
• Besides its core coding algorithm, H.263 specifies many negotiable coding options in its various
Annexes. Four of the common options are as follows
1. Unrestricted motion vector mode. The pixels referenced are no longer restricted to within the
boundary of the image. When the motion vector points outside the image boundary, the value of the
boundary pixel geometrically closest to the referenced pixel is used. This is beneficial when image
content is moving across the edge of the image, often caused by object and/or camera movements.
This mode also allows an extension of the range of motion vectors. The maximum range of motion
vectors is[−31.5,31.5], which enables efficient coding of fast-moving.
2. Syntax-based arithmetic coding mode.
Like H.261, H.263 uses Variable-Length Coding (VLC) as a default coding method for the DCT
coefficients. Variable-length coding implies that each symbol must be coded into a fixed, integral
number of bits. By employing arithmetic coding, this restriction is removed, and a higher
compression ratio can be achieved.

Optional H. 263 coding modes
3.Advanced prediction mode .
• In this mode, the macro block size for motion compensation is reduced from 16 to 8. Four motion vectors
(from each of the 8×8 blocks) are generated for each macro block in the luminance image. Afterwards,
each pixel in the 8×8 luminance prediction block takes a weighted sum of three predicted values based on
the motion vector of the current luminance block and two out of the four motion vectors from the
neighboring blocks—that is, one from the block at the left or right side of the current luminance block and
one from the block above or below. Although sending four motion vectors incurs some additional
overhead, the use of this mode generally yields better prediction and hence considerable gain in
compression.
4.PB - frames mode.
• As shown by MPEG (detailed discussions in Chap.11), the introduction of a B-frame, which is
predicted bidirectionally from both the previous frame and the future frame, can often improve the
quality of prediction and hence the compression ratio without sacrificing picture quality. In H.263, a
PB-frame consists of two pictures coded as one unit: one P-frame, predicted from the previous
decoded I-frame or P-frame (or P-frame part of a PB-frame), and one B-frame, predicted from both
the previous decoded I- or P-frame and the P-frame currently being decoded

H.263+ and H.263++
• H.263+.The aim of H.263+ is to broaden the potential applications and offer
additional flexibility in terms of custom source formats, different pixel aspect ratios,
and clock frequencies. H.263+ includes numerous recommendations to improve code
efficiency and error resilience [9]. It also provides 12 new negotiable modes, in
addition to the four optional modes in H.263
• H263++.The development of H.263 continued beyond its second version. The third
version, H.263 v3, also known as H.263++, was initially approved in the Year 2000.
A further developed version was approved in 2005. H.263++ includes the baseline
coding methods of H.263 and additional recommendations for enhanced reference
picture selection(ERPS), data partition slice(DPS),
• ERPS mode operates by managing a multi frame buffer for stored frames , enhancing
coding efficiency and error resilience .
• DPS mode provide additional enhancement

Moving Picture Experts Group (MPEG)
• The MPEG-1 audio/video digital compression standard [2,3] was approved by the
• International Organization for Standardization/International Electrotechnical Commission
(ISO/IEC) MPEG group in November 1991 for Coding of Moving Pictures and Associated Audio
for Digital Storage Media at up to about 1.5 Mbit /s [4].
• Common digital storage media include compact discs (CDs) and video compact discs (VCDs).
Out of the specified 1.5, 1.2 Mbps is intended for coded video, and 256 kbps(kilobits per second)
can be used for stereo audio. This yields a picture quality comparable to VHS cassettes and a
sound quality equal to CD audio . In general, MPEG-1 adopts the CCIR601 digital TV format,
also known as Source Input Format(SIF). MPEG-1 supports only non interlaced video. Normally,
its picture resolution is 352×240 for NTSC video at 30 fps, or 352×288 for PAL video at 25 fps. It
uses 4:2:0 chroma subsampling.

Motion compensation in MPEG-1
• As discussed in the last chapter, motion-compensation-based video encoding in
• H.261 works as follows: In motion estimation, each macroblock of the target P-frame
• is assigned a best matching macroblock from the previously coded I- or P-frame.
• This is called aprediction. The difference between the macroblock and its matching
• macroblock is theprediction error, which is sent to DCT and its subsequent encoding
• Steps
• MPEG introduces a third frame type—B-frames—and their accompanying
• bidirectional motion compensation. Figure11.2illustrates the motion-compensation-based B-frame coding idea. In
addition to the forward prediction, a backward pre-diction is also performed, in which the matching macroblock is
obtained from a
• future I- or P-frame in the video sequence. Consequently, each macroblock from a
• B-frame will specify up totwomotion vectors, one from the forward and one from
• the backward prediction.

Other Major Differences from H.261
• Source formatsH.261 supports only CIF (352×288) and QCIF (176×144)
• source formats. MPEG-1 supports SIF (352×240 for NTSC, 352×288 for PAL).
• Slices Instead of GOBs, as in H.261, an MPEG-1 picture can be divided into one or
more slices which are more flexible than GOBs
• Quantization MPEG-1 quantization uses different quantization tables for its intra
and intercoding . The quantizer numbers for intracoding vary within a macro block.
• This is different from H.261, where all quantizer numbers for AC coefficients are
constant within a macroblock

MPEG-1 video Bitstream
• 1-Sequence layer A video sequence consists of one or more groups of pictures
• 2- GOPs layer A GOP contains one or more pictures, one of which must be an I-picture. The
A GOP header contains information such as time _ code to indicate hour-minute-second-
frame from the start of the sequence.
• 3. Picture layer The three common MPEG-1 picture types are I-picture (intra-coding),P-
picture(predictive codin
• 4. Slice layer As mentioned earlier , MPEG-1 introduced the slice notion for bitrate control
and for recovery and synchronization after lost or corrupted bits. g), and B-
picture(Bidirectional predictive coding),
• 5- Macroblock layer Each macroblock consists of four Y blocks, one Cb block, and one Cr
block. All blocks are 8×8.
• 6. Block layer If the blocks are intra coded, the differential (DC) coefficient (DPCM) of
(DCs), as in JPEG) is sent first

MPEG-2
• coding high-quality video at 4-15 Mbps for video on demand (VOD), standard
definition (SD) and high-definition (HD) digital TV broadcasting and for storing
video on digital storage media like the DVD
• MPEG-2 should have scalable coding and should include error resilience
techniques
• MPEG-2 should provide good NTSC quality video at 4-6 Mbps and transparent
NTSC quality video at 8-10 Mbps
• MPEG-2 should provide random access to frames
• MPEG-2 should be compatible with MPEG-1 (an MPEG-2 decoder should be able
to decode an MPEG-1 bitstream)
• low cost decoders

Supporting Interlaced video
• MPEG-1 supports only non interlaced (progressive) video.
• Since MPEG-2 is adopted by digital broadcast TV, it must also support interlaced video, because this
is one of the options for digital broadcast TV and HDTV.
• As mentioned earlier, in interlaced video each frame consists of two fields, referred
• to as the top-field and the bottom-field . In a frame -picture,
• all scanlines from both fields are interleaved to form a single frame. This is then divided into 16×16
• macroblocks and coded using motion compensation. On the other hand, if each field is treated as a
separate picture, then it is called field picture.AsFig.11.6a shows,
• each frame-picture can be split into two field pictures. The figure shows 16 scanlines from a frame-
picture on the left, as opposed to 8 scanlines in each of the two field portions of a field picture on the
right

Five modes of predictions. MPEG -2
• 1. Frame prediction for frame – pictures . This is identical to MPEG-1
motion-compensation-based prediction methods in both P-frames and
B-frames. Frame prediction works well for videos containing only slow
and moderate object and camera motions.
• 2. Field prediction for field pictures (See Fig.11.6b)This mode uses a
macroblock size of 16×16 from field pictures. For P-field pictures (the
rightmost ones shown in the figure), predictions are made from the two
most recently encoded fields. Macroblocks in the top-field picture are
forward-predicted from the top-field or bottom-field pictures of the
preceding I- or P-frame.

• 3. Field prediction for frame -pictures.This mode treats the top-field
and bottom-field of a frame-picture separately. Accordingly, each 16×16
macroblock from the target frame-picture is split into two 16×8 parts,
each coming from one field. Field prediction is carried out for these
16×8 parts in a manner similar to that shown in Fig.11.6b. Besides the
smaller block size, the only difference is that the bottom-field will not
be predicted from the top-field of the same frame, since we are dealing
with frame-pictures now.
• 4.16×8 MC for field pictures .Each 16×16 macroblock from the target field picture
is now split into top and bottom 16×8 halves—that is, the first eight rows and the
next eight rows. Field prediction is performed on each half. As a result, two motion
vectors will be generated for each 16×16 macroblock in the P-field picture and up to
four motion vectors for each macroblock in the B-field picture. This mode is good
for finer motion compensation when motion is rapid and irregular.

• 5. Dual-prime for P-pictures .This is the only mode that can be used
for either frame-pictures or field pictures. At first, field prediction
from each previous field with the same parity (top or bottom) is made.
Each motion vector MV is then used to derive a calculated motion
vector CV in the field with the opposite parity, taking into account the
temporal scaling and vertical shift between lines in the top and bottom
fields. In this way, the pair MV and CV yields two preliminary
predictions for each macroblock. Their prediction errors are averaged
and used as the final prediction error. This mode is aimed at
mimicking B-picture prediction for P-pictures without adopting
backward prediction (and hence less encoding delay)

MPEG-2 supports the following scalabilities:
• •SNR scalability The enhancement layer provides higher SNR.
• •Spatial scalability The enhancement layer provides higher spatial
resolution.
• •Temporal scalability The enhancement layer facilitates higher
frame rate.
• •Hybrid scalability This combines any two of the above three
scalabilities.
• •Data partitioning Quantized DCT coefficients are split into
partitions.

1-SNR Scalability
• Figure11.8illustrates how SNR scalability works in the MPEG-2 encoder
and decoder.
• The MPEG-2 SNR scalable encoder generates output bitstreams
Bits_baseand Bits _ enhance at two layers.
• After variable-length coding, the bitstream is called Bits_base.
• after variable-length coding, becomes the bitstream called Bits _
enhance.
• inversely quantized (Q−1) and fed to the enhancement layer

2-Spatial Scalability
• The base and enhancement layers for MPEG-2 spatial
scalability are not as tightly coupled as in SNR scalability;
hence, this type of scalability is somewhat less complicated.
Temporally scalable coding has both the base and enhancement layers of
video at a reduced temporal rate (frame rate)
3-Temporal Scalability

4- hybrid scalability
• Any two of the above three scalabilities can be combined to form
hybrid scalability . These combinations are:
• •Spatial and temporal hybrid scalability
• •SNR and spatial hybrid scalability
• •SNR and temporal hybrid scalability
• Usually, a three-layer hybrid coder will be adopted, consisting of
base layer, enhancement layer 1, and enhancement layer 2.For
example, for Spatial and temporal hybrid scalability, the base
layer and enhancement layer 1 will provide spatial scalability,
and enhancement layers 1 and 2 will provide temporal
scalability, in which enhancement layer 1 is effectively servingas
a base layer

5- Data Partitioning
• The compressed video stream is divided into two partitions. The base
partition contains lower-frequency DCT coefficients, and the
enhancement partition contains high-frequency DCT coefficients.
Although the partitions are sometimes also referred to as layers (base
layer and enhancement layer), strictly speaking, data partitioning does
not conduct the same type of layered coding, since a single stream of
video data is simply divided up and does not depend further on the base
partition in generating the enhancement partition. Nevertheless, data
partitioning can be useful for transmission over noisy channels and for
progressive transmission.

Other Major Differences fromMPEG-1
• 1• Better resilience to bit errors.
• 2.Support of 4:2:2 and 4:4:4 chroma subsampling.
• 3.Nonlinear quantization
• 4. More restricted slice structure
• 5. More flexible video formats.

MPEG-4
• Objective
◦ Standardize algorithms for audiovisual coding in multimedia
applications allowing for
• Interactivity
• High compression
• Scalability of audio and video content
• Support for natural and synthetic audio and video
• The Idea
◦ An audiovisual scene is a coded representation of audiovisual objects
related in space and time

five levels of the hierarchical description of a scene in
MPEG-4 visual

Motion Compensation
• This section addresses issues of VOP-based motion compensation in
MPEG-4. Since I-VOP coding is relatively straightforward, our
concentrate on
• coding for P-VOP and /or B-VOP unless I-VO explicitly mentioned .P is
As before, motion-compensation-based VOP coding in MPEG-4 again
involves
• three steps: motion estimation, motion-compensation-based prediction,
and coding

MPEG-4: Coding
• Texture coding
◦ Intra-VOPs, residual errors from motion compensation are DCT coded like MPEG-1
◦ 4 luminance and 2 chrominance blocks in a macroblock
◦ P-VOPs (prediction error blocks) may not conform to VOP boundary
• Pixels outside the active area are set to a constant value
• Standard compression
• Efficient prediction of DC and AC components from intra and inter coded blocks
◦ Multiplexing
• Shape  motion  texture coded data
• Motion and DCT coefficients can be jointly (H.263) or individually coded

MPEG-4: Coding
• Shape coding
◦ Shape information in alpha planes
◦ Transparency of shape encoded
◦ Inter and intra shape coding functions
◦ After shape coding each VOP in a VO is partitioned into non-overlapping
macroblocks
• Motion coding
◦ Shift parameter wrt reference window
◦ Standard macroblock
◦ Contour macroblock

Chronology of Video Standards
1990
1996 20021992 1994 1998 2000
H.263L
H.263++
H.263+
H.263
H.261
MPEG 7
MPEG 4
MPEG 2
MPEG 1
ISOITU-T

Multimedia basic video compression techniques

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Multimedia basic video compression techniques

Similar to Multimedia basic video compression techniques (20)

More from Mazin Alwaaly

More from Mazin Alwaaly (20)

Recently uploaded

Recently uploaded (20)

Multimedia basic video compression techniques