Overview of Presentation
• Introduction – Donald’s Background
• H.265 Has Achieved 2X Increase in Video Coding
Efficiency Over H.264
• Practical Aspects For Developing Image
Technology for the Motion Picture Industry
What I Learned From Developing Technology for
Hollywood
• Comparison Techniques For Image & Video C/P
2
Donald’s Background
• As an MIT Grad Student, I worked at the MIT Media Lab during Its
infancy doing research in Advanced Television Broadcast Systems.
• Worked Over 20 Years Developing Digital Multimedia
(Video,Image,Speech,Audio) Encoders/Decoder (Codecs) and
Processing Technology Primarily for Entertainment Applications
• Image Codecs: JPEG, JPEG2000*, and Qualcomm ABSDCT Codec*
• Video Codecs: MPEG-1,MPEG-2*, MPEG-4, H.264/AVC, and
H.265/HEVC
• Pervasive Theme: Reduce Image/Video Encoded
Bit Rate and Improve Reconstructed Image/Video
Quality
*Red Codecs Used For Digital Cinema 3
H.265 Overview
• Introduction and Technical Overview of HEVC
• HEVC Coding Tools
• Picture Partitioning
• How Does H.265 Achieve a Further 50% Reduction in Bit Rate
Over H.264?
Introduction to HEVC
• A new video compression standard
• An evolution of AVC (H.265 | MPEG-4 Part 10)
• HEVC Standardization
• A Joint Collaborative Team on Video Coding (JCT-VC) of MPEG &
VCEG
• Aim: To deliver same picture quality for half the bitrate of AVC
• Could require up to 10× more computational complexity (encode) and 2×-
3× (decode)
• Plan: To become ISO/IEC 23008-2 MPEG-H Part 2 & ITU-T Rec. H.265
• technically final (Part 1 Jan 2013)
From Presentation at TECHCON12 Technology
Conference by Matt Goldman, Ericsson
HEVC Coding Design and Feature Highlights
• Mulitiple Goals
• Improved Coding Efficiency
• Ease of Transport and Data Loss Resilience
• Faster Processing using parallel processing architectures & Simplifications
• Key Elements
A. Video Coding Layer
B. High-Level Syntax Architecture
C. Parallel Decoding Syntax and Modified Slice Structuring
Acronyms and Terms
Acronym Definition Acronym Definition
AIF adaptive interpolation filter JCT-VC or JCTVC Joint Collaborative
Team on Video
Coding
ALF adaptive loop filter KTA Key Technical Area
AVC Advanced Video Coding LCU largest coding unit,
max is 64x64
BD-rate Bjøntegaard-delta bit rate PU prediction unit
CfP Call for Proposal SAO Spatial Adaptive Filter
C/P Compression/Processing TMuC test model under
consideration
CU coding unit (new name for
macroblock)
TU transform unit
CTU Coding Tree Unit
HEVC or
H.265
High Efficiency Video Coding
or H.265
HM HEVC Test Model
Overview of H.265/HEVC Video Coding Tools
Similar Structure Prior Hybrid Video Codec such as H.264/AVC
but with Enhancements in Each State
• Motion-Compensated Prediction with variable block size and
fractional-pel motion vectors
• Integer transformation and scalar quantization
• Quantized Coefficients are entropy encoded using either
arithmetic coding or variable length coding.
• In-Loop Deblock filter is applied to the reconstructed image
1. New Coding Structure (Code Unit (CU), (LCU 64x64))
That Replaces the Macroblock (16x16) Coding Structure
of H.264
64x64 LCU Potentially Improves Video Coding Efficiency
2. The inclusion of two new filters that are applied after the
deblocking filter: (i) Adaptive Loop Filter (ALF) and
Sample Adaptive Offset (SAO) Filter
9
HEVC 2 Main Features That Differentiate it from H.264
New Coding Structure for HEVC
• Coding Units (CUs) that contain one or more prediction units (PUs)
and transform units (TUs)
• Each Frame is divided into non-overlapping Large Coding Units
(LCUs)
• Each LCU can be recursively split into smaller CUs using a generic
quad-tree segmentation structure
• PU is the basic unit for prediction where each PU can contain
several partitions of variable size
• TU is the basic unit of transform which also can have its own
partitions
H.265/HEVC Coding Tools
http://en.wikipedia.org/wiki/High_Efficiency_Video_Coding#Coding_tools
• Prediction block size
• Coding Tree Blocks or Coding Units (CUs): 64×64 (LCU),
32×32, 16×16 pixel regions
• CUs can be hierarchically subdivided down to 8x8 using
Quadtrees
• Intra Prediction
• 33 different intra prediction directions
• Planar and DC mode
• Chroma can be predicted by a best-fit linear transform
• Higher Bit Depths and Higher Resolutions 8K UHD (Up to
8192H x 4320V)
• Parallel Processing Tools
Tiles Can Be Processed Independently
Wavefront Parallel Processing (WPP). Each row of
Coding Tree Blocks can be decoded by a separate
thread (as long as it does not get ahead of above row)
Slices similar to H.264/AVC
• Entropy Coding
Uses Similar CABAC algorithm as AVC
Fewer Context states than AVC -> Faster Execution
More bins get encoded in bypass mode 12
H.265/HEVC Coding Tools (Cont)
Picture Partitioning
1. Pictures are Divided Into Coding Tree Units (CTUs)
1. A coding tree block of luma samples, two corresponding coding
tree blocks of chroma samples
2. Max Size of 64×64 (LCU)
2. Coding Unit (CU): a coding block of luma samples & 2
corresponding coding blocks of chroma samples of a
picture that has three sample arrays.
3. Coding Tree Block: An N×N block of samples
4. Coding Block: An N×N block of samples. The division of
a coding tree block into coding blocks is a partitioning
Picture Partitioning (cont)
5. Slice Structure
a) Slice: an integer number of coding tree blocks ordered
consecutively in the tile scan.
b) Slice is specified as a unit of packetization of coded video data
for transmission purposes
c) Independently Decodable Entropy coding restarted between slices
6. Tile: A sequence of an integer number of coding tree blocks co-
occurring in one column and one row. Tile is always rectangular.
7. One or both of the following conditions shall be fulfilled for each
slice and tile:
a) All coding tree blocks in a slice belong to the same tile
b) All coding tree blocks in a tile belong to the same slice
Picture Partitioning (cont)
5. Prediction Unit (PU) Structure: Can Be Non-Square
6. Transform Unit (TU) Structure: TU can be (non-)square if PU is
(non-)square. TU’s can be arranged in quad-tree structure
7. Blocks and associated syntax structures are encapsulated in a
“unit” as follows:
→ 1 or 3 prediction blocks are encapsulated in a PU
→ 1 or 3 transform blocks are encapsulated in a TU
→ 1 or 3 coding blocks are assoc. PU and TU are encapsulated in a CU.
→ 1 or 3 coding tree blocks and the associated coding tree syntax
structures and associated CU’s are encapsulated in a CTU.
Luminance Coding, Prediction, & Transform Block Sizes
for H.265/HEVC Main Profile
Block Sizes (Square) Block Sizes (Non-Square)
Coding Block 64×64,32×32,16×16,8×8
Prediction
Block(INTRA)
2N×2N & N×N
Prediction
Block
(INTER)
2N×2N & N×N
4×4 block size is not
allowed
2N×N, N×2N,
2N×nD*: 2N×3N/2 & 2N×N/2,
2N×nU*:2N×N/2 & 2N×3N/2
nL×2N*: N/2×2N & 3N/2×2N
nR×2N*: 3N/2×2N & N/2×2N
from References [1]&[3].
Transform
Block
32×32,16×16, 8×8 & 4×4
“TU can be (non-)square
32×8,8×32,16×4 & 4×16
if CU is (non-)square”
*Asymetric Motion Partition (amp_enabled_flag = 1)
Many Square & Non-Square
Blocks Sizes
Why H.265/HEVC Reduces Bit Rate by 50% and
Achieves Better Perceived Video Quality vs. H.264/AVC
• H.265 Allowed to Have Significantly Higher Complexity
• Increased Prediction Flexiblity: More Prediction Directions
• Wider Range of Block Sizes
H.265/HEVC versus H.264/AVC
• Larger H.265 CUs (64x64) Than H.264 Macroblocks (16×16)
• TUs (32×32,16×16,8×8 & 4×4) vs Transform Sizes (8×8 & 4×4)
• Larger and Smaller Prediction Units (PUs)
• H.265 Main: 64×64 down to 4×4 vs. H.264: 16×16 down to 4×4
• H.265/HEVC Has Achieved Its Goal of 50% Bit Rate Reduction At
The Same Video Perceptual Quality As H.264/AVC
• Parallel Processing Architectures and Entropy Codin g (CABAC)
simplications Has Significantly reduced Encode and Decode
Processing Time With Very Little Reduction in H.265/HEVC Coding
Efficency
19
Summary of H.265/HEVC
References
[1] “HM7: High EfficiencyVido Coding *HEVC) Test Model 7 Encoder
Description”, JCTV-VC 9th Meeting, Geneva, CH, Doc. JCTVC-
I1002, Apr-May 2012.
[2] “High efficiency video coding (HEVC) text specification draft 8”,
Doc. JCTVC-J1003_d7, JCT-VC 10th Meeting, Stockholm, SE,,
July 2012.
[3] HEVC HM-8.0 Software Source Code (TypeDef.h).
• Get Sample Of Latest Movie Content from Hollywood Studio
• Movie trailers are usually easy to obtain
• Longer clips may require knowing someone at the Hollywood studio and
will usually require some written agreement for how the movie content can
be used, e.g. allow public display at trade shows.
• Follow the written agreement if you would like to continue borrowing movie
content in the future.
• Hollywood Studios Know Their Movie Content Well and Should Be
Able to Notice Any Differences in the Appearance of Their Content.
This Makes Good Test Movie Content.
• “Golden Eyes” and “Golden Ears” Are Trained To Look For These
Differences
• Tech Demos Should Be Viewed as An Opportunity to “Put on a
Show”. Try to make it entertaining or tell a could story.
22
Practical Aspects For Developing Image
Technology for the Motion Picture Industry:
Borrowing Movie Clips For Experimental Testing
• The Director’s Vision Drives the Look, Feel, and Tone of the Movie.
• http://www.thx.com/test-bench-blog/the-what-and-why-of-artistic-intent/
• Preserve The Director’s Original Artistic Intent
• Hollywood Studios Know Their Movie Content Well and Should Be
Able to Notice Any Differences in the Appearance of Their Content
• Know The Image Content Capture Parameters: Color Space and
Gamma. Don’t Mess Up the Colors.
Large Software Company Example: Linear versus non-linear image signal.
• For Example, Don’t Remove the Noise Without The Director’s
Consent
Many Director’s Still Prefer the Film Look and May Use the Visibility of
Film Grain To Depict a Scene in the Past
23
Preserve High Quality of Motion Picture Content
24
High Quality Should Be Maintained Throughout the
Entire Image Chain
It can Be Difficult or Next to impossible to Recover From a Bad
Image Content Capture.
It’s Challenging to Get Great Imagery From a Web Cam With Poor
Lighting Conditions
• For Compression and Processing, Typically the Original Is
Compared to the Compressed/Processed (C/P) Version
Original Movie
Content
• Let’s Explore the Possible Realizations of
the “Compare” Rounded Box
25
How To “Compare” Compressed/Processed
Content
Compress/
Process
“Compare”
Objective Comparisons
• PSNR vs EncBitRate “Rate-Distortion” Curves
• Structural Similarity (SSIM) Index (Bovik)
Subjective Comparisons
1. Alternate Viewing Original and Compressed/Processed Version
2. Viewing Original and Compressed/Processed Version Side-By-Side (2
Monitors)
3. Viewing Original and Compressed/Processed Version Butterflied Side-By-
Side (One Monitor)
4. Alternate Viewing Difference of Original and C/P Version and C/P Version
(One Monitor)
5. View Difference of Original and C/P Version Butterflied with C/P Version
(One Monitor) 26
Realizations of “Compare” Rounded Box
27
For Compression Objective “Compare” Can Be
PSNR-EncBitRate Plotted on a “Rate Distortion
Curve. ”
35
37
39
41
43
45
0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000
PSNR-Y(dB)
Video Enc Bit Rate (kbps)
PSNR-Y vs Video Enc Bit Rate for H.265 (hm-8.0) and EV-P5 MBCtrk
(Traffic 2560Hx1600V)
hm-8.0 Pref1mainQPoff0
EVP5 (MBC Tracker)
Subjective Comparisons
1. Alternate Viewing Original and Compressed/Processed Version
2. Viewing Original and Compressed/Processed Version Side-By-Side
(2 Monitors)
3. Viewing Original and Compressed/Processed Version Butterflied Side-By-
Side (One Monitor)
4. Alternate Viewing Difference of Original and C/P Version and C/P Version
(One Monitor)
5. View Difference of Original and C/P Version Butterflied with C/P Version
(One Monitor)
28
Realizations of “Compare” Rounded Box (Cont)
29
1. Alternate Viewing Original and
Compressed/Processed Version
Playback Original, Then Playback C/P, repeat…
30
2. Viewing Original and Compressed/Processed
Version Side-By-Side (2 Monitors)
Original Compressed/Processed
31
3. Viewing Original and C/P Version Butterflied
Side-By-Side (One Monitor)
Original vs C/P Butterflied
Original Example
32
4. Alternate Viewing Difference of (Original and
C/P Version) and C/P Version (One Monitor)
Play C/P Version,
Play Difference of C/P & Original, repeat.
33
5. View Difference of Original and C/P Version
Butterflied with C/P Version
Scale * (Original – C/P + 128) : C/P Butterflied
C/P Example
34
Use Scaled Difference Left Image As A Guide To
Look For Artifacts on C/P Right Image
Look for C/P artifacts here
Scale * (Original – C/P + 128) : C/P Butterflied
• Add Offset of 128 (mid gray-level value) and Scale (Typically by 4
or 8) Difference Pictures: Scale *((Original – C/P) + 128).
• Scaling Makes It Easier for the Observer to Detect Differences in
C/P Methods.
• After Detecting Differences, One Can Focus on those picture areas
(in the non-differenced pictures) where the differences are most
noticeable. (As shown in example in previous slide)
• If Using 2 Monitors, they should have the same model number,
display size and calibrated to the same display settings.
• Large White Poster Board Used to View Output From Digital
Cinema Projector
35
Comments on Comparing Compression/Processing
Methods
• H.265 Has Achieved Goal of 50% Bit Rate Reduction for Same
Perceived Video Quality
• When Developing Imaging Technology, Make Sure to Preserve the
High Quality Motion Picture Content
• Good Comparison Techniques Can Make It Easier to Distinguish
Subtle Differences Between C/P Techniques
• Movie Content Is Very Important to the Hollywood Studios. How Do
We Protect It From Movie Pirates?
36