ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HEVC V3 and Coming Developments

High Efﬁciency Video Coding – Coding Tools and
Speciﬁcation: HEVC V3 and Coming Developments
Mathias Wien
Institut für Nachrichtentechnik
RWTH Aachen University
ICME2016
Mathias Wien (RWTH) HEVC V3 and Beyond ICME2016 1 / 235

Outline Part I
1 Video Coding Systems
2 Structure of a Video Sequence
3 Speciﬁcation

Outline Part II
4 Coding Structures
5 Reference Pictures
6 High-Level Syntax

Outline Part III
7 Intra Prediction
8 Inter Prediction
9 Residual Coding
10 Loop Filtering
11 Entropy Coding
12 Proﬁles in HEVC Edition 1

Outline Part IV
13 Range Extensions
14 Annex F: Common specifications for multi-layer extensions
15 Annex G: Multiview High Efficiency Video Coding
16 Annex H: Scalable High Efficiency Video Coding
17 Annex I: 3D High Efficiency Video Coding

Outline Part V
18 Screen Content Coding
19 Free Viewpoint Television
20 Wide Color Gamut / High Dynamic Range Coding
21 Future Video Coding

Outline Part VI
22 Summary and Outlook
23 Books and Tools
24 Resources
25 Acronyms
26 References

Part I
Introduction

Outline
3 Speciﬁcation

Video Coding Systems
source pre-processing encoding
transmission
decodingpost-processingdisplay
Generalized overview of the processing chain
Various realizations of the chain
Communication (e. g. video conferencing)
Broadcast (e. g. TV, streaming)
Storage (DVD, Blu-Ray, . . . )
Transcoding may be part of transmission

Hybrid Coding Scheme: Encoder
Decoder
CTB
input picture
+
−
TR+Q
TB
iTR+iQ TB
+
Intra
PB
Entropy
Cod-
ing
bitstream
Deblk. Slice
Loop
Filter
Slice
rec. picture
Inter
PB
Buffer n pics
ME
PB
CTB – Coding Tree Block
ME – Motion Estimation
PB – Prediction Block
Q – Quantization
TB – Transform Block
TR – Transform

Outline
Representation of Color
3 Speciﬁcation

Structure of a Video Sequence
Sequence of pictures successively captured or rendered
Progressive and interlaced formats
Picture rate measured in pictures per second, unit Hertz (Hz)
Minimum picture rate at 24 Hz for fluent motion [1]
Standard Definition TV at 50/60 Hz interlaced
High Definition (HD) video at 50/60 Hz progressive
Ultra HD (UHD) video up to 120 Hz
Up to 300 Hz considered

Chroma Sub-Sampling
Human visual system less sensitive to color than to structure and texture
⇒ full resolution luma, lower resolution chroma
Chroma sub-sampling types commonly speciﬁed by relation between
number of luma an chroma samples
YCbCr Y:X1:X2
With Y: number of luma pixels
Sub-sampling format of chroma components speciﬁed by X1 and X2
X1: horizontal sub-sampling
X2 = 0: vertical sub-sampling identical to horizontal sub-sampling
X2 = X1: no vertical sub-sampling

Chroma Sub-Sampling
Example:
Assumed location of chroma samples relative to luma samples in HEVC
luma
chroma
(a) YCbCr 4:2:0 (b) YCbCr 4:2:2

Outline
3 Speciﬁcation
Speciﬁcation Scope

Standardization
Driving factor for standardization: Interoperability
Desire to enable devices and applications from different manufacturers
and sources to interoperate
Deﬁnition of requirements
Application needs
Extensibility options

Speciﬁcation Scope
source pre-processing encoding
transmission
decodingpost-processingdisplay
Parsing process
Bitstream syntax
Decoding process

Part II
HEVC Coding Structures and High-Level
Syntax

Outline
4 Coding Structures
Random Access Points
Leading and Trailing Pictures
Temporal Sub-Layer Access
Blocks, Units, and Slices
Tiles
Block Types
6 High-Level Syntax

Picture Types: Intra Random Access Points
Instantaneous Decoder Refresh (IDR)
Start of new CVS, reset of decoding process, DPB emptied
Only reference to pictures following IDR in coding order
No reference ’across’ IDR picture
Clean Random Access (CRA)
DPB left intact if within coded video sequence
Pictures after CRA in coding and output order without reference to
pictures before CRA
If used as starting point: Conformingly decodable pictures following CRA
in output order
Broken Link Access (BLA)
Renaming of CRA after splicing operation
Indicates removal of pictures containing broken / unavailable references
CVS = Coded Video Sequence, DPB = Decoded Picture Buffer

Picture Types: Leading and Trailing Pictures
Leading Pictures
Precede associated IRAP in output order
Follow associated IRAP in coding order
Random Access Decodable Leading Picture (RADL)
Correctly decodable if decoding starts at corresponding IRAP
No reference to pictures prior to IRAP in coding and output order
Random Access Skipped Leading Picture (RASL)
May contain references to unavailable pictures if decoding starts at IRAP
Shall not be output if associated IRAP is a BLA
Trailing Pictures
Follow the associated IRAP in coding and output order
Follow all leading pictures associated to the IRAP in coding order

Picture Types: Leading and Trailing Pictures – Example
2 70 1 3 4 5 6 8 9output ord.
coding ord. 01 2 34 56 7 9 8
pic. type RADL RADL IDR TRAIL TRAIL RADL RADL CRA TRAIL TRAIL
NUT 7 6 19 0 1 7 6 21 0 1
(a)
2 70 1 3 4 5 6 8 9 10output ord.
coding ord. 01 2 34 56 7 89
pic. type RADL RADL IDR TRAIL TRAIL RASL RASL CRA TRAIL TRAIL
NUT 7 6 19 0 1 9 8 21 0 1
(b)
NUT: NAL unit type, NAL = Network Abstraction Layer
Coding structure for demonstration purpose only

Picture Types: Leading and Trailing Pictures – Splicing
SEQUENCE A
2 0 1 4 3 7 5 6 9 8 12 10 11 1output ord.
coding ord. 0 1 2 3 4 5 6 7 8 9
pic. type IDR RADL RADL TRAIL TRAIL CRA RASL RASL TRAIL TRAIL
NUT 19 7 6 1 0 21 9 8 1 0
SEQUENCE B
0 1 2 3 × × 5 4 8 6 7 10 9output ord.
coding ord. 0 1 2 3 4 5 6 7 8 9
pic. type IDR TRAIL TRAIL BLA RASL RASL TRAIL TRAIL CRA RASL
NUT 19 1 0 16 9 8 1 0 21 9

Picture Types: Temporal Sub-Layer Access
Switching temporal layers
Temporal nesting: at any picture to higher or lower tid
General: switching of temporal layer only at tid = 0
Temporal sub-layer access: additional option
Temporal Sub-Layer Access (TSA)
Switch to any higher tid at TSA picture
No reference to higher tid by TSA picture
Stepwise Temporal Sub-Layer Access (STSA)
Switch to tid of STSA picture possible
Switch to higher temporal layers not possible

Spatial Coding Structures
Blocks and Units
Block: Square or rectangular area in a color component array
Unit: Collocated blocks of the (three) color components, associated
syntax elements and prediction data (e. g. motion vectors)
Picture partitioning
Coding Tree Blocks / Coding Tree Units (CTBs / CTUs)
Each CTU in exactly one slice segment
Independent slice segment: Full header, independently decodable
Dependent slice segment: very short header, relies on corresponding
independent slice, inherits CABAC state
Slice types
I-slice: Intra prediction only
P-slice: Intra prediction and motion compensation with one reference
picture list
B-slice: Intra prediction and motion compensation with one or two
reference picture lists
CABAC: Context-based Adaptive Binary Arithmetic Coding

Wavefront Parallel Processing (WPP)
Storage of CABAC states for synchronization
Two CTUs offset per row (availability of top-left CTU)
Entry points coded in the slice segment header
CTU CTU CTU CTU CTU CTU
0 1 2 3 4 5
CTU CTU CTU CTU
Nc Nc+1 Nc+2 Nc+3
CTU CTU
2Nc 2Nc+1
decoder1
decoder2
decoder3
slice seg.
header
CTU CTU CTU CTU CTU CTU
0 1 Nc Nc+1 2Nc 2Nc+1
··· ··· ···
ep0 ep1 ep2
Bitstream

Tiles
Change scanning order of CTBs in picture
Slices in tiles, or tiles in slices
Reset of prediction and entropy coding → parallel processing
(entry points like WPP)
Slice 1
Slice 2
Slice 3
Slice 4
Slice 5
Slice 6

Tiles
Slice 1
Slice 2
Slice 3
Slice 4
Slice 5
Slice 6
Tile 1 Tile 2 Tile 3

Tiles
Slice 1
Slice 2 Slice 3 Slice 4
Slice 5

Coding Tree Blocks and Coding Blocks (CBs)
Quadtree partitioning of CTB into CBs
If picture size not integer multiple of CTB size:
Implicit CTB partitioning to meet picture size (must be multiple of 8×8
pixels)
0
1 2
3 4
5 6
7
8 9
10 11
12 13
14 15
16
17 18
19 20
21
0
1 2
3 4 5 6
7 8 9 10 11
12 13 14 15
16
17 18 19 20
21
(a) (b)

Prediction Blocks (PBs) and Transform Blocks (TBs)
Prediction block partitioning of a 2N×2N CB
INTER
2N×2N 2N×N N×2N N×N
2N×nU 2N×nD nL×2N nR×2N
INTRA
2N×2N
4×4
Transform block partitioning of a CB
Quadtree partitioning of CB → Residual Quad Tree (RQT)
Transform size 4×4 to 32×32
TB size 4×4 to 64×64
PB boundaries inside TBs allowed

Outline
4 Coding Structures
Reference Picture Set
Reference Picture List
6 High-Level Syntax

Reference Picture Sets (RPS)
Reference Picture Set
Set of previously decoded pictures
To be used as reference for inter prediction
Identiﬁed by POC value
Picture marking
Use in current or following pictures
Unused for reference (can be removed from DPB)
Construction
Short-term before (B)
Short-term after (A)
Long-term
spec further reading: [4]

Short-Term RPS – Example
POC
RPS
0 8B0 *B1B2B3
0 842 61 3 5 7-1-2-3-4-5-6-7-8
1 9B0 A0*B1
2 10B0 A1A0*B1
3B0 A2A1A0*
4B1 A1A0B0 *
5B2 A0B0B1 *
6B1 A1B0 A0*
7B2 A0B1 B0 *
0 842 61 3 5 7-1-2-3-4-5-6-7-8
RPS of random access conﬁguration from the JCT-VC
common testing conditions JCTVC-K1100 [8]

Reference Picture List (RPL)
Reference picture lists constructed from available RPS
Size of RPL signaled in PPS or slice segment header
One list in P-slices (List0)
Two lists in B-slices (List0, List1)
Construction
List0
Short-term before
Short-term after
Long-term
List1
Short-term after
Short-term before
Long-term

Outline
4 Coding Structures
6 High-Level Syntax
Network Abstraction Layer
Parameter Sets
Picture Order Count
Hypothetical Reference Decoder
Supplemental Enhancement Information
Video Usability Information

Network Abstraction Layer (NAL)
Coded Video Sequence (CVS)
Starts with a ’new’ IRAP (associated RASLs to be discarded)
One or more CVSs in a bitstream
→ Coded Video Sequence Group (CVSG)
Network Abstraction Layer
Encapsulation of coded video sequence for transport and storage
Video coding layer (VCL) NAL units
All video data, i. e.
Slices with CTUs, PUs, TUs
Non-VCL NAL units
Parameter sets
Supplemental enhancement information
. . .
ScHaWaWe12 [9]

NAL Unit Structure
| | | | | | | | |
. . .
byte 0 1 2 . . . . . . NU−1
1 0 0 0 0 0
NALU header RBSP
SODB
RBSP stop bitMSB LSB
RBSP: Raw byte sequence payload
Sequence of bytes comprising the coded NAL unit payload
RBSP stop bit (=’1’) plus zero bytes for byte alignment
SODB: String of data bits
Concatenation of bits in the RBSP bytes from MSB to LSB
All bits needed for the decoding process
Only the bits needed for the decoding process

NAL Unit Header (2 bytes)
“0” NAL unit type NUH layer id temporal id
byte 1 byte 2
NAL unit type: characterize content of VCL/ non-VCL NAL unit
NAL unit header (NUH) layer id: equal to 0 for HEVC Version 1, for use in
scalable extension (spatial layers, quality layers), multi-view extension
(view id)
temporal id ( tid): temporal sub-layer identiﬁer

Access Units
start
AUD
(preﬁx)
SEI
CSS
sufﬁx
SEI
EOS EOB
end
Access Unit (AU)
Set of all NAL units associated to exactly one picture
All NAL units of AU share output time of included picture
Not depicted
Parameter sets
Other NAL units types
Decoding Units
Decoder operation on sub-picture level
Operation of the HRD on sub-picture basis
(HRD = Hypothetical Reference Decoder, see below)
Enables sub-picture output before complete decoding of full picture
AUD Access unit delimiter
SEI Supplemental enhancement information
EOS End of sequence
EOB End of bitstream

Parameter Sets
Hierarchical structure
Separation of information for different hierarchy levels
Highest-priority information
In-band or out-of-band transmission
Available parameter sets
Video parameter set (VPS)
Sequence parameter set (SPS)
Picture parameter set (PPS)

Parameter Sets
VPS1
VPS2
SPS1
SPS2
SPS3
PPS1
PPS2
PPS3
PPS4
PPS5
slice headers in coded video seq. A
slice headers in coded video seq. B
slice headers in coded video seq. M
slice headers in coded video seq. N
...
...
bitstream I
bitstream II

Coded Video Sequence Group
CVS1
CVS2
CVSG
VPS1
SPS1 PPS2 SSH lid = 1

Video Parameter Set
Introduced for handling of multi-layer bitstreams
General information, activated for all layers
Coded video sequence(s), included layers, available operation points
Highest layer parameter set: sets global constraints
HEVC Version 1: not needed, copy of SPS information

Sequence Parameter Set
Activated for the coded video sequence
Proﬁle, tier, and level
Usage of tools
Video usability information
. . .

Picture Parameter Set
Activated per picture
May change from picture to picture (only one per picture)
Tool conﬁguration
CABAC
Quantizers,
Loop ﬁlters
Tiles
. . .

Slice Segment Header
Activates the chain of parameter sets (once per picture)
Slice segment header must refer PPS with identical content (but not
necessarily identical PPS id)
Includes picture order count
Includes all information needed to decode the independent slice segment
and associated dependent slice segments
Entry points for wavefront parallel decoding / tiles

Picture Order Count (POC)
POC-2
POC-1
POC
POC+1
Identiﬁer of picture in the DPB
Indicates output order of the pictures, strictly increasing
POC of IDR pictures always 0: start of new coded video sequence
Used for derivation of picture distance in inter prediction, scaling of
motion vectors
Constant POC delta not necessary
VPS: indication of POC relation to picture time difference possible

Hypothetical Reference Decoder (HRD)
hypothetical
stream
scheduler
(HSS)
coded
picture
buffer
(CPB)
decoding
process
(instan-
taneous)
decoded
picture
buffer
(DPB)
output
cropping
Encoder: control of buffer states
Introduction of timing
Conformance testing
Parameter signalling in VPS or SPS
Sub-picture operation possible
AU operation → DU operation (new in HEVC)
Speciﬁed in HEVC Annex C spec , further reading: [10]

HRD: Decoded Picture Buffer
hypothetical
stream
scheduler
(HSS)
coded
picture
buffer
(CPB)
decoding
process
(instan-
taneous)
decoded
picture
buffer
(DPB)
output
cropping
Reconstructed pictures
Reference pictures (short-term / long-term)
Pictures to be displayed
Picture output timing: Delay relative to CPB removal time
Note: Prevention of picture output possible
Signalled in the bitstream
Indicated by BLA picture type

Supplemental Enhancement Information (SEI)
SEI message content
Not required for the decoding process
Potentially useful for the decoding process
Attached and integrated into bitstream, sent with coded video sequence
Error recovery, testing for integrity, . . .
Prefix SEI: Cannot occur after the NAL units of the coded picture
Suffix SEI: Cannot occur before the NAL units of the coded pictures
Concept of SEI from H.264|AVC, payload types aligned for inherited SEI
messages
Specified in Annex D spec

Video Usability Information VUI
Not required by the decoding process
Information on the interpretation of the decoded video sequence
Signaling in-band (as part of SPS) or out-of-band
Parameters
Geometric relations → sample aspect ratio (SAR), overscan indication
Video signal type and color information
Frame / ﬁeld indication
Default display window (suggested area within the cropping window)
HRD/ Timing information (including clock tick for picture rate)
Restrictions: Tiles, motion vectors, RPL, spatial segmentation, maximum
byte cost (picture / CTU)
Speciﬁed in Annex E spec

Part III
HEVC Coding Tools

Outline
7 Intra Prediction
Intra Prediction Modes
Intra Coding Example
8 Inter Prediction
9 Residual Coding
10 Loop Filtering
11 Entropy Coding

Intra Prediction
Decoder
CTB
input picture
+
−
TR+Q
TB
iTR+iQ TB
+
Intra
PB
Entropy
Cod-
ing
bitstream
Deblk. Slice
Loop
Filter
Slice
rec. picture
Inter
PB
Buffer n pics
ME
PB

Intra Prediction Modes
0 : Planar
1 : DC
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
34
33
32
31
30
29
28 272625 24
23
22
21
20
19
Planar prediction: mode 0
DC intra prediction: mode 1
Numbering from diagonal-up to diagonal-down
Horizontal: mode 10, vertical: mode 26

Intra Prediction Block size
Intra prediction mode coded per CU
Prediction block size derived from residual quadtree
Boundary samples of neighboring block used for prediction
Efﬁcient representation
Local update of prediction source

DC
DC
DC
DC
DC
P
P DC
P P
DC
P
P
DC
P
P
P
DC
P
P P
DC
original

DC
DC
DC
DC
DC
P
P DC
P P
DC
P
P
DC
P
P
P
DC
P
P P
DC
prediction

DC
DC
DC
DC
DC
P
P DC
P P
DC
P
P
DC
P
P
P
DC
P
P P
DC
residual

Outline
7 Intra Prediction
8 Inter Prediction
Motion Compensated Prediction
Motion Vector Representation
Inter Coding Examples
9 Residual Coding
10 Loop Filtering
11 Entropy Coding

Inter Prediction
Decoder
CTB
input picture
+
−
TR+Q
TB
iTR+iQ TB
+
Intra
PB
Entropy
Cod-
ing
bitstream
Deblk. Slice
Loop
Filter
Slice
rec. picture
Inter
PB
Buffer n pics
ME
PB

Motion Compensated Prediction
POC-2
POC-1
POC
POC+1
Prediction from reference picture lists
Uni-prediction
P-slices only with List0, B-slices with List0 or List1
Minimum PB size 8×4 or 4×8
Bi-prediction, only in B-slices
One predictor from List0, one predictor from List1
Minimum PB size 8×8

Motion Vector Representation
Merge mode
Motion vector (MV) derived from candidate set (spatial and temporal
neighborhood)
Merge mode candidate index coded
No motion vector difference encoded
Advanced motion vector prediction
Predictor derived from candidate set (spatial and temporal neighborhood)
Predictor index coded
Motion vector difference encoded
HeOuBrMaBi12 [12]

Merging Motion Vectors
Merge candidate list
List with up to five different motion merge candidates (list length indicated
in the slice segment header)
Unavailable candidates are ignored
List filled to specified length → Improved loss robustness:
Available list length independent of derivation process
Additional combination of reference picture list 0 and 1 candidates for
B-slices
Merge mode granularity
PU grid size configured in the PPS
PUs below grid size share merge candidate list
Adjustable computational complexity

Merging Motion Vectors – Spatial Candidates
A0
A1
B0B1
B2
(xP,yP)
PU
(xP,yP) (xP,yP)
(a) Merge candidates (b) 4×8 PUs
Processing order: A1,B1,B0,A0,B2
Candidates A1,B1 only if not right or bottom PU in CU with two partitions,
respectively
Maximum four spatial merge candidates
Reduced line buffer storage requirement: shifted neighbor locations for
blocks of width 4
spec

Merging Motion Vectors – Temporal Candidates
collocated region in
reference picture
C0
C1
n −5 n −4 n −3 n −2 n −1 n
current picture
POC
td tb
Location C1 if C0 unavailable, or PU at bottom right CTU boundary
(complexity)
Applicable reference picture selected on slice basis
B-slices: Reference picture list 0 or 1 indicated by ﬂag
Reference index of candidate always set to 0 (error resilience, complexity)
MV scaling according to POC distance
mvPU =
tb
td
·mvcol

Predictive Motion Vector Coding
A0
A1
B0B1
B2
(xP,yP)
PU
collocated region in
reference picture
C0
C1
Advanced motion vector prediction
Reference index and motion vector difference coded!
Selection of predictor by flag (only two options)
Derivation process for each reference picture list
Locations from spatial neighborhood as shown for merge mode
Candidate MVs scaled based on POC difference
Candidate A first of A0,A1, candidate B first of B0,B1,B2
Optional: Additional temporal candidate C if A or B not available
spec

Motion Vector Signaling: Skip spec
CU Skip
Indication via ﬂag at beginning of CU syntax
Based on Merge mode
Only merge index signalled
No further syntax elements for CU
→ Residual inferred to be zero
→ Very cheap coding mode

Legend: List 0 = diagonal-down hatch, List 1 = diagonal-up hatch, ref.idx=0: gray, ref.idx=1, dark gray

(a) mvL0 (b) mvL1

Outline
7 Intra Prediction
8 Inter Prediction
9 Residual Coding
Core Transforms
Quantization
Coded Representation of Transform Blocks
Special Modes
10 Loop Filtering
11 Entropy Coding

Residual Coding
Decoder
CTB
input picture
+
−
TR+Q
TB
iTR+iQ TB
+
Intra
PB
Entropy
Cod-
ing
bitstream
Deblk. Slice
Loop
Filter
Slice
rec. picture
Inter
PB
Buffer n pics
ME
PB

Core Transforms
Transform block sizes 4×4, 8×8, 16×16, and 32×32
Integer approximations of the DCT-II transform matrix
Additionally, integer approximation of the DST-VI transform matrix
’Single-norm’ design per transform block size → simple quantizer
implementation
Not all perfectly orthogonal, leakage below normalization threshold
BuFuBjSzSa13 [13]

Quantizer Step Size
... ...
nq
x
-4 -3 -2 -1 1 2 3 40
∆q
Quantizer step size ∆q derived from quantization parameter QP
Logarithmic relation of quantizer step sizes
Double step size every 6 QP
∆q( QP +1) =
6
√
2 ·∆q( QP)
Deﬁnition: ∆q = 1 for QP = 4, thereby
∆q,0 ∈ 2−4
6 ,2−3
6 ,2−2
6 ,2−1
6 ,1,2
1
6
Quantizer step sizes for QP > 5
∆q( QP) = ∆q,0( QPmod6)·2
QP
6

Quantizer Step Size
Quantizer range
QP = 0,...,51
Resulting quantizer step sizes
0.630 ≤ ∆q ≤ 228.1
Covering value range of an 8 bit input signal
Higher input bit depth:
Extension towards ﬁner quantization
Range extended by 6 QP steps per additional bit

Transform Sub-Blocks (TSBs)
last significant coefficient (xlsc,ylsc)
16
16
4
4
transform sub-block
Scan of transform sub-blocks
Last significant coefficient position
Determination of transform sub-blocks with non-zero coefficients
Coded sub-block flag
spec

Transform Sub-Block Scanning
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0
1
2
3
0
2
1
3
diag.,
vert.
horiz.
(a) 4×4 diagonal (b) 4×4 horizontal (c) 4×4 vertical (d) 8×8 TSB scans
Partitioning of transform block into 4×4 transform sub-blocks (TSBs)
Scan direction selectable for 4×4 and 8×8 blocks, diagonal otherwise
Scan direction in TSB depending on (intra) prediction mode
Level distribution: ’trailing ones’ expected towards higher frequencies
Scan used in inverse direction
Start with expected ’1’ values
Level: quantized transform coefﬁcient level nq

Special Modes
Transform skip, applicable to 4×4 TUs
Omit transform step
Inverse quantization operation
Additional bit shift to compensate missing transform
Transform and quantization bypass
No inverse transform, no quantizer scaling
Perfect reconstruction of residual (useful e. g. for graphics content)
PCM mode
Direct encoding of pixel levels, lossless representation option for CU
Conﬁgurable bit depth (SPS)
Maximum PCM block size 32×32

Outline
7 Intra Prediction
8 Inter Prediction
9 Residual Coding
10 Loop Filtering
Deblocking Filter
Sample Adaptive Offset
11 Entropy Coding

Loop Filtering
Decoder
CTB
input picture
+
−
TR+Q
TB
iTR+iQ TB
+
Intra
PB
Entropy
Cod-
ing
bitstream
Deblk. Slice
Loop
Filter
Slice
rec. picture
Inter
PB
Buffer n pics
ME
PB

Deblocking Filter
Design considerations
Reduction of visible blocking artifacts induced from block-wise processing
Computational complexity
Parallel processing
Deblocking processing
Operation on slice / tile basis
Slice-based control of deblocking ﬁlter conﬁguration possible
Picture-based or block-based implementations possible
Line buffer for block-based processing
[14]

Deblocking Filter Neighborhood
vertical edges horizontal edges
q0,A1
p0,A1
q0,A2
p0,A2
q0,A3
p0,A3
q0,A4
p0,A4
q0,B1
p0,B1
q0,B2
p0,B2
q0,C1
p0,C1
q0,C2
p0,C2
q0,C3
p0,C3
q0,A1
p0,A1
q0,A2
p0,A2
q0,A3
p0,A3
q0,A4
p0,A4
q0,B1
p0,B1
q0,B2
p0,B2
q0,B3
p0,B3
q0,B4
p0,B4
q0,C
p0,C
(a) (b)
Neighborhood determination
q0,p0: reference samples in current block and neighboring block
Filtering only on 8×8 sample grid!
Edge sections of 4 sample length

Boundary Strength
Boundary strength parameter bS
Neighboring blocks determined by q0,p0
Possible boundary parameter strength values: 0,1,2
Determination of bS
At least one intra coded block: bS = 2
Transform block boundary and non-zero coefﬁcients: bS = 1
Motion information conditions: bS = 1
Different reference pictures or different number of motion vectors
Same number of MVs and |mvx,p −mvx,q| ≥ 1 or |mvy,p −mvy,q| ≥ 1
(full-sample resolution)
Otherwise: bS = 0
spec

Deblocking Filter Operation
Deblocking filter operation
Operation on a 4-sample edge basis
Luma: Deblocking if bs ≥ 1
Chroma: Deblocking if bs = 2
First vertical filtering, then horizontal filtering
Independent operation on 8×8 block grid → parallel processing!
spec

Deblocking Filter Example
(a) Original (b) Reconstruction with deblocking

(c) Structure, deblocked samples (d) Reconstruction without deblocking

(e) Normal deblocking (f) Strong deblocking

Sample Adaptive Offset (SAO)
New filter type in ITU-T / MPEG video coding specifications
Local processing of samples
Depending on local neighborhood (edge offset), or
Depending on sample value (band offset)
Operation independent of processed samples → parallel processing
Local filter parameter adaptation
Four different offset values available (plus SAO off)
Dedicated SAO parameters for each Y, Cb, Cr component
Common SAO mode for chroma components
further reading: [15]
spec

SAO: Edge Offset
pcp0 p1 pc
p1
p0
pc
p1
p0
pc
p1
p0
(a) horizontal (b) vertical (c) diagonal-down (d) diagonal-up
Neighborhood
Two samples from 8-connected neighborhood considered
Direction of neighborhood signaled on CTU basis
Relation of sample values determines edge offset index ie

SAO: Edge Offset
p0 pc p1 p0 pc p1 p0 pc p1 p0 pc p1 p0 pc p1
(a) ie = 0 (b) ie = 1 (c) ie = 2
p0 pc p1 p0 pc p1 p0 pc p1
(d) ie = 4 (e) ie = 3
Offset categories
Edge offset index identiﬁes category
Smoothing only, direction of offset predeﬁned
Sign of SAO offsets not signaled

SAO: Band Offset
low band transition high band
transition band position io
0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128 136 144 152 160 168 176 184 192 200 208 216 224 232 240 248
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
255
intensity value (example: bit depth 8bit)
Correction of sample intensity values for four transition bands
Signaling of transition band position
Offset values for transition bands freely conﬁgurable

SAO Filter Example
(a) Reconstructed region (b) Samples with band offset active

SAO Filter Example
(c) Reconstructed region (d) Samples with edge offset active

SAO Filter Example
CTBs with SAO edge (lines) and band offset (gray squares)

Outline
7 Intra Prediction
8 Inter Prediction
9 Residual Coding
10 Loop Filtering
11 Entropy Coding
Fixed Length and Variable Length Coding
Context-Based Adaptive Binary Arithmetic Coding

Entropy Coding
Decoder
CTB
input picture
+
−
TR+Q
TB
iTR+iQ TB
+
Intra
PB
Entropy
Cod-
ing
bitstream
Deblk. Slice
Loop
Filter
Slice
rec. picture
Inter
PB
Buffer n pics
ME
PB

Entropy Coding
Fixed length and variable length codes (FLC, VLC)
High-level syntax
Parameter sets, slice segment header
SEI messages
Fixed-length codes, Exp-Golomb codes
Arithmetic coding
Slice level, CTUs
Context-based adaptive coding
Bypass coding (complexity, throughput)

Variable Length and Arithmetic Coding
slice seg.
header
NALU
header
CTU CTU CTU CTU
0 NC−1 NC NS−1
··· ···
FLC FLC,VLC CABAC CABACba
ep0
VCL NAL Unit
FLC, VLC for header information
CABAC for CTUs
Byte alignment in case of multiple tiles, or with wavefront parallel
processing (not present otherwise)
CABAC = Context-based Adaptive Binary Arithmetic Coding
ba = byte alignment

Context-Based Adaptive Binary Arithmetic Coding – CABAC
Binary Arithmetic Coder
syntax
ele-
ment
Binarizer
Context
Modeler
Adaptive
Engine
Bypass
Engine
bitstream
bin
string
binary value
bin
value coded bits
bin value coded bits
context update
Binarization
Context model selection
Binary arithmetic coding
Optimized binarization design, reduced number of non-bypass bins
compared to H.264|AVC
MaScWi03 [16]
SzBu12 [17]

Outline
7 Intra Prediction
8 Inter Prediction
9 Residual Coding
10 Loop Filtering
11 Entropy Coding
Proﬁles
Tiers and Levels

Proﬁles, Tiers, and Levels in HEVC Edition 1
Main Still Picture
Main
Main 10
Annex A

HEVC Edition 1 Profiles
Main Profile
YCbCr 4:2:0 8 bit video only
CTB block sizes 16×16, 32×32, and 64×64
Either tiles or wavefront parallel processing
Minimum tile size 256×64 pixels
Maximum number of CABAC coded bits in N×N CTU: 5
3
braw
with braw = (N2
+N/2)·Bd for YCbCr 4:2:0, bit depth Bd
Main 10 Profile
Additional support of bit depths of 9 and 10 bit
Main Still Picture Profile
Like Main, but only one picture in the bitstream
No timing constraints for the decoder

Tiers and Levels
Profiles
Defined tool (sub-)set of specification
Tools determined from application space
Tiers and levels
Levels: Restrictions on parameters which determine decoding and
buffering capabilities (13 levels defined)
Tiers: Grouping of level limits for different application spaces (currently:
Main Tier for consumer, High Tier for professional applications)
Decoder capable of decoding Profile@Tier/Level must be able to decode
all lower levels of same and lower tier

Extensions of HEVC
Range extensions (HEVC V.2/Ed. 2, 10/2014) V.x=ITU-Version, Ed.y=ISO/ICE Edition
Extended color formats (4:2:2, 4:4:4)
Extended bit depth
Scalable extensions(HEVC V.2/Ed. 2, 10/2014)
Simple, multiloop approach, no modiﬁcations on tool-level
Supports spatial, SNR scalability
Multi-view (HEVC V.2/Ed. 2, 10/2014)
3D extensions (HEVC V.3/Ed. 3, 04/2015)
Stereo / multi-view coding
Multi-view with depth coding
Screen Content Coding (V.4/Ed. 3, 2016)
Dedicated tools for this content type

Part IV
Range and Multi-Layer Extensions

Outline
13 Range Extensions
Extended Color Formats
RExt Tools
RExt Proﬁles
RExt Levels

Range Extensions Applications
Professional Requirements
Highest quality, very high bitrates
Increased chroma resolution
Increased precision: higher bit depth
Applications
Professional: contribution, distribution
Medical applications
. . .

4:2:2 Residual Quadtree
Y Cb Cr
YCbCr 4:2:2
Chroma shares vertical luma resolution, half horizontal resolution
Residual quadtree kept for the three components
Chroma includes two stacked N
2
×N
2
transform blocks with N×N luma
transform block

4:4:4 Residual Quadtree
Y Cb Cr
YCbCr or GBR or YZX 4:4:4
Chroma shares vertical and horizontal luma resolution
Two option, depending on separate_colour_plane_flag
0: Residual quadtree kept for the three components
1: Each color component treated as independent monochrome plane

RExt Tools: Transform Skip Rotation
0 1 2 3
4 5 6 7
8 9 A B
C D E F
=⇒
180◦
F E D C
B A 9 8
7 6 5 4
3 2 1 0
Rotation of residual in transform block
Signaled in SPS
Applicable to intra 4×4 transform blocks using transform skip
Only if transform and quantization bypass enabled

RExt Tools: Implicit/Explicit Residual DPCM Coding
Implicit Residual DPCM Coding
Usage signaled in SPS
Activated only for blocks using horizontal and vertical intra prediction
Transform skip or transform and quantization bypass to be enabled
Explicit Residual DPCM Coding
Inter blocks with transform bypass
Usage and direction explicitly signaled
Example: horizontal DPCM
r(x,y)+ = r(x −1,y); for x = 1...bS −1, y = 0...bS −1

RExt Tools: Inter-Component Residual Prediction
Tool for 4:4:4 content
Prediction of chroma component residual block from luma residual block
RC = RC +α ·RY
with α ∈ {0,±1,±2,±4,±8}
Additional decorrelation of residual signal
Gains on RGB content reported [18]
Activated in PPS

RExt Profiles
Main
Intra
Main
4:4:4
Intra
Main
4:4:4
10 Intra
Main
4:4:4
12 Intra
Main
4:4:4
16 Intra
High Throughput
4:4:4
16 Intra
Monochrome
Monochrome
16
Monochrome 12
Main
4:2:2 Intra
Main
4:2:2
10 Intra
Main
10 Intra
Main
10
Main
12 Intra
Main
4:2:2
12 Intra
Main
12
Main
Still Picture
Main
4:4:4
Still Picture
Main
4:4:4
16
Still Picture
Main
4:4:4
Main
4:4:4
10
Main
4:4:4
12
4:2:2
Main
4:2:2
10
Main
4:2:2
12
Range extensions profiles: general profile idc = 4
High throughput profiles: general profile idc = 5

RExt Levels
Level limits specified in Version 1 unsufficient for RExt profiles
Introduction of scaling factors
HbrFactor: High bitrate factor
Controlled by low bit rate flag spec
Values can be 1, 2, or 12, 24 for High Throughput profiles
CpbVclFactor / CpbNalFactor
Scaled by HbrFactor to apply for table A.4 spec
FormatCapabilityFactor
Scaling factor for number of bytes in an Access Unit
MinCrScaleFactor
Scaling factor for minimum compression ratio

RExt Performance
Comparison of HEVC RExt in HM 16.2 relative to AVC FRExt in JM 18.6
Lossy coding [BD-rate %] Lossless coding, rate saving [%]
Figure from FlMaNaNgRoShSoXu16 [19]

Outline
13 Range Extensions
Common Speciﬁcation Structure
Parameter Sets
Bitstream Subsets
SEI messages

Common Specification Structure for Multilayer Video
Scalable and multiview coding share concept of layers
Scalable layers: different picture resolutions, different reconstruction quality
Multiview layers: different views on the same scene
Specification approach in HEVC extensions
Unify common parts of extensions in a joint specification annex
Only separate tools in separate annexes
Chosen extension structure
Annex F: “Common syntax, semantics and decoding processes for
multilayer video coding extensions”
Annex G: “Multiview coding”
Annex H: “Syntax, semantics and decoding processes for scalable
extension”
Annex I: “3D High Efficiency Video Coding”

Annex F: Main Features
High level syntax
Introduction of layer identiﬁcation and layer reference mechanims
Speciﬁcation of layer structure and layer dependencies
Usability and scalability information
Extended Concepts
Introduction of auxiliary layers (not for output)
Depth
Alpha-plane
Option for non-HEVC baselayer (H.264|AVC)

Video Parameter Set
VPS1
Specification of available layers in the bitstream
Scalable layers
Multiple views
Auxiliary pictures (e. g. depth or alpha planes)
Specification of layer dependencies
Prediction relations
Layer sets
Available operation points, output layer sets
Available profiles and levels

NAL Unit Header: Layer ID
“0” NAL unit type NUH layer id temporal id
byte 1 byte 2
Each layer has unique layer identifier lid
Scalability types defined in VPS
Indication of texture or depth layer
View order index for multi-view coding voidx
Spatial / quality scalability → dependency identifier did
Auxiliary pictures (not used for coding of primary picture)
Auxiliary picture identifier aid
No prediction between pictures with different value of aid
Meaning of the NUH bits in identifier configurable
Combination of different scalability types possible → splitting flag
Additional identifier for each layer encoded in VPS
Output-flags: control of presentation of layers

VPS Extension: Layer Dependencies
target layer
direct reference layer
indirect reference layer
poc-1
lid = 2
lid = 1
lid = 0
lid = 1
lid = 0
poc
lid = 2
lid = 1
lid = 0
poc+1
lid = 2

Layer Dependencies – Adaptation in Network
Strict lid hierarchy
Layer hierarchy → enable scalability
Simple MANEs: achieve sub-bistream extraction by cutting at some lid, tid
combination
Advanced MANES: analyze dependencies and make use of advanced
knowledge
MANE = Media Aware Network Element

VPS Extension: Layer Sets
Layer set
set of lid values which can be extracted from the bitstream for decoding
(e. g. different spatial resolutions)
Output layer set (OLS)
Set of layers which are output if ﬂag indicates OLS to be a target OLS
output_layer_flag: indicates if layer is to be output or not
Example: Depth layer not output layer, but associated texture layer is
ouput layer
alt_output_layer_flag: top non-output layer is output if output
layer not present
Example: Quality scalability, dropping of highest quality layer

VPS: Layer Video Usability Information
Information on characteristics of layer (optional)
Cross layer picture type alignment
Cross layer IRAP alignment
Cross layer IDR alignment
Information on average bitrate, maximum bitrate
Indication of constant or variable picture rate
Hints for use of tiles, loop ﬁlters, WPP, cross layer tile alignment
Indication of restrictions on inter-layer prediction
. . .
spec

Sequence Parameter Set Extensions
Proﬁle/Tier/Level information conveyed in VPS for non-baselayer SPS
Reference to layer representation format in VPS, index in SPS
Multilayer extension
Constraint: Vertical component of inter-view MVs ≤ 56 luma samples
⇒ Reduced memory access (within CTB-row above collocated CTB in
inter-view reference picture)
spec

Picture Parameter Set Extension
Speciﬁcation of spatial relation of layers
Scaled reference layer offset
Offset of luma pixels collocated to corner luma pixels of reference layer
region
Reference region offset
Offset of reference layer region corner luma pixels to corner luma pixels of
reference layer picture
Resample phase information
Spatial phase shift between reference layer and current layer
spec

PPS Extension: Color mapping table
Support of different color spaces, different layer bitdepths
Look-up table: map reference layer YCbCr values to target layer
Only increasing bitdepth between layers allowed
Figure from BoYeChRa15 [20]

Random Access Pictures and POC
IRAP pictures not necessarily aligned over layers
Different temporal resolution
Dependencies / independencies of layers
POC reset
POC reset indication
0: No reset
1: Only MSB reset
2: MSB and LSB reset
3: More information signaled
RASL, RADL, pictures with tid > 0, discardable pictures:
Identical POC reset handling for all pictures in AU required
Clean handling after potential bitstream extraction process
If at least IDR picture in AU: only reset types 1 or 2
If lid = 0 non-IDR: no reset types 1 or 2

Slice Segment Header Extension
Extension at beginning of slice segment header
Reserved bits before slice type syntax element (number from PPS)
Version 1: number of reserved bits equal to 0 spec
Version 2/3: Number is set to max 2 spec
Decoders shall accept any value
Activation of inter-layer prediction
Number of inter-layer reference pictures
Identiﬁers of inter-layer pred ref pictures ( lid)
Slice segment header extension spec
POC modiﬁcation information (if applicable)

Independent Non-Baselayer Rewriting
Capability of bitstream rewriting spec
Indication of independent non-baselayer decodability of layer lid at
temporal sub-layer tid
Proﬁle-level syntax structure
Formal processing steps
Removal of all NAL units with lid =target lid and not SPS, PPS, EOB
Removal of SPS, PPS NAL units if not lid =target lid or lid = 0
Removal of VPS NAL units
Removal of NAL units with tid >target tid
Rewriting of lid to lid = 0 for all NAL units
→ Decodable HEVC V1 bitstream

Annex F SEI messages
New SEI messages
plt Name Summary
160 Layers not present Indicate layers of VPS missing in bitstream [persists from SEI to change
SEI]
161 Inter-layer constrained tile sets Indicate decodability of tiles/tile sets[associated CLVS]
162 Bitstream partition nesting Carries SEI messages applicable to defined OLS [persistence depending
on nested SEIs]
163 Bitstream partition initial arrival time Initial arrival times for CPB operation [remainder of bitstream partition]
164 Sub-bitstream property Bit rate information for a sub-bitstream [CVS containing SEI]
165 Alpha channel information Control Alpha channel picture persistence [persistence specified by SEI]
166 Overlay information Identify alpha and content of overlay pictures [persistence specified by
SEI]
167 Temporal MV prediction constraints Indicate storage needs for MVs [persistence specified by SEI]
168 Frame-field information How to display pictures [associated AU]
Annex D SEI messages
Syntax of previous SEI messages unchanged
Semantics partly adapted to meet requirements of application with
multi-layer bitstreams

Outline
13 Range Extensions
Multiview Coding Concept
Multiview Proﬁles
Multiview SEI MEssages
Multiview Performance

Multiview Scenario
P1
picture plane C0 C2 cameras

Multiview Scenario
P1
picture plane C0 C2 camerasC1

Multiview Coding: View Arrangement
poc-2
poc-1
poc
C1
C2
C0

Multiview Concept: View-Id and View Order Index
vid = 1
vid = 2
vid = 0
voidx = 0
voidx = 1
voidx = 2
C1
C2
C0
Identiﬁcation of views
Relative arrangement of views for presentation: view ID vid
Ordering of views for coding: view order index voidx

MV-HEVC Layer Terminology
Base View, Independend View
View order index voidx = 0, lid = 0
Conforming to HEVC version 1
Independent from other views
Dependent Views
Predict from reference views

Multiview Main Profile
Multiview Main Profile: general profile idc = 6
Multiview Profile based on Main profile
Scalability ID: only view ID or AUX pictures allowed
Aux picture may carry depth information
(not used in decoding process for texture view)
Inter view MV vertical constraint flag active
→ vertical MV component restricted
All view pictures identical size; no ref local offset allowed
(all views aligned)
Number of ref layers (direct or indirect) ≤ 4
Number of pictures in DPB ≤ 8

Annex G SEI Messages
plt Name Summary
176 3D reference displays information Information on recommended display [persistence speciﬁed by SEI]
177 Depth representation information Information on depth and disparity ranges for auxiliary pictures
178 Multiview scene information Minimum and maximum disparity in AU [associated CVS]
179 Multiview acquisition information Information on aquisition environment [associated CVS]
180 Multiview view position SEI message View position from left to right [associated CVS]

Results of subjective evaluation from MV-HEVC veriﬁcation test
JCT3V-N1001 [21]: Comparison of
MVC: AVC-based multi-view video coding (in which the non-base view is
coded using inter-view prediction)
Simulcast HEVC, in which each view is coded independently
MV-HEVC: HEVC-based multi-view video coding (in which the non-base
view is coded using inter-view prediction)
⇒ Gain of approx. 33 % compared to HEVC simulcast
⇒ Gain of approx. 50 % compared to H.264|AVC MVC

Figure from JCT3V-N1001 [21]

Outline
13 Range Extensions
SHVC: Coding Concept
SHVC Tools
Scalable Proﬁles
SHVC Coding Performance

SHVC Layer Terminology
Base layer (BL)
Conforming to HEVC version 1
Lowest available layer in bitstream
Alternative option: external base layer (H.264|AVC)
Enhancement layers (EL)
Hierarchical organization
Dependent or independent from lower layers
Lower layers used for prediction in enhancement layer: reference layers

SHVC Scalable Layers
target layer (EL2)
direct reference layer (EL1)
indirect reference layer (BL)
poc-1
lid = 2
lid = 1
lid = 0
lid = 1
lid = 0
poc
lid = 2
lid = 1
lid = 0
poc+1
lid = 2

SHVC Specification
Approach
Simple high-level syntax approach (“simple solution”)
No changes on CTU level
Inter-layer prediction via reference picture list
Multi-loop coding: Each layer with motion compensation, loop filter, . . .
H.264|AVC- Scalable Video Coding (SVC) [22]
Single loop design
Motion compensation, loop filter only on target layer
Key picture concept: Limit drift if not all layers decoded
Claim: Performance only 10% below single layer coding
Issues
High description complexity
Performance claim only fulfilled by
very well optimized encoders

SHVC Tools: Resampling for Inter-Layer Prediction
Size and offset of reference layer picture to target layer picture define
scaling for interpolation
Reference layer sample location derivation (16th-sample precision)
Interpolation operation
16 dedicated interpolation filters
DCTIF design: integrate with motion compensation interpolation filters
Phase alignment specified in VPS
(a) Center alignment (b) Top left alignment

SHVC Tools: Resampling for Inter-Layer Prediction
−2 0 2 4
−16
0
16
32
48
64
Phase 0
−2 0 2 4
−16
0
16
32
48
64
Phase 1
−2 0 2 4
−16
0
16
32
48
64
Phase 2
−2 0 2 4
−16
0
16
32
48
64
Phase 3
−2 0 2 4
−16
0
16
32
48
64
Phase 4
−2 0 2 4
−16
0
16
32
48
64
Phase 5
−2 0 2 4
−16
0
16
32
48
64
Phase 6
−2 0 2 4
−16
0
16
32
48
64
Phase 7
−2 0 2 4
−16
0
16
32
48
64
Phase 8

SHVC Tools: Coding Standard Scalability
Conventional scalable scheme
Base layer HEVC version 1
Enhancement layer with different spatial resolution, different quality
Coding standard scalability
Base layer coded with different standard (H.264|AVC)
Enhancement layer using HEVC tools
Approach
Only use access to reconstructed samples (standard agnostic)
Possible alternative: Specify means to access information regarding
partitioning, prediction modes, motion vectors (not further pursued)

SHVC Tools: Coding Standard Scalability
poc-2
lid = 1
lid = 0
poc-1
lid = 1
lid = 0
poc
lid = 1
lid = 0
AVC BL
AVC BL
AVC BL
HEVC EL
HEVC EL
HEVC EL

Scalable Main Profiles
Scalable Main Profile and Scalable Main 10 Profile: general profile idc = 7
Based on Main and Main 10 profiles, respectively.
Chroma format 4:2:0 for all layers
Maximum of 16 different representation formats in VPSs
Aux pictures allowed (not used in decoding process)
Layer ID lid → Dependency ID did

SHVC Coding Performance
Results of subjective evaluation from SHVC veriﬁcation test
JCTVC-W1004 [23]: Comparison of SHVC vs. HEVC simulcast
Figure from JCTVC-W1004 [23]

Outline
13 Range Extensions
3D Video Coding Concept
3D Coding Tools
3D SEI Messages
3D Proﬁles
3D-HEVC Coding Performance

3D Video Coding
Multiple views with depth information
Video → “texture view”
Depth information: represent object distance from camera
Coding of texture and depth
Ability to render additional virtual views
Exploit depth information for improved compression efﬁciency
Auxiliary picture carrying depth
Decoding process changed for lid > 0 (not compatible to HEVC Version 1)
Dedicated depth coding tools
Tools exploiting the relation between texture and depth

3D Coding Scenario with Depth
0
s
z
s∆
P1
P2
f
u d1 d2
xd
zP1
zmax
zmin
d1
xd
=
f
zP1
;
d2
s∆ −xd
=
f
zP1
intercept theorem
d = d1 +d2 =
f ·xd
zP1
+
f ·(s∆ −xd)
zP1
⇒ d =
f ·s∆
zP1

3D Coding Scenario with Depth
s∆1 s∆2
0
s
z
P1
P2
zP1
zmax
zmin

Depth Coding
World coordinate distance zP1
between camera system and object
Range between znear and zfar
Representation by depth value vd
Reciprocal relation between depth and distance
Representation by vd = 0...vmax, e. g. vmax = 255 for 8 bit representation
depth array with same resolution as associated picture
→ depth value for each pixel

Depth Representation
0
s
z
s∆
P1
f
u d1 d2
xd
zP1
zmax
zmin vd = 255
vd = 0
vd
Depth value vd:
zP1
=
1
vd
255
· 1
znear
− 1
zfar
+ 1
zfar

Extension of multiview approach by depth information
Enabling view synthesis at the receiver side
Generation of additional views as needed by the (N-view) display
Adaptation of the depth impression
(e. g. ’correct’ stereo on large/small screens)
Figure by Fabian Jäger, RWTH internal report

Inter-component / -view dependencies → improved prediction
Disparity compensated prediction
Inter-view prediction of motion
Inter-view prediction of residual
View synthesis prediction
Figure by Fabian Jäger, RWTH internal report

Coding of Depth Information
Characteristics of depth map differ from texture
Homogeneous areas
Depth edges at (or close to) object boundaries

Coding of Dependent Views
Coding tools
Disparity compensated prediction
Inter-view motion prediction
Advanced residual prediction
Illumination compensation
View synthesis prediction
Depth-based block partitioning

Virtual Depth Estimate
Depth of reference view is available
Determine ’virtual depth’ for current view
PU grid for depth view
8×8 grid for texture view
Maximum depth value to derive disparity estimate
Use disparity estimate for prediction tools
Figure from JCT3V-K1003 [24]

Advanced Residual Prediction (ARP)
Determination of collocated block by estimated depth
Application of weighted residual as additional predictor for residual of
current block
Prediction from temporal reference or reference view
Figure from JCT3V-K1003,JCT3V-C0049 [24, 25]

Illumination Compensation
Callibration of cameras my vary over views
Compensated for changed by linear model

View Synthesis Prediction (VSP)
Use estimated depth of reference view
Warp texture of reference view to current view

Depth-Based Block Partitioning
Two motion partitions
Derive texture segmentation from depth information
Integration based on conventional motion information
Ja13 [26]

Coding of Depth Maps
Modiﬁcations to basic coding tools
Depth modelling
Segment-wise DC coding
Depth Loop-Up Table

Depth Modelling:Wedgelets
Explicit coding of non-rectangular partitioning

Depth Modelling: Contour Partitions
Shape of partitions estimated from corresponding luma component

Depth: Segment-Wise DC Coding
DC-like structure often observed in depth component
Coding of Constant Partition Values (CPV)
Single value for conventional HEVC intra prediction modes
two values for segmented blocks
Prediction from neighborhood
Coding using depth look-up table

Depth Intra Coding
Horizontal or vertical prediction only
Prediction value from center depth pixel at boundary
No residual coded

Annex I SEI Messages
SEI messages of Annexes D, F, and G apply
Additional SEI message: Alternative depth information
Global View and Depth (GVD)
Warp map

3D Main Profile
3D Main Profile: general profile idc = 8
3D Profile based on Main profile
Aux picture to carry depth information
Scalability ID: only view ID or depth layer allowed
Inter view MV vertical constraint flag active
→ vertical MV component restricted
All view pictures identical size; no ref local offset allowed
(all views aligned)
Number of pictures in DPB ≤ 8

Results of subjective evaluation from 3D-HEVC veriﬁcation test
JCT3V-M1001 [27]: Comparison of
MV-HEVC: anchor codec for multiview and depth without block-level
changes to decoding process
3D-HEVC: enhanced compression of multiview and depth with
modiﬁcations to block-level decoding process for dependent texture views
⇒ Gain of approx. 20 % for 3D-HEVC compared to MV-HEVC

Figure from JCT3V-M1001 [27]

Part V
Recent Extensions and Developments

Outline
Application Scenario
Screen Content Coding Tools
SCC Proﬁles
SCC Performance

Screen Content Coding (SCC)
Characteristics of screen content sequences differ
from camera captured video
Applications
Gaming
Remote desktops
. . .

Screen Content Coding Tools
Additional set of tools on top of RExt proﬁle
Intra block copy
Palette mode coding
Adaptive cross-component transformation
Adaptive motion vector resolution

Intra Block Copy
Displaced prediction within the same picture
Displaced prediction of current PB/CB from available parts of picture
Harmonized with inter prediction under study, JCTVC-T1000 [28]
Figure from JCTVC-S1014 [29]

Palette mode
Pixel value prediction instead of ’regular’ prediction and transform
Palette predictor, size conﬁgurable in SPS
Horizontal or vertical traversal scan of block
Differential coding of new palette entries

Adaptive cross-component transformation
SCC content in 4:4:4 color format – redundancy among components
Selectable cross-component transform for decorrelation

Adaptive motion vector resolution
Motion in screen content sequences
Integer pixel displacements not unlikely (e. g. when moving windows on a
desktop)
Quarter-sample motion vectors not needed in such cases
Solution
Analysis of the observed motion in the coded picture at encoder
Indication of quarter-sample or integer-sample motion vector precision

SCC Development
Amendment for Screen Content Coding finalized, publication to follow
Tool specification integrated into main body of HEVC specification text
⇒ Closer integration, usage with other extensions easier, like RExt
Seven dedicated profiles specified

SCC Profiles
Screen Content Coding Profiles: general profile idc = 9
Screen-Extended Main and Screen-Extended Main 10 profiles
Screen-Extended Main 4:4:4 and Screen-Extended Main 4:4:4 10 profiles
Screen-Extended High Throughput 4:4:4, Screen-Extended High
Throughput 4:4:4 10, and Screen-Extended High Throughput 14 profiles

SCC Performance
Comparison of HM-16.4 vs. HM-16.4+SCM-4.0 for SCC CTC sequences,
lossy coding
Figure from JCTVC-U0051 [30]

Outline

Free Viewpoint Television
Extension to 3D video coding
Restriction to co-planar camera arrangement relaxed
New application scenarios
Super-Multiview
Figure from N16128 [31]

Extension to 3D video coding
Restriction to co-planar camera arrangement relaxed
New application scenarios
Free Navigation

Status of FTV Development
MPEG Ad-Hoc Group on FTV
June 2015: Call for Evidence on Free-Viewpoint Television:
Super-Multiview and Free Navigation, N15348 [32]
February 2016: Results of the Call for Evidence on Free-Viewpoint
Television: Super-Multiview and Free Navigation, N16128 [31]
Outcome
1 proposal for Super-Multiview, 3 proposals for Free Navigation
Evaluation of results not yet conclusive

Outline

Wide Color Gamut / High Dynamic Range Coding
Colour space
Standard Dynamic Range (SDR) video
Contrast approx. 1000 : 0
ITU-R BT.709 colour space, [33]
High Dynamic Range (HDR) video
Contrast approx. 1000000 : 0
ITU-R BT.2020 colour space, [34]

Wide Color Gamut / High Dynamic Range Coding
Dynamic Range

Standardization Activities on HDR/WCG
ISO/IEC JTC 1/SC 29/WG 11 (MPEG)
02/2015: Requirements and Use Cases for HDR and WCG Content
Coding N15083 [36]
02/2015: Call for Evidence for HDR and WCG Video Coding N15084 [35]
06/2015: Test Results of Call for Evidence (CfE) for HDR and WCG Video
Coding N15350 [37]
10/2015: Assignment of HDR/WCG as work item to JCT-VC
JCT-VC
02/2016: Decision: no new proﬁle, no new coding tools
02/2016: Technical report: Conversion and coding practices for
HDR/WCG video (ISO/IEC TR 23008-14), N16063 [38]
06/2016: Conversion and Coding Practices for HDR/WCG Y’CbCr 4:2:0
Video with PQ Transfer Characteristics, JCTVC-X1017 [39]

HDR/WCG Conversion Practices: Scope
Figure from JCTVC-X1017 [39]

Outline

Future Video Coding Development
Standardization Requirements, e.g. MPEG N16359 [40]
50 % bitrate saving (30 % might sufﬁce on important usecases)
Tentative target timeline: new standard by 2020
Variety of content (captured, computer generated, mixed, 3D, various
projections (→ 360VR), multispectral, . . . )
Licensing model(s)?

Joint Video Exploration Team (JVET)
Exploration activities
Future Video Coding AhG in MPEG
Exploration and requirements discussions, workshops
Coding Efﬁciency Improvements AhG in VCEG
Key Technical Areas software, VCEG-AZ01 [41]
Foundation of JVET of MPEG and VCEG, similar to JCT-VC, Oct. 2015
Publicly available document site
http://phenix.int-evry.fr/jvet/
Publicly available software repository https:
//jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/

JVET Status
Joint Exploration Model (JEM)
JVET-A1001,JVET-B1001,JVET-C1001 [42, 43, 44]
Definition of Common Testing Conditions, JVET-B1010 [45]
Definition of Ad-Hoc Groups, JVET-B1000 [46]
Definition of Exploration Experiments, JVET-B1011 [47]
Collection of new test material
JVET-A1002,JVET-B1002,JVET-C1002 [48, 49, 50]

JEM: New Coding Tools – Block Partitioning
QTBT - Quadtree plus binary tree partitioning
Replacement of HEVC CU/PU/TU structure
→ Adaptive block size transforms, Wi03 [51]
Maximum CTU size 256×256
Figure from JVET-C1001 [44]

JEM: New Coding Tools – Intra Coding
67 intra prediction modes, 65 directional modes
Intra prediction mode coding: 6 MPM candidates
4-tap interpolation ﬁlters for directional predictors
Enhanced boundary prediction, . . .

JEM: New Coding Tools – Inter Coding
Sub-PU level motion vector prediction
→ Advanced temporal motion vector prediction (ATMVP)
Adaptive motion vector resolution (AMVR)
Overlapped block motion compensation (OBMC) at left and top block
boundary possible
Afﬁne motion predition (MVs for 4×4 sub-blocks)
Local illumination compensation (LIC), Pattern-matched motion vector
derivation, bi-directional optical ﬂow (BIO, pixel dense)

JEM: New Coding Tools – Transform Coding
More core transforms: additional
DST-VII, DCT-VIII, DST-I and DCT-V
Mode dependent non-separable
secondary transforms
Signal dependent transform (SDT)

JEM: New Coding Tools – Loop Filters
Re-introduction of Adaptive Loop Filters (ALF)

JEM: New Coding Tools – Entropy Coding
Modiﬁed context modeling for transform coefﬁcients
Multi-hypothesis probability estimation with context-dependent updating
speed
Adaptive initialization for context models

JEM: Performance

Part VI
Summary and Outlook

Outline
Summary
Video Coding Development: Outlook
23 Books and Tools
24 Resources
25 Acronyms
26 References

Summary
Coding tools in HEVC version 1
Range extensions
Multilayer extensions
- Multiview
- Scalability
- 3D
Screen Content coding
High Dynamic Range and Wide Color Gamut
Future Video Coding - Status of JVET

Video Coding Development: Outlook
Towards Immersive Media
360 Virtual Reality
JVET AhG8: 360 video test conditions, JVET-C1000d [53]
MPEG FTV AhG
Lightfields
MPEG AhG on lightfield compression
Joint AhG (JPEG and MPEG) for digital representations of light/sound
fields for immersive media applications, WG1N72033 [54]
Left: 1st frame of BearAttacks sequence, Nokia. Right:
https://s3.amazonaws.com/lytro-corp-assets/blog/camera_array.png

Outline
23 Books and Tools
24 Resources
25 Acronyms
26 References

Further Reading
Books published
Wien, Mathias. High Efficiency Video Coding – Coding Tools and
Specification, Springer Berlin Heidelberg, 2015.
Sze, Vivienne, Budagavi, Madhukar, Sullivan, Gary J. (Eds.). High
Efficiency Video Coding (HEVC) – Algorithms and Architectures, Springer
International Publishing, 2014.

YUView Player
YUV player supporting display of statistics information
https://github.com/IENT/YUView
Play back of raw YUV ﬁles, playlists
Comparison / side-by-side views of two ﬁles
Visualization of statistics in xml-format
HEVC bitstream decoding and mode visualization

Outline
23 Books and Tools
24 Resources
25 Acronyms
26 References

HEVC Resources (1/2)
JCT-VC Mailing List
http://mailman.rwth-aachen.de/mailman/listinfo/jct-vc
JCT-VC Document Repository
http://phenix.it-sudparis.eu/jct/index.php
http://ftp3.itu.int/av-arch/jctvc-site/
JCT-VC HM Software Repository
SVN:
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware
Trac:
https://hevc.hhi.fraunhofer.de/trac/hevc

HEVC Resources (2/2)
JCT-3V Mailing List
http://mailman.rwth-aachen.de/mailman/listinfo/jct-3v
JCT-3V Document Repository
http://phenix.int-evry.fr/jct-3v/
http://wftp3.itu.int/av-arch/jct3v-site/
JCT-3V HM Software Repository
SVN:
https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware/
Trac:
https://hevc.hhi.fraunhofer.de/trac/3d-hevc

JVET Resources
JVET Mailing List
http://mailman.rwth-aachen.de/mailman/listinfo/jvet
JVET Document Repository
http://phenix.int-evry.fr/jvet/
http://wftp3.itu.int/av-arch/jvet-site/
JVET JEM Software Repository
SVN:
https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/
Trac:
https://hevc.hhi.fraunhofer.de/trac/jem/

Outline
23 Books and Tools
24 Resources
25 Acronyms
26 References

Acronyms
3D-HEVC – 3D high efficiency video coding
AAC – Advanced audio coding
AAP – Alternative Approval Process
AHG – Ad-hoc group
AI – All intra (JCT-VC CTC)
AIF∗ – Adaptive interpolation filter
ALF∗ – Adaptive loop filter
AMD – Amendment (for ISO standard)
AMP – Asymmetric motion partitioning
AMVP – Advanced motion vector prediction
ANG∗ – Angular intra prediction
ASO – Arbitrary slice ordering (H.264|AVC)
AP – Aggregation packet
AU – Access unit
AUD – Access unit delimiter
AVC – Advanced video coding
BDSNR – Bjøntegaard Delta PSNR
BD-rate – Bjøntegaard Delta rate
BL – Base layer
BLA – Broken link access
BO – SAO band offset

Acronyms II
BoG – Break-out group
BP – Bandpass [ﬁlter]
BPB – Bitstream partition buffer
CABAC – Context adaptive binary arithmetic coding
CAVLC – Context adaptive variable length coding
CB – Coding block
CBF∗ – Coded block ﬂag
CBP – Coded block pattern (H.264|AVC)
CBR – Constant bitrate
CD – Committee draft
CD – Compact disk
CE – Core experiment
CfE – Call for evidence
CfP – Call for proposals
CIE – Commission internationale de l’éclairage
CIF – Common interchange format 352×288
CPB – Coded picture buffer
CRA – Clean random access
CRFB∗ – Compressed reference frame buffer
CRT – Cathode ray tube
CS – Constraint set (in CfP)

Acronyms III
CSS – Coded slice segment
CT – Collaborative team
CTB – Coding tree block
CTC – Common testing conditions
CTU – Coding tree unit
CTX – CABAC context
CU – Coding unit
CVS – Coded Video Sequence
CVSG – Coded Video Sequence Group
DAM – Draft amendment
DC – Direct current
DCT – Discrete cosine transform
DST – Discrete sine transform
DON – Decoding order number
DPB – Decoded picture buffer
DU – Decoding unit
DCT – Discrete cosine transform
DCTIF – DCT interpolation filter
DIF∗ – DCT interpolation filter, syn. DCTIF
DIF∗ – Directional interpolation filter
DIS – Draft international standard

Acronyms IV
DLP∗ – Decodable leading picture, syn. RADL
DMVD – Decoder side motion vector derivation
DPB – Decoded picture buffer
DPCM – Differential Pulse Code Modulation
DST – Discrete sine transform
DU – Decoding unit
DUT∗ – Directional uniﬁed transform
DVD – Digital Versatile Disk
EG – Exp-Golomb code
EL – Enhancement layer
EO – SAO edge offset
EOB – End of bitstream
EOS – End of sequence
EOTF – Electro-optical transfer function
FCD – Final committee draft
FD – Filler data
FDAM – Final draft amendment
FDIS – Final draft international standard
FLC – Fixed length code
FMO – Flexible macroblock ordering (H.264|AVC)
fps – Frames per second

Acronyms V
FRExt – Fidelity range extensions
FU – Fragmentation unit
GBR – Green red blue, color format (see also RGB)
GOP – Group of pictures
GRD – Gradual Decoder Refresh (H.264|AVC)
HBPS – Hypothetical bitstream partition scheduler
HRD – Hypothetical reference decoder
HD – High definition
HDR – High dynamic range
HDTV – High definition television
HEVC – High efficiency video coding
HM – HEVC test model
HP – Highpass [filter]
HRD – Hypothetical reference decoder
HSS – Hypothetical stream scheduler
HTM – 3D-HEVC test model
HVC∗ – High performance video coding (name of HEVC pre-project in MPEG)
HVS – Human visual system
IBDI∗ – Internal bit-depth increase
IDCT – Inverse discrete cosine transform
IDR – Instantaneous decoder refresh

Acronyms VI
IEC – International Electrotechnical Commission
IEEE – Institute of electrical and electronics engineers
IRAP – Intra random access point, see also RAP
IS – International standard
ISDN – Integrated services digital network
ISO – International Standardization Organization
ITU – International telecommunication union
JCT – Joint collaborative team (of ISO and ITU)
JCT-VC – Joint collaborative team on video coding
JCT-3V – Joint collaborative team on 3D video coding extension development
JEM – Joint exploration model (JVET test model)
JM – Joint model (AVC test model)
JPEG – Joint photographic experts group
JTC – Joint technical committee
JVT – Joint video team
JVET – Joint video exploration team
KLT – Karhunen-Loève transform
KTA – Key Technical Areas (H.264 based exploration software of VCEG)
LCTB∗ – Largest coded tree block, syn. CTB
LCTB∗ – Largest coded tree unit, syn. CTU
LCU∗ – Larges coding unit, syn. CTU

Acronyms VII
LD – Low delay (JCT-VC CTC)
LP – Lowpass [filter]
LPS – Least probably symbol
LSB – Least Significant Bit
MAC – Multiplexed analog components
MBAFF – Macroblock adaptive frame/field coding (H.264|AVC)
MC – Motion compensation
MDDT∗ – Mode dependent directional transform
ME – Motion estimation
Merge – Merge Mode (MV prediction)
MMCO – Memory management control operation (H.264|AVC)
MP3 – MPEG-2 audio layer III
MPEG – Moving picture experts group
MPM – Most probable mode
MPS – Most probably symbol
MRST – Multiple RTP streams over a single media transport
MRMT – Multiple RTP streams over multiple media transports
MSB – Most Significant Bit
MSE – Mean squared error

Acronyms VIII
MV – Motion vector
MVC – Multiview video coding (H.264|AVC)
MVD – Motion vector difference
MV-HEVC – Multiview high efﬁciency video coding
NAL – Network abstraction layer
MANE – Media aware network element
MB – Macroblock (H.264|AVC)
NALU – NAL unit
NB – National body (in ISO)
NGVC∗ – Next generation video coding (name of HEVC pre-project in VCEG)
NTSC – National television systems committee
NUH – NAL unit header
NUT – NAL unit type
PACI – Payload Content Information (packet)
OLS – Output layer set
PAFF – Picture adaptive frame/ﬁeld coding (H.264|AVC)
PAL – Phase alternating line
PB – Prediction block
PCM – Pulse code modulation
PDAM – Proposed draft amendment
POC – Picture order count

Acronyms IX
PPS – Picture parameter set
PSNR – Peak signal to noise ratio
PU – Prediction unit
QCIF – Quarter common intermediate format 176×144
QHD – Quarter High Deﬁnition 960×540
QP – Quantization parameter
RA – Random access (JCT-VC CTC)
RADL – Random access decodable leading picture
RAP – Random access point
RASL – Random access skipped leading picture
RBSP – Raw byte sequence payload
RD – Rate-distortion
RDO – Rate-distortion optimization
RDOQ – Rate-distortion optimized quantization
RExt – HEVC Range extensions
RGB – Red green blue, color format
RPL – Reference picture list
RPS – Reference picture set
RQT – Residual quad-tree
RTP – Real-time transport protocol
SAD – Sum of absolute differences

Acronyms X
SAO – Sample adaptive offset
SAR – Sample aspect ratio
SATD – Sum of absolute transformed differences
SC – Sub-committee
SCC – Screen Content Coding
SD – Standard deﬁnition (TV)
SDH – Sign data hiding
SDP – Session description protocol
SECAM – Séquentiel couleur à mémoire
SEI – Supplemental enhancement information
SG – Study group
SODB – String of data bits
SHVC – Scalable high efﬁciency video coding
SODB – String of data bits
SOP – Structure of pictures
SPS – Sequence parameter set
SRST – Single RTP stream over a single media transport
SSD – Sum of squared differences
SSE – Sum of squared error
STSA – Stepwise temporal sub-layer access
SVC – Scalable video coding (H.264|AVC)

Acronyms XI
TB – Transform block
TE – Tool experiment
TFD∗ – Tagged for discard [picture], syn. RASL
TSA – Temporal sub-layer access
TMuC – Test model under consideration
TMVP – Temporal motion vector predictor
TR – Truncated Rice [binarization]
TSA – Temporal sub-layer access
TSB – Transform sub-block
TSCI – Temporal scalability control information
TU – Transform unit
TV – Television
UHD – Ultra High Deﬁnition
VBR – Variable bitrate
VCEG – Visual coding experts group
VCL – Video coding layer
VGA – Video Graphics Array 640×480
VLC – Variable length code
VPS – Video parameter set
VUI – Video usability information
WCG – Wide colour gamut
WD – Working draft
WG – Working group
WPP – Wavefront parallel processing

Acronyms XII
XGA – Extended Graphics Array 1024×768
XYZ – XYZ color space, also color format
YCbCr – Color format with luma and two chroma components
YUV – XYZ color format
(Acronyms marked with ∗ are deprecated)

Outline
23 Books and Tools
24 Resources
25 Acronyms
26 References

References I
[1] Charles Poynton. Digital Video and HD: Algorithms and Interfaces. Waltham, MA, USA: Morgan Kaufman Publishers, 2012.
[2] High efficiency video coding. ITU-T, Apr. 2013. URL: http://www.itu.int/rec/T-REC-H.265/en.
[3] Gary J. Sullivan et al. “Overview of the High Efficiency Video Coding (HEVC) Standard”. In: IEEE Transactions on Circuits and
Systems for Video Technology 22.12 (Dec. 2012), pp. 1649–1668. DOI: 10.1109/TCSVT.2012.2221191.
[4] Rickard Sjöberg et al. “Overview of HEVC high-level syntax and reference picture management”. In: IEEE Transactions on Circuits and
[5] Advanced video coding for generic audiovisual services. ITU-T, Jan. 2012. URL:
http://www.itu.int/rec/T-REC-H.264/en.
[6] Stephan Wenger. “H.264/AVC over IP”. In: IEEE Transactions on Circuits and Systems for Video Technology 13.7 (July 2003),
pp. 645–656. DOI: 10.1109/TCSVT.2003.814966.
[7] Thomas Wiegand et al. “Overview of the H.264/AVC Video Coding Standard”. In: IEEE Transactions on Circuits and Systems for Video
Technology 13.7 (July 2003), pp. 560–576.
[8] Frank Bossen. Common test conditions and software reference configurations. Doc. JCTVC-K1100. Shanghai, CN, 11th meeting: Joint
Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Oct. 2012.
[9] Thomas Schierl et al. “System Layer Integration of HEVC”. In: IEEE Transactions on Circuits and Systems for Video Technology 22.12
(Dec. 2012), pp. 1871–1884. DOI: 10.1109/TCSVT.2012.2223054.
[10] Jordi Ribas-Corbera, Philip A. Chou, and Shankar Regunathan. “A Generalized Hypothetical Reference Decoder for H.264/AVC”. In:
IEEE Transactions on Circuits and Systems for Video Technology 13.7 (July 2003), pp. 674–687. DOI:
10.1109/TCSVT.2003.814965.
[11] Jani Lainema et al. “Intra Coding of the HEVC Standard”. In: IEEE Transactions on Circuits and Systems for Video Technology 22.12
(Dec. 2012), pp. 1792–1801. DOI: 10.1109/TCSVT.2012.2221525.
[12] Philipp Helle et al. “Block Merging for Quadtree-based Partitioning in HEVC”. In: IEEE Transactions on Circuits and Systems for Video
Technology 22.12 (Dec. 2012), pp. 1720–1731. DOI: 10.1109/TCSVT.2012.2223051.

References II
[13] Madhukar Budagavi et al. “Core Transform Design in the High Efficiency Video Coding (HEVC) Standard”. In: IEEE Journal of Selected
Topics in Signal Processing tbd (2013), tbd. DOI: 10.1109/JSTSP.2013.2270429.
[14] Andrey Norkin et al. “HEVC Deblocking Filter”. In: IEEE Transactions on Circuits and Systems for Video Technology 22.12 (Dec. 2012),
pp. 1746–1754. DOI: 10.1109/TCSVT.2012.2223053.
[15] Chih-Ming Fu et al. “Sample Adaptive Offset in the HEVC Standard”. In: IEEE Transactions on Circuits and Systems for Video
Technology 22.12 (Dec. 2012), pp. 1755–1764. DOI: 10.1109/TCSVT.2012.2221529.
[16] Detlev Marpe, Heiko Schwarz, and Thomas Wiegand. “Context-Based Adaptive Binary Arithmetic Coding in the H.264/AVC Video
Compression Standard”. In: IEEE Transactions on Circuits and Systems for Video Technology 13.7 (July 2003), pp. 620–637. DOI:
10.1109/TCSVT.2003.815173.
[17] Vivienne Sze and Madhukar Budagavi. “High Throughput CABAC Entropy Coding in HEVC”. In: IEEE Transactions on Circuits and
[18] Wei Pu et al. RCE1: Descriptions and Results for Experiments 1, 2, 3, and 4. Doc. JCTVC-O0202. Geneva, CH, 15th meeting: Joint
Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Oct. 2013.
[19] David Flynn et al. “Overview of the Range Extensions for the HEVC Standard: Tools, Profiles, and Performance”. In: IEEE Transactions
on Circuits and Systems for Video Technology 26.1 (Jan. 2016), pp. 4–19. DOI: 10.1109/TCSVT.2015.2478707.
[20] Jill Boyce et al. “Overview of SHVC: Scalable Extensions of the High Efficiency Video Coding (HEVC) Standard”. In: IEEE Transactions
on Circuits and Systems for Video Technology PP.99 (2015). DOI: 10.1109/TCSVT.2015.2461951.
[21] Vittrorio Baroncini, Karsten Mueller, and Shinya Shimizu. MV-HEVC Verification Test Report. Tech. rep. JCT3V-N1001. San Diego, CA,
USA, 14th meeting, Feb. 2016. URL: http://phenix.int-evry.fr/jct-
3v/doc_end_user/documents/14_San%20Diego/wg11/JCT3V-N1001-v2.zip.
[22] Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard”.
In: IEEE Transactions on Circuits and Systems for Video Technology 17.9 (Sept. 2007), pp. 1103–1120. DOI:
10.1109/TCSVT.2007.905532.

References III
[23] Yan Ye, Vittrorio Baroncini, and Ye-Kui Wang. SHVC verification test report. Doc. JCTVC-W1004. San Diego, CA, USA, 23rd meeting:
Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Feb. 2016. URL: http:
//phenix.it-sudparis.eu/jct/doc_end_user/documents/23_San%20Diego/wg11/JCTVC-W1004-v1.zip
(visited on 07/03/2017).
[24] Ying Chen et al. Test Model 11 of 3D-HEVC and MV-HEC. Tech. rep. JCT3V-K1003. Geneva, CH, 11th meeting, Feb. 2015.
[25] Li Zhang et al. 3D-CE4: Advanced residual prediction for multiview coding. Tech. rep. JCT3V-C0049. Geneva, CH, 3rd meeting, Jan.
2013. URL:
http://phenix.int-evry.fr/jct-3v/doc_end_user/documents/3_Geneva/wg11/JCT3V-C0049-v2.zip.
[26] Fabian Jäger. “Depth-based Block Partitioning for 3D Video Coding”. In: Proc. of International Picture Coding Symposium PCS ’13.
San Jose, USA: IEEE, Piscataway, Dec. 2013.
[27] Vittrorio Baroncini, Karsten Mueller, and Shinya Shimizu. 3D-HEVC Verification Test Report. Tech. rep. JCT3V-M1001. Geneva, CH,
13th meeting, Oct. 2015. URL: http://phenix.int-evry.fr/jct-
3v/doc_end_user/documents/13_Geneva/wg11/JCT3V-M1001-v2.zip (visited on 07/03/2016).
[28] Gary J. Sullivan and Jens-Rainer Ohm. Meeting report of the 20th meeting of the Joint Collaborative Team on Video Coding (JCT-VC),
Geneva, CH, 10–1x Feb. 2015. Doc. JCTVC-T1000. 20th meeting, Geneva, CH: Joint Collaborative Team on Video Coding (JCT-VC)
of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Feb. 2015.
[29] Rajan Joshi et al. Screen content coding test model 3 (SCM 3). Doc. JCTVC-S1014. 19th meeting, Strasbourg, F: Joint Collaborative
Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Oct. 2014.
[30] Bin Li, Jizheng Xu, and Gary Sullivan. Comparison of Compression Performance of HEVC Test Model 16.4 and HEVC Screen Content
Coding Extensions Test Model 4 with AVC High 4:4:4 Predictive profile. Doc. JCTVC-U0051. 21st meeting, Warsaw, PL: Joint
Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, June 2015.
[31] Vittorio Baroncini, Masayuki Tanimoto, and Olgierd Stankiewicz. Results of the Call for Evidence on Free-Viewpoint Television:
Super-Multiview and Free Navigation. Tech. rep. N16128. San Diego, CA, USA, 114th meeting: ISO/IEC JTC1/SC29/WG11, Feb.
2016. URL: http://phenix.it-
sudparis.eu/mpeg/doc_end_user/documents/114_San%20Diego/wg11/w16128-v2-w16128.zip.

ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HEVC V3 and Coming Developments

ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HEVC V3 and Coming Developments

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (19)

Similar to ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HEVC V3 and Coming Developments

Similar to ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HEVC V3 and Coming Developments (20)

Recently uploaded

Recently uploaded (17)

ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HEVC V3 and Coming Developments