SlideShare ist ein Scribd-Unternehmen logo
1 von 164
Downloaden Sie, um offline zu lesen
Trends and Recent Developments in Video Coding
Standardization
ICME 2018 Tutorial, San Diego, 23.07.2018
Jens-Rainer Ohm Mathias Wien
Institute of Communication Engineering Institute of Imaging and Computer Vision
RWTH Aachen University, Germany RWTH Aachen University, Germany
ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
2
1. Introduction and history of video coding standardization (Jens)
2. Source formats and resolutions (Mathias)
3. State of the art in video compression (Mathias)
4. Versatile Video Coding (Jens)
5. Exploratory trends and perspectives (Jens)
6. Coding tools for multi-camera captures (Jens)
7. Summary and outlook
Outline
Part I: Introduction and history of video coding
standardization
ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization
Jens-Rainer Ohm Mathias Wien
Institute of Communication Engineering Institute of Imaging and Computer Vision
RWTH Aachen University, Germany RWTH Aachen University, Germany
ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
4
Video coding standardization organisations
• ISO/IEC MPEG = “Moving Picture Experts Group”
(ISO/IEC JTC 1/SC 29/WG 11 = International Standardization Organization and International Electrotechnical Commission,
Joint Technical Committee 1, Subcommittee 29, Working Group 11)
• ITU-T VCEG = “Video Coding Experts Group”
(ITU-T SG16/Q6 = International Telecommunications Union – Telecommunications Standardization Sector (ITU-T,
a United Nations Organization, formerly CCITT),
Study Group 16, Working Party 3, Question 6)
• JVT = “Joint Video Team” collaborative team of MPEG & VCEG, responsible for developing AVC
(discontinued in 2009)
• JCT-VC = “Joint Collaborative Team on Video Coding” team of MPEG & VCEG , responsible for
developing HEVC (established January 2010)
• JVET = “Joint Video Experts Team” exploring potential for new technology beyond HEVC (established Oct.
2015 as Joint Video Exploration Team, renamed Apr. 2018)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
5
History of international video coding standardization (1985  2020)
H.263/+/++
(1995-2000+)
MPEG-4
Visual
(1998-2001+)
MPEG-1
(1993)
ISO/IECITU-T
H.120
(1984-1988)
H.261
(1990+)
H.262 / 13818-2
(1994/95-1998+)
H.264 / 14496-10
AVC
(2003-2018+)
H.265 / 23008-2
HEVC
(2013-2018+)
Videotelephony
Computer
SD HD 4K UHD
(Advanced Video Coding
developed by JVT)
(High Efficiency Video
Coding developed by
JCT-VC)
(MPEG-2)
H.26x / 23090-3
VVC
(2020-...)
8K, 360, ...
(Versatile Video Coding
to be developed
by JVET)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
6
The scope of video standardization
• Only Specifications of the Bitstream, Syntax, and Decoder are standardized:
• Permits optimization beyond the obvious
• Permits complexity reduction for implementability
• Provides no guarantees of quality
Pre-Processing Encoding
Source
Destination
Post-Processing
& Error Recovery
Decoding
Scope of Standard
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
7
Hybrid Coding Concept
Basis of every standard since H.261
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
8
Input Signal
Current Stage
Used since early days of video
compression standards, e.g.
H.261, MPEG-1/-2/-4, H.263, AVS,
H.264/AVC, HEVC and also in
most proprietary codecs (VC1, VP8 etc.)
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
9
Input Signal DCT
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
10
QuantizedInput Signal DCT
010011101001…
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
11
QuantizedInput Signal DCT
010011101001…
Inverse DCT
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
12
Next Input Signal Reconstruction
vs.
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
13
Next Input Signal Reconstruction
010011101001…
vs.
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
14
Input Signal MC Prediction Residual
– =
Residual w/o MC
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
15
Residual DCT
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
16
Residual DCT Quantized
010011101001…
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
17
Residual DCT Quantized Inverse DCT
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
18
Residual MC Prediction Reconstruction
+ =
usw.
Hybrid video coding concept
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
19
Performance history of standard generations
0 100 200 300
28
30
32
34
36
38
40
bit rate (kbit/s)
PSNR
(dB)
Foreman
10 Hz, QCIF
100 frames
HEVC
AVC
H.262/MPEG-2 H.261H.263 +
MPEG-4 Visual
JPEG
35
Bit-rate Reduction: 50%
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
20
• Improvements of motion compensation
 Variable partitions & merged partitions
 Flexible frame referencing & combined prediction
 Sub-sample precision and high performance sub-sample interpolation
 More efficient vector prediction & coding, supporting large vector ranges
• Improvements of 2D coding
 Efficient intra prediction and intra mode coding
 Design of transform bases and variable transform block sizes
• Loop filtering for artifact reduction
 Deblocking, sample-adaptive offset
• Improvements of entropy coding
 Flexible binarization of syntax elements
 Arithmetic coding
 Adaptation and usage of context information
• These are coupled with encoder optimization
 Rate distortion optimization – spend bits where they give best benefit in terms of distortion reduction
 Adaptive rate control and perceptually tuned quantization
What made this happen over the years?
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
21
• Group of Picture (GoP) structures allowing random access (used since MPEG-1)
• Bi-(directional) prediction for better compression performance (used since MPEG-1)
Reference picture structures
B B B B B B B
previous picture references
......
1 2 3 4 5 6 7
Uni-directional prediction
I|P B B P B B P
pre-previous picture references
Bi-directional prediction
......
1 2 3 4 5 6 7
I|P
a b
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
22
• Hierarchical prediction structures for frame rate scalability and further improved compression performance
(used in AVC and HEVC)
Reference picture structures
1P I /P00I /P00 3P 3P3P 2P 3P3P 3P3P2P 2P 1P I /P002P 3P
1B I /P00I /P00 3B 3B3B 2B 3B3B 3B3B2B 2B 1B I /P002B 3B
L prediction0
L prediction1
L prediction2
L prediction3
L prediction0
L prediction1
L prediction2
L prediction3
a
b
a
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
23
 Coder control is a non-normative part of video codecs
 Choose coding parameters at encoder side
“What part of the video signal should be coded using what method and parameter settings?”
 Constrained problem:
 Unconstrained Lagrangian formulation:
 l depends on slope of rate-distortion function:
 Small value: High rate, low distortion
 High value: Low rate, high distortion
 Can be applied in motion parameter estimation, mode decision, transform coefficient
quantization, … - typically set relationship between l and QP value
D - Distortion
R - Rate
p - Parameter Vector
 opt argmin ( ) ( )D Rl  
p
p p p
opt Targetargmin ( ) s.t. ( )D R R 
p
p p p
Coder control
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
24
• Video is continually increasing by resolution
 HD existing, UHD (4Kx2K, 8Kx4K) appearing
 Mobile services going towards HD/UHD
 Stereo, multi-view, 360° video
• Devices available to record and display ultra-high resolutions
 Becoming affordable for home and mobile consumers
• Video has multiple dimensions to grow the data rate
 Frame resolution, Temporal resolution
 Color resolution, bit depth
 Multi-view
 Visible distortion still an issue with existing networks
• Necessary video data rate grows faster than feasible network transport capacities
 Better video compression (than current HEVC) needed in next decade, even after availability of 5G
Motivation for improved video compression
Part II: Source formats and resolutions
ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization
Jens-Rainer Ohm Mathias Wien
Institute of Communication Engineering Institute of Imaging and Computer Vision
RWTH Aachen University, Germany RWTH Aachen University, Germany
ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
26
• Sequence of pictures successively captured or rendered
• Progressive and interlaced formats
• Picture rate measured in pictures per second, unit Hertz (Hz)
• Minimum picture rate at 24Hz for impression of fluent motion [Po12]
 Standard Definition TV at 50/60Hz interlaced
 High Definition (HD) video at 50/60Hz progressive
 Ultra HD (UHD) video up to 120Hz
 Up to 300Hz considered
Structure of a Video Sequence
[Po12] Charles Poynton. Digital Video and HD: Algorithms and Interfaces. Waltham, MA, USA: Morgan Kaufman Publishers, 2012.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
27
• Picture
 Set of arrays or a single array of samples with intensity values
 Monochrome picture: single intensity array
 Color video: usually three intensity arrays
⇒ three color components representing the color
 Color sample (all three components) also referred to as a pixel
(derived from picture element, sometimes also denoted as pel)
 Optional alpha channel to indicate opaqueness (transparency) for mixing applications
Pictures, Frames, and Fields
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
29
• Picture
 Set of pixel lines, defined number of pixels per line
 Shape of pixels not necessarily square, depends on picture format
 Examples:
Pixel Shape
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
30
• Human visual system less sensitive to color than to structure and texture
⇒ full resolution luma, lower resolution chroma
• Chroma sub-sampling types commonly specified by relation between
number of luma an chroma samples
YCbCr Y : X1 : X2
• With Y: number of luma pixels
• Sub-sampling format of chroma components specified by X1 and X 2
• X1 : horizontal sub-sampling
• X2 = 0: vertical sub-sampling identical to horizontal sub-sampling
• X2 = X1 : no vertical sub-sampling
Chroma Sub-Sampling
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
31
• Color Impression
 Visible range of spectrum range from
380 nm to 780 nm
 Impression of color: intensity density
distribution over the visible spectral range
 Colors corresponding to single wavelength:
 spectral colors or primary colors
 Human visual system has three color receptors (cone cells)
 Maximum sensitivity in the wavelength areas of red, green and blue
 Additional ’gray-scale’ receptors (rod cells): responsive in low lighting conditions
Representation of Color
Picture source: Wikipedia, artwork by Holly Fischer
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
32
• Visual perception split into perception of brightness (light and dark) and
chromaticity (color impression)
 Brightness is driven by summarized intensity of observed spectrum
 Color impression is driven by shape of intensity distribution
• Functional expression to represent perceived color by a mathematical
description first standardized in the CIE 1931 Standard Observer
• Color as a point in a three-dimensional XYZ space
• X,Y,Z values derived from the observed spectrum
• Three color matching functions
The CIE Standard Observer
CIE: Commission internationale de l’éclairage, http://www.cie.co.at
Standard Observer specified in ISO11664-1
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
33
•
The CIE Standard Observer
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
34
• Normalization for expression of the chromaticity independent observed brightness
• Since , therefore
• Chromaticity specified by (x,y)-pair
• Definition of a standardized white point, e.g. ’white C’, ’white D65’
The CIE Standard Observer
[Po12] Charles Poynton. Digital Video and HD: Algorithms and Interfaces. Waltham, MA, USA: Morgan Kaufman Publishers, 2012.
[Hu04] Robert G.W. Hunt. The Reproduction of Colour. 6th ed. Chichester, West Sussex, England: Whiley-VCH, 2004.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
35
• Colour space
 Standard Dynamic Range (SDR) video
 Contrast approx. 1000 : 0
 ITU-R BT.709 colour space
 High Dynamic Range (HDR) video
 Contrast approx. 1000000 : 0
 ITU-R BT.2100 colour space
Color Spaces: Standard and Hight Dynamic Range / Wide Color Gamut
Figure from N1508: Ajay Luthra, Edouard Francois, and Walt Husak (Eds.). Requirements and Use Cases for HDR and
WCG Content Coding. Doc. N15084. Geneva, CH, 111th meeting: MPEG, Feb. 2015.
ITU-R BT.709: Parameter values for the HDTV standards for production and international programme exchange. ITU-R,
Apr. 2004. URL: http://www.itu.int/rec/R-REC-BT.709/en .
ITU-R BT.2020: Parameter values for ultra-high definition television systems for production and international programme
exchange. ITU-R, Oct. 2015. URL : http://www.itu.int/rec/R-REC-BT.2020/en
ITU-R BT.2100: Image parameter values for high dynamic range television for use in production and international
programme exchange. ITU-R, Jun. 2017. URL: http://www.itu.int/rec/R-REC-BT.2100-1-201706-I/en
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
36
Color Spaces: Standard and Hight Dynamic Range / Wide Color Gamut
Figure from: Ajay Luthra, Edouard Francois,
and Walt Husak (Eds.). Requirements and Use
Cases for HDR and WCG Content Coding. Doc.
N15084. Geneva, CH, 111th meeting: MPEG,
Feb. 2015.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
37
HDR/WCG Conversion Practices: Scope
ITU-T H Suppl. 15 | ISO/IEC TR 23008-14, Conversion and Coding Practices for HDR/WCG Y′CbCr 4:2:0 Video with PQ Transfer Characteristics.
ITU-T H Suppl. 18 | ISO/IEC TR 23008-15, Signalling, backward compatibility and display adaptation for HDR/WCG video coding.
Figure from: Jonatan Samuelsson et al.: Conversion and Coding Practices for HDR/WCG Y′CbCr 4:2:0 Video with PQ Transfer Characteristics (Draft 4). Doc. JCTVC-Z1017. 26th meeting,
Geneva, CH: Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Jan 2017.
Part III: State of the Art in Video Compression
ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization
Jens-Rainer Ohm Mathias Wien
Institute of Communication Engineering Institute of Imaging and Computer Vision
RWTH Aachen University, Germany RWTH Aachen University, Germany
ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
39
Comparison of HEVC and the Joint Exploration Test Model (JEM) of JVET
• A glimpse on high-level syntax (HEVC)
• Coding structures
• Walk-through of the coding loop
 Intra coding
 Inter coding
 Transform coding
 Loop filters
 Entropy coding
Outline and Concept for Part III
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
40
• Coded Video Sequence (CVS)
 Starts with a random access point (intra-coded picture)
 One or more CVSs in a bitstream
→ Coded Video Sequence Group (CVSG)
• Network Abstraction Layer (NAL)
 Encapsulation of coded video sequence for transport and storage
 Video coding layer (VCL) NAL units
 Information directly for reconstruction of samples and pictures
 Non-VCL NAL units
 Parameter sets
 Supplemental enhancement information
 ...
Network Abstraction Layer and Video Coding Layer
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
41
• RBSP: Raw byte sequence payload
 Sequence of bytes comprising the coded NAL unit payload
 RBSP stop bit (=’1’) plus zero bits for byte alignment
• SODB: String of data bits
 Concatenation of bits in the RBSP bytes from MSB to LSB
 All bits needed for the decoding process
 Only the bits needed for the decoding process
NAL Unit Structure
NAL unit header
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
47
• Blocks and Units
 Block: Square or rectangular area in a color component array
 Unit: Collocated blocks of the (three) color components, associated syntax elements and
prediction data (e.g. motion vectors)
• Picture partitioning
 Coding Tree Blocks / Coding Tree Units (CTBs / CTUs)
 Each CTU in exactly one slice segment
 Independent slice segment: full header, independently decodable
 Dependent slice segment: very short header, relies on corresponding independent slice,
inherits CABAC state
• Slice types
 I-slice: Intra prediction only
 P-slice: Intra prediction and motion compensation with one reference picture list
 B-slice: Intra prediction and motion compensation with one or two reference picture lists
HEVC Spatial Coding Structures
CABAC: Context-based Adaptive Binary Arithmetic Coding
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
48
Tiles in HEVC
• Change scanning order of CTBs in picture
• Slices in tiles, or tiles in slices
• Reset of prediction and entropy coding → parallel processing
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
49
• Maximum CTU size: 64×64 pixels
• Quadtree partitioning of CTB into CBs
• If picture size not integer multiple of CTB size:
 Implicit CTB partitioning to meet picture size (must be multiple of 8×8 pixels)
HEVC: Coding Tree Blocks and Coding Blocks (CBs)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
50
• Prediction block partitioning of a 2N×2N CB
• Transform block partitioning of a CB
 Quadtree partitioning of CB → Residual Quad Tree (RQT)
 Transform size 4×4 to 32×32
 TB size 4×4 to 64×64
 PB boundaries inside TBs allowed
HEVC: Prediction Blocks (PBs) and Transform Blocks (TBs)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
51
• QTBT structure removes concept of multiple partition types (TU = PU = CU)
• Maximum CTU size: 256×256 pixels (128×128 used in common testing conditions)
• Binary trees starting from leaves of quad-tree (with horizontal / vertical split indication)
→ CU can have either square or rectangular shape
• Configuration
 MinQTSize, MaxBTSize : minimum quadtree leaf node size / maximum binary tree root node size
 MaxBTDepth, MinBTSize : maximum binary tree depth / minimum binary tree leaf node size
JEM: Quad-Tree plus Binary Tree Partitioning (QTBT)
1
1
0
1
0
0
Figure from: Jianle Chen et al. Algorithm Description of Joint Exploration Test Model 7. Doc. JVET-G1001. Torino, IT, 7th meeting: Joint Video
Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Jul. 2017.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
52
Intra Prediction
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
53
Intra prediction modes
• Planar prediction: mode 0
• DC intra prediction: mode 1
• Numbering from diagonal-up to diagonal-down
 Modes 2 – 18: horizontal
• Modes 19 – 34: vertical
• Horizontal: mode 10
Vertical: mode 26
Intra prediction block size
• Intra prediction mode coded per CU
• Prediction block size derived from residual quadtree
• Boundary samples of neighboring block used for prediction
• Efficient representation
• Local update of prediction source
HEVC Intra Prediction Modes
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
54
• Concept of HEVC as basis
 Higher number of prediction modes
 Larger maximum block size
• Chroma
 Prediction modes from neighbors
 Derived modes from collocated luma
JEM Intra Prediction Modes
Figure from: Jianle Chen et al. Algorithm Description of Joint Exploration Test Model 7. Doc. JVET-G1001. Torino, IT, 7th meeting: Joint Video
Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Jul. 2017.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
55
• HEVC
 2-tap filters
 Weight derived from prediction direction
• JEM
 4-tap filters
 Cubic interpolation for blocks with ≤ 64 samples
 Gaussian interpolation filters elsewhere
 Parameters fixed according to block size
 Same filter for all predicted samples, all modes
Interpolation Filters for Directional Intra Prediction Modes
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
56
• HEVC
 Boundary sample filtering for intra prediction modes 10, 26
(horizontal / vertical)
 Local, 1-sample update at boundary perpendicular to prediction direction
• JEM
 Extended to directional modes
 Boundary samples up to four columns or rows
 2-tap filter for intra modes 2 & 34
 3-tap filter for intra modes 3–6 & 30–33
Intra Prediction Boundary Filtering
Figure from: . JVET-G1001: Algorithm Description of Joint Exploration Test Model 7.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
57
• Chroma samples predicted using corresponding reconstructed luma samples
𝑝𝑟𝑒𝑑 𝐶 𝑖, 𝑗 = 𝛼 · 𝑟𝑒𝑐 𝐿′ 𝑖, 𝑗 + 𝛽
• Parameters 𝛼 and 𝛽: minimize regression error between
neighbouring reconstructed luma and chroma samples
around current block
• Further prediction between chroma components with updated
parameters
𝑝𝑟𝑒𝑑 𝐶𝑟
∗
𝑖, 𝑗 = 𝑝𝑟𝑒𝑑 𝐶𝑟 𝑖, 𝑗 + 𝛼 · 𝑟𝑒𝑠𝑖 𝐶𝑏′ 𝑖, 𝑗
Multiple model CCLM mode (MMLM)
• Neighbouring luma samples and neighbouring chroma samples classified
into two groups
• Linear model for each group
JEM: Cross-Component Linear Model Prediction (CCLM)
Figures from: JVET-G1001: Algorithm Description of Joint Exploration Test Model 7.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
58
• Combination of the un-filtered boundary
reference samples and HEVC style intra
prediction with filtered boundary reference
samples
 Position-dependent weighting of filtered
and unfiltered reference, configurable by
four weighing parameters (hor/ver + corner)
 Filtered reference: linear comination of un-
filtered reference and lowpass, configurable
weight
 Three predefined lowpass filters selectable
(3-tap, 5-tap, 7-tap)
 Prediction parameters stored per block size
JEM: Position Dependent Intra Prediction Combination for Planar Mode (PDPC)
Figure from: JVET-G1001: Algorithm Description of Joint Exploration Test Model 7.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
59
• HEVC
 Bi-linear smoothing
 Depending on prediction block size
Mode-dependent Intra Reference Sample Smoothing (MDIS)
• Temporally adopted in JEM (removed in JEM7)
 Adaptive reference sample smoothing (ARSS)
 3-tap LPF with the coefficients of [1, 2, 1] / 4
 5-tap LPF with the coefficients of [2, 3, 6, 3, 2] / 16
Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
60
Inter Prediction
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
61
Prediction from reference picture lists
• Uni-prediction
 P-slices only with List0, B-slices with List0 or List1
 HEVC: Minimum PB size 8×4 or 4×8
• Bi-prediction, only in B-slices
 One predictor from List0, one predictor from List1
 HEVC: Minimum prediction block size 8×8
Motion Compensated Prediction
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
62
• Merge mode
 Motion vector (MV) derived from candidate set
(spatial and temporal neighborhood)
 Merge mode candidate index coded
 No motion vector difference encoded
• Advanced motion vector prediction
 Predictor derived from candidate set
(spatial and temporal neighborhood)
 Predictor index coded
 Motion vector difference encoded
• Skip mode
 Only merge candidate signaled, no residual
HEVC: Motion Vector Representation
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
63
• CU: at most one set of motion parameters for each prediction direction
• Option to split large CU into sub-CUs
 Alternative temporal motion vector prediction (ATMVP)
 Fetch multiple sets of motion information from multiple blocks in collocated reference picture
 Spatial-temporal motion vector prediction (STMVP)
 Derive recursively by temporal motion vector predictor and spatial
neighbouring motion vector
• ATMVP and STMVP: additional merge candidates (list extended to max 7)
JEM: Sub-CU based motion vector prediction
Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
64
• Locally adaptive motion vector resolution (LAMVR)
motion vector difference (MVD) coded in units of
 quarter luma samples,
 integer luma samples, or
 four luma samples
• Higher motion vector storage accuracy
 Internal motion vector storage and merge candidate at 1/16 pel (skip and merge modes only)
 SHVC upsampling interpolation filters for the additional fractional pel positions
JEM Motion Vector Representation
SHVC: Scalable High Efficiency Video Coding, HEVC Annex G
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
65
• Overlapped Block Motion Compensation (OBMC) previously been used in ITU-T H.263
• Switchable on CU level
 Motion compensation block boundaries except the right and bottom boundaries of CU
 Applied for both the luma and chroma components
 Performed at sub-block level for all MC block boundaries
JEM: Overlapped Block Motion Compensation
Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
66
• Linear model for illumination changes, using a scaling factor a and an offset b  concept taken from 3D-HEVC
• Enabled or disabled adaptively for each inter-mode coded coding unit (CU)
• Least square error method employed to derive the parameters a and b
• CU in 2N×2N merge mode
 LIC flag copied from neighbouring blocks (like merge)
 Otherwise, LIC flag at CU level
JEM: Local Illumination Compensation (LIC)
Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
67
• Motion vector field (MVF) for CU, applicable MV derived for each
4×4 block at 1/16 pel resolution
 Control point motion vector (CPMV)
• AF INTER mode
 Signaling CPMV difference from predictor
 Block width and height ≥ 8 required
• AF MERGE mode
 Derivation of CPMV from neigborhood
JEM: Affine Motion Vector Derivation for MC
















y
xxyy
y
x
yyxx
x
vy
w
vv
x
w
vv
v
vy
w
vv
x
w
vv
v
0
0101
0
0101
)()(
)()(
Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
68
• Special merge mode based on Frame-Rate Up Conversion (FRUC) techniques
Options for
 Bilateral matching
 Template matching (applicable also for AMVP mode, CU level only)
• Motion vector derivation process
 Initial motion vector for CU of size 𝑊 × 𝐻
 Sub-CU motion refinement for blocks of size 𝑀 × 𝑀
𝑀 = max{4, min{
𝑊
2 𝐷 ,
𝐻
2 𝐷}}
JEM: Pattern Matched Motion Vector Derivation (PMMVD)
bilateral
Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
69
• Sample-wise motion refinement on top of block-wise motion compensation for bi-prediction
• No extra signaling, applied on 4×4 block basis
• MVF determined by minimizing difference Δ between points 𝐴 and 𝐵 on trajectory
by Taylor expansion
Δ = 𝐼(0)
− 𝐼0
1
+ 𝑣 𝑥 𝜏1
𝜕𝐼 1
𝜕𝑥
+ 𝜏0
𝜕𝐼 0
𝜕𝑥
+ 𝑣 𝑦 𝜏1
𝜕𝐼 1
𝜕𝑦
+ 𝜏0
𝜕𝐼 0
𝜕𝑦
• Limited search window
• Optimized search
 First vertical, then horizontal search
 Memory usage: only access samples
inside block
JEM: Bi-directional optical flow (BIO)
Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
70
• MVs of bi-prediction refined by bilateral template matching process
• Search between bilateral template and reference pictures
⇒ refined MV without further signaling
• Applied only with reference pictures with pocRef𝑖 < poccurr < pocRef𝑗
• Not applied if enabled in CU:
 LIC,
 Affine motion,
 FRUC, or
 sub-CU merge candidate
JEM: Decoder-side Motion Vector Refinement (DMVR)
Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
71
Residual Coding
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
72
• Transform block sizes 4×4, 8×8, 16×16, and 32×32
 Integer approximations of the DCT-II transform matrix
• Additionally, integer approximation of 4×4 DST-VI transform matrix
• ’Single-norm’ design per transform block size → simple quantizer implementation
• Not all perfectly orthogonal, leakage below normalization threshold
HEVC Core Transforms
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
73
• Quantizer step size Δq derived from quantization parameter QP
• Exponentional relation of quantizer step sizes
• Double step size every 6 QP
Δq QP + 1 =
6
Δ 𝑞 QP
• Definition: Δq = 1 for QP = 4, thereby
Δq,0 = 2−
4
6, 2−
3
6, 2−
2
6, 2−
1
6, 1, 2
1
6
• Quantizer step sizes for given QP
Δq QP = Δq,0 QP mod 6 ⋅ 2
QP
6
Quantizer Implementation
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
74
• Large block-size transforms with high-frequency zeroing
 Maximum transform size up to 128 × 128
 Coefficients with column / row index > 32 set to 0
if
 Block width > 64
 Block height > 64, respectively
• Adaptive multiple core transform (AMT)
 Transform matrices quantized more accurately
 Applicable for block sizes ≤ 64 × 64
 Indicated by CU flag
 Mode-dependent transform-set selection
for intra prediction modes
JEM Transforms
Tables from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
75
• Motivation
 Remaining correlation between coefficients after primary transform!
 Dependency on intra prediction mode!
• Approach: mode dependent transforms (have been studies as tool for HEVC)
• MDNSST Structure:
 35×3 non-separable secondary transforms for both 4×4 and 8×8 block size
 3 NSST candidates for each intra prediction mode
 Application of transposed transform blocks for modes > 34
JEM: Mode-Dependent Non-separable Secondary Transforms (MDNSST)
Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
76
• Only applied to the low frequency coefficients after the primary transform
 For blocks ≥ 8 × 8, application of 8 × 8 transform to lowest frequency coefficients of primary transform
 For blocks < 8 × 8, application of 4 × 4 transform to lowest frequency coefficients of primary transform
• Implementation by Hypercube-Givens Transform (HyGT)
• Two rounds for 4 × 4, four rounds for 8 × 8 secondary transforms
JEM: Mode-Dependent Non-separable Secondary Transforms (MDNSST)
Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
77
• Searching 𝑁 similar patches in reconstructed region of picture, based on template
• Scheme of KLT matrix derivation:
 Collection of 𝑁 prediction residuals: 𝑼 = (𝒖 𝟏,𝒖 𝟐,…,𝒖 𝑵)
 covariance matrix Σ = 𝑼𝑼 𝑻
 Eigenvectors are KLT bases
• Application of proposed KLT on 4×4, 8×8, 16×16 and 32×32 coding blocks
• Note: Tool not activated in JVET Common Testing Conditions [JVET-G1010]
JEM: Signal dependent transform
Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
78
Loop Filtering
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
79
• HEVC deblocking filter also used in JEM
 Filtering at prediction and transform
block edges on a 8 × 8 grid
 Independent operation on 8 × 8 blocks
possible  parallel processing enabled
• Deblocking filtering
 Boundary processed in 4-sample sections (edges)
 Filter strength determined based on analysis of top
and bottom rows of edge
 Normal: Filtering of maximum two samples into block
 Strong: Up to four samples into block
Deblocking Filter
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
80
• HEVC SAO filtering also used in JEM
• Local processing of samples
 Depending on local neighborhood (edge offset)
 Direction signaled, smoothing only
 Depending on sample value (band offset)
 Configurable correction of sample intensity
values for four transition bands
• Operation independent of processed samples
→ parallel processing
• Local filter parameter adaptation
• Four different offset values available (plus SAO off)
• Dedicated SAO parameters for Y, Cb, Cr
 Common SAO mode for chroma components
Sample Adaptive Offset Filter (SAO)
edge offset
band offset
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
81
• First loop filter in the decoding process chain of JEM
• Each luma sample in reconstructed TU is replaced by weighted average of itself and its neighbours within TU
 sample located at (𝑖, 𝑗), neighbouring sample at (𝑘, 𝑙)
 𝐼(𝑖, 𝑗 ) and 𝐼(𝑘, 𝑙): reconstructed intensity value
 𝜎 𝑑: spatial parameter (transform size, pred.mode)
 𝜎𝑟: range parameter (QP)
𝜔 𝑖, 𝑗, 𝑘, 𝑙 = exp −
𝑖 − 𝑘 2
+ 𝑗 − 𝑙 2
2𝜎𝑑
2
−
𝐼 𝑖, 𝑗 − 𝐼 𝑘, 𝑙 2
2𝜎𝑟
2
𝐼 𝐹 𝑖, 𝑗 =
σ 𝑘,𝑙 𝐼 𝑘, 𝑙 ⋅ 𝜔(𝑖, 𝑗, 𝑘, 𝑙)
σ 𝑘,𝑙 𝜔(𝑖, 𝑗, 𝑘, 𝑙)
 Integer implementation with look-up table
for division
JEM: Bilateral filter
Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
82
• Luma component
 25 filters available for each 2×2 block, based on direction and activity of local gradients
 Diamond filter shapes (3 × 3, 5 × 5, 7 × 7)
 Classification into 25 classes, based on
 Activitiy index
 Directionality index
• Chroma components
 Diamond filter shape 5 × 5
 No classification
 Single set of filter coefficients
• Geometric transformations based on data from classification
 Transpose, vertical flip, rotation
• Filter coefficients signaled with 1st CTU, FIFO buffering for temporal prediction in inter pictures, 16 candidate
sets for intra pictures
JEM: Adaptive loop filter (ALF)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
83
Entropy Coding
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
84
• Fixed length and variable length codes (FLC, VLC)
 High-level syntax
 Parameter sets, slice segment header
 SEI messages
 Fixed-length codes, Exp-Golomb codes
• Arithmetic coding
 Slice level, CTUs
 Context-based adaptive coding
 Bypass coding (complexity, throughput)
Entropy Coding
CTU = Coding Tree Unit
SEI = Supplemental Enhancement Information
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
85
• VCL NAL Unit
 FLC, VLC for header information
 CABAC for CTUs
 Byte alignment in case of multiple tiles, or with wavefront parallel
processing (not present otherwise)
Fixed and Variable Length Coding
NAL = Network Abstraction Layer
VCL = Video Coding Layer
CABAC = Context-based Adaptive Binary Arithmetic Coding
ba = byte alignment
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
86
• Arithmetic coding engine
 Binarization
 Context model selection
 Binary arithmetic coding
 Optimized binarization design
 Reduced number of non-bypass
bins compared to H.264 | AVC
• JEM
 Modified context modeling for transform coefficients
 Multi-hypothesis probability estimation with context-dependent updating speed
 Adaptive initialization for context models
Context-Based Adaptive Binary Arithmetic Coding (CABAC)
Part IV: Versatile Video Coding
ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization
Jens-Rainer Ohm Mathias Wien
Institute of Communication Engineering Institute of Imaging and Computer Vision
RWTH Aachen University, Germany RWTH Aachen University, Germany
ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
88
• Experimental software “Joint Exploration Model“ (JEM) developed by JVET
 Intended to investigate potential for better compression beyond HEVC
 Was initially started extending HEVC software by additional compression tools, or replace existing tools
(see previous section)
• Substantial benefit was shown over HEVC, both in subjective quality and objective metrics
 Proven in "Call for Evidence" (July 2017)
 JEM was however not designed for becoming a standard (regarding all design tradeoffs)
 Call for Proposals was issued by MPEG and VCEG (October 2017)
• Call for Proposals very successful (responses received by April 2018)
 32 companies in 21 proponent groups responded
 46 category-specific submissions: 22 in SDR, 12 each in HDR and 360° video
 All responses clearly better than HEVC, some evidently better than JEM
 This marked the starting point for VVC development
Steps towards next generation standard – Versatile Video Coding (VVC)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
90
• Document JVET-H1002
• Test categories
 Standard dynamic range (SDR): 5 UHD and 5 HD sequences
 High dynamic range (HDR): 3 HLG and 5 PQ sequences
 360° video (360): 5 sequences in ERP format
• Constraint sets
 Constraint set 1 (C1): Random access configuration
 Max 1.1s random access intervals, structural delay max 16 pictures
 Constraint set 2 (C2): Low delay configuration only evaluated for SDR HD sequences
 No picture reordering between input and output
• Encoding constraints
 No pre-processing, post-processing only within the coding loop
 Static quantizer setting with one-time change to meet target bitrate
 Relevant optimization methods to be reported
Joint Call for Proposals (CfP) on Video Compression with Capability beyond HEVC
UHD = Ultra High Definition, HD = High Definition, HLG = Hybrid Log Gamma, PQ = Perceptive Quantization (ITU-T BT2020), ERP = Equirectangular Projection
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
91
• SDR-A: 3840×2160
• SDR-B: 1920×1080
• HDR (PQ HD, HLG 4K)
• 360 Video (8K, 6K)
VVC CfP Test Sequences
FoodMarket4 60p CatRobot1 60p DaylightRoad2 60p ParkRunning3 50p Campfire 30p
BasketballDrive 50p Cactus 50p BQTerrace 60p RitualDance 60p MarketPlace 60p
Market3 HD50p Hurdles HD50p Starting HD50p ShowGirls2 HD25p Cosmos1 HD24p
DayStreet 60p PeopleInShop... SunsetBeach 60p
ChairliftRide 30p KiteFlite 30p Harbor 30p Trolley 30p Balboa 60p
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
92
• Category-specific submissions (total 46):
 SDR: 22 submissions (8 of which are registered only in this category)
 HDR: 12 submissions
 360°: 12 submissions (2 of which are registered only in this category)
For all categories: HEVC anchors (HM) and JEM anchors
• Proposals
 Described in JVET input documents JVET-J0011...JVET-J0033
 Participation of 32 institutions
VVC CfP Responses
JVET documents available at http://phenix.it-sudparis.eu/jvet
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
93
• Submissions had to provide coded/decoded sequences
 4 rate points each, two constraint conditions "low delay" (LD) and "random access" (RA)
 SDR: 5x HD (both LD and RA), 5x UHD-4K (only RA)
 HDR: 5x HD (PQ grading), 3x UHD-4K (HLG grading)
 360°: 5 sequences 6K/8K for the full panorama
• Double stimulus test with two hidden anchors HEVC-HM & JEM
 Rate points defined with lowest rate was typically less than "fair" quality for HEVC, but still possible to code
 Quality was judged to be distinguishable when confidence intervals were non-overlapping
• Evaluation: Three ways of judging benefit:
 Mean MOS over all test cases (28x4 test points: 23x4 C1, 5x4 C2 )
 Count cases where a proposal was visually better/worse than JEM
 Count cases where a proposal was visually better than HEVC (HEVC at higher rate point)
• Reports: Input subjective test [JVET-J0080], output CfP results [JVET-J1003]
Performance
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
94
• Measured by objective performance (PSNR), best performers report >40% bit rate reduction compared
to HEVC, >10% compared to JEM (for SDR case)
 Similar ranges for HDR and 360°
 Obviously, proposals with more elements show better performance
 Some proposals showed similar performance as JEM with significant complexity/run time reduction
 2 proposals used some degree of subjective optimization, not measurable by PSNR
• Results of subjective tests generally show similar (or even better) tendency
 Benefit over HEVC very clear
 Benefit over JEM visible at various points
 Proposals with subjective optimization also showing benefit in some cases
Performance
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
95
• JVET-J1003:
Report of subjective
evaluation contains
28 plots as shown,
one per sequence
• Count significant
cases of positive/
negative benefit
with non-overlapping
confidence interval
against JEM
Performance
HM
JEM
Proposals ranked by MOS (per rate point)
+1 credit
-1 credit
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
96
• "Mean" and "significance-count"
method suggested at least 7
proposals that were obviously
better than JEM
Performance SDR
Pxx 10
Pxx 8
Pxx 8
Pxx 6
Pxx 6
Pxx 6
Pxx 6
Pnn 3
Pnn 3
Pnn 2
Pnn 2
Pnn 1
Pnn 1
JEM 0
Pnn 0
Pnn -1
Pnn -1
Pnn -1
Pnn -2
Pnn -2
Pnn -2
Pnn -3
Pnn -4
HM -36
Pxx 6,53
Pxx 6,46
Pxx 6,41
Pxx 6,37
Pxx 6,33
Pxx 6,33
Pxx 6,26
Pnn 6,23
Pnn 6,17
Pnn 6,15
Pnn 6,13
Pnn 6,11
Pnn 6,04
Pnn 6,04
Pnn 6,03
Pnn 6,03
Pnn 6,01
JEM 6,01
Pnn 6,00
Pnn 5,96
Pnn 5,94
Pnn 5,88
Pnn 5,86
HM 4,57
Mean MOS Significance vs. JEM
60 ... +60
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
97
• Similar
tendency
in HDR
and 360°
categories
• Mostly same
coding tools
as in SDR
provide good
benefit
Performance HDR / 360°
Mean MOS Signif. vs. JEM
Pxx 6,04
Pxx 6,00
Pxx 5,94
Pxx 5,93
Pxx 5,86
Pnn 5,85
Pnn 5,80
Pnn 5,67
JEM 5,62
Pnn 5,60
Pnn 5,59
Pnn 5,45
Pnn 5,11
HM 4,14
Pxx 7
Pxx 3
Pxx 2
Pxx 2
Pxx 2
Pnn 1
Pnn 1
JEM 0
Pnn 0
Pnn 0
Pnn -1
Pnn -1
Pnn -6
HM -20
32 ... +32
Mean MOS Signif. vs. JEM
Pxx 6,20
Pxx 6,19
Pxx 6,06
Pxx 6,03
Pxx 5,99
Pxx 5,96
Pxx 5,86
Pnn 5,69
Pnn 5,67
Pnn 5,51
Pnn 5,45
JEM 5,11
HM 3,79
Pnn 3,45
Pxx 9
Pxx 9
Pxx 8
Pnn 7
Pxx 7
Pxx 6
Pxx 5
Pxx 4
Pnn 2
Pnn 1
Pnn 1
JEM 0
HM -9
Pnn -12
20 ... +20HDR 360°
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
98
• How often are best performing proposals better than HEVC at higher rate?
• Note: R11 Mbit/s; R2 1.6 Mbit/s; R3 2.8 Mbit/s; R4 4.6 Mbit/s
Performance compared to HEVC
Pbest vs HM R1 vs R2 R1 vs R3 R1 vs R4 R2 vs R3 R2 vs R4 R3 vs R4
SDR UHD 60% 40% 0% 80% 0% 20%
SDR HD/RA 40% 0% 0% 20% 0% 20%
SDR HD-/LD 40% 0% 0% 0% 0% 0%
HLG 67% 0% 0% 67% 0% 33%
PQ 40% 0% 0% 40% 0% 20%
360° 40% 20% 0% 20% 0% 60%
Rate saving  37.5%  65%  78%  43%  35%  39%
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
99
• How often is HEVC better than best performing proposals at lower rate?
- Note: 1-xx% means that best performing proposal is equal or better
• Note: R11 Mbit/s; R2 1.6 Mbit/s; R3 2.8 Mbit/s; R4 4.6 Mbit/s
Performance compared to HEVC
HM vs Pbest R1 vs R2 R1 vs R3 R1 vs R4 R2 vs R3 R2 vs R4 R3 vs R4
SDR UHD 0% 0% 60% 0% 0% 0%
SDR HD/RA 0% 60% 100% 0% 80% 0%
SDR HD-/LD 0% 60% 80% 0% 80% 0%
HLG 0% 0% 100% 0% 67% 0%
PQ 0% 60% 100% 0% 60% 0%
360° 0% 40% 80% 0% 40% 0%
Rate saving  37.5%  65%  78%  43%  65%  39%
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
100
• The subjective quality of best performing proposals is always equal or sometimes better (~1/3 of cases) than
HEVC at next higher rate point, over all categories (with approx. 40% less rate)
• The subjective quality of best performing proposals is always equal or sometimes better (~1/5 of cases) than
HEVC at 2nd higher rate point, in SDR-UHD category (with approx. 65% less rate)
• Though it is not always the same proposal that performs best at a given rate point, it can be anticipated that
merits of different proposals can be combined
• 50% (or more) bit rate reduction with same quality will probably be achievable by the new standard
Performance compared to HEVC
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
101
• New elements (some come with high complexity):
 Decoder side estimation for mode/MV derivation and sample prediction both in intra and inter coding (JEM)
 Finer partitioning: Asymmetric, geometric
 Neural networks for prediction, loop filtering, upsampling, (encoder control)
 Additional elements using template matching
 Intra block copy / current picture referencing
 Additional non-linear, de-noising and statistics-based loop filters
 Additional linear and non-linear elements in prediction
• HDR specific:
 New adaptive reshaping and quantization, also in-loop
 HDR-specific modifications of existing tools, e.g. deblocking
• 360-video specific:
 Variants of projection formats, geometry-corrected face boundary padding
 Modification and disabling of existing tools at face boundaries
CfP analysis: What was proposed?
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
102
• VVC Working Draft 1 / Test Model 1 (VTM1): basic approach
built on "reduced HEVC" starting point
• VTM Block structure
 Unified tree (coding block unites prediction and transform)
 CTU size 128x128, rectangular blocks (dyadic sizes),
smallest luma size 4x4
 Maximum transform size 64x64
• VTM: Some removed elements of HEVC:
 Mode dependent transform (DST-VII), mode dependent scan
 Strong intra smoothing
 Sign data hiding in transform coding
 Unnecessary high-level syntax (e.g. VPS)
 Tiles and wavefront
 Quantization weighting
VVC Working Draft and Test Model 1
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
103
• Report of Results from the Call for Proposals on Video Compression with Capability beyond HEVC
[JVET-J1003]
 Documentation of results per sequence, marking HM and JEM anchors, not identifying individual proponents
 Assessment of qualitative (and as far as possible quantitative) benefit of submitted technology compared to
anchors
• Working Draft 1 of Versatile Video Coding [JVET-J1001]
 "Reduced" HEVC plus quad/binary/ternary tree structure
• Test Model 1 of Versatile Video Coding (VTM 1) [JVET-J1002]
 Corresponding encoder and algorithm description
Documents issued after CfP Results
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
104
• Benchmark Set (BMS) was defined in addition to VTM, including the following well-known JEM tools:
• 65 intra prediction modes
• Coefficient coding
• AMT + 4x4 NSST
• Affine motion
• Geometry based adaptive loop filter
• Subblock merge candidate (ATMVP)
• Adaptive motion vector precision
• Decoder motion vector refinement
• LM Chroma mode
• Purpose: Testing benefit of technology against better performing set
 Holding extra potential features we aren’t so sure about yet
 Superset of VTM; should have significant gain over the VTM
 Unveils in CEs whether gains are independent, or how much gain remains when a tool is combined with a
set of more performant tools
 Can be a common basis for further CE tests of modified versions of features
 Not necessarily ultra-low complexity, but encoder needs to be runnable in reasonable amount of time
Benchmark Set and its role
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
105
• The only fundamental new element of version 1
• Simple multi-type tree split, can be alternated
Quad/binary/ternary partitioning
Example:
Figures from: JVET-J1001
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
106
• PSNR-based Common Test Conditions (CTC) BD-Rate savings relative to HEVC reference software (10 bit)
• Note that gain over HEVC with CTC
is lower than with CfP test set
(other sequences, higher rates,
lower resolutions)
Performance of VTM1 and initial BMS compared to HEVC
vs HM16.18 VTM BMS
4k UHD 10% 28%
1080p 8% 22%
WVGA 6% 19%
Average 8% 23%
Decode time 0.8× 2×
Encode time 2× 9×
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
107
• Working Draft 2 of Versatile Video Coding [JVET-K1001]
 Normative text specification
 No descriptive text of building blocks "borrowed" from HEVC: These would anyway be placeholders which
are likely to be replaced later
 Starting from this meeting, precise specification of more substantial newly adopted building blocks is being
added (see subsequent slides)
• Test Model 2 of Versatile Video Coding (VTM 2) [JVET-K1002]
 Encoder and algorithm description
 Has corresponding software implementation
Latest status (from last week)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
108
• QT/BT/TT no longer “placeholder”
• Remove unnecessary partitioning restrictions
• Implicit splitting at picture boundaries
• Separate trees for intra slices
• Position Dependent Prediction Combination
• Cross Component Linear Model
• 87 intra modes (wide angles included), 3 MPM, TU binarization
• Affine MC (4x4 fixed subblock size, 4/6 parameter model switching at CU level)
• Affine MV coding
 list construction contains inheritance and derivation spatial/temporal
 improved difference coding
• Adaptive motion vector resolution (AMVR)
• Subblock MC (4x4) from ATMVP merge, 8x8 granularity motion vector storage [High precision]
Latest status (from last week): New elements of WD2 / VTM2
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
109
• Multiple transform selection (all are DCT/DST types) for intra and inter
• Increase max QP from 51 to 63
• Modified entropy coding supporting dependent quantization
• Sign data hiding reinvoked from HEVC
• Adaptive loop filter
 4x4 classification based (gradient strength & orientation) for luma
 7x7 luma, 5x5 chroma filters)
 enabling flag at CTU level
• Basic high-level syntax (SPS, PPS, slice)
• Update of BMS contains
 generalized Bi prediction (kind of local weighted prediction)
 Decoder-side estimation: BIO, simplified bilateral matching
 Current picture referencing (aka intra block copy)
Latest status (from last week): New elements of WD2 / VTM2
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
110
• For rectangular blocks, prediction directions witch angles beyond 45/135 degrees are reasonable
• This can be implemented by adding modes at both ends
• VTM2 uses a total of 85 directional intra modes now
(plus DC and planar)
Wide angular modes
Figures from JVET-K0500
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
111
• Alternating between two quantizers based on state transition rule allows to select an optimum
sequence of reconstruction values (e.g. by trellis-like search)
• Decoder needs to implement the sequential state transition rule
• CABAC contexts needs to be modified as well for this case
(greater than 0/1/2/... would have different meaning depending on Q0/Q1)
Dependent quantization
0 1
2 3
Q0
Q1
(k & 1) == 1
(k & 1) == 1
(k & 1) == 1
(k & 1) == 1
(k & 1) == 0
(k & 1) == 0
start
state
current
state
next state for …
(k & 1) == 0 (k & 1) == 1
0 0 2
1 2 0
2 1 3
3 3 1
-9Δ -8Δ 8Δ3Δ2Δ 4Δ 5Δ 6Δ 7Δ-Δ-6Δ-7Δ -5Δ -4Δ -3Δ -2Δ Δ0 9Δ
0
1
4-2 1-4 -3
0
-1
Q0
t
2 3
2 3 4 5-1-2-3-4-5
Q1
A AA BA B B A B
DC C D C DDCDCD
Figures from JVET-K0071
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
112
• Ongoing investigations on
 Improved merge, intra prediction, etc.
 Decoder-side estimation with low complexity
 Multi-hypothesis prediction and OBMC
 Diagonal and other geometric partitioning
 Secondary transforms
 New approaches of loop filtering, reconstruction and prediction filtering
(denoising, non-local, diffusion based, bilateral, etc.)
 Current picture referencing, template matching, palette mode
 Neural networks for loop filtering and prediction
• Core experiments (CE) process
 coordinated effort to investigate performance, complexity impact of proposed elements
 typically based on a specific technology proposed, or combination of several technologies
 allows detailed study / cross-checks by other interested parties
 allows identifying which elements of a proposal are useful, if it is nit useful at all, or if further improvements
are needed
Further promising fields
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
113
• Motivation: Towards object-oriented coding
 Follow object boundaries more closely
 Less coding artifacts where it matters
• Prediction, transform and coding driven by actual object
shape under RD-constraint
 Inter- and intra-predicted segments for handling of
disocclusions
 Overlapped wedge based filtering at partition boundary
 Shape-adaptive DCT for spatially localized transform
coding
Geometric Partitioning (GEO)
Source: M. Bläser, J. Sauer, and M. Wien, “Description of SDR and 360o video coding technology proposal
by RWTH Aachen University,” Doc. JVET-J0023, Joint Video Experts Team of ITU-T VCEG and ISO/IEC MPEG, San Diego, USA, 10th meeting, Apr. 2018
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
114
• GEO available for all block sizes ≥ 8×8 luma samples
• Partitioning is represented by two coordinate points 𝑃0 and 𝑃1 on the block boundary
• Prediction of two coordinate points 𝑃0 and 𝑃1 from 16 pre-defined templates (scaled for non-square blocks)
 Alternative: Spatial or temporal prediction
 Refinement: block size dependent offset
• Integration with AMVP, MERGE, FRUC
(no AFFINE (yet))
GEO: Partitioning Coding and Prediction
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
115
Results for GEO
JEM 7.0 JEM 7.0 + GEO
• Visual improvements at object boundaries
 Sharper contours
 Less staircase-effect
 More background details
• Objective gains (BD-rate savings)
 Against HEVC: ~33% on C1, ~25% on C2
 Against JEM: ~0.8% for both, C1 and C2
JEM 7.0
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
116
Results for GEO
JEM 7.0 JEM 7.0 + GEO
• Visual improvements at object boundaries
 Sharper contours
 Less staircase-effect
 More background details
• Objective gains (BD-rate savings)
 Against HEVC: ~33% on C1, ~25% on C2
 Against JEM: ~0.8% for both, C1 and C2
JEM 7.0 + GEO
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
117
• CE1: Partitioning
• CE2: Adaptive loop filter
• CE3: Intra prediction and mode coding
• CE4: Inter prediction and MV coding
• CE5: Arithmetic coding engine
• CE6: Transforms and transform signalling
• CE7: Quantization and coefficient coding
• CE8: Current picture referencing
• CE9: Decoder side MV derivation
• CE10: Combined and multi-hypothesis prediction
• CE11: Deblocking
• CE12: Mapping for HDR content
• CE13: Coding tools for omnidirectional video
• CE14: Post-reconstruction filtering
• CE15: Palette mode
Current Core Experiments
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
118
• Technically similar elements to HEVC/JEM/VVC or JVET study
 Partitioning: 128x128 "superblock" with equivalent to quad/binary sub-splits (no 1:2:1 ternary)
 Directional intra prediction, 56 directional modes, DC and "true motion" mode
 Chroma from luma prediction
 Intra block copy
 Up to 7 reference frames (allows similar structure to hierarchical B)
 Spatial/temporal motion vector referencing
 Affine motion compensation (pixel based)
 OBMC
 DCT/DST based transforms, and skip
 Adaptive arithmetic coder
 Context-based transform coefficient coding
 Film grain synthesis
 Adaptive loop filter (Wiener like)
 Deblocking
AOM's AV1
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
119
• Other elements
 Recursive-filtering intra predictor
 Prediction based on color palette
 Wedge-based prediction, 16 diagonal/asymmetric modes for square/rectangular blocks, similar to GEO
 Difference-modulated prediction (based on difference between two references)
 Contrast enhancement/deringing loop filter
 Self-guided filter (somewhat similar to bilateral & diffusion filters)
 Super-resolution coding mode (with coding at lower res.)
• Performance
 Owners report 20% average bit rate reduction (PSNR based)
compared to X.265-style HEVC encoder, set of full HD sequences
 Other reports indicate much less gain, or even losses compared
to HM encoder (using sequences from JVET's CTC)
 According to the same reports, JEM performs significantly better than AV1
 Some of those may not have used the newest JEM version, though
AOM's AV1
Part V: Exploratory trends and perspectives
ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization
Jens-Rainer Ohm Mathias Wien
Institute of Communication Engineering Institute of Imaging and Computer Vision
RWTH Aachen University, Germany RWTH Aachen University, Germany
ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
121
• PSNR mostly used for video quality assessment
 targeting Pixel fidelity which does not necessarily reflect subjective quality
• Specific artifacts produced by video codecs:
 blockiness, blur and banding
 motion jerkiness
 time-varying edge noise ("mosquito effect")
• Alternative metrics may be clustered into
 full reference quality metrics
 reduced reference quality metrics
 no-reference quality metrics
• Note that also subjective testing methods require some reference (e.g. impairment compared to original or
another anchor)
 full reference metrics are most reliable and are also typically used for encoder decisions
• Note: Subsequent slide gives an example (SSIM) – not claimed that this is the best!
Quality metrics
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
122
• Example of another full-reference metric which better matches subjective quality at least for images
• Structural SIMilarity Index (SSIM) [Wang et al. 2004] measures the structural distortion by exploring three
components: Luminance, Contrast and Structural changes.
 Luminance:
 Contrast:
 Structure comparison:
• Numerous variants:
 Computation separately for regions
 Weighting by amount of motion and frame averaging for video
 Computation in complex wavelet domain for frequency weighting (MS-SSIM, multi-scale)
Perceptually adapted quality metrics example: SSIM
1
2 2
1
2
( , ) x y
x y
C
l x y
C
 
 


 
2
2 2
2
2
( , ) x y
x y
C
c x y
C
 
 


 
3
3
2
( , )
xy
x y
C
s x y
C

 



( , ) [ ( , )] .[ ( , )] .[ ( , )]SSIM x y l x y c x y s x y  

Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
123
• Textures with large amount of detail and/or motion are often extremely challenging for video codecs
• On the other hand, the exact pixel-wise appearance is largely irrelevant for human observers, whereas
degradation of visual quality is critical
• Textures in videos can be static or dynamically changing over time
 Static textures basically rigid (but may be moving globally)
 Dynamic textures have high amount of irregular local motion
 Examples: water, smoke, head-and-shoulder sequences
• Both categories should have some stationarity properties in space and/or time, for allowing modelling as
random process expressed by parametric description – examples:
 Spectral properties
 Moments (marginal statistics and covariance statistics)
 Random field models
• In case of dynamic texture, modelling the motion properties is relevant as well, can also be understood as a
random field with certain amount of variation
Perceptual coding: Texture analysis and synthesis
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
124
• Example below is based on a parametric statistical description in complex wavelet domain (steerable
pyramid), with lowpass baseband and four directional orientations in bandpass layers
[Portilla, Simoncelli 2000]
• Efficient coding of parameters needed for synthesis by [Thakur, Ray 2016]
• Marginal statistics expressed as scalar values
• Auto and cross correlation statistics compressed via DCT
Static texture synthesis
Reference HEVC Intra Coding 0.223bpp Thakur et al. 0.213bpp
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
125
MVF MV T(i,j)
Dense OF
between adjacent
frames
Analyse
Motion
Distribution
Discard
non-probable
MV combinations
T original frames
MVF MV T'(i,j)
Compressed
MCM Mc
MCM M
Discard Intermediate
Frames
Derive Motion
Vectors
Invert MVF
Synthesized
MVF
T-2 synthesized frames
Frame Warping
and Blending
Analysis
Synthesis
Source: Chubach et al. 2017
Dynamic texture synthesis method
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
126
HEVC 6 of 8 frames synthesized
Dynamic texture synthesis vs. HEVC at same rate
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
127
• Recently, many signal processing tasks are solved by employing machine learning, deep learning and
convolutional neural networks (CNN)
• Advantages for video compression could be as follows:
• Systematic approach of optimizing with big data sets (rather than hand-crafted design)
• Detection and exploitation of nonlinear dependencies in images and video
• Inclusion of perceptual criteria by mimicking human observer behaviour
• On the downside, both training and running e.g. CNN algorithms e.g. for encoder decisions or at the decoder
may be overly complex
• Types of NN that have been proposed for image/video compression
• Autoencoders
• Adversarial networks
• Recurrent networks, particularly based on LSTM (long short-term memory) elements
Learning based approaches: Overview
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
128
• An autoencoder is a deep (convolutional) neural network with a sparse hidden layer that represents the code
• The encoder typically performs subsequent filtering and downsampling steps on input x per layer (note
conceptual similarity with transform coding!)
• The decoder performs complementary upsampling steps and generates output y
• Encoder and decoder are trained jointly
such that
• Difference between x and y
is minimized w.r.t. some distortion
• Code z is as sparse (minimum amount
of information) as possible
• Use Bayes formula P(z|x) P(x|z)P(z)
and minimize Kullback Leibler divergence
of conditional probabilities to achieve
the latter [Kingma, Welling 2014]
Convolutional Neural Networks: Autoencoders (AE)
Source: Wikipedia
x y
z=F(x) y=G(z)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
129
• Generator net G generates samples y from random variables z (G would be the decoder, z the code)
• Discriminator net D decides whether the samples could match with real-world images x which stem from an
unknown distribution P(x)
• Generator and discriminator nets are trained iteratively, optimizing following function
• Minimax optimization:
• Train D such that V is maximized
• Train G such that V is minimized
• Problem: There is no corresponding
mapping from x to z (no encoder)
• Solution (e.g. [Santurkar et al. 2017]):
Combination AE and GAN, i.e. train
F(x) from AE joint with G(z) and D(⋅)
Convolutional Neural Networks: Generative Adversarial Networks (GAN)
Source: Slideshare.net – K. McGuinness
z
x
y
G(z)
D(x) or D(y)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
130
• Variable-rate and variable-size coding not straightforward
• Option to operate over small patches / blocks
• Train separate for different content complexity
• Code residual differences
• Cost functions for rate distortion optimization not straightforward to implement
• Option to re-formulate rate constraint as energy minimization problem
• Hybrid solutions where conventional entropy coding is operated after network output at encoder
• None of these solutions may lead to a consistent optimum, and may require to be driven by some external
decision mechanism
Convolutional Neural Networks: General problems and possible solutions
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
131
• Autoencoder could be interpreted as a monolithic non-linear transform (though operating with local kernels)
– see previously used notation in light green below
• A similar approach is proposed in [Ballé et al. 2017], with additional criteria for rate distortion optimization
and quantization / entropy coding on the sparse representation (called y here)
• Perceptual optimization based on nonlinear "generalized divisive normalization" and L2 norm minimization in
nonlinear space
• Authors report significantly improvement on detail structures, also improved MS-SSIM compared to
conventional codecs – transform optimized based on cost criterion below:
Trained non-linear transforms
(x)
(y)
(z)F(x)
G(z') (z')
Source: Ballé et al. 2017
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
132
• All methods discussed so far were developed for still image coding, and could be used in intra coding for video
• Main problem: Motion compensation is a very effective tool, and can hardly be trained into a network (or would
be tremendously more complex than conventional motion estimation)
• Some work on using CNN for
 Sub-pel interpolation
 Resolution up-conversion
 Post-processing
 Texture synthesis and inpainting
• It is also not as simple to train for perceptual criteria in video
NN for video
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
133
• NN-based approaches were so far more successful in still image coding rather than video coding
 Perceptual criteria also better understood for images
• In video coding, motion compensation is a most effective key component
 Requires motion estimation for which "conventional" algorithms appear to be less complex
 Analogy: Eye tracking – the brain processes a motion compensated input
• CNN have been demonstrated to provide benefit in context of video coding for
 Resolution up-conversion
 Post-processing and loop filtering
 Intra coding
 Encoder optimization, in particular partitioning which is basically a segmentation problem
NN for video
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
134
• Switching to lower resolution is common (an necessary) when data rate is low
• Video is locally varying by detail, and may not require encoding at full resolution everywhere
• Lower resolution may also be useful with high motion, motion blur, etc.
• Need to code less information in such irrelevant areas can save data rate
• Tools "Reduced Resolution Update" or "Dynamic Resolution Conversion" were included in MPEG-4 part 2 and
H.263+, but not well understood by that time
• Requires tools for
 downsampling when generating prediction from reference
 signalling the coding with variable resolution
 upsampling for generating full-resolution picture
• Three examples shown subsequently:
 Down/Up-sampling using neural networks / conventional filters
 Coding B pictures of dynamic texture with low resolution
 Dictionary-based super-resolution upsampling
Variable-resolution coding
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
135
• Basic idea of dynamic resolution coding:
 Downsample and code by lower resolution (less bitrate cost)
 Upsample at decoder side to full resolution
 Encoder decides using full res, conventional or CNN-based down- and upsampling
 CNN-based could generate super-resolution upsampling, sharper edges, etc.
• Can be implemented in combination with intra and inter prediction coding
• Operated on block by block basis
CNN for resolution up-conversion
Figure from JVET-J0032
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
136
• Loop filtering is common in video coding
 removes compression artifacts from reconstruction
 improves prediction from reconstructed frames
• Generally, signal-adaptive and non-linear filters
 e.g., de-blocking, de-ringing, de-banding
 edge-adaptive & Wiener optimized
 bi-lateral filters
 ...
• CNN reconstruction
provides additional
gain (3-5% rate red.)
and might replace
some conventional
filters
• Can be operated on
block basis, parallel
processing possible
CNN for loop filtering
Figures from JVET-I0022
Process Unit
Block7
2*padding_size
Block6
Block1 Block2 Block3 Block4 Block5
Block8 Block9 Block10
2*padding_size
padding_size
Block11 Block12 Block13 Block14 Block15
Block16 Block17 Block18 Block19 Block20
2*padding_size
padding_size
Conv1 (5, 5, 45)
Conv2 (3, 3, 54)
Conv3 (3, 3, 58)
Conv4 (3, 3, 48)
Conv5 (3, 3, 51)
Conv6 (3, 3, 40)
Conv7 (3, 3, 31)
Convolution8 (3, 3, 1)
Normalized QP MapNormalized Y/U/V
Concat
Summation
ConvL (M,N,KL)
ConvolutionL (M,N,KL)
ReLU
M: kernel width
N: kernel height
KL: kernel number
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
137
• Neural networks were demonstrated to provide improved intra prediction, compared to conventional
directional and planar modes
• Mostly fully connected networks
have been used for this
purpose (no convolutional
layers)
• Average rate reductions
of 4-5% (for intra coding)
have been reported
• Examples of prediction
demonstrate the benefit
of non-linear processing
Neural networks for intra prediction
Figure
from JVET-J0037
Figures from Li et al. IEEE-TCSVT, July 2018
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
138
• Key pictures coded with full resolution
• Non-key pictures coded with reduced resolution
• Upsampling based on motion-compensated steerable pyramid
Variable-resolution coding for dynamic texture (Thakur et al. 2017)
Ref pic L0 Ref pic L1
Lowpass Lowpass Lowpass
Original Pictures
Reconstructed
key Pictures
Predicting
Non-Key Pictures
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
139
• Motion vectors initially estimated from downsampled lowpass key pictures, refined and applied in bandpass
and highpass components of non-key pictures
• Authors report significant bit rate saving (20-30% average) for dynamic texture content, whereas subjective
quality is preserved compared to full-resolution coding
Variable-resolution coding for dynamic texture (Thakur et al. 2017)
Motion
Estimation
Motion
Compensation
Bandpass
Current LowpassReference Lowpass
HighpassHighpass
Bandpass
Key picture Non-key picture
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
140
• Low and high-resolution dictionaries trained jointly with sparsity constraint (large data base)
• Up-converter searches low number of matching dictionary bases in low res, and applies the corresponding
bases from the high res dictionary
Low-resolution coding with dictionary-based up-conversion (Schneider et al. 2017)
Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA |
Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018
141
• Scheme run with overlapping blocks
• Provides sharp reconstruction of structures and edges
• Authors report 2-3% rate gain when used in upsampling for HEVC scalable coding
Low-resolution coding with dictionary-based up-conversion (Schneider et al. 2017)
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization
Trends and Recent Developments in Video Coding Standardization

Weitere ähnliche Inhalte

Was ist angesagt?

An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)Varun Ravi
 
Introduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video CompressionIntroduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video CompressionIain Richardson
 
Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)danishrafiq
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video codingIain Richardson
 
Video compression
Video compressionVideo compression
Video compressionnnmaurya
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainVideoguy
 
VVC tutorial at ICME 2020 together with Benjamin Bross
VVC tutorial at ICME 2020 together with Benjamin BrossVVC tutorial at ICME 2020 together with Benjamin Bross
VVC tutorial at ICME 2020 together with Benjamin BrossMathias Wien
 
Iain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson
 
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 VideoAn Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 VideoDr. Mohieddin Moradi
 
Introduction to HEVC
Introduction to HEVCIntroduction to HEVC
Introduction to HEVCYoss Cohen
 
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...Mathias Wien
 
Video Coding Standard
Video Coding StandardVideo Coding Standard
Video Coding StandardVideoguy
 
HEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxHEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxYoss Cohen
 
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdf
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdfTutorial High Efficiency Video Coding Coding - Tools and Specification.pdf
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdfssuserc5a4dd
 
Audio encoding principles
Audio encoding principlesAudio encoding principles
Audio encoding principlesPhillip Doyle
 
Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2VijayKumarArya
 

Was ist angesagt? (20)

An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)
 
Introduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video CompressionIntroduction to H.264 Advanced Video Compression
Introduction to H.264 Advanced Video Compression
 
Video coding standards ppt
Video coding standards pptVideo coding standards ppt
Video coding standards ppt
 
H.264 vs HEVC
H.264 vs HEVCH.264 vs HEVC
H.264 vs HEVC
 
HEVC overview main
HEVC overview mainHEVC overview main
HEVC overview main
 
Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video coding
 
Video compression
Video compressionVideo compression
Video compression
 
Introduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag JainIntroduction to Video Compression Techniques - Anurag Jain
Introduction to Video Compression Techniques - Anurag Jain
 
VVC tutorial at ICME 2020 together with Benjamin Bross
VVC tutorial at ICME 2020 together with Benjamin BrossVVC tutorial at ICME 2020 together with Benjamin Bross
VVC tutorial at ICME 2020 together with Benjamin Bross
 
Iain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video Compression
 
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 VideoAn Introduction to  Versatile Video Coding (VVC) for UHD, HDR and 360 Video
An Introduction to Versatile Video Coding (VVC) for UHD, HDR and 360 Video
 
Introduction to HEVC
Introduction to HEVCIntroduction to HEVC
Introduction to HEVC
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...
 
Video Coding Standard
Video Coding StandardVideo Coding Standard
Video Coding Standard
 
HEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxHEVC Definitions and high-level syntax
HEVC Definitions and high-level syntax
 
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdf
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdfTutorial High Efficiency Video Coding Coding - Tools and Specification.pdf
Tutorial High Efficiency Video Coding Coding - Tools and Specification.pdf
 
Audio encoding principles
Audio encoding principlesAudio encoding principles
Audio encoding principles
 
Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
 

Ähnlich wie Trends and Recent Developments in Video Coding Standardization

Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...Förderverein Technische Fakultät
 
VVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin BrossVVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin BrossMathias Wien
 
09a video compstream_intro_trd_23-nov-2005v0_2
09a video compstream_intro_trd_23-nov-2005v0_209a video compstream_intro_trd_23-nov-2005v0_2
09a video compstream_intro_trd_23-nov-2005v0_2Pptblog Pptblogcom
 
Versatile Video Coding: Compression Tools for UHD and 360° Video
Versatile Video Coding: Compression Tools for UHD and 360° VideoVersatile Video Coding: Compression Tools for UHD and 360° Video
Versatile Video Coding: Compression Tools for UHD and 360° VideoMathias Wien
 
Comparison of compression efficiency between HEVC and VP9 based on subjective...
Comparison of compression efficiency between HEVC and VP9 based on subjective...Comparison of compression efficiency between HEVC and VP9 based on subjective...
Comparison of compression efficiency between HEVC and VP9 based on subjective...Touradj Ebrahimi
 
Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...
Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...
Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...Alpen-Adria-Universität
 
HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)Alpen-Adria-Universität
 
Overview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesOverview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesAlpen-Adria-Universität
 
Overview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesOverview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesAlpen-Adria-Universität
 
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingAlpen-Adria-Universität
 
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive StreamingAlpen-Adria-Universität
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingVignesh V Menon
 
MPEG for the past, present and future of television.ppt
MPEG for the past, present and future of television.pptMPEG for the past, present and future of television.ppt
MPEG for the past, present and future of television.ppttavallaeimostafa
 
MMSys'21 - Multi-access edge computing for adaptive bitrate video streaming
MMSys'21 - Multi-access edge computing for adaptive bitrate video streamingMMSys'21 - Multi-access edge computing for adaptive bitrate video streaming
MMSys'21 - Multi-access edge computing for adaptive bitrate video streamingJesus Aguilar
 
Paper id 2120148
Paper id 2120148Paper id 2120148
Paper id 2120148IJRAT
 
Video Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive StreamingVideo Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive StreamingAlpen-Adria-Universität
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfVignesh V Menon
 
A Framework for Adaptive Delivery of Omnidirectional Video
A Framework for Adaptive Delivery of Omnidirectional VideoA Framework for Adaptive Delivery of Omnidirectional Video
A Framework for Adaptive Delivery of Omnidirectional VideoAlpen-Adria-Universität
 
Standardisation In Media Formats
Standardisation In Media FormatsStandardisation In Media Formats
Standardisation In Media FormatsFITT
 

Ähnlich wie Trends and Recent Developments in Video Coding Standardization (20)

Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
Versatile Video Coding – Video Compression beyond HEVC: Coding Tools for SDR ...
 
VVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin BrossVVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin Bross
 
09a video compstream_intro_trd_23-nov-2005v0_2
09a video compstream_intro_trd_23-nov-2005v0_209a video compstream_intro_trd_23-nov-2005v0_2
09a video compstream_intro_trd_23-nov-2005v0_2
 
Versatile Video Coding: Compression Tools for UHD and 360° Video
Versatile Video Coding: Compression Tools for UHD and 360° VideoVersatile Video Coding: Compression Tools for UHD and 360° Video
Versatile Video Coding: Compression Tools for UHD and 360° Video
 
Comparison of compression efficiency between HEVC and VP9 based on subjective...
Comparison of compression efficiency between HEVC and VP9 based on subjective...Comparison of compression efficiency between HEVC and VP9 based on subjective...
Comparison of compression efficiency between HEVC and VP9 based on subjective...
 
Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...
Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...
Video Coding for Large-Scale HTTP Adaptive Streaming Deployments: State of th...
 
HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)HTTP Adaptive Streaming – Quo Vadis? (2023)
HTTP Adaptive Streaming – Quo Vadis? (2023)
 
Overview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesOverview of Selected Current MPEG Activities
Overview of Selected Current MPEG Activities
 
Overview of Selected Current MPEG Activities
Overview of Selected Current MPEG ActivitiesOverview of Selected Current MPEG Activities
Overview of Selected Current MPEG Activities
 
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive StreamingMachine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming
 
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-title Encoding for Ultra HD Adaptive Streaming
 
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive StreamingLive-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
Live-PSTR: Live Per-Title Encoding for Ultra HD Adaptive Streaming
 
MPEG for the past, present and future of television.ppt
MPEG for the past, present and future of television.pptMPEG for the past, present and future of television.ppt
MPEG for the past, present and future of television.ppt
 
MMSys'21 - Multi-access edge computing for adaptive bitrate video streaming
MMSys'21 - Multi-access edge computing for adaptive bitrate video streamingMMSys'21 - Multi-access edge computing for adaptive bitrate video streaming
MMSys'21 - Multi-access edge computing for adaptive bitrate video streaming
 
Paper id 2120148
Paper id 2120148Paper id 2120148
Paper id 2120148
 
Video Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive StreamingVideo Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive Streaming
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdf
 
A Framework for Adaptive Delivery of Omnidirectional Video
A Framework for Adaptive Delivery of Omnidirectional VideoA Framework for Adaptive Delivery of Omnidirectional Video
A Framework for Adaptive Delivery of Omnidirectional Video
 
MMM_MCOM_Live.pdf
MMM_MCOM_Live.pdfMMM_MCOM_Live.pdf
MMM_MCOM_Live.pdf
 
Standardisation In Media Formats
Standardisation In Media FormatsStandardisation In Media Formats
Standardisation In Media Formats
 

Kürzlich hochgeladen

Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringJuanCarlosMorales19600
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 

Kürzlich hochgeladen (20)

Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineering
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 

Trends and Recent Developments in Video Coding Standardization

  • 1. Trends and Recent Developments in Video Coding Standardization ICME 2018 Tutorial, San Diego, 23.07.2018 Jens-Rainer Ohm Mathias Wien Institute of Communication Engineering Institute of Imaging and Computer Vision RWTH Aachen University, Germany RWTH Aachen University, Germany ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
  • 2. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 2 1. Introduction and history of video coding standardization (Jens) 2. Source formats and resolutions (Mathias) 3. State of the art in video compression (Mathias) 4. Versatile Video Coding (Jens) 5. Exploratory trends and perspectives (Jens) 6. Coding tools for multi-camera captures (Jens) 7. Summary and outlook Outline
  • 3. Part I: Introduction and history of video coding standardization ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization Jens-Rainer Ohm Mathias Wien Institute of Communication Engineering Institute of Imaging and Computer Vision RWTH Aachen University, Germany RWTH Aachen University, Germany ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
  • 4. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 4 Video coding standardization organisations • ISO/IEC MPEG = “Moving Picture Experts Group” (ISO/IEC JTC 1/SC 29/WG 11 = International Standardization Organization and International Electrotechnical Commission, Joint Technical Committee 1, Subcommittee 29, Working Group 11) • ITU-T VCEG = “Video Coding Experts Group” (ITU-T SG16/Q6 = International Telecommunications Union – Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT), Study Group 16, Working Party 3, Question 6) • JVT = “Joint Video Team” collaborative team of MPEG & VCEG, responsible for developing AVC (discontinued in 2009) • JCT-VC = “Joint Collaborative Team on Video Coding” team of MPEG & VCEG , responsible for developing HEVC (established January 2010) • JVET = “Joint Video Experts Team” exploring potential for new technology beyond HEVC (established Oct. 2015 as Joint Video Exploration Team, renamed Apr. 2018)
  • 5. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 5 History of international video coding standardization (1985  2020) H.263/+/++ (1995-2000+) MPEG-4 Visual (1998-2001+) MPEG-1 (1993) ISO/IECITU-T H.120 (1984-1988) H.261 (1990+) H.262 / 13818-2 (1994/95-1998+) H.264 / 14496-10 AVC (2003-2018+) H.265 / 23008-2 HEVC (2013-2018+) Videotelephony Computer SD HD 4K UHD (Advanced Video Coding developed by JVT) (High Efficiency Video Coding developed by JCT-VC) (MPEG-2) H.26x / 23090-3 VVC (2020-...) 8K, 360, ... (Versatile Video Coding to be developed by JVET)
  • 6. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 6 The scope of video standardization • Only Specifications of the Bitstream, Syntax, and Decoder are standardized: • Permits optimization beyond the obvious • Permits complexity reduction for implementability • Provides no guarantees of quality Pre-Processing Encoding Source Destination Post-Processing & Error Recovery Decoding Scope of Standard
  • 7. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 7 Hybrid Coding Concept Basis of every standard since H.261
  • 8. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 8 Input Signal Current Stage Used since early days of video compression standards, e.g. H.261, MPEG-1/-2/-4, H.263, AVS, H.264/AVC, HEVC and also in most proprietary codecs (VC1, VP8 etc.) Hybrid video coding concept
  • 9. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 9 Input Signal DCT Hybrid video coding concept
  • 10. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 10 QuantizedInput Signal DCT 010011101001… Hybrid video coding concept
  • 11. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 11 QuantizedInput Signal DCT 010011101001… Inverse DCT Hybrid video coding concept
  • 12. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 12 Next Input Signal Reconstruction vs. Hybrid video coding concept
  • 13. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 13 Next Input Signal Reconstruction 010011101001… vs. Hybrid video coding concept
  • 14. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 14 Input Signal MC Prediction Residual – = Residual w/o MC Hybrid video coding concept
  • 15. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 15 Residual DCT Hybrid video coding concept
  • 16. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 16 Residual DCT Quantized 010011101001… Hybrid video coding concept
  • 17. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 17 Residual DCT Quantized Inverse DCT Hybrid video coding concept
  • 18. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 18 Residual MC Prediction Reconstruction + = usw. Hybrid video coding concept
  • 19. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 19 Performance history of standard generations 0 100 200 300 28 30 32 34 36 38 40 bit rate (kbit/s) PSNR (dB) Foreman 10 Hz, QCIF 100 frames HEVC AVC H.262/MPEG-2 H.261H.263 + MPEG-4 Visual JPEG 35 Bit-rate Reduction: 50%
  • 20. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 20 • Improvements of motion compensation  Variable partitions & merged partitions  Flexible frame referencing & combined prediction  Sub-sample precision and high performance sub-sample interpolation  More efficient vector prediction & coding, supporting large vector ranges • Improvements of 2D coding  Efficient intra prediction and intra mode coding  Design of transform bases and variable transform block sizes • Loop filtering for artifact reduction  Deblocking, sample-adaptive offset • Improvements of entropy coding  Flexible binarization of syntax elements  Arithmetic coding  Adaptation and usage of context information • These are coupled with encoder optimization  Rate distortion optimization – spend bits where they give best benefit in terms of distortion reduction  Adaptive rate control and perceptually tuned quantization What made this happen over the years?
  • 21. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 21 • Group of Picture (GoP) structures allowing random access (used since MPEG-1) • Bi-(directional) prediction for better compression performance (used since MPEG-1) Reference picture structures B B B B B B B previous picture references ...... 1 2 3 4 5 6 7 Uni-directional prediction I|P B B P B B P pre-previous picture references Bi-directional prediction ...... 1 2 3 4 5 6 7 I|P a b
  • 22. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 22 • Hierarchical prediction structures for frame rate scalability and further improved compression performance (used in AVC and HEVC) Reference picture structures 1P I /P00I /P00 3P 3P3P 2P 3P3P 3P3P2P 2P 1P I /P002P 3P 1B I /P00I /P00 3B 3B3B 2B 3B3B 3B3B2B 2B 1B I /P002B 3B L prediction0 L prediction1 L prediction2 L prediction3 L prediction0 L prediction1 L prediction2 L prediction3 a b a
  • 23. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 23  Coder control is a non-normative part of video codecs  Choose coding parameters at encoder side “What part of the video signal should be coded using what method and parameter settings?”  Constrained problem:  Unconstrained Lagrangian formulation:  l depends on slope of rate-distortion function:  Small value: High rate, low distortion  High value: Low rate, high distortion  Can be applied in motion parameter estimation, mode decision, transform coefficient quantization, … - typically set relationship between l and QP value D - Distortion R - Rate p - Parameter Vector  opt argmin ( ) ( )D Rl   p p p p opt Targetargmin ( ) s.t. ( )D R R  p p p p Coder control
  • 24. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 24 • Video is continually increasing by resolution  HD existing, UHD (4Kx2K, 8Kx4K) appearing  Mobile services going towards HD/UHD  Stereo, multi-view, 360° video • Devices available to record and display ultra-high resolutions  Becoming affordable for home and mobile consumers • Video has multiple dimensions to grow the data rate  Frame resolution, Temporal resolution  Color resolution, bit depth  Multi-view  Visible distortion still an issue with existing networks • Necessary video data rate grows faster than feasible network transport capacities  Better video compression (than current HEVC) needed in next decade, even after availability of 5G Motivation for improved video compression
  • 25. Part II: Source formats and resolutions ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization Jens-Rainer Ohm Mathias Wien Institute of Communication Engineering Institute of Imaging and Computer Vision RWTH Aachen University, Germany RWTH Aachen University, Germany ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
  • 26. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 26 • Sequence of pictures successively captured or rendered • Progressive and interlaced formats • Picture rate measured in pictures per second, unit Hertz (Hz) • Minimum picture rate at 24Hz for impression of fluent motion [Po12]  Standard Definition TV at 50/60Hz interlaced  High Definition (HD) video at 50/60Hz progressive  Ultra HD (UHD) video up to 120Hz  Up to 300Hz considered Structure of a Video Sequence [Po12] Charles Poynton. Digital Video and HD: Algorithms and Interfaces. Waltham, MA, USA: Morgan Kaufman Publishers, 2012.
  • 27. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 27 • Picture  Set of arrays or a single array of samples with intensity values  Monochrome picture: single intensity array  Color video: usually three intensity arrays ⇒ three color components representing the color  Color sample (all three components) also referred to as a pixel (derived from picture element, sometimes also denoted as pel)  Optional alpha channel to indicate opaqueness (transparency) for mixing applications Pictures, Frames, and Fields
  • 28. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 29 • Picture  Set of pixel lines, defined number of pixels per line  Shape of pixels not necessarily square, depends on picture format  Examples: Pixel Shape
  • 29. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 30 • Human visual system less sensitive to color than to structure and texture ⇒ full resolution luma, lower resolution chroma • Chroma sub-sampling types commonly specified by relation between number of luma an chroma samples YCbCr Y : X1 : X2 • With Y: number of luma pixels • Sub-sampling format of chroma components specified by X1 and X 2 • X1 : horizontal sub-sampling • X2 = 0: vertical sub-sampling identical to horizontal sub-sampling • X2 = X1 : no vertical sub-sampling Chroma Sub-Sampling
  • 30. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 31 • Color Impression  Visible range of spectrum range from 380 nm to 780 nm  Impression of color: intensity density distribution over the visible spectral range  Colors corresponding to single wavelength:  spectral colors or primary colors  Human visual system has three color receptors (cone cells)  Maximum sensitivity in the wavelength areas of red, green and blue  Additional ’gray-scale’ receptors (rod cells): responsive in low lighting conditions Representation of Color Picture source: Wikipedia, artwork by Holly Fischer
  • 31. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 32 • Visual perception split into perception of brightness (light and dark) and chromaticity (color impression)  Brightness is driven by summarized intensity of observed spectrum  Color impression is driven by shape of intensity distribution • Functional expression to represent perceived color by a mathematical description first standardized in the CIE 1931 Standard Observer • Color as a point in a three-dimensional XYZ space • X,Y,Z values derived from the observed spectrum • Three color matching functions The CIE Standard Observer CIE: Commission internationale de l’éclairage, http://www.cie.co.at Standard Observer specified in ISO11664-1
  • 32. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 33 • The CIE Standard Observer
  • 33. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 34 • Normalization for expression of the chromaticity independent observed brightness • Since , therefore • Chromaticity specified by (x,y)-pair • Definition of a standardized white point, e.g. ’white C’, ’white D65’ The CIE Standard Observer [Po12] Charles Poynton. Digital Video and HD: Algorithms and Interfaces. Waltham, MA, USA: Morgan Kaufman Publishers, 2012. [Hu04] Robert G.W. Hunt. The Reproduction of Colour. 6th ed. Chichester, West Sussex, England: Whiley-VCH, 2004.
  • 34. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 35 • Colour space  Standard Dynamic Range (SDR) video  Contrast approx. 1000 : 0  ITU-R BT.709 colour space  High Dynamic Range (HDR) video  Contrast approx. 1000000 : 0  ITU-R BT.2100 colour space Color Spaces: Standard and Hight Dynamic Range / Wide Color Gamut Figure from N1508: Ajay Luthra, Edouard Francois, and Walt Husak (Eds.). Requirements and Use Cases for HDR and WCG Content Coding. Doc. N15084. Geneva, CH, 111th meeting: MPEG, Feb. 2015. ITU-R BT.709: Parameter values for the HDTV standards for production and international programme exchange. ITU-R, Apr. 2004. URL: http://www.itu.int/rec/R-REC-BT.709/en . ITU-R BT.2020: Parameter values for ultra-high definition television systems for production and international programme exchange. ITU-R, Oct. 2015. URL : http://www.itu.int/rec/R-REC-BT.2020/en ITU-R BT.2100: Image parameter values for high dynamic range television for use in production and international programme exchange. ITU-R, Jun. 2017. URL: http://www.itu.int/rec/R-REC-BT.2100-1-201706-I/en
  • 35. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 36 Color Spaces: Standard and Hight Dynamic Range / Wide Color Gamut Figure from: Ajay Luthra, Edouard Francois, and Walt Husak (Eds.). Requirements and Use Cases for HDR and WCG Content Coding. Doc. N15084. Geneva, CH, 111th meeting: MPEG, Feb. 2015.
  • 36. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 37 HDR/WCG Conversion Practices: Scope ITU-T H Suppl. 15 | ISO/IEC TR 23008-14, Conversion and Coding Practices for HDR/WCG Y′CbCr 4:2:0 Video with PQ Transfer Characteristics. ITU-T H Suppl. 18 | ISO/IEC TR 23008-15, Signalling, backward compatibility and display adaptation for HDR/WCG video coding. Figure from: Jonatan Samuelsson et al.: Conversion and Coding Practices for HDR/WCG Y′CbCr 4:2:0 Video with PQ Transfer Characteristics (Draft 4). Doc. JCTVC-Z1017. 26th meeting, Geneva, CH: Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Jan 2017.
  • 37. Part III: State of the Art in Video Compression ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization Jens-Rainer Ohm Mathias Wien Institute of Communication Engineering Institute of Imaging and Computer Vision RWTH Aachen University, Germany RWTH Aachen University, Germany ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
  • 38. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 39 Comparison of HEVC and the Joint Exploration Test Model (JEM) of JVET • A glimpse on high-level syntax (HEVC) • Coding structures • Walk-through of the coding loop  Intra coding  Inter coding  Transform coding  Loop filters  Entropy coding Outline and Concept for Part III
  • 39. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 40 • Coded Video Sequence (CVS)  Starts with a random access point (intra-coded picture)  One or more CVSs in a bitstream → Coded Video Sequence Group (CVSG) • Network Abstraction Layer (NAL)  Encapsulation of coded video sequence for transport and storage  Video coding layer (VCL) NAL units  Information directly for reconstruction of samples and pictures  Non-VCL NAL units  Parameter sets  Supplemental enhancement information  ... Network Abstraction Layer and Video Coding Layer
  • 40. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 41 • RBSP: Raw byte sequence payload  Sequence of bytes comprising the coded NAL unit payload  RBSP stop bit (=’1’) plus zero bits for byte alignment • SODB: String of data bits  Concatenation of bits in the RBSP bytes from MSB to LSB  All bits needed for the decoding process  Only the bits needed for the decoding process NAL Unit Structure NAL unit header
  • 41. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 47 • Blocks and Units  Block: Square or rectangular area in a color component array  Unit: Collocated blocks of the (three) color components, associated syntax elements and prediction data (e.g. motion vectors) • Picture partitioning  Coding Tree Blocks / Coding Tree Units (CTBs / CTUs)  Each CTU in exactly one slice segment  Independent slice segment: full header, independently decodable  Dependent slice segment: very short header, relies on corresponding independent slice, inherits CABAC state • Slice types  I-slice: Intra prediction only  P-slice: Intra prediction and motion compensation with one reference picture list  B-slice: Intra prediction and motion compensation with one or two reference picture lists HEVC Spatial Coding Structures CABAC: Context-based Adaptive Binary Arithmetic Coding
  • 42. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 48 Tiles in HEVC • Change scanning order of CTBs in picture • Slices in tiles, or tiles in slices • Reset of prediction and entropy coding → parallel processing
  • 43. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 49 • Maximum CTU size: 64×64 pixels • Quadtree partitioning of CTB into CBs • If picture size not integer multiple of CTB size:  Implicit CTB partitioning to meet picture size (must be multiple of 8×8 pixels) HEVC: Coding Tree Blocks and Coding Blocks (CBs)
  • 44. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 50 • Prediction block partitioning of a 2N×2N CB • Transform block partitioning of a CB  Quadtree partitioning of CB → Residual Quad Tree (RQT)  Transform size 4×4 to 32×32  TB size 4×4 to 64×64  PB boundaries inside TBs allowed HEVC: Prediction Blocks (PBs) and Transform Blocks (TBs)
  • 45. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 51 • QTBT structure removes concept of multiple partition types (TU = PU = CU) • Maximum CTU size: 256×256 pixels (128×128 used in common testing conditions) • Binary trees starting from leaves of quad-tree (with horizontal / vertical split indication) → CU can have either square or rectangular shape • Configuration  MinQTSize, MaxBTSize : minimum quadtree leaf node size / maximum binary tree root node size  MaxBTDepth, MinBTSize : maximum binary tree depth / minimum binary tree leaf node size JEM: Quad-Tree plus Binary Tree Partitioning (QTBT) 1 1 0 1 0 0 Figure from: Jianle Chen et al. Algorithm Description of Joint Exploration Test Model 7. Doc. JVET-G1001. Torino, IT, 7th meeting: Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Jul. 2017.
  • 46. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 52 Intra Prediction
  • 47. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 53 Intra prediction modes • Planar prediction: mode 0 • DC intra prediction: mode 1 • Numbering from diagonal-up to diagonal-down  Modes 2 – 18: horizontal • Modes 19 – 34: vertical • Horizontal: mode 10 Vertical: mode 26 Intra prediction block size • Intra prediction mode coded per CU • Prediction block size derived from residual quadtree • Boundary samples of neighboring block used for prediction • Efficient representation • Local update of prediction source HEVC Intra Prediction Modes
  • 48. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 54 • Concept of HEVC as basis  Higher number of prediction modes  Larger maximum block size • Chroma  Prediction modes from neighbors  Derived modes from collocated luma JEM Intra Prediction Modes Figure from: Jianle Chen et al. Algorithm Description of Joint Exploration Test Model 7. Doc. JVET-G1001. Torino, IT, 7th meeting: Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Jul. 2017.
  • 49. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 55 • HEVC  2-tap filters  Weight derived from prediction direction • JEM  4-tap filters  Cubic interpolation for blocks with ≤ 64 samples  Gaussian interpolation filters elsewhere  Parameters fixed according to block size  Same filter for all predicted samples, all modes Interpolation Filters for Directional Intra Prediction Modes
  • 50. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 56 • HEVC  Boundary sample filtering for intra prediction modes 10, 26 (horizontal / vertical)  Local, 1-sample update at boundary perpendicular to prediction direction • JEM  Extended to directional modes  Boundary samples up to four columns or rows  2-tap filter for intra modes 2 & 34  3-tap filter for intra modes 3–6 & 30–33 Intra Prediction Boundary Filtering Figure from: . JVET-G1001: Algorithm Description of Joint Exploration Test Model 7.
  • 51. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 57 • Chroma samples predicted using corresponding reconstructed luma samples 𝑝𝑟𝑒𝑑 𝐶 𝑖, 𝑗 = 𝛼 · 𝑟𝑒𝑐 𝐿′ 𝑖, 𝑗 + 𝛽 • Parameters 𝛼 and 𝛽: minimize regression error between neighbouring reconstructed luma and chroma samples around current block • Further prediction between chroma components with updated parameters 𝑝𝑟𝑒𝑑 𝐶𝑟 ∗ 𝑖, 𝑗 = 𝑝𝑟𝑒𝑑 𝐶𝑟 𝑖, 𝑗 + 𝛼 · 𝑟𝑒𝑠𝑖 𝐶𝑏′ 𝑖, 𝑗 Multiple model CCLM mode (MMLM) • Neighbouring luma samples and neighbouring chroma samples classified into two groups • Linear model for each group JEM: Cross-Component Linear Model Prediction (CCLM) Figures from: JVET-G1001: Algorithm Description of Joint Exploration Test Model 7.
  • 52. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 58 • Combination of the un-filtered boundary reference samples and HEVC style intra prediction with filtered boundary reference samples  Position-dependent weighting of filtered and unfiltered reference, configurable by four weighing parameters (hor/ver + corner)  Filtered reference: linear comination of un- filtered reference and lowpass, configurable weight  Three predefined lowpass filters selectable (3-tap, 5-tap, 7-tap)  Prediction parameters stored per block size JEM: Position Dependent Intra Prediction Combination for Planar Mode (PDPC) Figure from: JVET-G1001: Algorithm Description of Joint Exploration Test Model 7.
  • 53. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 59 • HEVC  Bi-linear smoothing  Depending on prediction block size Mode-dependent Intra Reference Sample Smoothing (MDIS) • Temporally adopted in JEM (removed in JEM7)  Adaptive reference sample smoothing (ARSS)  3-tap LPF with the coefficients of [1, 2, 1] / 4  5-tap LPF with the coefficients of [2, 3, 6, 3, 2] / 16 Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 54. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 60 Inter Prediction
  • 55. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 61 Prediction from reference picture lists • Uni-prediction  P-slices only with List0, B-slices with List0 or List1  HEVC: Minimum PB size 8×4 or 4×8 • Bi-prediction, only in B-slices  One predictor from List0, one predictor from List1  HEVC: Minimum prediction block size 8×8 Motion Compensated Prediction
  • 56. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 62 • Merge mode  Motion vector (MV) derived from candidate set (spatial and temporal neighborhood)  Merge mode candidate index coded  No motion vector difference encoded • Advanced motion vector prediction  Predictor derived from candidate set (spatial and temporal neighborhood)  Predictor index coded  Motion vector difference encoded • Skip mode  Only merge candidate signaled, no residual HEVC: Motion Vector Representation
  • 57. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 63 • CU: at most one set of motion parameters for each prediction direction • Option to split large CU into sub-CUs  Alternative temporal motion vector prediction (ATMVP)  Fetch multiple sets of motion information from multiple blocks in collocated reference picture  Spatial-temporal motion vector prediction (STMVP)  Derive recursively by temporal motion vector predictor and spatial neighbouring motion vector • ATMVP and STMVP: additional merge candidates (list extended to max 7) JEM: Sub-CU based motion vector prediction Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 58. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 64 • Locally adaptive motion vector resolution (LAMVR) motion vector difference (MVD) coded in units of  quarter luma samples,  integer luma samples, or  four luma samples • Higher motion vector storage accuracy  Internal motion vector storage and merge candidate at 1/16 pel (skip and merge modes only)  SHVC upsampling interpolation filters for the additional fractional pel positions JEM Motion Vector Representation SHVC: Scalable High Efficiency Video Coding, HEVC Annex G
  • 59. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 65 • Overlapped Block Motion Compensation (OBMC) previously been used in ITU-T H.263 • Switchable on CU level  Motion compensation block boundaries except the right and bottom boundaries of CU  Applied for both the luma and chroma components  Performed at sub-block level for all MC block boundaries JEM: Overlapped Block Motion Compensation Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 60. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 66 • Linear model for illumination changes, using a scaling factor a and an offset b  concept taken from 3D-HEVC • Enabled or disabled adaptively for each inter-mode coded coding unit (CU) • Least square error method employed to derive the parameters a and b • CU in 2N×2N merge mode  LIC flag copied from neighbouring blocks (like merge)  Otherwise, LIC flag at CU level JEM: Local Illumination Compensation (LIC) Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 61. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 67 • Motion vector field (MVF) for CU, applicable MV derived for each 4×4 block at 1/16 pel resolution  Control point motion vector (CPMV) • AF INTER mode  Signaling CPMV difference from predictor  Block width and height ≥ 8 required • AF MERGE mode  Derivation of CPMV from neigborhood JEM: Affine Motion Vector Derivation for MC                 y xxyy y x yyxx x vy w vv x w vv v vy w vv x w vv v 0 0101 0 0101 )()( )()( Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 62. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 68 • Special merge mode based on Frame-Rate Up Conversion (FRUC) techniques Options for  Bilateral matching  Template matching (applicable also for AMVP mode, CU level only) • Motion vector derivation process  Initial motion vector for CU of size 𝑊 × 𝐻  Sub-CU motion refinement for blocks of size 𝑀 × 𝑀 𝑀 = max{4, min{ 𝑊 2 𝐷 , 𝐻 2 𝐷}} JEM: Pattern Matched Motion Vector Derivation (PMMVD) bilateral Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 63. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 69 • Sample-wise motion refinement on top of block-wise motion compensation for bi-prediction • No extra signaling, applied on 4×4 block basis • MVF determined by minimizing difference Δ between points 𝐴 and 𝐵 on trajectory by Taylor expansion Δ = 𝐼(0) − 𝐼0 1 + 𝑣 𝑥 𝜏1 𝜕𝐼 1 𝜕𝑥 + 𝜏0 𝜕𝐼 0 𝜕𝑥 + 𝑣 𝑦 𝜏1 𝜕𝐼 1 𝜕𝑦 + 𝜏0 𝜕𝐼 0 𝜕𝑦 • Limited search window • Optimized search  First vertical, then horizontal search  Memory usage: only access samples inside block JEM: Bi-directional optical flow (BIO) Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 64. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 70 • MVs of bi-prediction refined by bilateral template matching process • Search between bilateral template and reference pictures ⇒ refined MV without further signaling • Applied only with reference pictures with pocRef𝑖 < poccurr < pocRef𝑗 • Not applied if enabled in CU:  LIC,  Affine motion,  FRUC, or  sub-CU merge candidate JEM: Decoder-side Motion Vector Refinement (DMVR) Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 65. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 71 Residual Coding
  • 66. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 72 • Transform block sizes 4×4, 8×8, 16×16, and 32×32  Integer approximations of the DCT-II transform matrix • Additionally, integer approximation of 4×4 DST-VI transform matrix • ’Single-norm’ design per transform block size → simple quantizer implementation • Not all perfectly orthogonal, leakage below normalization threshold HEVC Core Transforms
  • 67. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 73 • Quantizer step size Δq derived from quantization parameter QP • Exponentional relation of quantizer step sizes • Double step size every 6 QP Δq QP + 1 = 6 Δ 𝑞 QP • Definition: Δq = 1 for QP = 4, thereby Δq,0 = 2− 4 6, 2− 3 6, 2− 2 6, 2− 1 6, 1, 2 1 6 • Quantizer step sizes for given QP Δq QP = Δq,0 QP mod 6 ⋅ 2 QP 6 Quantizer Implementation
  • 68. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 74 • Large block-size transforms with high-frequency zeroing  Maximum transform size up to 128 × 128  Coefficients with column / row index > 32 set to 0 if  Block width > 64  Block height > 64, respectively • Adaptive multiple core transform (AMT)  Transform matrices quantized more accurately  Applicable for block sizes ≤ 64 × 64  Indicated by CU flag  Mode-dependent transform-set selection for intra prediction modes JEM Transforms Tables from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 69. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 75 • Motivation  Remaining correlation between coefficients after primary transform!  Dependency on intra prediction mode! • Approach: mode dependent transforms (have been studies as tool for HEVC) • MDNSST Structure:  35×3 non-separable secondary transforms for both 4×4 and 8×8 block size  3 NSST candidates for each intra prediction mode  Application of transposed transform blocks for modes > 34 JEM: Mode-Dependent Non-separable Secondary Transforms (MDNSST) Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 70. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 76 • Only applied to the low frequency coefficients after the primary transform  For blocks ≥ 8 × 8, application of 8 × 8 transform to lowest frequency coefficients of primary transform  For blocks < 8 × 8, application of 4 × 4 transform to lowest frequency coefficients of primary transform • Implementation by Hypercube-Givens Transform (HyGT) • Two rounds for 4 × 4, four rounds for 8 × 8 secondary transforms JEM: Mode-Dependent Non-separable Secondary Transforms (MDNSST) Figures from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 71. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 77 • Searching 𝑁 similar patches in reconstructed region of picture, based on template • Scheme of KLT matrix derivation:  Collection of 𝑁 prediction residuals: 𝑼 = (𝒖 𝟏,𝒖 𝟐,…,𝒖 𝑵)  covariance matrix Σ = 𝑼𝑼 𝑻  Eigenvectors are KLT bases • Application of proposed KLT on 4×4, 8×8, 16×16 and 32×32 coding blocks • Note: Tool not activated in JVET Common Testing Conditions [JVET-G1010] JEM: Signal dependent transform Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 72. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 78 Loop Filtering
  • 73. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 79 • HEVC deblocking filter also used in JEM  Filtering at prediction and transform block edges on a 8 × 8 grid  Independent operation on 8 × 8 blocks possible  parallel processing enabled • Deblocking filtering  Boundary processed in 4-sample sections (edges)  Filter strength determined based on analysis of top and bottom rows of edge  Normal: Filtering of maximum two samples into block  Strong: Up to four samples into block Deblocking Filter
  • 74. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 80 • HEVC SAO filtering also used in JEM • Local processing of samples  Depending on local neighborhood (edge offset)  Direction signaled, smoothing only  Depending on sample value (band offset)  Configurable correction of sample intensity values for four transition bands • Operation independent of processed samples → parallel processing • Local filter parameter adaptation • Four different offset values available (plus SAO off) • Dedicated SAO parameters for Y, Cb, Cr  Common SAO mode for chroma components Sample Adaptive Offset Filter (SAO) edge offset band offset
  • 75. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 81 • First loop filter in the decoding process chain of JEM • Each luma sample in reconstructed TU is replaced by weighted average of itself and its neighbours within TU  sample located at (𝑖, 𝑗), neighbouring sample at (𝑘, 𝑙)  𝐼(𝑖, 𝑗 ) and 𝐼(𝑘, 𝑙): reconstructed intensity value  𝜎 𝑑: spatial parameter (transform size, pred.mode)  𝜎𝑟: range parameter (QP) 𝜔 𝑖, 𝑗, 𝑘, 𝑙 = exp − 𝑖 − 𝑘 2 + 𝑗 − 𝑙 2 2𝜎𝑑 2 − 𝐼 𝑖, 𝑗 − 𝐼 𝑘, 𝑙 2 2𝜎𝑟 2 𝐼 𝐹 𝑖, 𝑗 = σ 𝑘,𝑙 𝐼 𝑘, 𝑙 ⋅ 𝜔(𝑖, 𝑗, 𝑘, 𝑙) σ 𝑘,𝑙 𝜔(𝑖, 𝑗, 𝑘, 𝑙)  Integer implementation with look-up table for division JEM: Bilateral filter Figure from: JVET-F1001: Algorithm Description of Joint Exploration Test Model 6.
  • 76. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 82 • Luma component  25 filters available for each 2×2 block, based on direction and activity of local gradients  Diamond filter shapes (3 × 3, 5 × 5, 7 × 7)  Classification into 25 classes, based on  Activitiy index  Directionality index • Chroma components  Diamond filter shape 5 × 5  No classification  Single set of filter coefficients • Geometric transformations based on data from classification  Transpose, vertical flip, rotation • Filter coefficients signaled with 1st CTU, FIFO buffering for temporal prediction in inter pictures, 16 candidate sets for intra pictures JEM: Adaptive loop filter (ALF)
  • 77. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 83 Entropy Coding
  • 78. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 84 • Fixed length and variable length codes (FLC, VLC)  High-level syntax  Parameter sets, slice segment header  SEI messages  Fixed-length codes, Exp-Golomb codes • Arithmetic coding  Slice level, CTUs  Context-based adaptive coding  Bypass coding (complexity, throughput) Entropy Coding CTU = Coding Tree Unit SEI = Supplemental Enhancement Information
  • 79. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 85 • VCL NAL Unit  FLC, VLC for header information  CABAC for CTUs  Byte alignment in case of multiple tiles, or with wavefront parallel processing (not present otherwise) Fixed and Variable Length Coding NAL = Network Abstraction Layer VCL = Video Coding Layer CABAC = Context-based Adaptive Binary Arithmetic Coding ba = byte alignment
  • 80. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 86 • Arithmetic coding engine  Binarization  Context model selection  Binary arithmetic coding  Optimized binarization design  Reduced number of non-bypass bins compared to H.264 | AVC • JEM  Modified context modeling for transform coefficients  Multi-hypothesis probability estimation with context-dependent updating speed  Adaptive initialization for context models Context-Based Adaptive Binary Arithmetic Coding (CABAC)
  • 81. Part IV: Versatile Video Coding ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization Jens-Rainer Ohm Mathias Wien Institute of Communication Engineering Institute of Imaging and Computer Vision RWTH Aachen University, Germany RWTH Aachen University, Germany ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
  • 82. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 88 • Experimental software “Joint Exploration Model“ (JEM) developed by JVET  Intended to investigate potential for better compression beyond HEVC  Was initially started extending HEVC software by additional compression tools, or replace existing tools (see previous section) • Substantial benefit was shown over HEVC, both in subjective quality and objective metrics  Proven in "Call for Evidence" (July 2017)  JEM was however not designed for becoming a standard (regarding all design tradeoffs)  Call for Proposals was issued by MPEG and VCEG (October 2017) • Call for Proposals very successful (responses received by April 2018)  32 companies in 21 proponent groups responded  46 category-specific submissions: 22 in SDR, 12 each in HDR and 360° video  All responses clearly better than HEVC, some evidently better than JEM  This marked the starting point for VVC development Steps towards next generation standard – Versatile Video Coding (VVC)
  • 83. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 90 • Document JVET-H1002 • Test categories  Standard dynamic range (SDR): 5 UHD and 5 HD sequences  High dynamic range (HDR): 3 HLG and 5 PQ sequences  360° video (360): 5 sequences in ERP format • Constraint sets  Constraint set 1 (C1): Random access configuration  Max 1.1s random access intervals, structural delay max 16 pictures  Constraint set 2 (C2): Low delay configuration only evaluated for SDR HD sequences  No picture reordering between input and output • Encoding constraints  No pre-processing, post-processing only within the coding loop  Static quantizer setting with one-time change to meet target bitrate  Relevant optimization methods to be reported Joint Call for Proposals (CfP) on Video Compression with Capability beyond HEVC UHD = Ultra High Definition, HD = High Definition, HLG = Hybrid Log Gamma, PQ = Perceptive Quantization (ITU-T BT2020), ERP = Equirectangular Projection
  • 84. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 91 • SDR-A: 3840×2160 • SDR-B: 1920×1080 • HDR (PQ HD, HLG 4K) • 360 Video (8K, 6K) VVC CfP Test Sequences FoodMarket4 60p CatRobot1 60p DaylightRoad2 60p ParkRunning3 50p Campfire 30p BasketballDrive 50p Cactus 50p BQTerrace 60p RitualDance 60p MarketPlace 60p Market3 HD50p Hurdles HD50p Starting HD50p ShowGirls2 HD25p Cosmos1 HD24p DayStreet 60p PeopleInShop... SunsetBeach 60p ChairliftRide 30p KiteFlite 30p Harbor 30p Trolley 30p Balboa 60p
  • 85. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 92 • Category-specific submissions (total 46):  SDR: 22 submissions (8 of which are registered only in this category)  HDR: 12 submissions  360°: 12 submissions (2 of which are registered only in this category) For all categories: HEVC anchors (HM) and JEM anchors • Proposals  Described in JVET input documents JVET-J0011...JVET-J0033  Participation of 32 institutions VVC CfP Responses JVET documents available at http://phenix.it-sudparis.eu/jvet
  • 86. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 93 • Submissions had to provide coded/decoded sequences  4 rate points each, two constraint conditions "low delay" (LD) and "random access" (RA)  SDR: 5x HD (both LD and RA), 5x UHD-4K (only RA)  HDR: 5x HD (PQ grading), 3x UHD-4K (HLG grading)  360°: 5 sequences 6K/8K for the full panorama • Double stimulus test with two hidden anchors HEVC-HM & JEM  Rate points defined with lowest rate was typically less than "fair" quality for HEVC, but still possible to code  Quality was judged to be distinguishable when confidence intervals were non-overlapping • Evaluation: Three ways of judging benefit:  Mean MOS over all test cases (28x4 test points: 23x4 C1, 5x4 C2 )  Count cases where a proposal was visually better/worse than JEM  Count cases where a proposal was visually better than HEVC (HEVC at higher rate point) • Reports: Input subjective test [JVET-J0080], output CfP results [JVET-J1003] Performance
  • 87. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 94 • Measured by objective performance (PSNR), best performers report >40% bit rate reduction compared to HEVC, >10% compared to JEM (for SDR case)  Similar ranges for HDR and 360°  Obviously, proposals with more elements show better performance  Some proposals showed similar performance as JEM with significant complexity/run time reduction  2 proposals used some degree of subjective optimization, not measurable by PSNR • Results of subjective tests generally show similar (or even better) tendency  Benefit over HEVC very clear  Benefit over JEM visible at various points  Proposals with subjective optimization also showing benefit in some cases Performance
  • 88. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 95 • JVET-J1003: Report of subjective evaluation contains 28 plots as shown, one per sequence • Count significant cases of positive/ negative benefit with non-overlapping confidence interval against JEM Performance HM JEM Proposals ranked by MOS (per rate point) +1 credit -1 credit
  • 89. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 96 • "Mean" and "significance-count" method suggested at least 7 proposals that were obviously better than JEM Performance SDR Pxx 10 Pxx 8 Pxx 8 Pxx 6 Pxx 6 Pxx 6 Pxx 6 Pnn 3 Pnn 3 Pnn 2 Pnn 2 Pnn 1 Pnn 1 JEM 0 Pnn 0 Pnn -1 Pnn -1 Pnn -1 Pnn -2 Pnn -2 Pnn -2 Pnn -3 Pnn -4 HM -36 Pxx 6,53 Pxx 6,46 Pxx 6,41 Pxx 6,37 Pxx 6,33 Pxx 6,33 Pxx 6,26 Pnn 6,23 Pnn 6,17 Pnn 6,15 Pnn 6,13 Pnn 6,11 Pnn 6,04 Pnn 6,04 Pnn 6,03 Pnn 6,03 Pnn 6,01 JEM 6,01 Pnn 6,00 Pnn 5,96 Pnn 5,94 Pnn 5,88 Pnn 5,86 HM 4,57 Mean MOS Significance vs. JEM 60 ... +60
  • 90. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 97 • Similar tendency in HDR and 360° categories • Mostly same coding tools as in SDR provide good benefit Performance HDR / 360° Mean MOS Signif. vs. JEM Pxx 6,04 Pxx 6,00 Pxx 5,94 Pxx 5,93 Pxx 5,86 Pnn 5,85 Pnn 5,80 Pnn 5,67 JEM 5,62 Pnn 5,60 Pnn 5,59 Pnn 5,45 Pnn 5,11 HM 4,14 Pxx 7 Pxx 3 Pxx 2 Pxx 2 Pxx 2 Pnn 1 Pnn 1 JEM 0 Pnn 0 Pnn 0 Pnn -1 Pnn -1 Pnn -6 HM -20 32 ... +32 Mean MOS Signif. vs. JEM Pxx 6,20 Pxx 6,19 Pxx 6,06 Pxx 6,03 Pxx 5,99 Pxx 5,96 Pxx 5,86 Pnn 5,69 Pnn 5,67 Pnn 5,51 Pnn 5,45 JEM 5,11 HM 3,79 Pnn 3,45 Pxx 9 Pxx 9 Pxx 8 Pnn 7 Pxx 7 Pxx 6 Pxx 5 Pxx 4 Pnn 2 Pnn 1 Pnn 1 JEM 0 HM -9 Pnn -12 20 ... +20HDR 360°
  • 91. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 98 • How often are best performing proposals better than HEVC at higher rate? • Note: R11 Mbit/s; R2 1.6 Mbit/s; R3 2.8 Mbit/s; R4 4.6 Mbit/s Performance compared to HEVC Pbest vs HM R1 vs R2 R1 vs R3 R1 vs R4 R2 vs R3 R2 vs R4 R3 vs R4 SDR UHD 60% 40% 0% 80% 0% 20% SDR HD/RA 40% 0% 0% 20% 0% 20% SDR HD-/LD 40% 0% 0% 0% 0% 0% HLG 67% 0% 0% 67% 0% 33% PQ 40% 0% 0% 40% 0% 20% 360° 40% 20% 0% 20% 0% 60% Rate saving  37.5%  65%  78%  43%  35%  39%
  • 92. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 99 • How often is HEVC better than best performing proposals at lower rate? - Note: 1-xx% means that best performing proposal is equal or better • Note: R11 Mbit/s; R2 1.6 Mbit/s; R3 2.8 Mbit/s; R4 4.6 Mbit/s Performance compared to HEVC HM vs Pbest R1 vs R2 R1 vs R3 R1 vs R4 R2 vs R3 R2 vs R4 R3 vs R4 SDR UHD 0% 0% 60% 0% 0% 0% SDR HD/RA 0% 60% 100% 0% 80% 0% SDR HD-/LD 0% 60% 80% 0% 80% 0% HLG 0% 0% 100% 0% 67% 0% PQ 0% 60% 100% 0% 60% 0% 360° 0% 40% 80% 0% 40% 0% Rate saving  37.5%  65%  78%  43%  65%  39%
  • 93. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 100 • The subjective quality of best performing proposals is always equal or sometimes better (~1/3 of cases) than HEVC at next higher rate point, over all categories (with approx. 40% less rate) • The subjective quality of best performing proposals is always equal or sometimes better (~1/5 of cases) than HEVC at 2nd higher rate point, in SDR-UHD category (with approx. 65% less rate) • Though it is not always the same proposal that performs best at a given rate point, it can be anticipated that merits of different proposals can be combined • 50% (or more) bit rate reduction with same quality will probably be achievable by the new standard Performance compared to HEVC
  • 94. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 101 • New elements (some come with high complexity):  Decoder side estimation for mode/MV derivation and sample prediction both in intra and inter coding (JEM)  Finer partitioning: Asymmetric, geometric  Neural networks for prediction, loop filtering, upsampling, (encoder control)  Additional elements using template matching  Intra block copy / current picture referencing  Additional non-linear, de-noising and statistics-based loop filters  Additional linear and non-linear elements in prediction • HDR specific:  New adaptive reshaping and quantization, also in-loop  HDR-specific modifications of existing tools, e.g. deblocking • 360-video specific:  Variants of projection formats, geometry-corrected face boundary padding  Modification and disabling of existing tools at face boundaries CfP analysis: What was proposed?
  • 95. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 102 • VVC Working Draft 1 / Test Model 1 (VTM1): basic approach built on "reduced HEVC" starting point • VTM Block structure  Unified tree (coding block unites prediction and transform)  CTU size 128x128, rectangular blocks (dyadic sizes), smallest luma size 4x4  Maximum transform size 64x64 • VTM: Some removed elements of HEVC:  Mode dependent transform (DST-VII), mode dependent scan  Strong intra smoothing  Sign data hiding in transform coding  Unnecessary high-level syntax (e.g. VPS)  Tiles and wavefront  Quantization weighting VVC Working Draft and Test Model 1
  • 96. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 103 • Report of Results from the Call for Proposals on Video Compression with Capability beyond HEVC [JVET-J1003]  Documentation of results per sequence, marking HM and JEM anchors, not identifying individual proponents  Assessment of qualitative (and as far as possible quantitative) benefit of submitted technology compared to anchors • Working Draft 1 of Versatile Video Coding [JVET-J1001]  "Reduced" HEVC plus quad/binary/ternary tree structure • Test Model 1 of Versatile Video Coding (VTM 1) [JVET-J1002]  Corresponding encoder and algorithm description Documents issued after CfP Results
  • 97. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 104 • Benchmark Set (BMS) was defined in addition to VTM, including the following well-known JEM tools: • 65 intra prediction modes • Coefficient coding • AMT + 4x4 NSST • Affine motion • Geometry based adaptive loop filter • Subblock merge candidate (ATMVP) • Adaptive motion vector precision • Decoder motion vector refinement • LM Chroma mode • Purpose: Testing benefit of technology against better performing set  Holding extra potential features we aren’t so sure about yet  Superset of VTM; should have significant gain over the VTM  Unveils in CEs whether gains are independent, or how much gain remains when a tool is combined with a set of more performant tools  Can be a common basis for further CE tests of modified versions of features  Not necessarily ultra-low complexity, but encoder needs to be runnable in reasonable amount of time Benchmark Set and its role
  • 98. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 105 • The only fundamental new element of version 1 • Simple multi-type tree split, can be alternated Quad/binary/ternary partitioning Example: Figures from: JVET-J1001
  • 99. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 106 • PSNR-based Common Test Conditions (CTC) BD-Rate savings relative to HEVC reference software (10 bit) • Note that gain over HEVC with CTC is lower than with CfP test set (other sequences, higher rates, lower resolutions) Performance of VTM1 and initial BMS compared to HEVC vs HM16.18 VTM BMS 4k UHD 10% 28% 1080p 8% 22% WVGA 6% 19% Average 8% 23% Decode time 0.8× 2× Encode time 2× 9×
  • 100. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 107 • Working Draft 2 of Versatile Video Coding [JVET-K1001]  Normative text specification  No descriptive text of building blocks "borrowed" from HEVC: These would anyway be placeholders which are likely to be replaced later  Starting from this meeting, precise specification of more substantial newly adopted building blocks is being added (see subsequent slides) • Test Model 2 of Versatile Video Coding (VTM 2) [JVET-K1002]  Encoder and algorithm description  Has corresponding software implementation Latest status (from last week)
  • 101. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 108 • QT/BT/TT no longer “placeholder” • Remove unnecessary partitioning restrictions • Implicit splitting at picture boundaries • Separate trees for intra slices • Position Dependent Prediction Combination • Cross Component Linear Model • 87 intra modes (wide angles included), 3 MPM, TU binarization • Affine MC (4x4 fixed subblock size, 4/6 parameter model switching at CU level) • Affine MV coding  list construction contains inheritance and derivation spatial/temporal  improved difference coding • Adaptive motion vector resolution (AMVR) • Subblock MC (4x4) from ATMVP merge, 8x8 granularity motion vector storage [High precision] Latest status (from last week): New elements of WD2 / VTM2
  • 102. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 109 • Multiple transform selection (all are DCT/DST types) for intra and inter • Increase max QP from 51 to 63 • Modified entropy coding supporting dependent quantization • Sign data hiding reinvoked from HEVC • Adaptive loop filter  4x4 classification based (gradient strength & orientation) for luma  7x7 luma, 5x5 chroma filters)  enabling flag at CTU level • Basic high-level syntax (SPS, PPS, slice) • Update of BMS contains  generalized Bi prediction (kind of local weighted prediction)  Decoder-side estimation: BIO, simplified bilateral matching  Current picture referencing (aka intra block copy) Latest status (from last week): New elements of WD2 / VTM2
  • 103. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 110 • For rectangular blocks, prediction directions witch angles beyond 45/135 degrees are reasonable • This can be implemented by adding modes at both ends • VTM2 uses a total of 85 directional intra modes now (plus DC and planar) Wide angular modes Figures from JVET-K0500
  • 104. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 111 • Alternating between two quantizers based on state transition rule allows to select an optimum sequence of reconstruction values (e.g. by trellis-like search) • Decoder needs to implement the sequential state transition rule • CABAC contexts needs to be modified as well for this case (greater than 0/1/2/... would have different meaning depending on Q0/Q1) Dependent quantization 0 1 2 3 Q0 Q1 (k & 1) == 1 (k & 1) == 1 (k & 1) == 1 (k & 1) == 1 (k & 1) == 0 (k & 1) == 0 start state current state next state for … (k & 1) == 0 (k & 1) == 1 0 0 2 1 2 0 2 1 3 3 3 1 -9Δ -8Δ 8Δ3Δ2Δ 4Δ 5Δ 6Δ 7Δ-Δ-6Δ-7Δ -5Δ -4Δ -3Δ -2Δ Δ0 9Δ 0 1 4-2 1-4 -3 0 -1 Q0 t 2 3 2 3 4 5-1-2-3-4-5 Q1 A AA BA B B A B DC C D C DDCDCD Figures from JVET-K0071
  • 105. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 112 • Ongoing investigations on  Improved merge, intra prediction, etc.  Decoder-side estimation with low complexity  Multi-hypothesis prediction and OBMC  Diagonal and other geometric partitioning  Secondary transforms  New approaches of loop filtering, reconstruction and prediction filtering (denoising, non-local, diffusion based, bilateral, etc.)  Current picture referencing, template matching, palette mode  Neural networks for loop filtering and prediction • Core experiments (CE) process  coordinated effort to investigate performance, complexity impact of proposed elements  typically based on a specific technology proposed, or combination of several technologies  allows detailed study / cross-checks by other interested parties  allows identifying which elements of a proposal are useful, if it is nit useful at all, or if further improvements are needed Further promising fields
  • 106. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 113 • Motivation: Towards object-oriented coding  Follow object boundaries more closely  Less coding artifacts where it matters • Prediction, transform and coding driven by actual object shape under RD-constraint  Inter- and intra-predicted segments for handling of disocclusions  Overlapped wedge based filtering at partition boundary  Shape-adaptive DCT for spatially localized transform coding Geometric Partitioning (GEO) Source: M. Bläser, J. Sauer, and M. Wien, “Description of SDR and 360o video coding technology proposal by RWTH Aachen University,” Doc. JVET-J0023, Joint Video Experts Team of ITU-T VCEG and ISO/IEC MPEG, San Diego, USA, 10th meeting, Apr. 2018
  • 107. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 114 • GEO available for all block sizes ≥ 8×8 luma samples • Partitioning is represented by two coordinate points 𝑃0 and 𝑃1 on the block boundary • Prediction of two coordinate points 𝑃0 and 𝑃1 from 16 pre-defined templates (scaled for non-square blocks)  Alternative: Spatial or temporal prediction  Refinement: block size dependent offset • Integration with AMVP, MERGE, FRUC (no AFFINE (yet)) GEO: Partitioning Coding and Prediction
  • 108. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 115 Results for GEO JEM 7.0 JEM 7.0 + GEO • Visual improvements at object boundaries  Sharper contours  Less staircase-effect  More background details • Objective gains (BD-rate savings)  Against HEVC: ~33% on C1, ~25% on C2  Against JEM: ~0.8% for both, C1 and C2 JEM 7.0
  • 109. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 116 Results for GEO JEM 7.0 JEM 7.0 + GEO • Visual improvements at object boundaries  Sharper contours  Less staircase-effect  More background details • Objective gains (BD-rate savings)  Against HEVC: ~33% on C1, ~25% on C2  Against JEM: ~0.8% for both, C1 and C2 JEM 7.0 + GEO
  • 110. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 117 • CE1: Partitioning • CE2: Adaptive loop filter • CE3: Intra prediction and mode coding • CE4: Inter prediction and MV coding • CE5: Arithmetic coding engine • CE6: Transforms and transform signalling • CE7: Quantization and coefficient coding • CE8: Current picture referencing • CE9: Decoder side MV derivation • CE10: Combined and multi-hypothesis prediction • CE11: Deblocking • CE12: Mapping for HDR content • CE13: Coding tools for omnidirectional video • CE14: Post-reconstruction filtering • CE15: Palette mode Current Core Experiments
  • 111. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 118 • Technically similar elements to HEVC/JEM/VVC or JVET study  Partitioning: 128x128 "superblock" with equivalent to quad/binary sub-splits (no 1:2:1 ternary)  Directional intra prediction, 56 directional modes, DC and "true motion" mode  Chroma from luma prediction  Intra block copy  Up to 7 reference frames (allows similar structure to hierarchical B)  Spatial/temporal motion vector referencing  Affine motion compensation (pixel based)  OBMC  DCT/DST based transforms, and skip  Adaptive arithmetic coder  Context-based transform coefficient coding  Film grain synthesis  Adaptive loop filter (Wiener like)  Deblocking AOM's AV1
  • 112. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 119 • Other elements  Recursive-filtering intra predictor  Prediction based on color palette  Wedge-based prediction, 16 diagonal/asymmetric modes for square/rectangular blocks, similar to GEO  Difference-modulated prediction (based on difference between two references)  Contrast enhancement/deringing loop filter  Self-guided filter (somewhat similar to bilateral & diffusion filters)  Super-resolution coding mode (with coding at lower res.) • Performance  Owners report 20% average bit rate reduction (PSNR based) compared to X.265-style HEVC encoder, set of full HD sequences  Other reports indicate much less gain, or even losses compared to HM encoder (using sequences from JVET's CTC)  According to the same reports, JEM performs significantly better than AV1  Some of those may not have used the newest JEM version, though AOM's AV1
  • 113. Part V: Exploratory trends and perspectives ICME 2018 Tutorial: Trends and Recent Developments in Video Coding Standardization Jens-Rainer Ohm Mathias Wien Institute of Communication Engineering Institute of Imaging and Computer Vision RWTH Aachen University, Germany RWTH Aachen University, Germany ohm@ient.rwth-aachen.de wien@lfb.rwth-aachen.de
  • 114. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 121 • PSNR mostly used for video quality assessment  targeting Pixel fidelity which does not necessarily reflect subjective quality • Specific artifacts produced by video codecs:  blockiness, blur and banding  motion jerkiness  time-varying edge noise ("mosquito effect") • Alternative metrics may be clustered into  full reference quality metrics  reduced reference quality metrics  no-reference quality metrics • Note that also subjective testing methods require some reference (e.g. impairment compared to original or another anchor)  full reference metrics are most reliable and are also typically used for encoder decisions • Note: Subsequent slide gives an example (SSIM) – not claimed that this is the best! Quality metrics
  • 115. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 122 • Example of another full-reference metric which better matches subjective quality at least for images • Structural SIMilarity Index (SSIM) [Wang et al. 2004] measures the structural distortion by exploring three components: Luminance, Contrast and Structural changes.  Luminance:  Contrast:  Structure comparison: • Numerous variants:  Computation separately for regions  Weighting by amount of motion and frame averaging for video  Computation in complex wavelet domain for frequency weighting (MS-SSIM, multi-scale) Perceptually adapted quality metrics example: SSIM 1 2 2 1 2 ( , ) x y x y C l x y C         2 2 2 2 2 ( , ) x y x y C c x y C         3 3 2 ( , ) xy x y C s x y C       ( , ) [ ( , )] .[ ( , )] .[ ( , )]SSIM x y l x y c x y s x y   
  • 116. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 123 • Textures with large amount of detail and/or motion are often extremely challenging for video codecs • On the other hand, the exact pixel-wise appearance is largely irrelevant for human observers, whereas degradation of visual quality is critical • Textures in videos can be static or dynamically changing over time  Static textures basically rigid (but may be moving globally)  Dynamic textures have high amount of irregular local motion  Examples: water, smoke, head-and-shoulder sequences • Both categories should have some stationarity properties in space and/or time, for allowing modelling as random process expressed by parametric description – examples:  Spectral properties  Moments (marginal statistics and covariance statistics)  Random field models • In case of dynamic texture, modelling the motion properties is relevant as well, can also be understood as a random field with certain amount of variation Perceptual coding: Texture analysis and synthesis
  • 117. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 124 • Example below is based on a parametric statistical description in complex wavelet domain (steerable pyramid), with lowpass baseband and four directional orientations in bandpass layers [Portilla, Simoncelli 2000] • Efficient coding of parameters needed for synthesis by [Thakur, Ray 2016] • Marginal statistics expressed as scalar values • Auto and cross correlation statistics compressed via DCT Static texture synthesis Reference HEVC Intra Coding 0.223bpp Thakur et al. 0.213bpp
  • 118. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 125 MVF MV T(i,j) Dense OF between adjacent frames Analyse Motion Distribution Discard non-probable MV combinations T original frames MVF MV T'(i,j) Compressed MCM Mc MCM M Discard Intermediate Frames Derive Motion Vectors Invert MVF Synthesized MVF T-2 synthesized frames Frame Warping and Blending Analysis Synthesis Source: Chubach et al. 2017 Dynamic texture synthesis method
  • 119. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 126 HEVC 6 of 8 frames synthesized Dynamic texture synthesis vs. HEVC at same rate
  • 120. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 127 • Recently, many signal processing tasks are solved by employing machine learning, deep learning and convolutional neural networks (CNN) • Advantages for video compression could be as follows: • Systematic approach of optimizing with big data sets (rather than hand-crafted design) • Detection and exploitation of nonlinear dependencies in images and video • Inclusion of perceptual criteria by mimicking human observer behaviour • On the downside, both training and running e.g. CNN algorithms e.g. for encoder decisions or at the decoder may be overly complex • Types of NN that have been proposed for image/video compression • Autoencoders • Adversarial networks • Recurrent networks, particularly based on LSTM (long short-term memory) elements Learning based approaches: Overview
  • 121. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 128 • An autoencoder is a deep (convolutional) neural network with a sparse hidden layer that represents the code • The encoder typically performs subsequent filtering and downsampling steps on input x per layer (note conceptual similarity with transform coding!) • The decoder performs complementary upsampling steps and generates output y • Encoder and decoder are trained jointly such that • Difference between x and y is minimized w.r.t. some distortion • Code z is as sparse (minimum amount of information) as possible • Use Bayes formula P(z|x) P(x|z)P(z) and minimize Kullback Leibler divergence of conditional probabilities to achieve the latter [Kingma, Welling 2014] Convolutional Neural Networks: Autoencoders (AE) Source: Wikipedia x y z=F(x) y=G(z)
  • 122. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 129 • Generator net G generates samples y from random variables z (G would be the decoder, z the code) • Discriminator net D decides whether the samples could match with real-world images x which stem from an unknown distribution P(x) • Generator and discriminator nets are trained iteratively, optimizing following function • Minimax optimization: • Train D such that V is maximized • Train G such that V is minimized • Problem: There is no corresponding mapping from x to z (no encoder) • Solution (e.g. [Santurkar et al. 2017]): Combination AE and GAN, i.e. train F(x) from AE joint with G(z) and D(⋅) Convolutional Neural Networks: Generative Adversarial Networks (GAN) Source: Slideshare.net – K. McGuinness z x y G(z) D(x) or D(y)
  • 123. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 130 • Variable-rate and variable-size coding not straightforward • Option to operate over small patches / blocks • Train separate for different content complexity • Code residual differences • Cost functions for rate distortion optimization not straightforward to implement • Option to re-formulate rate constraint as energy minimization problem • Hybrid solutions where conventional entropy coding is operated after network output at encoder • None of these solutions may lead to a consistent optimum, and may require to be driven by some external decision mechanism Convolutional Neural Networks: General problems and possible solutions
  • 124. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 131 • Autoencoder could be interpreted as a monolithic non-linear transform (though operating with local kernels) – see previously used notation in light green below • A similar approach is proposed in [Ballé et al. 2017], with additional criteria for rate distortion optimization and quantization / entropy coding on the sparse representation (called y here) • Perceptual optimization based on nonlinear "generalized divisive normalization" and L2 norm minimization in nonlinear space • Authors report significantly improvement on detail structures, also improved MS-SSIM compared to conventional codecs – transform optimized based on cost criterion below: Trained non-linear transforms (x) (y) (z)F(x) G(z') (z') Source: Ballé et al. 2017
  • 125. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 132 • All methods discussed so far were developed for still image coding, and could be used in intra coding for video • Main problem: Motion compensation is a very effective tool, and can hardly be trained into a network (or would be tremendously more complex than conventional motion estimation) • Some work on using CNN for  Sub-pel interpolation  Resolution up-conversion  Post-processing  Texture synthesis and inpainting • It is also not as simple to train for perceptual criteria in video NN for video
  • 126. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 133 • NN-based approaches were so far more successful in still image coding rather than video coding  Perceptual criteria also better understood for images • In video coding, motion compensation is a most effective key component  Requires motion estimation for which "conventional" algorithms appear to be less complex  Analogy: Eye tracking – the brain processes a motion compensated input • CNN have been demonstrated to provide benefit in context of video coding for  Resolution up-conversion  Post-processing and loop filtering  Intra coding  Encoder optimization, in particular partitioning which is basically a segmentation problem NN for video
  • 127. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 134 • Switching to lower resolution is common (an necessary) when data rate is low • Video is locally varying by detail, and may not require encoding at full resolution everywhere • Lower resolution may also be useful with high motion, motion blur, etc. • Need to code less information in such irrelevant areas can save data rate • Tools "Reduced Resolution Update" or "Dynamic Resolution Conversion" were included in MPEG-4 part 2 and H.263+, but not well understood by that time • Requires tools for  downsampling when generating prediction from reference  signalling the coding with variable resolution  upsampling for generating full-resolution picture • Three examples shown subsequently:  Down/Up-sampling using neural networks / conventional filters  Coding B pictures of dynamic texture with low resolution  Dictionary-based super-resolution upsampling Variable-resolution coding
  • 128. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 135 • Basic idea of dynamic resolution coding:  Downsample and code by lower resolution (less bitrate cost)  Upsample at decoder side to full resolution  Encoder decides using full res, conventional or CNN-based down- and upsampling  CNN-based could generate super-resolution upsampling, sharper edges, etc. • Can be implemented in combination with intra and inter prediction coding • Operated on block by block basis CNN for resolution up-conversion Figure from JVET-J0032
  • 129. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 136 • Loop filtering is common in video coding  removes compression artifacts from reconstruction  improves prediction from reconstructed frames • Generally, signal-adaptive and non-linear filters  e.g., de-blocking, de-ringing, de-banding  edge-adaptive & Wiener optimized  bi-lateral filters  ... • CNN reconstruction provides additional gain (3-5% rate red.) and might replace some conventional filters • Can be operated on block basis, parallel processing possible CNN for loop filtering Figures from JVET-I0022 Process Unit Block7 2*padding_size Block6 Block1 Block2 Block3 Block4 Block5 Block8 Block9 Block10 2*padding_size padding_size Block11 Block12 Block13 Block14 Block15 Block16 Block17 Block18 Block19 Block20 2*padding_size padding_size Conv1 (5, 5, 45) Conv2 (3, 3, 54) Conv3 (3, 3, 58) Conv4 (3, 3, 48) Conv5 (3, 3, 51) Conv6 (3, 3, 40) Conv7 (3, 3, 31) Convolution8 (3, 3, 1) Normalized QP MapNormalized Y/U/V Concat Summation ConvL (M,N,KL) ConvolutionL (M,N,KL) ReLU M: kernel width N: kernel height KL: kernel number
  • 130. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 137 • Neural networks were demonstrated to provide improved intra prediction, compared to conventional directional and planar modes • Mostly fully connected networks have been used for this purpose (no convolutional layers) • Average rate reductions of 4-5% (for intra coding) have been reported • Examples of prediction demonstrate the benefit of non-linear processing Neural networks for intra prediction Figure from JVET-J0037 Figures from Li et al. IEEE-TCSVT, July 2018
  • 131. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 138 • Key pictures coded with full resolution • Non-key pictures coded with reduced resolution • Upsampling based on motion-compensated steerable pyramid Variable-resolution coding for dynamic texture (Thakur et al. 2017) Ref pic L0 Ref pic L1 Lowpass Lowpass Lowpass Original Pictures Reconstructed key Pictures Predicting Non-Key Pictures
  • 132. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 139 • Motion vectors initially estimated from downsampled lowpass key pictures, refined and applied in bandpass and highpass components of non-key pictures • Authors report significant bit rate saving (20-30% average) for dynamic texture content, whereas subjective quality is preserved compared to full-resolution coding Variable-resolution coding for dynamic texture (Thakur et al. 2017) Motion Estimation Motion Compensation Bandpass Current LowpassReference Lowpass HighpassHighpass Bandpass Key picture Non-key picture
  • 133. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 140 • Low and high-resolution dictionaries trained jointly with sparsity constraint (large data base) • Up-converter searches low number of matching dictionary bases in low res, and applies the corresponding bases from the high res dictionary Low-resolution coding with dictionary-based up-conversion (Schneider et al. 2017)
  • 134. Trends and Recent Developments in Video Coding Standardization | Tutorial at ICME 2018 | San Diego, CA, USA | Jens-Rainer Ohm and Mathias Wien | RWTH Aachen University | Institut für Nachrichtentechnik | Lehrstuhl für Bildverarbeitung | 23.07.2018 141 • Scheme run with overlapping blocks • Provides sharp reconstruction of structures and edges • Authors report 2-3% rate gain when used in upsampling for HEVC scalable coding Low-resolution coding with dictionary-based up-conversion (Schneider et al. 2017)