4. Perceptual Video Coding Technique
(Digital) Video D
Codec(Encoder + Decoder)
R
Human Visual System (HVS)
(end recipient)
Dimensions of coder
performance
Basic Principle in Perceptual coding technique
- consider all the data that humans cannot perceive as
superfluous data, and discard them.
5. Rate-Distortion Theory
^
x Q x
Quantization noise: ˆ
e X X
N
D pi ( xi xi ) 2
ˆ
i 1
probabilities
If X is Gaussian distribution N(0,σ2):
D 2 2 2 R
6. Gap between theory and real codec
SPIHT can beat Shannon bound!
Gaussian prior
is not valid for
image!
Rate-distortion curves achieved with the SPIHT coder(dash line) and with the
Shannon RD theoretical bounds(solid line) corresponding to an i.i.d. zero-
mean Gaussian model for each wavelet sub bands (Gaussian vector source)
[A. Ortega, etc, IEEE Signal Processing Magazine, 1998]
7. HEVC: MSE vs MOS
Random
Low Delay
Access
Class A −36.9%
Class B −39.4% −40.3%
Class C −30.1% −31.5%
Class D −28.3% −29.2%
Class E −41.2%
Class F −26.2% −28.8%
Average −32.5% −34.2%
Average
−34.0% −35.5%
without F
[from:JCTVC-I0409, 2012] [from: JCT-VC Summary, 8th JCT-VC]
There is >20% gap between MSE and MOS!
8. Ideal perceptual metric
Half century’s endeavor and still open problem!
Many metrics proposed: SSIM/M-SSIM/CW-SSIM, VIF, VQM,…
[Figure from :N. Jayant, Proceedings of the IEEE ,1993]
10. Outline
Introduction
Perceptual Cues in Video Coding
Recent Research
JND based RDO
SSIM based RDO
Analysis-Completion Framework
Summary & References
11. Where do we use perceptual model currently?
[Pourazad, IEEE Consumer Electronics Magazine, 2012]
12. Frequency Masking for JPEG
The DCT-based encoder incorporated with human
visual frequency weighting(L.W Chang,2001 )
Modulation Transfer
Function(MTF)
or Quantization
Matrix(QM)
we can do better with fine
adjustment factor!
13. HEVC QM Design
HEVC default quantization matrix
Intra 8x8 QM: Uses the same QM developed for JPEG in 1999.
Intra 4x4 QM: Sub-sampled from 8x8 Intra QM
Intra 16x16 QM and Intra 32x32 QM: Up-sampled from 8x8 Intra QM
Inter QM’s : Predicted from Intra QM’s, using the linear relationship between
the Intra QM’s and the corresponding inter QM’s in AVC/H.264
[JCT-VC I012]&[L.W. Chang 2001]
15. JND in the classic DCT domain
TJND n, i, j Tbasic n, i, j Flum n Fcontrast n, i, j Ftemporal n, i, j
The basic threshold
Spatial frequency Tbasic
The luminance adaptation factor
Luminance sensitivity Flum
The contrast masking factor
Plane, edge, texture, etc Fcontrast
The temporal modulation factor
Motion, frame rate, etc Ftemporal
[Zhenyu Wei,etc, IEEE T-CSVT, 2009]
26. SSIM motivated Perceptual Coding
Yi-Hsin Huang, etc,. "Perceptual Rate-Distortion
Optimization Using Structural Similarity Index as
Quality Metric“, IEEE T-CSVT, vol. 20, no. 11,
pp. 1614-1624, Nov., 2010.
Replace PNSR with SSIM
Empirically estimating Rate-SSIM model
Reuse classical Lagrange multiplier method for
mode selection and motion estimation
27. Improved SSIM Perceptual Coding
Shiqi Wang, etc., “SSIM-Motivated Rate-
Distortion Optimization for Video Coding”, IEEE
T-CSVT, Vol.22, no. 4, pp.516-529, April, 2012.
They try to get the analytical model for the
Rate-SSIM relationship
ChuoHao Yeo, etc., “On Rate Distortion Optimization using
SSIM”, ICASSP 2012.
Abdul Rehman ,etc., “SSIM-Inspired Perceptual Video
Coding for HEVC”, ICME 2012.
Xi Wang, etc., “Motion Based Perceptual Distortion and
Rate Optimization for video Coding”, ICEM 2012
29. Abstract+Detail Framework
Key Frame (Abstract+Detail) [Z. Yuan, H. Xiong and
Li Song, ICASSP 2009]
Abstract Only(NonKey Frame) Use ME to find matching
Use Bilateral Filtering to block to recover details
remove details
30. Super-resolution Framework
Encoder
Symmetric coding complexity
5~10% bit saving at same quality
Decoder
[Q. Zhou, and Li Song, IEEE PCM 2010]
31. Outline
Introduction
Perceptual Cues in Video Coding
Recent Research
JND based RDO
SSIM based RDO
Analysis-Completion Framework
Summary & References
32. Personal Respective
Can we do much better than HEVC?
Yes, new generation video coding probably will
need more perceptual related techniques.
Some preliminary works
“On Just Noticeable Distortion Quantization in the HEVC
Codec”, JCTVC-H0477, Feb.2012
Claim 3%~25% bitrate saving at same quality.
“A joint JND model based on luminance and frequency
masking for HEVC”, JCTVC-I0163, May.2012
Claim 3%~30% bitrate saving at same quality.
33. Personal Respective
Future research
Advanced computational HVS model
– Suprathreshold vs suberthreshold
– Other masking model, like attention
Exploiting new Distortion Metric
– Image statistical properties
– Learning from large-scale datasets
Generic R-D Optimization
– R-D relationship and RDO for video coding.
34. References
Important papers
J. L. Mannnos and D. J. Sakrison, “The Effects of a Visual Fidelity Criterion
on the Encoding of Images”, IEEE Trans. On Information Theory, Vol.20,
No.4, July 1974.(Cited by 776)
N. Jayant, J. Johnston and R. Safranek, “Signal Compression Based on
Models of Human Perception”, Proceedings of the IEEE, Vol. 81, No.10, Oct.,
1993 (Cited by 761)
A Ortega, K Ramchandran, Rate-distortion methods for image and video
compression, IEEE Signal Processing Magazine, Vol.15 (6), 23-50, 1998(Cited
by 597)
W. Zhou, A.C. Bovik, "Mean Squared Error: love it or leave it? A new look at
Signal Fidelity Measures", IEEE Signal Processing Magazine , Vol.26(1):98-117,
Jan. 2009. (Cited by 353)
Ching Yang Wang, Shiuh Ming Lee, Long-Wen Chang, “Designing JPEG
quantization tables based on human visual system”, Sig. Proc.: Image Comm.
16(5): 501-506, 2001.
Wenjun Zeng, Scott Daly, Shawmin Lei, “An Overview of the Visual
Optimization Tools in JPEG 2000”, Sig. Proc.: Image Comm. 17: 85-104, 2002.
35. References
JND related
X. Yang, W. Lin, Z. Lu, E. Ong and S. Yao, “Motion-compensated Residue
Pre-processing in Video Coding Based on Just-noticeable-distortion
Profile”, IEEE Trans. Circuits and Systems for Video Technology,
vol.15(6), pp.742-750, June, 2005.
Z. Chen and C. Guillemot, "Perceptually-friendly H.264/AVC video coding
based on foveated Just-Noticeable-Distortion model," IEEE Trans. Circuits
Syst. Video Technol., vol. 20, no. 6, pp. 806-819, June 2010.
M. Naccari and F. Pereira, "Advanced H.264/AVC based perceptual video
coding: architecture, tools and assessment", IEEE Transactions on
Circuits and Systems for Video Technology, vol. 21, no. 6, pp. 766-782,
June 2011.
M. Naccari and M. Mrak, “On Just Noticeable Distortion Quantization in
the HEVC codec”, JCTVC-H0477, JCTVT 8th Meeting, San Jose, Feb.,
2012
Z. Luo, Li Song, S. Zheng,"Improving H.264/AVC Video Coding with
Adaptive Coefficient Suppression",IEEE International Symposium on
Circuits and Systems (ISCAS 2010), May.30-June.2, 2010, France.
36. References
SSIM or Other Metrics as Distortion:
Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, Chen, H.H. "Perceptual Rate-
Distortion Optimization Using Structural Similarity Index as Quality Metric“,
IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no. 11,
pp. 1614-1624, Nov., 2010.
Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, Chen, H.H. “SSIM-Based
Perceptual Rate Control for Video Coding”, IEEE Transactions on Circuits and
Systems for Video Technology, Vol.21, No.5, pp.682-691, May, 2012.
Shiqi Wang, Rehman, A, Zhou Wang, Siwei Ma and Wen Gao, “SSIM-Motivated
Rate-Distortion Optimization for Video Coding”, IEEE Transactions on Circuits
and Systems for Video Technology, Vol.22, no. 4, pp.516-529, April, 2012
Yeo chuoHao, Tan Huili, Tan Yihhan, “On Rate Distortion Optimization using
SSIM”, 2012 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP), March 2012.
Abdul Rehman and Zhou Wang, “SSIM-Inspired Perceptual Video Coding for
HEVC”, IEEE International Conference on Multimedia and Expo, June 2012.
Xi Wang, Li Su, Qingming Huang, Chunxi Liu, Ling-yu Duan, “Motion Based
Perceptual Distortion and Rate Optimization for video Coding”, IEEE
International Conference on Multimedia and Expo, 2012.
37. References
Analysis-Completion Framework:
Minmin Shen, Ping Xue and Ci Wang, “Down-Sampling Based Video Coding
Using Super-Resolution Technique”, IEEE Transaction On Circuits and
Systems for Video Technology, VOL. 21, NO. 6, pp.755-765, June, 2011
P. Ndjiki-Nya, D. Doshkov, H. Kaprykowsky, F. Zhang, D. Bull, T. Wiegand,
"Perception-oriented video coding based on image analysis and completion: A
review", Signal Processing: Image Communication 27 (2012) 579–594.
F.Zhang,D.R.Bull,Aparametricframeworkforvideocompression using region-
basedtexturemodels,IEEE Journal of Selected Topics in Signal Processing
Vol.5(7):1378–1392,2011.
Q. Zhou, Li Song, W. Zhang, “Video Coding With Key Frames Guided Super
Resolution”, IEEE Pacific-Rim Conference on Multimedia (PCM 2010),
September 21-24, Shanghai, China.
Z Yuan, H. Xiong, Li Song, “Generic Video Coding With Abstraction And
Detail Completion”, IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP 2009), April 19-24,2009, Taipei, Taiwan.