SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Perceptual Video Coding
   Research Progress

          Dr. Li Song
   Associate Professor, SJTU
Visiting Associate Professor, SCU
            2012.09
Outline

 Introduction
   Perceptual Cues in Video Coding
 Recent Research
   JND based RDO
   SSIM based RDO
   Analysis-Completion Framework
 Summary & References
Perceptual Lossless Images




PIC: 0.914 bits/pixel!                                 Original!


     [T. Pappas, Visual Signal Analysis and Compression, ICIP 2010]
Perceptual Video Coding Technique

        (Digital) Video                              D


  Codec(Encoder + Decoder)
                                         R
 Human Visual System (HVS)
      (end recipient)
                                      Dimensions of coder
                                         performance
Basic Principle in Perceptual coding technique
       - consider all the data that humans cannot perceive as
superfluous data, and discard them.
Rate-Distortion Theory
                              ^
          x            Q      x


Quantization noise:              ˆ
                           e X X
                 N
        D   pi ( xi  xi ) 2
                        ˆ
                i 1


              probabilities

  If X is Gaussian distribution N(0,σ2):

                       D  2     2   2 R
Gap between theory and real codec
                SPIHT can beat Shannon bound!



                                                          Gaussian prior
                                                          is not valid for
                                                          image!




Rate-distortion curves achieved with the SPIHT coder(dash line) and with the
Shannon RD theoretical bounds(solid line) corresponding to an i.i.d. zero-
mean Gaussian model for each wavelet sub bands (Gaussian vector source)
       [A. Ortega, etc, IEEE Signal Processing Magazine, 1998]
HEVC: MSE vs MOS

            Random
                    Low Delay
             Access
Class A     −36.9%
Class B     −39.4%   −40.3%
Class C     −30.1%   −31.5%
Class D     −28.3%   −29.2%
Class E              −41.2%
Class F     −26.2%   −28.8%
Average     −32.5%   −34.2%
Average
            −34.0%   −35.5%
without F
   [from:JCTVC-I0409, 2012]      [from: JCT-VC Summary, 8th JCT-VC]


            There is >20% gap between MSE and MOS!
Ideal perceptual metric




 Half century’s endeavor and still open problem!
Many metrics proposed: SSIM/M-SSIM/CW-SSIM, VIF, VQM,…

      [Figure from :N. Jayant, Proceedings of the IEEE ,1993]
What about Popular SSIM?




                 [JCTVC-H0063,2012]
Outline

 Introduction
   Perceptual Cues in Video Coding
 Recent Research
   JND based RDO
   SSIM based RDO
   Analysis-Completion Framework
 Summary & References
Where do we use perceptual model currently?




     [Pourazad, IEEE Consumer Electronics Magazine, 2012]
Frequency Masking for JPEG
   The DCT-based encoder incorporated with human
     visual frequency weighting(L.W Chang,2001 )




Modulation Transfer
  Function(MTF)
 or Quantization
   Matrix(QM)



        we can do better with fine
           adjustment factor!
HEVC QM Design
 HEVC default quantization matrix
 Intra 8x8 QM: Uses the same QM developed for JPEG in 1999.




 Intra 4x4 QM: Sub-sampled from 8x8 Intra QM
 Intra 16x16 QM and Intra 32x32 QM: Up-sampled from 8x8 Intra QM
 Inter QM’s : Predicted from Intra QM’s, using the linear relationship between
  the Intra QM’s and the corresponding inter QM’s in AVC/H.264


                                               [JCT-VC I012]&[L.W. Chang 2001]
Local Spatial-temporal contrast sensitivity of
           luminance perception
JND in the classic DCT domain
TJND  n, i, j   Tbasic  n, i, j   Flum  n   Fcontrast  n, i, j   Ftemporal  n, i, j 

The basic threshold
      Spatial frequency                               Tbasic
The luminance adaptation factor
      Luminance sensitivity                           Flum
The contrast masking factor
      Plane, edge, texture, etc                       Fcontrast
The temporal modulation factor
      Motion, frame rate, etc                         Ftemporal

                      [Zhenyu Wei,etc, IEEE T-CSVT, 2009]
Different Embedded Schemes

             [X. Yang, TCSVT, 2005]



             [Our, ISCAS 2010]&
             [TCSVT (accept)]


             [Z. Chen, TCSVT ,2010] &
             [M. Naccari,TCSVT, 2011]
The proposed Coding Framework
        Adjustment Threshold
            Calculation
        JND Calculation and
           Translation
                                            Adaptive              Entropy
Input                  T          Q                                         Output
                                           Suppression            Coding

                                                          Q-1


                                                          T-1




                              Intra or Inter
                               Prediction
                                                         Frame
                                                         Buffer
                        Lagrange Multiplier                          D= D1(Q)+D2(JND)
                            Adaptation
           Motion Vector
             Scaling
Bit Saving
                                                              Bitrate Reduction Against
                                  Bitrate (kbps)
Sequence   Preset QP                                                 JM 14.2 (%)
                       JM 14.2       Chen’s        Proposed    Chen’s        Proposed
               20      7945.83      6889.50        5149.85      13.29          35.19
               24      3165.17      2660.42        2436.40      15.95          23.02
Cyclists
               28      1343.73       1103.82       1138.30      17.85          15.29
               32       658.92       543.16         612.40      17.57           7.06
               20      25104.43     23734.86       15822.41     5.46           36.97
               24      13496.66     12290.08       8843.39      8.94           34.48
Harbour
               28      6054.17      5336.50        4557.15      11.85          24.73
               32      2909.30      2607.64        2588.25      10.37          11.04
               20      20306.64     18749.84       11330.19     7.67           44.20
               24      9688.57      8714.15        6239.72      10.06          35.60
 Night
               28      4507.60      4036.23        3430.19      10.46          23.90
               32      2311.90      2088.36        2050.42      9.67           11.31
Bit Saving
                                                                  Bitrate Reduction Against
                                      Bitrate (kbps)
 Sequence      Preset QP                                                 JM 14.2 (%)
                           JM 14.2       Chen’s        Proposed    Chen’s        Proposed
                   20      7135.21      6568.93        4147.18      7.94           41.88
                   24      3193.59      2850.05        2201.83      10.76          31.05
   Raven
                   28      1537.32      1346.20        1189.10      12.43          22.65
                   32       803.07       705.19         710.89      12.19          11.48
                   20      13951.79     12986.99       7317.07      6.92           47.55
                   24      6472.74      5838.45        3739.43      9.80           42.23
   Sheriff
                   28      2665.81      2361.96        1817.07      11.40          31.84
                   32      1159.36      1032.24         963.12      10.96          16.93
                   20      25071.25     21394.72       11108.62     14.66          55.69
                   24      7878.49      5930.58        4548.43      24.72          42.27
SpinCalendar
                   28      2653.01      2194.53        2046.35      17.28          22.87
                   32      1315.22       1129.24       1177.62      14.14          10.46
  Average                                                           12.18          28.32
Frame Differences




JM 14.2: QP=20 88th Frame
Frame Differences




Our: QP=20 88th Frame
Frame Differences




Differences: QP=20 88th
         Frame
Frame Differences




JM 14.2: QP=20 102nd Frame
Frame Differences




Our: QP=20 102nd Frame
Frame Differences




Frame Differences: QP=20
     102nd Frame
SSIM motivated Perceptual Coding
 Yi-Hsin Huang, etc,. "Perceptual Rate-Distortion
  Optimization Using Structural Similarity Index as
  Quality Metric“, IEEE T-CSVT, vol. 20, no. 11,
  pp. 1614-1624, Nov., 2010.
     Replace PNSR with SSIM
     Empirically estimating Rate-SSIM model
     Reuse classical Lagrange multiplier method for
      mode selection and motion estimation
Improved SSIM Perceptual Coding
 Shiqi Wang, etc., “SSIM-Motivated Rate-
   Distortion Optimization for Video Coding”, IEEE
   T-CSVT, Vol.22, no. 4, pp.516-529, April, 2012.
     They try to get the analytical model for the
      Rate-SSIM relationship
 ChuoHao Yeo, etc., “On Rate Distortion Optimization using
   SSIM”, ICASSP 2012.
 Abdul Rehman ,etc., “SSIM-Inspired Perceptual Video
   Coding for HEVC”, ICME 2012.
 Xi Wang, etc., “Motion Based Perceptual Distortion and
   Rate Optimization for video Coding”, ICEM 2012
Basic Analysis-Completion Structure




  [P. Ndjiki-Nya, Signal Processing: Image Communication, 2012]
Abstract+Detail Framework
     Key Frame (Abstract+Detail)           [Z. Yuan, H. Xiong and
                                           Li Song, ICASSP 2009]




Abstract Only(NonKey Frame)        Use ME to find matching
Use Bilateral Filtering to         block to recover details
remove details
Super-resolution Framework



                        Encoder



            Symmetric coding complexity
            5~10% bit saving at same quality




                            Decoder



         [Q. Zhou, and Li Song, IEEE PCM 2010]
Outline

 Introduction
   Perceptual Cues in Video Coding
 Recent Research
   JND based RDO
   SSIM based RDO
   Analysis-Completion Framework
 Summary & References
Personal Respective
 Can we do much better than HEVC?
   Yes, new generation video coding probably will
      need more perceptual related techniques.
 Some preliminary works
      “On Just Noticeable Distortion Quantization in the HEVC
      Codec”, JCTVC-H0477, Feb.2012
        Claim 3%~25% bitrate saving at same quality.
   “A joint JND model based on luminance and frequency
    masking for HEVC”, JCTVC-I0163, May.2012
        Claim 3%~30% bitrate saving at same quality.
Personal Respective
 Future research
   Advanced computational HVS model
   – Suprathreshold vs suberthreshold
   – Other masking model, like attention
   Exploiting new Distortion Metric
    – Image statistical properties
    – Learning from large-scale datasets
   Generic R-D Optimization
    – R-D relationship and RDO for video coding.
References
 Important papers
     J. L. Mannnos and D. J. Sakrison, “The Effects of a Visual Fidelity Criterion
      on the Encoding of Images”, IEEE Trans. On Information Theory, Vol.20,
      No.4, July 1974.(Cited by 776)
     N. Jayant, J. Johnston and R. Safranek, “Signal Compression Based on
      Models of Human Perception”, Proceedings of the IEEE, Vol. 81, No.10, Oct.,
      1993 (Cited by 761)
     A Ortega, K Ramchandran, Rate-distortion methods for image and video
      compression, IEEE Signal Processing Magazine, Vol.15 (6), 23-50, 1998(Cited
      by 597)
     W. Zhou, A.C. Bovik, "Mean Squared Error: love it or leave it? A new look at
      Signal Fidelity Measures", IEEE Signal Processing Magazine , Vol.26(1):98-117,
      Jan. 2009. (Cited by 353)
     Ching Yang Wang, Shiuh Ming Lee, Long-Wen Chang, “Designing JPEG
      quantization tables based on human visual system”, Sig. Proc.: Image Comm.
      16(5): 501-506, 2001.
     Wenjun Zeng, Scott Daly, Shawmin Lei, “An Overview of the Visual
      Optimization Tools in JPEG 2000”, Sig. Proc.: Image Comm. 17: 85-104, 2002.
References
 JND related
    X. Yang, W. Lin, Z. Lu, E. Ong and S. Yao, “Motion-compensated Residue
     Pre-processing in Video Coding Based on Just-noticeable-distortion
     Profile”, IEEE Trans. Circuits and Systems for Video Technology,
     vol.15(6), pp.742-750, June, 2005.
    Z. Chen and C. Guillemot, "Perceptually-friendly H.264/AVC video coding
     based on foveated Just-Noticeable-Distortion model," IEEE Trans. Circuits
     Syst. Video Technol., vol. 20, no. 6, pp. 806-819, June 2010.
    M. Naccari and F. Pereira, "Advanced H.264/AVC based perceptual video
     coding: architecture, tools and assessment", IEEE Transactions on
     Circuits and Systems for Video Technology, vol. 21, no. 6, pp. 766-782,
     June 2011.
    M. Naccari and M. Mrak, “On Just Noticeable Distortion Quantization in
     the HEVC codec”, JCTVC-H0477, JCTVT 8th Meeting, San Jose, Feb.,
     2012
    Z. Luo, Li Song, S. Zheng,"Improving H.264/AVC Video Coding with
     Adaptive Coefficient Suppression",IEEE International Symposium on
     Circuits and Systems (ISCAS 2010), May.30-June.2, 2010, France.
References
   SSIM or Other Metrics as Distortion:
    Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, Chen, H.H. "Perceptual Rate-
     Distortion Optimization Using Structural Similarity Index as Quality Metric“,
     IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no. 11,
     pp. 1614-1624, Nov., 2010.
    Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, Chen, H.H. “SSIM-Based
     Perceptual Rate Control for Video Coding”, IEEE Transactions on Circuits and
     Systems for Video Technology, Vol.21, No.5, pp.682-691, May, 2012.
    Shiqi Wang, Rehman, A, Zhou Wang, Siwei Ma and Wen Gao, “SSIM-Motivated
     Rate-Distortion Optimization for Video Coding”, IEEE Transactions on Circuits
     and Systems for Video Technology, Vol.22, no. 4, pp.516-529, April, 2012
    Yeo chuoHao, Tan Huili, Tan Yihhan, “On Rate Distortion Optimization using
     SSIM”, 2012 IEEE International Conference on Acoustics, Speech and Signal
     Processing (ICASSP), March 2012.
    Abdul Rehman and Zhou Wang, “SSIM-Inspired Perceptual Video Coding for
     HEVC”, IEEE International Conference on Multimedia and Expo, June 2012.
    Xi Wang, Li Su, Qingming Huang, Chunxi Liu, Ling-yu Duan, “Motion Based
     Perceptual Distortion and Rate Optimization for video Coding”, IEEE
     International Conference on Multimedia and Expo, 2012.
References
 Analysis-Completion Framework:
   Minmin Shen, Ping Xue and Ci Wang, “Down-Sampling Based Video Coding
   Using Super-Resolution Technique”, IEEE Transaction On Circuits and
   Systems for Video Technology, VOL. 21, NO. 6, pp.755-765, June, 2011
   P. Ndjiki-Nya, D. Doshkov, H. Kaprykowsky, F. Zhang, D. Bull, T. Wiegand,
   "Perception-oriented video coding based on image analysis and completion: A
   review", Signal Processing: Image Communication 27 (2012) 579–594.
   F.Zhang,D.R.Bull,Aparametricframeworkforvideocompression using region-
   basedtexturemodels,IEEE Journal of Selected Topics in Signal Processing
   Vol.5(7):1378–1392,2011.
   Q. Zhou, Li Song, W. Zhang, “Video Coding With Key Frames Guided Super
   Resolution”, IEEE Pacific-Rim Conference on Multimedia (PCM 2010),
   September 21-24, Shanghai, China.
   Z Yuan, H. Xiong, Li Song, “Generic Video Coding With Abstraction And
   Detail Completion”, IEEE International Conference on Acoustics, Speech and
   Signal Processing (ICASSP 2009), April 19-24,2009, Taipei, Taiwan.
Thanks!

Weitere ähnliche Inhalte

Was ist angesagt?

Compressive Light Field Displays
Compressive Light Field DisplaysCompressive Light Field Displays
Compressive Light Field DisplaysGordon Wetzstein
 
Introduction to wavelet transform
Introduction to wavelet transformIntroduction to wavelet transform
Introduction to wavelet transformRaj Endiran
 
Design Approach of Colour Image Denoising Using Adaptive Wavelet
Design Approach of Colour Image Denoising Using Adaptive WaveletDesign Approach of Colour Image Denoising Using Adaptive Wavelet
Design Approach of Colour Image Denoising Using Adaptive WaveletIJERD Editor
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)Nidhi Gopal
 
Robust Super-Resolution by minimizing a Gaussian-weighted L2 error norm
Robust Super-Resolution by minimizing a Gaussian-weighted L2 error normRobust Super-Resolution by minimizing a Gaussian-weighted L2 error norm
Robust Super-Resolution by minimizing a Gaussian-weighted L2 error normTuan Q. Pham
 
Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...
Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...
Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...Wanmin Wu
 
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...IDES Editor
 
An Optimized Transform for ECG Signal Compression
An Optimized Transform for ECG Signal CompressionAn Optimized Transform for ECG Signal Compression
An Optimized Transform for ECG Signal CompressionIDES Editor
 
Voice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringVoice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringTejus Adiga M
 
Continuous variable quantum key distribution finite key analysis of composabl...
Continuous variable quantum key distribution finite key analysis of composabl...Continuous variable quantum key distribution finite key analysis of composabl...
Continuous variable quantum key distribution finite key analysis of composabl...wtyru1989
 
Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...
Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...
Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...ijsrd.com
 
Experimental demonstration of continuous variable quantum key distribution ov...
Experimental demonstration of continuous variable quantum key distribution ov...Experimental demonstration of continuous variable quantum key distribution ov...
Experimental demonstration of continuous variable quantum key distribution ov...wtyru1989
 
Fundamentals of Digital Signal Processing - Question Bank
Fundamentals of Digital Signal Processing - Question BankFundamentals of Digital Signal Processing - Question Bank
Fundamentals of Digital Signal Processing - Question BankMathankumar S
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012MediaEval2012
 
Recent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-ResolutionRecent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-ResolutionHiroto Honda
 
Nishimoto Interspeech 2010 v3
Nishimoto Interspeech 2010 v3Nishimoto Interspeech 2010 v3
Nishimoto Interspeech 2010 v3Takuya Nishimoto
 

Was ist angesagt? (19)

Compressive Light Field Displays
Compressive Light Field DisplaysCompressive Light Field Displays
Compressive Light Field Displays
 
671 679
671 679671 679
671 679
 
Introduction to wavelet transform
Introduction to wavelet transformIntroduction to wavelet transform
Introduction to wavelet transform
 
Design Approach of Colour Image Denoising Using Adaptive Wavelet
Design Approach of Colour Image Denoising Using Adaptive WaveletDesign Approach of Colour Image Denoising Using Adaptive Wavelet
Design Approach of Colour Image Denoising Using Adaptive Wavelet
 
PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)PPT Image Analysis(IRDE, DRDO)
PPT Image Analysis(IRDE, DRDO)
 
Robust Super-Resolution by minimizing a Gaussian-weighted L2 error norm
Robust Super-Resolution by minimizing a Gaussian-weighted L2 error normRobust Super-Resolution by minimizing a Gaussian-weighted L2 error norm
Robust Super-Resolution by minimizing a Gaussian-weighted L2 error norm
 
D25014017
D25014017D25014017
D25014017
 
Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...
Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...
Color-plus-Depth Level-of-Detail in 3D Tele-immersive Video: A Psychophysical...
 
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
 
An Optimized Transform for ECG Signal Compression
An Optimized Transform for ECG Signal CompressionAn Optimized Transform for ECG Signal Compression
An Optimized Transform for ECG Signal Compression
 
Voice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringVoice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency Filtering
 
Continuous variable quantum key distribution finite key analysis of composabl...
Continuous variable quantum key distribution finite key analysis of composabl...Continuous variable quantum key distribution finite key analysis of composabl...
Continuous variable quantum key distribution finite key analysis of composabl...
 
Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...
Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...
Comparative Analysis of Dwt, Reduced Wavelet Transform, Complex Wavelet Trans...
 
Experimental demonstration of continuous variable quantum key distribution ov...
Experimental demonstration of continuous variable quantum key distribution ov...Experimental demonstration of continuous variable quantum key distribution ov...
Experimental demonstration of continuous variable quantum key distribution ov...
 
Fundamentals of Digital Signal Processing - Question Bank
Fundamentals of Digital Signal Processing - Question BankFundamentals of Digital Signal Processing - Question Bank
Fundamentals of Digital Signal Processing - Question Bank
 
Lightspeed SIGGRAPH talk
Lightspeed SIGGRAPH talkLightspeed SIGGRAPH talk
Lightspeed SIGGRAPH talk
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
 
Recent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-ResolutionRecent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-Resolution
 
Nishimoto Interspeech 2010 v3
Nishimoto Interspeech 2010 v3Nishimoto Interspeech 2010 v3
Nishimoto Interspeech 2010 v3
 

Andere mochten auch

Video summarization using clustering
Video summarization using clusteringVideo summarization using clustering
Video summarization using clusteringSahil Biswas
 
Gaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image CapturingGaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image CapturingGuy K. Kloss
 
Current developments in video quality: From the emerging HEVC standard to tem...
Current developments in video quality: From the emerging HEVC standard to tem...Current developments in video quality: From the emerging HEVC standard to tem...
Current developments in video quality: From the emerging HEVC standard to tem...Harilaos Koumaras
 
Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309Berry Gao
 
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Chris Huang
 
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...Journal For Research
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Universitat Politècnica de Catalunya
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...Edge AI and Vision Alliance
 
Integrating Physical And Logical Security
Integrating Physical And Logical SecurityIntegrating Physical And Logical Security
Integrating Physical And Logical SecurityJorge Sebastiao
 

Andere mochten auch (12)

Video summarization using clustering
Video summarization using clusteringVideo summarization using clustering
Video summarization using clustering
 
Gaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image CapturingGaining Colour Stability in Live Image Capturing
Gaining Colour Stability in Live Image Capturing
 
Current developments in video quality: From the emerging HEVC standard to tem...
Current developments in video quality: From the emerging HEVC standard to tem...Current developments in video quality: From the emerging HEVC standard to tem...
Current developments in video quality: From the emerging HEVC standard to tem...
 
Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309Howen CCTV System worldwide Application-201309
Howen CCTV System worldwide Application-201309
 
Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...Applying Media Content Analysis to the Production of Musical Videos as Summar...
Applying Media Content Analysis to the Production of Musical Videos as Summar...
 
Content based video summarization into object maps
Content based video summarization into object mapsContent based video summarization into object maps
Content based video summarization into object maps
 
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
Paralleling Variable Block Size Motion Estimation of HEVC On CPU plus GPU Pla...
 
Keyframe-based Video Summarization Designer
Keyframe-based Video Summarization DesignerKeyframe-based Video Summarization Designer
Keyframe-based Video Summarization Designer
 
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
VIDEO SUMMARIZATION: CORRELATION FOR SUMMARIZATION AND SUBTRACTION FOR RARE E...
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
 
"Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin..."Image and Video Summarization," a Presentation from the University of Washin...
"Image and Video Summarization," a Presentation from the University of Washin...
 
Integrating Physical And Logical Security
Integrating Physical And Logical SecurityIntegrating Physical And Logical Security
Integrating Physical And Logical Security
 

Ähnlich wie Perceptual Video Coding

Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUsPerformance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUsFisnik Kraja
 
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...Fisnik Kraja
 
CyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdfCyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdfMohammadAzreeYahaya
 
Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Matthias Trapp
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayNAVER Engineering
 
Design of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB CommunicationsDesign of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB CommunicationsRFIC-IUMA
 
lossy compression JPEG
lossy compression JPEGlossy compression JPEG
lossy compression JPEGMahmoud Hikmet
 
A Video Watermarking Scheme to Hinder Camcorder Piracy
A Video Watermarking Scheme to Hinder Camcorder PiracyA Video Watermarking Scheme to Hinder Camcorder Piracy
A Video Watermarking Scheme to Hinder Camcorder PiracyIOSR Journals
 
Ibtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital imagesIbtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital imagesZakaria Zubi
 
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time RaytracingSIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time RaytracingElectronic Arts / DICE
 

Ähnlich wie Perceptual Video Coding (20)

JASLA_presentation.pdf
JASLA_presentation.pdfJASLA_presentation.pdf
JASLA_presentation.pdf
 
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUsPerformance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
 
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
Parallelization Techniques for the 2D Fourier Matched Filtering and Interpola...
 
CyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdfCyberSec_JPEGcompressionForensics.pdf
CyberSec_JPEGcompressionForensics.pdf
 
H0545156
H0545156H0545156
H0545156
 
Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)
 
SigmaDeltaADC
SigmaDeltaADCSigmaDeltaADC
SigmaDeltaADC
 
Image denoising using curvelet transform
Image denoising using curvelet transformImage denoising using curvelet transform
Image denoising using curvelet transform
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things away
 
BMC 2012 - Invited Talk
BMC 2012 - Invited TalkBMC 2012 - Invited Talk
BMC 2012 - Invited Talk
 
Design of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB CommunicationsDesign of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB Communications
 
lossy compression JPEG
lossy compression JPEGlossy compression JPEG
lossy compression JPEG
 
Ph.D. Presentation
Ph.D. PresentationPh.D. Presentation
Ph.D. Presentation
 
Pcm
PcmPcm
Pcm
 
The role of a biometrician in an International Agricultural Center: service a...
The role of a biometrician in an International Agricultural Center: service a...The role of a biometrician in an International Agricultural Center: service a...
The role of a biometrician in an International Agricultural Center: service a...
 
How video codec work
How video codec work How video codec work
How video codec work
 
A Video Watermarking Scheme to Hinder Camcorder Piracy
A Video Watermarking Scheme to Hinder Camcorder PiracyA Video Watermarking Scheme to Hinder Camcorder Piracy
A Video Watermarking Scheme to Hinder Camcorder Piracy
 
Ibtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital imagesIbtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital images
 
48
4848
48
 
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time RaytracingSIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
 

Mehr von Shanghai Jiao Tong University(上海交通大学) (6)

ICIP2013-video stabilization with l1 l2 optimization
ICIP2013-video stabilization with l1 l2 optimizationICIP2013-video stabilization with l1 l2 optimization
ICIP2013-video stabilization with l1 l2 optimization
 
THE SJTU 4K VIDEO SEQUENCE DATASET
THE SJTU 4K VIDEO SEQUENCE DATASETTHE SJTU 4K VIDEO SEQUENCE DATASET
THE SJTU 4K VIDEO SEQUENCE DATASET
 
No-reference Video Quality Assessment on Mobile Devices
No-reference Video Quality Assessment on Mobile DevicesNo-reference Video Quality Assessment on Mobile Devices
No-reference Video Quality Assessment on Mobile Devices
 
Efficient Realization of Parallel HEVC Intra Coding
Efficient Realization of Parallel HEVC Intra CodingEfficient Realization of Parallel HEVC Intra Coding
Efficient Realization of Parallel HEVC Intra Coding
 
Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...
 
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
 

Kürzlich hochgeladen

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Perceptual Video Coding

  • 1. Perceptual Video Coding Research Progress Dr. Li Song Associate Professor, SJTU Visiting Associate Professor, SCU 2012.09
  • 2. Outline  Introduction  Perceptual Cues in Video Coding  Recent Research  JND based RDO  SSIM based RDO  Analysis-Completion Framework  Summary & References
  • 3. Perceptual Lossless Images PIC: 0.914 bits/pixel! Original! [T. Pappas, Visual Signal Analysis and Compression, ICIP 2010]
  • 4. Perceptual Video Coding Technique (Digital) Video D Codec(Encoder + Decoder) R Human Visual System (HVS) (end recipient) Dimensions of coder performance Basic Principle in Perceptual coding technique - consider all the data that humans cannot perceive as superfluous data, and discard them.
  • 5. Rate-Distortion Theory ^ x Q x Quantization noise: ˆ e X X N D   pi ( xi  xi ) 2 ˆ i 1 probabilities If X is Gaussian distribution N(0,σ2): D  2 2 2 R
  • 6. Gap between theory and real codec SPIHT can beat Shannon bound! Gaussian prior is not valid for image! Rate-distortion curves achieved with the SPIHT coder(dash line) and with the Shannon RD theoretical bounds(solid line) corresponding to an i.i.d. zero- mean Gaussian model for each wavelet sub bands (Gaussian vector source) [A. Ortega, etc, IEEE Signal Processing Magazine, 1998]
  • 7. HEVC: MSE vs MOS Random Low Delay Access Class A −36.9% Class B −39.4% −40.3% Class C −30.1% −31.5% Class D −28.3% −29.2% Class E −41.2% Class F −26.2% −28.8% Average −32.5% −34.2% Average −34.0% −35.5% without F [from:JCTVC-I0409, 2012] [from: JCT-VC Summary, 8th JCT-VC] There is >20% gap between MSE and MOS!
  • 8. Ideal perceptual metric Half century’s endeavor and still open problem! Many metrics proposed: SSIM/M-SSIM/CW-SSIM, VIF, VQM,… [Figure from :N. Jayant, Proceedings of the IEEE ,1993]
  • 9. What about Popular SSIM? [JCTVC-H0063,2012]
  • 10. Outline  Introduction  Perceptual Cues in Video Coding  Recent Research  JND based RDO  SSIM based RDO  Analysis-Completion Framework  Summary & References
  • 11. Where do we use perceptual model currently? [Pourazad, IEEE Consumer Electronics Magazine, 2012]
  • 12. Frequency Masking for JPEG The DCT-based encoder incorporated with human visual frequency weighting(L.W Chang,2001 ) Modulation Transfer Function(MTF) or Quantization Matrix(QM) we can do better with fine adjustment factor!
  • 13. HEVC QM Design  HEVC default quantization matrix  Intra 8x8 QM: Uses the same QM developed for JPEG in 1999.  Intra 4x4 QM: Sub-sampled from 8x8 Intra QM  Intra 16x16 QM and Intra 32x32 QM: Up-sampled from 8x8 Intra QM  Inter QM’s : Predicted from Intra QM’s, using the linear relationship between the Intra QM’s and the corresponding inter QM’s in AVC/H.264 [JCT-VC I012]&[L.W. Chang 2001]
  • 14. Local Spatial-temporal contrast sensitivity of luminance perception
  • 15. JND in the classic DCT domain TJND  n, i, j   Tbasic  n, i, j   Flum  n   Fcontrast  n, i, j   Ftemporal  n, i, j  The basic threshold Spatial frequency Tbasic The luminance adaptation factor Luminance sensitivity Flum The contrast masking factor Plane, edge, texture, etc Fcontrast The temporal modulation factor Motion, frame rate, etc Ftemporal [Zhenyu Wei,etc, IEEE T-CSVT, 2009]
  • 16. Different Embedded Schemes [X. Yang, TCSVT, 2005] [Our, ISCAS 2010]& [TCSVT (accept)] [Z. Chen, TCSVT ,2010] & [M. Naccari,TCSVT, 2011]
  • 17. The proposed Coding Framework Adjustment Threshold Calculation JND Calculation and Translation Adaptive Entropy Input T Q Output Suppression Coding Q-1 T-1 Intra or Inter Prediction Frame Buffer Lagrange Multiplier D= D1(Q)+D2(JND) Adaptation Motion Vector Scaling
  • 18. Bit Saving Bitrate Reduction Against Bitrate (kbps) Sequence Preset QP JM 14.2 (%) JM 14.2 Chen’s Proposed Chen’s Proposed 20 7945.83 6889.50 5149.85 13.29 35.19 24 3165.17 2660.42 2436.40 15.95 23.02 Cyclists 28 1343.73 1103.82 1138.30 17.85 15.29 32 658.92 543.16 612.40 17.57 7.06 20 25104.43 23734.86 15822.41 5.46 36.97 24 13496.66 12290.08 8843.39 8.94 34.48 Harbour 28 6054.17 5336.50 4557.15 11.85 24.73 32 2909.30 2607.64 2588.25 10.37 11.04 20 20306.64 18749.84 11330.19 7.67 44.20 24 9688.57 8714.15 6239.72 10.06 35.60 Night 28 4507.60 4036.23 3430.19 10.46 23.90 32 2311.90 2088.36 2050.42 9.67 11.31
  • 19. Bit Saving Bitrate Reduction Against Bitrate (kbps) Sequence Preset QP JM 14.2 (%) JM 14.2 Chen’s Proposed Chen’s Proposed 20 7135.21 6568.93 4147.18 7.94 41.88 24 3193.59 2850.05 2201.83 10.76 31.05 Raven 28 1537.32 1346.20 1189.10 12.43 22.65 32 803.07 705.19 710.89 12.19 11.48 20 13951.79 12986.99 7317.07 6.92 47.55 24 6472.74 5838.45 3739.43 9.80 42.23 Sheriff 28 2665.81 2361.96 1817.07 11.40 31.84 32 1159.36 1032.24 963.12 10.96 16.93 20 25071.25 21394.72 11108.62 14.66 55.69 24 7878.49 5930.58 4548.43 24.72 42.27 SpinCalendar 28 2653.01 2194.53 2046.35 17.28 22.87 32 1315.22 1129.24 1177.62 14.14 10.46 Average 12.18 28.32
  • 20. Frame Differences JM 14.2: QP=20 88th Frame
  • 23. Frame Differences JM 14.2: QP=20 102nd Frame
  • 26. SSIM motivated Perceptual Coding  Yi-Hsin Huang, etc,. "Perceptual Rate-Distortion Optimization Using Structural Similarity Index as Quality Metric“, IEEE T-CSVT, vol. 20, no. 11, pp. 1614-1624, Nov., 2010.  Replace PNSR with SSIM  Empirically estimating Rate-SSIM model  Reuse classical Lagrange multiplier method for mode selection and motion estimation
  • 27. Improved SSIM Perceptual Coding  Shiqi Wang, etc., “SSIM-Motivated Rate- Distortion Optimization for Video Coding”, IEEE T-CSVT, Vol.22, no. 4, pp.516-529, April, 2012.  They try to get the analytical model for the Rate-SSIM relationship  ChuoHao Yeo, etc., “On Rate Distortion Optimization using SSIM”, ICASSP 2012.  Abdul Rehman ,etc., “SSIM-Inspired Perceptual Video Coding for HEVC”, ICME 2012.  Xi Wang, etc., “Motion Based Perceptual Distortion and Rate Optimization for video Coding”, ICEM 2012
  • 28. Basic Analysis-Completion Structure [P. Ndjiki-Nya, Signal Processing: Image Communication, 2012]
  • 29. Abstract+Detail Framework Key Frame (Abstract+Detail) [Z. Yuan, H. Xiong and Li Song, ICASSP 2009] Abstract Only(NonKey Frame) Use ME to find matching Use Bilateral Filtering to block to recover details remove details
  • 30. Super-resolution Framework Encoder  Symmetric coding complexity  5~10% bit saving at same quality Decoder [Q. Zhou, and Li Song, IEEE PCM 2010]
  • 31. Outline  Introduction  Perceptual Cues in Video Coding  Recent Research  JND based RDO  SSIM based RDO  Analysis-Completion Framework  Summary & References
  • 32. Personal Respective  Can we do much better than HEVC?  Yes, new generation video coding probably will need more perceptual related techniques.  Some preliminary works  “On Just Noticeable Distortion Quantization in the HEVC Codec”, JCTVC-H0477, Feb.2012  Claim 3%~25% bitrate saving at same quality.  “A joint JND model based on luminance and frequency masking for HEVC”, JCTVC-I0163, May.2012  Claim 3%~30% bitrate saving at same quality.
  • 33. Personal Respective  Future research  Advanced computational HVS model – Suprathreshold vs suberthreshold – Other masking model, like attention  Exploiting new Distortion Metric – Image statistical properties – Learning from large-scale datasets  Generic R-D Optimization – R-D relationship and RDO for video coding.
  • 34. References  Important papers  J. L. Mannnos and D. J. Sakrison, “The Effects of a Visual Fidelity Criterion on the Encoding of Images”, IEEE Trans. On Information Theory, Vol.20, No.4, July 1974.(Cited by 776)  N. Jayant, J. Johnston and R. Safranek, “Signal Compression Based on Models of Human Perception”, Proceedings of the IEEE, Vol. 81, No.10, Oct., 1993 (Cited by 761)  A Ortega, K Ramchandran, Rate-distortion methods for image and video compression, IEEE Signal Processing Magazine, Vol.15 (6), 23-50, 1998(Cited by 597)  W. Zhou, A.C. Bovik, "Mean Squared Error: love it or leave it? A new look at Signal Fidelity Measures", IEEE Signal Processing Magazine , Vol.26(1):98-117, Jan. 2009. (Cited by 353)  Ching Yang Wang, Shiuh Ming Lee, Long-Wen Chang, “Designing JPEG quantization tables based on human visual system”, Sig. Proc.: Image Comm. 16(5): 501-506, 2001.  Wenjun Zeng, Scott Daly, Shawmin Lei, “An Overview of the Visual Optimization Tools in JPEG 2000”, Sig. Proc.: Image Comm. 17: 85-104, 2002.
  • 35. References  JND related  X. Yang, W. Lin, Z. Lu, E. Ong and S. Yao, “Motion-compensated Residue Pre-processing in Video Coding Based on Just-noticeable-distortion Profile”, IEEE Trans. Circuits and Systems for Video Technology, vol.15(6), pp.742-750, June, 2005.  Z. Chen and C. Guillemot, "Perceptually-friendly H.264/AVC video coding based on foveated Just-Noticeable-Distortion model," IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 6, pp. 806-819, June 2010.  M. Naccari and F. Pereira, "Advanced H.264/AVC based perceptual video coding: architecture, tools and assessment", IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 6, pp. 766-782, June 2011.  M. Naccari and M. Mrak, “On Just Noticeable Distortion Quantization in the HEVC codec”, JCTVC-H0477, JCTVT 8th Meeting, San Jose, Feb., 2012  Z. Luo, Li Song, S. Zheng,"Improving H.264/AVC Video Coding with Adaptive Coefficient Suppression",IEEE International Symposium on Circuits and Systems (ISCAS 2010), May.30-June.2, 2010, France.
  • 36. References  SSIM or Other Metrics as Distortion: Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, Chen, H.H. "Perceptual Rate- Distortion Optimization Using Structural Similarity Index as Quality Metric“, IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no. 11, pp. 1614-1624, Nov., 2010. Yi-Hsin Huang, Tao-Sheng Ou, Po-Yen Su, Chen, H.H. “SSIM-Based Perceptual Rate Control for Video Coding”, IEEE Transactions on Circuits and Systems for Video Technology, Vol.21, No.5, pp.682-691, May, 2012. Shiqi Wang, Rehman, A, Zhou Wang, Siwei Ma and Wen Gao, “SSIM-Motivated Rate-Distortion Optimization for Video Coding”, IEEE Transactions on Circuits and Systems for Video Technology, Vol.22, no. 4, pp.516-529, April, 2012 Yeo chuoHao, Tan Huili, Tan Yihhan, “On Rate Distortion Optimization using SSIM”, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 2012. Abdul Rehman and Zhou Wang, “SSIM-Inspired Perceptual Video Coding for HEVC”, IEEE International Conference on Multimedia and Expo, June 2012. Xi Wang, Li Su, Qingming Huang, Chunxi Liu, Ling-yu Duan, “Motion Based Perceptual Distortion and Rate Optimization for video Coding”, IEEE International Conference on Multimedia and Expo, 2012.
  • 37. References  Analysis-Completion Framework: Minmin Shen, Ping Xue and Ci Wang, “Down-Sampling Based Video Coding Using Super-Resolution Technique”, IEEE Transaction On Circuits and Systems for Video Technology, VOL. 21, NO. 6, pp.755-765, June, 2011 P. Ndjiki-Nya, D. Doshkov, H. Kaprykowsky, F. Zhang, D. Bull, T. Wiegand, "Perception-oriented video coding based on image analysis and completion: A review", Signal Processing: Image Communication 27 (2012) 579–594. F.Zhang,D.R.Bull,Aparametricframeworkforvideocompression using region- basedtexturemodels,IEEE Journal of Selected Topics in Signal Processing Vol.5(7):1378–1392,2011. Q. Zhou, Li Song, W. Zhang, “Video Coding With Key Frames Guided Super Resolution”, IEEE Pacific-Rim Conference on Multimedia (PCM 2010), September 21-24, Shanghai, China. Z Yuan, H. Xiong, Li Song, “Generic Video Coding With Abstraction And Detail Completion”, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009), April 19-24,2009, Taipei, Taiwan.