SlideShare ist ein Scribd-Unternehmen logo
1 von 19
1
Motion Estimation methods
Review and comparison
2
Motion Prediction Models
Translational Model
the prediction signal for each block is a block
of same size MXN from other frames
the prediction block is specified by the
translational motion vector and reference
frame(s) index.
Affine Model
motion in 3-D is described by affine
transformations (a composition of scaling,
rotation, sheering and translation).
3
Translational Model
Translational model maps rectangle to rectangle of same size and it is non-
adequate for 3-D motion (e.g. imagine the case when a rectangular object
quickly approaching to the camera, its size is increasing). This model works
good for small motions:
Subdivide current frame into blocks.
Find one displacement vector for each block:
Within a search range, find a “best match” that minimizes an error
measure.
In Translational model all pixels with a current block are displaced by same
vector from the reference block. It’s like the reference block is displaced
(without rotation and scaling) to a new position.
4
Affine Model
5
Taxonomy of Motion Estimation Methods
Pixel Domain Methods
Matching algorithms
Block Matching (most popular): full-search, Three Step, diamond etc.
Feature Matching: Integral Projection matching, Successive Elimination
Gradient-based algorithms
pel-recursive
block-recursive
Frequency Domain Methods
Phase correlation
matching in wavelet domain
matching in DCT domain
6
Motion estimation Parameters
Search Area
in case of significant (or fast) motion large search
area impacts significantly on Motion Estimation
effectiveness. On the other hand ME complexity
increases.
Sub-pixel mode
Motion is not limited to pixel granularity, therefore
sub-pixel prediction (with accuracy up to 1/8 of
pixel) is applied
7
Motion estimation error measures
8
Block Matching Motion Estimation Parameters
Hierarchical Architecture:
To reduce complexity and/or to pipeline Motion Estimation
two hierarchical levels are commonly used:
First stage: coarse motion estimation (usually on
decimated search region)
Second stage: fine motion estimation tuning around
“best” coarse motion vectors obtained in the previous
stage.
Speed Up Techniques:
Early termination – exclude current candidate if its
preliminary cost exceeding the minimal cost (already
obtained).
Exclude candidates – not all candidates are checked
(e.g. logarithmic search schemas).
9
Inter Prediction Comparison: AVC/H.264, HEVC/H.265, VP9, AV1
Features AVC/H.264 HEVC/H.265 VP9 AV1
Square blocks
Only
yes no yes no
Weighted
prediction
yes yes no yes
Bi-Prediction yes yes Yes as
superframe*
Yes as
superframe*
Number of
references
Up to 16
(depending on
level)
Up to 16
(depending on
level)
3 Up to 7
Sub-pixel
Precision
¼-pel for luma
1/8-pel for
chroma
¼-pel for luma
1/8-pel for chroma
1/8-pel for luma
1/16-pel for
chroma
1/8-pel for luma
1/16-pel for
chroma
*To avoid patent infringements B-frame is coded as a couple of non-displayable
frame plus displayable frame consisting of skip blocks. This pair of frames is
called ‘superframe’
10
Rectangular Prediction Blocks in HEVC/H.265 and AV1
 HEVC/H.265
 AV1
2
0 1
Horizontal A
Split
0
1 2
Horizontal B
Split
Sub-blocks are not split further
0
1
2
Vertical A
Split
1
2
0
Vertical B
Split
0 1 2 3
Vertical 4:1
Split
0
1
2
3
Horizontal 4:1
Split
nLx2N nRx2
N
2Nxn
U
2Nxn
D
11
Rectangular Prediction Blocks in HEVC/H.265 and AV1 (cont.)
2NxnU 2NxnD nRx2NnLx2N
Benefits of rectangular partitioning (HEVC(
Benefits of rectangular partitioning (AV1(
12
Sub-pixel Precision in AVC/H.264, HEVC/H.265, VP9 and AV1
 AVC sub-pel precision:
 is ¼ for luma and ⅛ for chroma respectively (4:2:0(.
The interpolation filters for both luma and chroma are fixed (non-
adaptive(.
 For luma the interpolation is pipelined and it is executed in two non-balanced serial
stages for each direction (horizontal and vertical(:
6-tap filter for half-pels (high complex(
bilinear filter for quarter-pels (low complex(
 For chroma a fixed 4-tap filter is used for all fractional positions (similar to HEVC(.
 HEVC sub-pel precision: ¼ for luma and ⅛ for chroma respectively
The interpolation filters for generating sub-pel data for both luma and
chroma are fixed (non-adaptive(:
 For luma pixels a fixed 8-tap filter is applied for both half-pels and quarter-pels.
The luma interpolation process is pipelined, it consists of two stages: horizontal and
vertical
filtering.
 For chroma a fixed 4-tap filter is used for all fractional positions.
13
Sub-pixel Precision in H.264, H.265, VP9 & AV1
VP9 sub-pel precision
¼ for luma and ⅛ for chroma respectively (if 4:2:0(.
The interpolation filters for generating sub-pel data can be
adaptively chosen at frame-level, available filters kernels:
Normal
Smooth - slightly smooths or blurs the prediction block
Sharp - slightly sharpens the prediction block.
Interpolation filtering is pipelined: firstly a corresponding
horizontal filter is used to build up a temporary array, and
then at the second stage this array is vertically filtered to
obtain the final prediction.
Note: important advantage of HEVC over VP9 is a
separation of filters for half and quarter pel (can be realized
in stages, friendly for HW(.
14
Sub-pixel Precision in H.264, H.265, VP9 & AV1
AV1 sub-pel precision
Up-to 1/8-pel sub-pel precision for luma (1/8 and 1/16 precision for
chroma respectively due to 4:2:0(, the precision level is specified at
frame level.
There are four interpolation kernels (up to 8 taps(, filter can be block-
level adaptive:
EIGHTTAP, EIGHTTAP_SMOOTH, EIGHTTAP_SHARP,
BILINEAR
Each filter is separable (i.e. filtering process is pipelined(: firstly
horizontal filtering is performed and then vertical filtering.
Interpolation filter can be fixed within a frame, in such case one of
four kernels is selected at frame header.
Interpolation filter can be switchable at block-level
There is a special mode - dual filtering, where kernel for each
direction can be different. Justification for dual filtering - signals can
possess distinctive statistics in vertical and horizontal directions.
15
Use Case: HEVC/H.265 Motion Estimation Details
 Variable inter-prediction block sizes – from 8x8 to 64x64, including non-square sizes like
32x16 (actually 4x8 and 8x4 blocks are also permitted with some constraints(.
 Chroma block sizes mimic luma, for 4:2:0 case with the scaling factor 1/2 (although for small
luma blocks the scaling factor is 1(.
 Bi-directional prediction: two prediction blocks from previous and future pictures are mixed
(averaged( to produce the final prediction signal (it’s a kind of interpolation(.
 weighted prediction (e.g. to compensate fading(.
 Sub-pixel precision: up to 1/4-th for luma and up to 1/8 for chroma
16
Weighted Prediction in HEVC/H.265
Fwd Ref
Horizont
al Filter
Vertical
Filter
<<
6
<<
2
8bits per pixel
10bits per pix
Bwd Ref
Horizont
al Filter
Vertical
Filter
<<
6
<<
2
8bits per pixel
10bits per pix
Merge
Predicted signal
17
AV1 Motion Estimation Details
 AV1 supports Global Motion mode which is divided into the following
categories:
Translation (panning video)
Rotation
Zoom
Affine (suitable for 3D motion)
 AV1 supports OBMC (Overlapped Block Motion Compensation)
 AV1 supports Warped motion per superblock
Examples:
In case of translation a global Motion Vector is applied for the whole
frame.
In case of Zoom and Rotation Motion Vector is depending on block
location
18
AV1 Motion Estimation Details – General Idea of OBMC
Justification of OBMC - MV is most reliable in the center of the block (where
prediction errors tend to be smaller than those at the corners). For a block it’s better to
assign several MVs (its own and nearby blocks) and to blend reference samples:
Block
MV0 MV1
MV2
MV3
r
c
19
AV1 Motion Estimation Details – Technical Details of OBMC
In AV1 OBMC predicted block is associated with a single vector MV0
corresponding to the block’s center while corner MVs are taken from causal
(already decoded) neighbors.
Blending is executed in two separable stages: firstly according to vertical
direction and then according to horizontal direction (the filter coefficients are
pre-defined in the AV1 spec.)
shadow of block2
Block
1 block2
MV1
MV2
shadowofBlock1
MV0
Get prediction samples according
to MV0
Get prediction samples for
overlap area of block1 according
to MV1
Get prediction samples for
overlap area of block2 according
to MV2
block3
block4
Shadow of block3
Shadow of4
MV3
MV4
Get prediction samples for
overlap area of block3 according
to MV3
Get prediction samples for
overlap area of block4 according
to MV4
blending blending

Weitere ähnliche Inhalte

Was ist angesagt?

DIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTESDIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTESEzhilya venkat
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extractionRushin Shah
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & DescriptorsPundrikPatel
 
Application of edge detection
Application of edge detectionApplication of edge detection
Application of edge detectionNaresh Biloniya
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram ProcessingAmnaakhaan
 
Frequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement TechniquesFrequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement TechniquesDiwaker Pant
 
Image Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersImage Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersKarthika Ramachandran
 
Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)asodariyabhavesh
 
Image filtering in Digital image processing
Image filtering in Digital image processingImage filtering in Digital image processing
Image filtering in Digital image processingAbinaya B
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationMostafa G. M. Mostafa
 
Image degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem AshrafImage degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem AshrafMD Naseem Ashraf
 
Simultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color ImagesSimultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color ImagesCristina Pérez Benito
 
Image restoration and degradation model
Image restoration and degradation modelImage restoration and degradation model
Image restoration and degradation modelAnupriyaDurai
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentationasodariyabhavesh
 

Was ist angesagt? (20)

DIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTESDIGITAL IMAGE PROCESSING - LECTURE NOTES
DIGITAL IMAGE PROCESSING - LECTURE NOTES
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
 
Digital image processing
Digital image processing  Digital image processing
Digital image processing
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & Descriptors
 
Application of edge detection
Application of edge detectionApplication of edge detection
Application of edge detection
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram Processing
 
Psuedo color
Psuedo colorPsuedo color
Psuedo color
 
Frequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement TechniquesFrequency Domain Image Enhancement Techniques
Frequency Domain Image Enhancement Techniques
 
Image Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersImage Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain Filters
 
Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)
 
Image filtering in Digital image processing
Image filtering in Digital image processingImage filtering in Digital image processing
Image filtering in Digital image processing
 
image enhancement
 image enhancement image enhancement
image enhancement
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Image degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem AshrafImage degradation and noise by Md.Naseem Ashraf
Image degradation and noise by Md.Naseem Ashraf
 
Image compression .
Image compression .Image compression .
Image compression .
 
Simultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color ImagesSimultaneous Smoothing and Sharpening of Color Images
Simultaneous Smoothing and Sharpening of Color Images
 
Data Redundacy
Data RedundacyData Redundacy
Data Redundacy
 
Image restoration and degradation model
Image restoration and degradation modelImage restoration and degradation model
Image restoration and degradation model
 
Morphological operations
Morphological operationsMorphological operations
Morphological operations
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentation
 

Ähnlich wie Motion estimation overview

Types Of Window Being Used For The Selected Granule
Types Of Window Being Used For The Selected GranuleTypes Of Window Being Used For The Selected Granule
Types Of Window Being Used For The Selected GranuleLeslie Lee
 
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...VLSICS Design
 
Adaptive trilateral filter for hevc standard
Adaptive trilateral filter for hevc standardAdaptive trilateral filter for hevc standard
Adaptive trilateral filter for hevc standardijma
 
74 real time-image-processing-applied-to-traffic-queue-d
74 real time-image-processing-applied-to-traffic-queue-d74 real time-image-processing-applied-to-traffic-queue-d
74 real time-image-processing-applied-to-traffic-queue-dravi247272
 
Adaptive Trilateral Filter for In-Loop Filtering
Adaptive Trilateral Filter for In-Loop FilteringAdaptive Trilateral Filter for In-Loop Filtering
Adaptive Trilateral Filter for In-Loop Filteringcsandit
 
Multimedia basic video compression techniques
Multimedia basic video compression techniquesMultimedia basic video compression techniques
Multimedia basic video compression techniquesMazin Alwaaly
 
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...cscpconf
 
Optical Flow Based Navigation
Optical Flow Based NavigationOptical Flow Based Navigation
Optical Flow Based NavigationVincent Kee
 
Lane detection by use of canny edge
Lane detection by use of canny edgeLane detection by use of canny edge
Lane detection by use of canny edgebanz23
 
Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisArunaRavi
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingYu Huang
 
Repeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video TranscodingRepeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video TranscodingCSCJournals
 
De Interlacing Techniques
De Interlacing TechniquesDe Interlacing Techniques
De Interlacing TechniquesRamesh Prasad
 
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...CSCJournals
 
Implementation of OFDM System Using Various Channel Modulation Schemes
Implementation of OFDM System Using Various Channel Modulation SchemesImplementation of OFDM System Using Various Channel Modulation Schemes
Implementation of OFDM System Using Various Channel Modulation SchemesIJCSIS Research Publications
 

Ähnlich wie Motion estimation overview (20)

HEVC overview main
HEVC overview mainHEVC overview main
HEVC overview main
 
Types Of Window Being Used For The Selected Granule
Types Of Window Being Used For The Selected GranuleTypes Of Window Being Used For The Selected Granule
Types Of Window Being Used For The Selected Granule
 
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
 
PPT
PPTPPT
PPT
 
Adaptive trilateral filter for hevc standard
Adaptive trilateral filter for hevc standardAdaptive trilateral filter for hevc standard
Adaptive trilateral filter for hevc standard
 
74 real time-image-processing-applied-to-traffic-queue-d
74 real time-image-processing-applied-to-traffic-queue-d74 real time-image-processing-applied-to-traffic-queue-d
74 real time-image-processing-applied-to-traffic-queue-d
 
Adaptive Trilateral Filter for In-Loop Filtering
Adaptive Trilateral Filter for In-Loop FilteringAdaptive Trilateral Filter for In-Loop Filtering
Adaptive Trilateral Filter for In-Loop Filtering
 
Multimedia basic video compression techniques
Multimedia basic video compression techniquesMultimedia basic video compression techniques
Multimedia basic video compression techniques
 
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
ADAPTIVE, SCALABLE, TRANSFORMDOMAIN GLOBAL MOTION ESTIMATION FOR VIDEO STABIL...
 
Deblocking_Filter_v2
Deblocking_Filter_v2Deblocking_Filter_v2
Deblocking_Filter_v2
 
Optical Flow Based Navigation
Optical Flow Based NavigationOptical Flow Based Navigation
Optical Flow Based Navigation
 
Lane detection by use of canny edge
Lane detection by use of canny edgeLane detection by use of canny edge
Lane detection by use of canny edge
 
Aruna Ravi - M.S Thesis
Aruna Ravi - M.S ThesisAruna Ravi - M.S Thesis
Aruna Ravi - M.S Thesis
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
Repeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video TranscodingRepeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video Transcoding
 
An734
An734An734
An734
 
De Interlacing Techniques
De Interlacing TechniquesDe Interlacing Techniques
De Interlacing Techniques
 
Temporal Segment Network
Temporal Segment NetworkTemporal Segment Network
Temporal Segment Network
 
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
Fast Motion Estimation for Quad-Tree Based Video Coder Using Normalized Cross...
 
Implementation of OFDM System Using Various Channel Modulation Schemes
Implementation of OFDM System Using Various Channel Modulation SchemesImplementation of OFDM System Using Various Channel Modulation Schemes
Implementation of OFDM System Using Various Channel Modulation Schemes
 

Mehr von Yoss Cohen

Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
open platform for swarm training
open platform for swarm training open platform for swarm training
open platform for swarm training Yoss Cohen
 
Deep Learning - system view
Deep Learning - system viewDeep Learning - system view
Deep Learning - system viewYoss Cohen
 
Dspip deep learning syllabus
Dspip deep learning syllabusDspip deep learning syllabus
Dspip deep learning syllabusYoss Cohen
 
IoT consideration selection
IoT consideration selectionIoT consideration selection
IoT consideration selectionYoss Cohen
 
Nvidia jetson nano bringup
Nvidia jetson nano bringupNvidia jetson nano bringup
Nvidia jetson nano bringupYoss Cohen
 
Autonomous car teleportation architecture
Autonomous car teleportation architectureAutonomous car teleportation architecture
Autonomous car teleportation architectureYoss Cohen
 
Computer Vision - Image Filters
Computer Vision - Image FiltersComputer Vision - Image Filters
Computer Vision - Image FiltersYoss Cohen
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learnYoss Cohen
 
DASH and HTTP2.0
DASH and HTTP2.0DASH and HTTP2.0
DASH and HTTP2.0Yoss Cohen
 
HEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxHEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxYoss Cohen
 
Introduction to HEVC
Introduction to HEVCIntroduction to HEVC
Introduction to HEVCYoss Cohen
 
FFMPEG on android
FFMPEG on androidFFMPEG on android
FFMPEG on androidYoss Cohen
 
Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video" Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video" Yoss Cohen
 
Video quality testing
Video quality testingVideo quality testing
Video quality testingYoss Cohen
 
HEVC / H265 Hands-On course
HEVC / H265 Hands-On courseHEVC / H265 Hands-On course
HEVC / H265 Hands-On courseYoss Cohen
 
Web video standards
Web video standardsWeb video standards
Web video standardsYoss Cohen
 
Product wise computer vision development
Product wise computer vision developmentProduct wise computer vision development
Product wise computer vision developmentYoss Cohen
 
3D Video Programming for Android
3D Video Programming for Android3D Video Programming for Android
3D Video Programming for AndroidYoss Cohen
 

Mehr von Yoss Cohen (20)

Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
open platform for swarm training
open platform for swarm training open platform for swarm training
open platform for swarm training
 
Deep Learning - system view
Deep Learning - system viewDeep Learning - system view
Deep Learning - system view
 
Dspip deep learning syllabus
Dspip deep learning syllabusDspip deep learning syllabus
Dspip deep learning syllabus
 
IoT consideration selection
IoT consideration selectionIoT consideration selection
IoT consideration selection
 
IoT evolution
IoT evolutionIoT evolution
IoT evolution
 
Nvidia jetson nano bringup
Nvidia jetson nano bringupNvidia jetson nano bringup
Nvidia jetson nano bringup
 
Autonomous car teleportation architecture
Autonomous car teleportation architectureAutonomous car teleportation architecture
Autonomous car teleportation architecture
 
Computer Vision - Image Filters
Computer Vision - Image FiltersComputer Vision - Image Filters
Computer Vision - Image Filters
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learn
 
DASH and HTTP2.0
DASH and HTTP2.0DASH and HTTP2.0
DASH and HTTP2.0
 
HEVC Definitions and high-level syntax
HEVC Definitions and high-level syntaxHEVC Definitions and high-level syntax
HEVC Definitions and high-level syntax
 
Introduction to HEVC
Introduction to HEVCIntroduction to HEVC
Introduction to HEVC
 
FFMPEG on android
FFMPEG on androidFFMPEG on android
FFMPEG on android
 
Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video" Hands-on Video Course - "RAW Video"
Hands-on Video Course - "RAW Video"
 
Video quality testing
Video quality testingVideo quality testing
Video quality testing
 
HEVC / H265 Hands-On course
HEVC / H265 Hands-On courseHEVC / H265 Hands-On course
HEVC / H265 Hands-On course
 
Web video standards
Web video standardsWeb video standards
Web video standards
 
Product wise computer vision development
Product wise computer vision developmentProduct wise computer vision development
Product wise computer vision development
 
3D Video Programming for Android
3D Video Programming for Android3D Video Programming for Android
3D Video Programming for Android
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 

Kürzlich hochgeladen (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

Motion estimation overview

  • 2. 2 Motion Prediction Models Translational Model the prediction signal for each block is a block of same size MXN from other frames the prediction block is specified by the translational motion vector and reference frame(s) index. Affine Model motion in 3-D is described by affine transformations (a composition of scaling, rotation, sheering and translation).
  • 3. 3 Translational Model Translational model maps rectangle to rectangle of same size and it is non- adequate for 3-D motion (e.g. imagine the case when a rectangular object quickly approaching to the camera, its size is increasing). This model works good for small motions: Subdivide current frame into blocks. Find one displacement vector for each block: Within a search range, find a “best match” that minimizes an error measure. In Translational model all pixels with a current block are displaced by same vector from the reference block. It’s like the reference block is displaced (without rotation and scaling) to a new position.
  • 5. 5 Taxonomy of Motion Estimation Methods Pixel Domain Methods Matching algorithms Block Matching (most popular): full-search, Three Step, diamond etc. Feature Matching: Integral Projection matching, Successive Elimination Gradient-based algorithms pel-recursive block-recursive Frequency Domain Methods Phase correlation matching in wavelet domain matching in DCT domain
  • 6. 6 Motion estimation Parameters Search Area in case of significant (or fast) motion large search area impacts significantly on Motion Estimation effectiveness. On the other hand ME complexity increases. Sub-pixel mode Motion is not limited to pixel granularity, therefore sub-pixel prediction (with accuracy up to 1/8 of pixel) is applied
  • 8. 8 Block Matching Motion Estimation Parameters Hierarchical Architecture: To reduce complexity and/or to pipeline Motion Estimation two hierarchical levels are commonly used: First stage: coarse motion estimation (usually on decimated search region) Second stage: fine motion estimation tuning around “best” coarse motion vectors obtained in the previous stage. Speed Up Techniques: Early termination – exclude current candidate if its preliminary cost exceeding the minimal cost (already obtained). Exclude candidates – not all candidates are checked (e.g. logarithmic search schemas).
  • 9. 9 Inter Prediction Comparison: AVC/H.264, HEVC/H.265, VP9, AV1 Features AVC/H.264 HEVC/H.265 VP9 AV1 Square blocks Only yes no yes no Weighted prediction yes yes no yes Bi-Prediction yes yes Yes as superframe* Yes as superframe* Number of references Up to 16 (depending on level) Up to 16 (depending on level) 3 Up to 7 Sub-pixel Precision ¼-pel for luma 1/8-pel for chroma ¼-pel for luma 1/8-pel for chroma 1/8-pel for luma 1/16-pel for chroma 1/8-pel for luma 1/16-pel for chroma *To avoid patent infringements B-frame is coded as a couple of non-displayable frame plus displayable frame consisting of skip blocks. This pair of frames is called ‘superframe’
  • 10. 10 Rectangular Prediction Blocks in HEVC/H.265 and AV1  HEVC/H.265  AV1 2 0 1 Horizontal A Split 0 1 2 Horizontal B Split Sub-blocks are not split further 0 1 2 Vertical A Split 1 2 0 Vertical B Split 0 1 2 3 Vertical 4:1 Split 0 1 2 3 Horizontal 4:1 Split nLx2N nRx2 N 2Nxn U 2Nxn D
  • 11. 11 Rectangular Prediction Blocks in HEVC/H.265 and AV1 (cont.) 2NxnU 2NxnD nRx2NnLx2N Benefits of rectangular partitioning (HEVC( Benefits of rectangular partitioning (AV1(
  • 12. 12 Sub-pixel Precision in AVC/H.264, HEVC/H.265, VP9 and AV1  AVC sub-pel precision:  is ¼ for luma and ⅛ for chroma respectively (4:2:0(. The interpolation filters for both luma and chroma are fixed (non- adaptive(.  For luma the interpolation is pipelined and it is executed in two non-balanced serial stages for each direction (horizontal and vertical(: 6-tap filter for half-pels (high complex( bilinear filter for quarter-pels (low complex(  For chroma a fixed 4-tap filter is used for all fractional positions (similar to HEVC(.  HEVC sub-pel precision: ¼ for luma and ⅛ for chroma respectively The interpolation filters for generating sub-pel data for both luma and chroma are fixed (non-adaptive(:  For luma pixels a fixed 8-tap filter is applied for both half-pels and quarter-pels. The luma interpolation process is pipelined, it consists of two stages: horizontal and vertical filtering.  For chroma a fixed 4-tap filter is used for all fractional positions.
  • 13. 13 Sub-pixel Precision in H.264, H.265, VP9 & AV1 VP9 sub-pel precision ¼ for luma and ⅛ for chroma respectively (if 4:2:0(. The interpolation filters for generating sub-pel data can be adaptively chosen at frame-level, available filters kernels: Normal Smooth - slightly smooths or blurs the prediction block Sharp - slightly sharpens the prediction block. Interpolation filtering is pipelined: firstly a corresponding horizontal filter is used to build up a temporary array, and then at the second stage this array is vertically filtered to obtain the final prediction. Note: important advantage of HEVC over VP9 is a separation of filters for half and quarter pel (can be realized in stages, friendly for HW(.
  • 14. 14 Sub-pixel Precision in H.264, H.265, VP9 & AV1 AV1 sub-pel precision Up-to 1/8-pel sub-pel precision for luma (1/8 and 1/16 precision for chroma respectively due to 4:2:0(, the precision level is specified at frame level. There are four interpolation kernels (up to 8 taps(, filter can be block- level adaptive: EIGHTTAP, EIGHTTAP_SMOOTH, EIGHTTAP_SHARP, BILINEAR Each filter is separable (i.e. filtering process is pipelined(: firstly horizontal filtering is performed and then vertical filtering. Interpolation filter can be fixed within a frame, in such case one of four kernels is selected at frame header. Interpolation filter can be switchable at block-level There is a special mode - dual filtering, where kernel for each direction can be different. Justification for dual filtering - signals can possess distinctive statistics in vertical and horizontal directions.
  • 15. 15 Use Case: HEVC/H.265 Motion Estimation Details  Variable inter-prediction block sizes – from 8x8 to 64x64, including non-square sizes like 32x16 (actually 4x8 and 8x4 blocks are also permitted with some constraints(.  Chroma block sizes mimic luma, for 4:2:0 case with the scaling factor 1/2 (although for small luma blocks the scaling factor is 1(.  Bi-directional prediction: two prediction blocks from previous and future pictures are mixed (averaged( to produce the final prediction signal (it’s a kind of interpolation(.  weighted prediction (e.g. to compensate fading(.  Sub-pixel precision: up to 1/4-th for luma and up to 1/8 for chroma
  • 16. 16 Weighted Prediction in HEVC/H.265 Fwd Ref Horizont al Filter Vertical Filter << 6 << 2 8bits per pixel 10bits per pix Bwd Ref Horizont al Filter Vertical Filter << 6 << 2 8bits per pixel 10bits per pix Merge Predicted signal
  • 17. 17 AV1 Motion Estimation Details  AV1 supports Global Motion mode which is divided into the following categories: Translation (panning video) Rotation Zoom Affine (suitable for 3D motion)  AV1 supports OBMC (Overlapped Block Motion Compensation)  AV1 supports Warped motion per superblock Examples: In case of translation a global Motion Vector is applied for the whole frame. In case of Zoom and Rotation Motion Vector is depending on block location
  • 18. 18 AV1 Motion Estimation Details – General Idea of OBMC Justification of OBMC - MV is most reliable in the center of the block (where prediction errors tend to be smaller than those at the corners). For a block it’s better to assign several MVs (its own and nearby blocks) and to blend reference samples: Block MV0 MV1 MV2 MV3 r c
  • 19. 19 AV1 Motion Estimation Details – Technical Details of OBMC In AV1 OBMC predicted block is associated with a single vector MV0 corresponding to the block’s center while corner MVs are taken from causal (already decoded) neighbors. Blending is executed in two separable stages: firstly according to vertical direction and then according to horizontal direction (the filter coefficients are pre-defined in the AV1 spec.) shadow of block2 Block 1 block2 MV1 MV2 shadowofBlock1 MV0 Get prediction samples according to MV0 Get prediction samples for overlap area of block1 according to MV1 Get prediction samples for overlap area of block2 according to MV2 block3 block4 Shadow of block3 Shadow of4 MV3 MV4 Get prediction samples for overlap area of block3 according to MV3 Get prediction samples for overlap area of block4 according to MV4 blending blending