2. RoI-AVC = Region-of-Interest Advanced Video Coding
For stationary camera video applications:
(e.g., video conference, video surveillance, news broadcast)
• Foreground moving objects of crucial interest RoI for smart video processing
• RoiAVC straddling computer vision & video coding a joint optimized design
• Battery powered cameras for low bandwidth scenarios encoding efficiency
& complexity crucial
Frame-based Object -based
RoiAVC RoI-based
A practical semantic video codec with the coding efficiency
and complexity advantages over state-of-art H.264/AVC
Striking a sweet spot between frame-based video coding paradigm
and object-based video coding paradigm
Powered by our key competence in fast reliable RoI detection and
coding schemes
2
3. Outline: RoI-AVC framework and strength
A joint optimized design bridging the two worlds
Vision world Video world
Multi-scale motion RoI bounding - H.264 video H.264 video
RoI detection box generation encoder decoder
Metadata of
Bounding-boxes
Previous frame Current frame Reconstructed frame
• Avoiding the initial background • Up to 34% bit-rate saving @
training and online updating similar quality over H.264/AVC
• Reliable motion RoI detection • 2.x to 3.x faster (including RoI
• 20 fps @ 352x288 w/o manual overhead) than H.264 reference
optimization on Intel Pentium 4 encoder, similar for the decoder
3
To appear in IEEE ICASSP 2007
4. Vision world: multi-scale motion RoI detection
Multi-scale motion RoI detection
Previous frame Pixel-
Pixel - level Region-
Region-level
processing processing
Detected motion RoI
Current frame
Multi-scale motion RoI detection
Multi-scale structural change aggregation as the key contribution
An integrated fast and reliable motion RoI detection approach
Directly applied to two successive video frames w/o a BG model
Robust to flicking lighting and camera noise, and less sensitive to the
thresholds
4
6. Video world: flexible MB-based H.264/AVC coding
Flexible MB-based H.264/AVC codec
Flexible MB-based Flexible MB-based
H.264 encoder H.264 decoder
Detected motion RoI Reconstructed frame
Metadata of
Bounding-boxes MB-based RoI coding
Flexible organization of MBs
16 17 18
Flexible MB-based codec:
1 2 3 19 20 21 1 2 3 4 Largely reduced coding bit-rate and
4 5 6 22 23 24 5 6 7 8
7 8 9 9 10 11 12 complexity
10 11 12
13 14 15
13 14 15 16 21 22
17 18 19 20 23 24
Data locality-preserving ordering
25 26 27 28
w/o changing MB-based pipeline
m MB of 1st Motion RoI n MB of 2 nd Motion RoI MB of background
Could be fully compliant to AVC
6
7. Compared to the prior methods from different worlds
Video
world
Current input frame Results of [CSVT’01] Results of [MM’01] Our results
Vision
world
Current input frame Gaussian hypothesis test The single-scale variant Our multi-scale scheme
7
8. Video demo 1: multi-scale motion RoI detection
Ballet @ 1024 x 768 from MSR camera-4
Detected motion blobs Bounding-boxes superimposed
upon the original frames
8
9. Video demo 2: perceptual quality of RoI-AVC
Indoor
monitoring
News
broadcast
Original video sequences Reconstructed video sequences 9