SlideShare a Scribd company logo
1 of 33
Download to read offline
A Web Service for Video Smart-Cropping
Konstantinos Apostolidis, Vasileios Mezaris
CERTH-ITI
Thessaloniki, Greece
This work was supported by the H2020 project ReTV (grant agreement No 780656)
1
23rd IEEE International Symposium
on Multimedia (ISM 2021)
Problem Statement
● Traditional TV and desktop computer monitors: landscape aspect ratios (16:9 or 4:3)
● Nowadays, mobile devices use different aspect ratios
● Video sharing platforms dictate the use of specific video aspect ratios
● Existing videos would have to be transformed to comply with their specifications
2
● Straightforward approaches for transforming a video to a different aspect ratio:
○ Static cropping of content
○ Padding the frames with black borders
Problem Statement
● The results of such simple approaches are often unsatisfactory
● Common video aspect ratio transformation methods of the literature:
○ Warping
○ Seam carving
➢Both introduce distortions and may alter the semantics of the video
3
➢Significant loss of visual content that might even be in the center of
attention
➢Shrinks the original video by introducing large borders in the output video
Problem Statement
● Several research works and some commercial software for video retargeting
available
● No easy and free video retargeting tools!
● Motivated by this, we built a freely accessible Web application for video retargeting
that consists of:
○ A REST service hosting the developed technologies for video cropping
○ An interactive user interface
● We used a modified version of the method in [1]
[1] Apostolidis, Konstantinos, and Vasileios Mezaris. "A Fast Smart-Cropping Method and
Dataset for Video Retargeting." In 2021 IEEE International Conference on Image
Processing (ICIP), pp. 2618-2622. IEEE, 2021. 4
Let’s quickly remind how the method of [1] works!
5
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
6
ow/oh = 16/9
fw/fh = 4/5
fh = oh
fw < ow
Original frame
Final frame
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
7
fh
ow
oh
fw
ow
oh
fh
fw
ow/oh = 4/5
fw/fh = 16/9
fh < oh
fw = ow
Original frame
Final frame
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
8
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
9
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
10
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
11
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
12
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
yi
xi
13
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing t
Time-series of center of mass
displacement
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
t
Shot transitions Video
frames
Shot #1
Shot #2
Shot #3
15
Method of [1]
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
Smoothed time-series
Inferred time-series
t
Smoothed time-series of center of mass
displacement
16
● Read the first out of every five videos frames
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
Proposed Method
Our modifications:
Optimized set of parameters
17
● Read the first out of every five videos frames
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
Proposed Method
Our modifications:
Optimized set of parameters
Spatial sub-sampling
18
● Read the first out of every five videos frames
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
Proposed Method
Spatial sub-sampling
“Focus stability” mechanism
Our modifications:
Optimized set of parameters
19
● Read the first out of every five videos frames
● Remove borders
● Calculate crop window dimensions
● Saliency detection
● Thresholding
● Filtering-through-clustering procedure
● Center of mass
● Shot detection
● Time-series smoothing
Proposed Method
Replace LOESS with a
Savitzky-Golay filter
Spatial sub-sampling
“Focus stability” mechanism
Our modifications:
Optimized set of parameters
20
Proposed Method
Deployed a Web application that:
1. Retrieves a video file
2. Analyzes the video
3. Transforms the video frames to the target aspect ratio
4. Renders the transformed video
21
Proposed Method
The REST service works through a 3-step process:
1. HTTP POST call to submit a video for analysis and the initiation of a relevant session
in the REST service
2. HTTP GET call to query the status of the initialized session and the progress of the
analysis
3. HTTP GET call to retrieve the results of a successfully completed session
22
User interface
● User can submit videos and
transform their aspect ratio
23
User interface
● User can submit videos and
transform their aspect ratio
● Predefined list of target aspect
ratios
24
User interface
● User can submit videos and
transform their aspect ratio
● Predefined list of target aspect
ratios
● Videos can be either available
on-line or locally stored
25
User interface
● User can submit videos and
transform their aspect ratio
● Predefined list of target aspect
ratios
● Videos can be either available
on-line or locally stored
● The landing page includes 10
demo videos, and the ability to
provide feedback
26
User interface
● User can submit videos and
transform their aspect ratio
● Predefined list of target aspect
ratios
● Videos can be either available
on-line or locally stored
● The landing page includes 10
demo videos, and the ability to
provide feedback
27
User interface
● User can submit videos and
transform their aspect ratio
● Predefined list of target aspect
ratios
● Videos can be either available
on-line or locally stored
● The landing page includes 10
demo videos, and the ability to
provide feedback
● Analysis procedure monitoring
28
User interface
● User can submit videos and
transform their aspect ratio
● Predefined list of target aspect
ratios
● Videos can be either available
on-line or locally stored
● The landing page includes 10
demo videos, and the ability to
provide feedback
● Analysis procedure monitoring
● On-line inspection of the
results through the UI of our
tool or download the video file
29
Results
● We utilize the RetargetVid dataset and the evaluation protocol of [1] to compare:
○ Method of [1]
○ Method of [1] + modifications
30
Method Worst (↑) Best (↑) Mean (↑) T (% ↓)
Results for 1:3 target aspect ratio
[1] 48.6 50.9 49.9 19
[1] + proposed modifications 51.7 53.8 52.9 13
Results for 3:1 target aspect ratio
[1] 70.1 73.6 71.4 20
[1] + proposed modifications 74.4 77.0 75.3 14
Results (IoU; time as a percentage of the videos’ duration)
(↑: the higher the better; ↓: the lower the better)
31
Links
Instructional Video: https://youtu.be/_pdTDMWbIfs
Web Application: http://multimedia2.iti.gr/videosmartcropping/service/start.html
GitHub Repository: https://github.com/bmezaris/RetargetVid
● Source code of SmartVidCrop
● Source code of SmartVidCrop + our modifications
● Ground-truth annotations for the RetargetVid dataset
32
Try it yourself at:
Contacts:
Vasileios Mezaris, bmezaris@iti.gr
Konstantinos Apostolidis, kapost@iti.gr
This work was supported by the H2020 project ReTV (grant agreement No 780656)
http://multimedia2.iti.gr/videosmartcropping/service/start.html
Or ask us about the underlying algorithms
and how these can be integrated in your system
33

More Related Content

Similar to Video smart cropping web application

CA-SUM Video Summarization
CA-SUM Video SummarizationCA-SUM Video Summarization
CA-SUM Video SummarizationVasileiosMezaris
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentAlpen-Adria-Universität
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentMinh Nguyen
 
LwTE: Light-weight Transcoding at the Edge
LwTE: Light-weight Transcoding at the EdgeLwTE: Light-weight Transcoding at the Edge
LwTE: Light-weight Transcoding at the EdgeAlpen-Adria-Universität
 
Arm html5 presentation
Arm html5 presentationArm html5 presentation
Arm html5 presentationIan Renyard
 
Video Recommendation Engines as a Service
Video Recommendation Engines as a ServiceVideo Recommendation Engines as a Service
Video Recommendation Engines as a ServiceKamil Sindi
 
Quality-delay Tradeoff Optimization in Multi-Bitrate Adaptive Streaming
Quality-delay Tradeoff Optimization in Multi-Bitrate Adaptive StreamingQuality-delay Tradeoff Optimization in Multi-Bitrate Adaptive Streaming
Quality-delay Tradeoff Optimization in Multi-Bitrate Adaptive StreamingDuc Nguyen
 
Video Compression Using Block By Block Basis Salience Detection
Video Compression Using Block By Block Basis Salience DetectionVideo Compression Using Block By Block Basis Salience Detection
Video Compression Using Block By Block Basis Salience DetectionIRJET Journal
 
Producing Effective Screencasts
Producing Effective ScreencastsProducing Effective Screencasts
Producing Effective ScreencastsRichard Harrington
 
Building turn-key recommendations for 5% of internet video
Building turn-key recommendations for 5% of internet videoBuilding turn-key recommendations for 5% of internet video
Building turn-key recommendations for 5% of internet videoNir Yungster
 
H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...
H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...
H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...Alpen-Adria-Universität
 
IEEEGlobecom'22-OL-RICHTER.pdf
IEEEGlobecom'22-OL-RICHTER.pdfIEEEGlobecom'22-OL-RICHTER.pdf
IEEEGlobecom'22-OL-RICHTER.pdfReza Farahani
 
Spatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees VideosSpatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees VideosVasileiosMezaris
 
2018 FiTCE congress
2018 FiTCE congress2018 FiTCE congress
2018 FiTCE congressSilvia Rossi
 
Монетизация сетевой инфраструктуры
Монетизация сетевой инфраструктурыМонетизация сетевой инфраструктуры
Монетизация сетевой инфраструктурыBAKOTECH
 
UVM_Full_Print_n.pptx
UVM_Full_Print_n.pptxUVM_Full_Print_n.pptx
UVM_Full_Print_n.pptxnikitha992646
 
RedisConf18 - Video Experience Operational Insights in Real Time.
RedisConf18 - Video Experience Operational Insights in Real Time.RedisConf18 - Video Experience Operational Insights in Real Time.
RedisConf18 - Video Experience Operational Insights in Real Time.Redis Labs
 

Similar to Video smart cropping web application (20)

NMSL_2017summer
NMSL_2017summerNMSL_2017summer
NMSL_2017summer
 
CA-SUM Video Summarization
CA-SUM Video SummarizationCA-SUM Video Summarization
CA-SUM Video Summarization
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
 
LwTE: Light-weight Transcoding at the Edge
LwTE: Light-weight Transcoding at the EdgeLwTE: Light-weight Transcoding at the Edge
LwTE: Light-weight Transcoding at the Edge
 
Arm html5 presentation
Arm html5 presentationArm html5 presentation
Arm html5 presentation
 
Video Recommendation Engines as a Service
Video Recommendation Engines as a ServiceVideo Recommendation Engines as a Service
Video Recommendation Engines as a Service
 
Quality-delay Tradeoff Optimization in Multi-Bitrate Adaptive Streaming
Quality-delay Tradeoff Optimization in Multi-Bitrate Adaptive StreamingQuality-delay Tradeoff Optimization in Multi-Bitrate Adaptive Streaming
Quality-delay Tradeoff Optimization in Multi-Bitrate Adaptive Streaming
 
Video Compression Using Block By Block Basis Salience Detection
Video Compression Using Block By Block Basis Salience DetectionVideo Compression Using Block By Block Basis Salience Detection
Video Compression Using Block By Block Basis Salience Detection
 
Producing Effective Screencasts
Producing Effective ScreencastsProducing Effective Screencasts
Producing Effective Screencasts
 
Building turn-key recommendations for 5% of internet video
Building turn-key recommendations for 5% of internet videoBuilding turn-key recommendations for 5% of internet video
Building turn-key recommendations for 5% of internet video
 
First Steps to DevOps
First Steps to DevOpsFirst Steps to DevOps
First Steps to DevOps
 
H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...
H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...
H2BR: An HTTP/2-based Retransmission Technique to Improve the QoE of Adaptive...
 
IEEEGlobecom'22-OL-RICHTER.pdf
IEEEGlobecom'22-OL-RICHTER.pdfIEEEGlobecom'22-OL-RICHTER.pdf
IEEEGlobecom'22-OL-RICHTER.pdf
 
Spatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees VideosSpatio-Temporal Summarization of 360-degrees Videos
Spatio-Temporal Summarization of 360-degrees Videos
 
2018 FiTCE congress
2018 FiTCE congress2018 FiTCE congress
2018 FiTCE congress
 
Монетизация сетевой инфраструктуры
Монетизация сетевой инфраструктурыМонетизация сетевой инфраструктуры
Монетизация сетевой инфраструктуры
 
Video Quality Control
Video Quality ControlVideo Quality Control
Video Quality Control
 
UVM_Full_Print_n.pptx
UVM_Full_Print_n.pptxUVM_Full_Print_n.pptx
UVM_Full_Print_n.pptx
 
RedisConf18 - Video Experience Operational Insights in Real Time.
RedisConf18 - Video Experience Operational Insights in Real Time.RedisConf18 - Video Experience Operational Insights in Real Time.
RedisConf18 - Video Experience Operational Insights in Real Time.
 

More from VasileiosMezaris

Multi-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and LocalizationMulti-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and LocalizationVasileiosMezaris
 
CERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages TaskCERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages TaskVasileiosMezaris
 
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...VasileiosMezaris
 
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022VasileiosMezaris
 
TAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for ExplanationsTAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for ExplanationsVasileiosMezaris
 
Explaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionExplaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionVasileiosMezaris
 
Combining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video SearchCombining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video SearchVasileiosMezaris
 
Explaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiersExplaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiersVasileiosMezaris
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...VasileiosMezaris
 
Are all combinations equal? Combining textual and visual features with multi...
Are all combinations equal?  Combining textual and visual features with multi...Are all combinations equal?  Combining textual and visual features with multi...
Are all combinations equal? Combining textual and visual features with multi...VasileiosMezaris
 
Misinformation on the internet: Video and AI
Misinformation on the internet: Video and AIMisinformation on the internet: Video and AI
Misinformation on the internet: Video and AIVasileiosMezaris
 
PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020VasileiosMezaris
 
GAN-based video summarization
GAN-based video summarizationGAN-based video summarization
GAN-based video summarizationVasileiosMezaris
 
Migration-related video retrieval
Migration-related video retrievalMigration-related video retrieval
Migration-related video retrievalVasileiosMezaris
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruningVasileiosMezaris
 
Icme2020 tutorial video_summarization_part1
Icme2020 tutorial video_summarization_part1Icme2020 tutorial video_summarization_part1
Icme2020 tutorial video_summarization_part1VasileiosMezaris
 
Video, AI and News: video analysis and verification technologies for supporti...
Video, AI and News: video analysis and verification technologies for supporti...Video, AI and News: video analysis and verification technologies for supporti...
Video, AI and News: video analysis and verification technologies for supporti...VasileiosMezaris
 
Subclass deep neural networks
Subclass deep neural networksSubclass deep neural networks
Subclass deep neural networksVasileiosMezaris
 

More from VasileiosMezaris (20)

Multi-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and LocalizationMulti-Modal Fusion for Image Manipulation Detection and Localization
Multi-Modal Fusion for Image Manipulation Detection and Localization
 
CERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages TaskCERTH-ITI at MediaEval 2023 NewsImages Task
CERTH-ITI at MediaEval 2023 NewsImages Task
 
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
Masked Feature Modelling for the unsupervised pre-training of a Graph Attenti...
 
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
Cross-modal Networks and Dual Softmax Operation for MediaEval NewsImages 2022
 
TAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for ExplanationsTAME: Trainable Attention Mechanism for Explanations
TAME: Trainable Attention Mechanism for Explanations
 
Gated-ViGAT
Gated-ViGATGated-ViGAT
Gated-ViGAT
 
Explaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attentionExplaining video summarization based on the focus of attention
Explaining video summarization based on the focus of attention
 
Combining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video SearchCombining textual and visual features for Ad-hoc Video Search
Combining textual and visual features for Ad-hoc Video Search
 
Explaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiersExplaining the decisions of image/video classifiers
Explaining the decisions of image/video classifiers
 
Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...Learning visual explanations for DCNN-based image classifiers using an attent...
Learning visual explanations for DCNN-based image classifiers using an attent...
 
Are all combinations equal? Combining textual and visual features with multi...
Are all combinations equal?  Combining textual and visual features with multi...Are all combinations equal?  Combining textual and visual features with multi...
Are all combinations equal? Combining textual and visual features with multi...
 
Misinformation on the internet: Video and AI
Misinformation on the internet: Video and AIMisinformation on the internet: Video and AI
Misinformation on the internet: Video and AI
 
LSTM Structured Pruning
LSTM Structured PruningLSTM Structured Pruning
LSTM Structured Pruning
 
PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020PoR_evaluation_measure_acm_mm_2020
PoR_evaluation_measure_acm_mm_2020
 
GAN-based video summarization
GAN-based video summarizationGAN-based video summarization
GAN-based video summarization
 
Migration-related video retrieval
Migration-related video retrievalMigration-related video retrieval
Migration-related video retrieval
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruning
 
Icme2020 tutorial video_summarization_part1
Icme2020 tutorial video_summarization_part1Icme2020 tutorial video_summarization_part1
Icme2020 tutorial video_summarization_part1
 
Video, AI and News: video analysis and verification technologies for supporti...
Video, AI and News: video analysis and verification technologies for supporti...Video, AI and News: video analysis and verification technologies for supporti...
Video, AI and News: video analysis and verification technologies for supporti...
 
Subclass deep neural networks
Subclass deep neural networksSubclass deep neural networks
Subclass deep neural networks
 

Recently uploaded

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 

Recently uploaded (20)

Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 

Video smart cropping web application

  • 1. A Web Service for Video Smart-Cropping Konstantinos Apostolidis, Vasileios Mezaris CERTH-ITI Thessaloniki, Greece This work was supported by the H2020 project ReTV (grant agreement No 780656) 1 23rd IEEE International Symposium on Multimedia (ISM 2021)
  • 2. Problem Statement ● Traditional TV and desktop computer monitors: landscape aspect ratios (16:9 or 4:3) ● Nowadays, mobile devices use different aspect ratios ● Video sharing platforms dictate the use of specific video aspect ratios ● Existing videos would have to be transformed to comply with their specifications 2
  • 3. ● Straightforward approaches for transforming a video to a different aspect ratio: ○ Static cropping of content ○ Padding the frames with black borders Problem Statement ● The results of such simple approaches are often unsatisfactory ● Common video aspect ratio transformation methods of the literature: ○ Warping ○ Seam carving ➢Both introduce distortions and may alter the semantics of the video 3 ➢Significant loss of visual content that might even be in the center of attention ➢Shrinks the original video by introducing large borders in the output video
  • 4. Problem Statement ● Several research works and some commercial software for video retargeting available ● No easy and free video retargeting tools! ● Motivated by this, we built a freely accessible Web application for video retargeting that consists of: ○ A REST service hosting the developed technologies for video cropping ○ An interactive user interface ● We used a modified version of the method in [1] [1] Apostolidis, Konstantinos, and Vasileios Mezaris. "A Fast Smart-Cropping Method and Dataset for Video Retargeting." In 2021 IEEE International Conference on Image Processing (ICIP), pp. 2618-2622. IEEE, 2021. 4
  • 5. Let’s quickly remind how the method of [1] works! 5
  • 6. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing 6
  • 7. ow/oh = 16/9 fw/fh = 4/5 fh = oh fw < ow Original frame Final frame Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing 7 fh ow oh fw
  • 8. ow oh fh fw ow/oh = 4/5 fw/fh = 16/9 fh < oh fw = ow Original frame Final frame Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing 8
  • 9. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing 9
  • 10. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing 10
  • 11. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing 11
  • 12. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing 12
  • 13. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing yi xi 13
  • 14. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing t Time-series of center of mass displacement
  • 15. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing t Shot transitions Video frames Shot #1 Shot #2 Shot #3 15
  • 16. Method of [1] ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing Smoothed time-series Inferred time-series t Smoothed time-series of center of mass displacement 16
  • 17. ● Read the first out of every five videos frames ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing Proposed Method Our modifications: Optimized set of parameters 17
  • 18. ● Read the first out of every five videos frames ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing Proposed Method Our modifications: Optimized set of parameters Spatial sub-sampling 18
  • 19. ● Read the first out of every five videos frames ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing Proposed Method Spatial sub-sampling “Focus stability” mechanism Our modifications: Optimized set of parameters 19
  • 20. ● Read the first out of every five videos frames ● Remove borders ● Calculate crop window dimensions ● Saliency detection ● Thresholding ● Filtering-through-clustering procedure ● Center of mass ● Shot detection ● Time-series smoothing Proposed Method Replace LOESS with a Savitzky-Golay filter Spatial sub-sampling “Focus stability” mechanism Our modifications: Optimized set of parameters 20
  • 21. Proposed Method Deployed a Web application that: 1. Retrieves a video file 2. Analyzes the video 3. Transforms the video frames to the target aspect ratio 4. Renders the transformed video 21
  • 22. Proposed Method The REST service works through a 3-step process: 1. HTTP POST call to submit a video for analysis and the initiation of a relevant session in the REST service 2. HTTP GET call to query the status of the initialized session and the progress of the analysis 3. HTTP GET call to retrieve the results of a successfully completed session 22
  • 23. User interface ● User can submit videos and transform their aspect ratio 23
  • 24. User interface ● User can submit videos and transform their aspect ratio ● Predefined list of target aspect ratios 24
  • 25. User interface ● User can submit videos and transform their aspect ratio ● Predefined list of target aspect ratios ● Videos can be either available on-line or locally stored 25
  • 26. User interface ● User can submit videos and transform their aspect ratio ● Predefined list of target aspect ratios ● Videos can be either available on-line or locally stored ● The landing page includes 10 demo videos, and the ability to provide feedback 26
  • 27. User interface ● User can submit videos and transform their aspect ratio ● Predefined list of target aspect ratios ● Videos can be either available on-line or locally stored ● The landing page includes 10 demo videos, and the ability to provide feedback 27
  • 28. User interface ● User can submit videos and transform their aspect ratio ● Predefined list of target aspect ratios ● Videos can be either available on-line or locally stored ● The landing page includes 10 demo videos, and the ability to provide feedback ● Analysis procedure monitoring 28
  • 29. User interface ● User can submit videos and transform their aspect ratio ● Predefined list of target aspect ratios ● Videos can be either available on-line or locally stored ● The landing page includes 10 demo videos, and the ability to provide feedback ● Analysis procedure monitoring ● On-line inspection of the results through the UI of our tool or download the video file 29
  • 30. Results ● We utilize the RetargetVid dataset and the evaluation protocol of [1] to compare: ○ Method of [1] ○ Method of [1] + modifications 30
  • 31. Method Worst (↑) Best (↑) Mean (↑) T (% ↓) Results for 1:3 target aspect ratio [1] 48.6 50.9 49.9 19 [1] + proposed modifications 51.7 53.8 52.9 13 Results for 3:1 target aspect ratio [1] 70.1 73.6 71.4 20 [1] + proposed modifications 74.4 77.0 75.3 14 Results (IoU; time as a percentage of the videos’ duration) (↑: the higher the better; ↓: the lower the better) 31
  • 32. Links Instructional Video: https://youtu.be/_pdTDMWbIfs Web Application: http://multimedia2.iti.gr/videosmartcropping/service/start.html GitHub Repository: https://github.com/bmezaris/RetargetVid ● Source code of SmartVidCrop ● Source code of SmartVidCrop + our modifications ● Ground-truth annotations for the RetargetVid dataset 32
  • 33. Try it yourself at: Contacts: Vasileios Mezaris, bmezaris@iti.gr Konstantinos Apostolidis, kapost@iti.gr This work was supported by the H2020 project ReTV (grant agreement No 780656) http://multimedia2.iti.gr/videosmartcropping/service/start.html Or ask us about the underlying algorithms and how these can be integrated in your system 33