SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
Predicting Media Interestingness Task
Overview
Claire-Hélène Demarty – Technicolor
Mats Sjöberg – University of Helsinki
Bogdan Ionescu – University Polytehnica of Bucharest
Thanh-Toan Do – University of Adelaide
Michael Gygli, ETH & Gifs.com
Ngoc Q.K. Duong, Technicolor
MediaEval 2017 Workshop
Dublin, 13-16th 2016
In its
second year
Derives from a use case at Technicolor
 Helping professionals to illustrate a Video on Demand (VOD) web site by
selecting some interesting frames and/or video excerpts for the posted
movies.
2
Task definition
9/13/2017
Derives from a use case at Technicolor
 Helping professionals to illustrate a Video on Demand (VOD) web site by
selecting some interesting frames and/or video excerpts for the posted
movies.
3
Task definition
9/13/2017
Derives from a use case at Technicolor
 Helping professionals to illustrate a Video on Demand (VOD) web site by
selecting some interesting frames and/or video excerpts for the posted
movies.
4
Task definition
9/13/2017
Definition: The frames and excerpts should be suitable in terms of helping a
user to make his/her decision about whether he/she is interested in watching
the underlying movie. Emphasized
in 2017
 Two subtasks -> Image and Video
 Image subtask: given a set of keyframes extracted from a movie, …
 Video subtask: given a set of video segments extracted from a movie, …
… automatically identify those images/segments that viewers report to be
interesting.
 Binary classification task on a per movie basis…
… but confidence values are also required.
5
Task definition
9/13/2017
 From Hollywood-like movie trailers or full-length movie extracts
 Manual segmentation of shots/longer segments with a semantic meaning
 Extraction of middle key-frame of each shot/segment
6
Dataset & additional features
9/13/2017
Development Set Test Set
78 trailers 26 trailers 4 movie extracts
(ca.15min)
Total % interesting Total % interesting Total % interesting
Shot # 7,396 9.0 2,192 11.3 243 11.5
Key-frame # 7,396 11.6 2,192 11.9 243 22.6
Modified in
2017
 From Hollywood-like movie trailers or full-length movie extracts
 Manual segmentation of shots/longer segments with a semantic meaning
 Extraction of middle key-frame of each shot/segment
7
Dataset & additional features
9/13/2017
Development Set Test Set
78 trailers 26 trailers 4 movie extracts
(ca.15min)
Total % interesting Total % interesting Total % interesting
Shot # 7,396 9.0 2,192 11.3 243 11.5
Key-frame # 7,396 11.6 2,192 11.9 243 22.6
 Precomputed content descriptors:
 Low-level: denseSift, HoG, LBP, GIST, HSV color histograms, MFCC, fc7 and
prob layers from AlexNet
 Mid-level: face detection and tracking-by-detection
 Segment-based: C3D from fc6 layer and averaged over each segment
Modified in
2017
Added
in 2017
8
Manual annotations
9/13/2017
Binary decision
(manual
thresholding)
Pair comparison
protocol
ONE SINGLE
aggregation
into ranking
pairs ranking
Annotators:
>252 persons for video
>188 persons for image
From 22 countries
Modified in
2017
Up to 5 runs per subtask!
Image subtask: Visual information, external data allowed
Video subtask: BOTH audio and visual information, external data allowed
9
Required runs
9/13/2017
Modified in
2017
Modified in
2017
 2017 official measure:
➢ Mean Average Precision at 10 (over all movies)
 Additional metrics are computed:
 2016 Mean Average Precision
 False alarm rate, miss detection rate, precision, recall, F-measure, etc.
10
Evaluation metrics
9/13/2017
Modified in
2017
11
Task participation
9/13/2017
 Registrations:
 32 teams
 18 countries
 Submissions:
 10 teams
 7 ‘experienced’ teams
0
5
10
15
20
25
30
35
40
Registrations Returned agreements Submitting teams Experienced teams Workshop
Task participation
2016 2017
12
Official results – Image subtask – 33 runs
9/13/2017
* organizers
Run MAP@10 MAP Official ranking
me17in_DUT-MMSR_image_run2.txt 0.1385 0.3075DUT_MMSR
me17in_HKBU_image_5.txt 0.1369 0.291HKBU
me17in_DUT-MMSR_image_run3.txt 0.1349 0.3052DUT_MMSR
me17in_HKBU_image_3.txt 0.1332 0.2898HKBU
me17in_HKBU_image_2.txt 0.132 0.2916HKBU
me17in_HKBU_image_4.txt 0.1315 0.2884HKBU
me17in_DUT-MMSR_image_run1.txt 0.131 0.3002DUT_MMSR
me17in_DUT-MMSR_image_run4.txt 0.1213 0.2887DUT_MMSR
me17in_HKBU_image_1.txt 0.1184 0.2812HKBU
me17in_gibis_image_run1-required.txt 0.1129 0.271GIBIS
me17in_technicolor_image_run2.txt 0.1054 0.2525Technicolor*
me17in_gibis_image_run2.txt 0.1029 0.2645GIBIS
me17in_technicolor_image_run1.txt 0.1028 0.2615Technicolor*
me17in_RUC_image_run1-required.txt 0.094 0.2655RUC
me17in_gibis_image_run5.txt 0.0939 0.2531GIBIS
me17in_gibis_image_run3.txt 0.0924 0.2502GIBIS
me17in_gibis_image_run4.txt 0.0916 0.2525GIBIS
me17in_IITB_image_run2-required.txt 0.0911 0.257IITB
me17in_technicolor_image_run4.txt 0.0875 0.2382Technicolor*
me17in_technicolor_image_run5.txt 0.0861 0.2347Technicolor*
2016 BEST RUN 0.2336
me17in_technicolor_image_run3.txt 0.0693 0.2244Technicolor*
me17in_DUT-MMSR_image_run5histface.txt 0.0649 0.2105DUT_MMSR
me17in_Eurecom_image_run1-required.txt 0.0587 0.2029Eurecom
me17in_Eurecom_image_run2.txt 0.0579 0.2016Eurecom
me17in_LAPI_image_run3.txt 0.0555 0.1873LAPI*
me17in_LAPI_image_run4.txt 0.0529 0.1851LAPI*
me17in_IITB_image_run4-required.txt 0.0521 0.2054IITB
me17in_IITB_image_run1-required.txt 0.05 0.1886IITB
Baseline 0.0495 0.1731
me17in_IITB_image_run3-required.txt 0.0494 0.2038IITB
me17in_LAPI_image_run1.txt 0.0463 0.1791LAPI*
me17in_LAPI_image_run2.txt 0.0442 0.1789LAPI
me17in_DAIICT_image_run1-required.txt 0.0406 0.1824DAIICT
me17in_TCNJ-CS_image_run1-required.txt 0.0126 0.1331TCNJ-CS
MAP=0.3075
MAP=0.2336
MAP=0.1731
MAP@10=0.1385
MAP@10=0.0495
13
Official results – Image subtask – best runs
9/13/2017
Run MAP@10 MAP Official ranking
me17in_DUT-MMSR_image_run2.txt 0.1385 0.3075 DUT_MMSR
me17in_HKBU_image_5.txt 0.1369 0.291 HKBU
me17in_gibis_image_run1-required.txt 0.1129 0.271 GIBIS
me17in_technicolor_image_run2.txt 0.1054 0.2525 Technicolor*
me17in_RUC_image_run1-required.txt 0.094 0.2655 RUC
me17in_IITB_image_run2-required.txt 0.0911 0.257 IITB
2016 BEST RUN 0.2336
me17in_Eurecom_image_run1-required.txt 0.0587 0.2029 Eurecom
me17in_LAPI_image_run3.txt 0.0555 0.1873 LAPI*
Baseline 0.0495 0.1731
me17in_DAIICT_image_run1-required.txt 0.0406 0.1824 DAIICT
me17in_TCNJ-CS_image_run1-required.txt 0.0126 0.1331 TCNJ-CS
* organizers
14
Official results – Video subtask – 42 runs
9/13/2017
* organizers
Run MAP@10 MAP Official ranking
me17in_Eurecom_video_run4.txt 0.0827 0.2094 Eurecom
me17in_Eurecom_video_run5.txt 0.0774 0.2002 Eurecom
me17in_LAPI_video_run4.txt 0.0732 0.2028 LAPI*
me17in_Eurecom_video_run2.txt 0.0732 0.196 Eurecom
me17in_Eurecom_video_run1-required.txt 0.0717 0.2034 Eurecom
me17in_technicolor_video_run4.txt 0.0641 0.1878 Technicolor*
me17in_Eurecom_video_run3.txt 0.064 0.1964 Eurecom
me17in_DAIICT_video_run4.txt 0.064 0.1885 DAIICT
me17in_RUC_video_run2.txt 0.0637 0.1897 RUC
me17in_DAIICT_video_run1-required.txt 0.0636 0.1867 DAIICT
me17in_gibis_video_run5.txt 0.0628 0.183 GIBIS
me17in_gibis_video_run4.txt 0.0624 0.1836 GIBIS
me17in_LAPI_video_run1.txt 0.0619 0.1937 LAPI*
me17in_LAPI_video_run3.txt 0.0619 0.1937 LAPI*
me17in_gibis_video_run3.txt 0.0614 0.1877 GIBIS
me17in_technicolor_video_run5.txt 0.0609 0.1918 Technicolor*
me17in_technicolor_video_run1.txt 0.0589 0.1856 Technicolor*
me17in_RUC_video_run1-required.txt 0.0589 0.183 RUC
me17in_DAIICT_video_run3.txt 0.0585 0.1839 DAIICT
me17in_LAPI_video_run5.txt 0.0571 0.1843 LAPI*
me17in_DAIICT_video_run5.txt 0.0571 0.1838 DAIICT
me17in_LAPI_video_run2.txt 0.0564 0.1819 LAPI*
Run MAP@10 MAP Official ranking
2016 BEST RUN 0.1815
Baseline 0.0564 0.1716
me17in_technicolor_video_run3.txt 0.0563 0.1825 Technicolor*
me17in_HKBU_video_1.txt 0.0556 0.1813 HKBU
me17in_DAIICT_video_run2.txt 0.0553 0.1812 DAIICT
me17in_gibis_video_run2-required.txt 0.053 0.1807 GIBIS
me17in_IITB_video_run1-required.txt 0.0525 0.1795 IITB
me17in_TCNJ-CS_video_run1-required.txt 0.0524 0.1774 TCNJ-CS
me17in_DUT-
MMSR_video_run5histface.txt 0.0516 0.1791 DUT-MMSR
me17in_DUT-MMSR_video_run4.txt 0.0482 0.1783 DUT-MMSR
me17in_DUT-MMSR_video_run3.txt 0.0478 0.177 DUT-MMSR
me17in_IITB_video_run3-required.txt 0.0474 0.17 IITB
me17in_HKBU_video_2.txt 0.0468 0.1761 HKBU
me17in_HKBU_video_3.txt 0.0468 0.1761 HKBU
me17in_technicolor_video_run2.txt 0.0465 0.1768 Technicolor*
me17in_DUT-MMSR_video_run2.txt 0.0465 0.1748 DUT-MMSR
me17in_HKBU_video_4.txt 0.0463 0.1742 HKBU
me17in_HKBU_video_5.txt 0.0445 0.1746 HKBU
me17in_IITB_video_run4-required.txt 0.0445 0.1678 IITB
me17in_IITB_video_run2-required.txt 0.0445 0.1675 IITB
me17in_DUT-MMSR_video_run1.txt 0.0443 0.1734 DUT-MMSR
me17in_gibis_video_run1.txt 0.0396 0.1667 GIBIS
2017 Best run: MAP@10=0.0827 ; MAP=0.2094
2016 Best run: MAP=0.1815
Baseline: MAP@10=0.0564 ; MAP=0.1716
15
Official results – Video subtask – best runs
9/13/2017
* organizers
Run MAP@10 MAP Official ranking
me17in_Eurecom_video_run4.txt 0.0827 0.2094 Eurecom
me17in_LAPI_video_run4.txt 0.0732 0.2028 LAPI*
me17in_technicolor_video_run4.txt 0.0641 0.1878 Technicolor*
me17in_DAIICT_video_run4.txt 0.064 0.1885 DAIICT
me17in_RUC_video_run2.txt 0.0637 0.1897 RUC
me17in_gibis_video_run5.txt 0.0628 0.183 GIBIS
2016 BEST RUN 0.1815
Baseline 0.0564 0.1716
me17in_HKBU_video_1.txt 0.0556 0.1813 HKBU
me17in_IITB_video_run1-required.txt 0.0525 0.1795 IITB
me17in_TCNJ-CS_video_run1-required.txt 0.0524 0.1774 TCNJ-CS
me17in_DUT-MMSR_video_run5histface.txt 0.0516 0.1791 DUT-MMSR
 Reconfirmed that Image interestingness is NOT video interestingness
 Some significant improvement, especially for the image subtask
 Dataset quality improved:
 Increase of number of iterations/annotations per sample
 Increase of dataset size
 Longer movie extracts
➢ Image subtask: All teams did better: Best MAP@10=0.2105 Best MAP=0.4343
➢ Video subtask: 1 team cleary improved, 5 teams depending on their runs: Best
MAP@10=0.1678 Best MAP=0.2637
16
What we have learned on the TASK itself
9/13/2017
 This year’s trend?
 DNN as (last) classifying step is not the majority choice
 Dataset size….
 Multimodal equals audio+video ONLY (text was used only once)
 (Mostly) no temporal approaches
 (Mostly) no use of external data
 Late fusion, dimension reduction
 Adding semantic/affect in the approaches
 Genre recognition pre-step
 Aesthetics-related features
 Movie context (Contextual feature, Textual description)
17
What we have learned on the participants systems
9/13/2017
 This year’s trend?
 DNN as (last) classifying step is not the majority choice
 Dataset size….
 Multimodal equals audio+video ONLY (text was used only once)
 (Mostly) no temporal approaches
 (Mostly) no use of external data
 Late fusion, dimension reduction
 Adding semantic/affect in the approaches
 Genre recognition pre-step
 Aesthetics-related features
 Movie context (Contextual feature, Textual description)
 Insights
 What works for the images does not work for the videos
 Monomodal systems (no audio) did as well as multimodal systems
 Adding semantic/affect/context in the approaches is promising!
18
What we have learned on the participants systems
9/13/2017
19 9/13/2017
Thank you!

Weitere ähnliche Inhalte

Ähnlich wie MediaEval 2017 - Interestingness Task: MediaEval 2017 Predicting Media Interestingness Task (Overview)

Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...
Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...
Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...
Yole Developpement
 
Infineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure SensorInfineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure Sensor
Yole Developpement
 
Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...
Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...
Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...
SANTIAGO PABLO ALBERTO
 
Camara for uav jan2012 eas 021
Camara for uav jan2012 eas 021Camara for uav jan2012 eas 021
Camara for uav jan2012 eas 021
M.L. Kamalasana
 

Ähnlich wie MediaEval 2017 - Interestingness Task: MediaEval 2017 Predicting Media Interestingness Task (Overview) (20)

3D-ICONS- D3 1: Interim Report on Data Acquisition
3D-ICONS- D3 1: Interim Report  on Data Acquisition3D-ICONS- D3 1: Interim Report  on Data Acquisition
3D-ICONS- D3 1: Interim Report on Data Acquisition
 
Camera Module Physical Analyses Overview 2017 teardown reverse costing report...
Camera Module Physical Analyses Overview 2017 teardown reverse costing report...Camera Module Physical Analyses Overview 2017 teardown reverse costing report...
Camera Module Physical Analyses Overview 2017 teardown reverse costing report...
 
Staad.Pro Training Report or Summer Internship
Staad.Pro Training Report or Summer Internship  Staad.Pro Training Report or Summer Internship
Staad.Pro Training Report or Summer Internship
 
Deferred rendering using compute shader
Deferred rendering using compute shaderDeferred rendering using compute shader
Deferred rendering using compute shader
 
CPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use CasesCPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use Cases
 
[Azure Council Experts (ACE) 第25回定例会] Microsoft Azureアップデート情報 (2017/08/25-201...
[Azure Council Experts (ACE) 第25回定例会] Microsoft Azureアップデート情報 (2017/08/25-201...[Azure Council Experts (ACE) 第25回定例会] Microsoft Azureアップデート情報 (2017/08/25-201...
[Azure Council Experts (ACE) 第25回定例会] Microsoft Azureアップデート情報 (2017/08/25-201...
 
Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...
Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...
Autoliv’s 3rd Generation Automotive Night Vision Camera with FLIR’s ISC0901 M...
 
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry PeopleSplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
SplunkLive! London 2017 - Using Machine Learning to Feed Hungry People
 
Bang pypers agustmeetup
Bang pypers agustmeetupBang pypers agustmeetup
Bang pypers agustmeetup
 
Video interaction through finger tips
Video interaction through finger tips Video interaction through finger tips
Video interaction through finger tips
 
3D-ICONS- D5.2: Report on Publication
3D-ICONS- D5.2: Report on Publication3D-ICONS- D5.2: Report on Publication
3D-ICONS- D5.2: Report on Publication
 
Infineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure SensorInfineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure Sensor
 
Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...
Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...
Electrónica: Diseño e impresión 3D por deposición de fundido de un soporte aj...
 
Camara for uav jan2012 eas 021
Camara for uav jan2012 eas 021Camara for uav jan2012 eas 021
Camara for uav jan2012 eas 021
 
A Journey Into Cyberspace
A Journey Into CyberspaceA Journey Into Cyberspace
A Journey Into Cyberspace
 
LPG Booking System [ bookmylpg.com ] Report
LPG Booking System [ bookmylpg.com ] ReportLPG Booking System [ bookmylpg.com ] Report
LPG Booking System [ bookmylpg.com ] Report
 
Capturing and Stitching of ground surface images to enable a Machine Learning...
Capturing and Stitching of ground surface images to enable a Machine Learning...Capturing and Stitching of ground surface images to enable a Machine Learning...
Capturing and Stitching of ground surface images to enable a Machine Learning...
 
Voice Controlled Robotic Vehicle
Voice Controlled Robotic VehicleVoice Controlled Robotic Vehicle
Voice Controlled Robotic Vehicle
 
LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...
LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...
LinkedTV Deliverable 4.7 - Contextualisation and personalisation evaluation a...
 
Project final report
Project final reportProject final report
Project final report
 

Mehr von multimediaeval

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
multimediaeval
 

Mehr von multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
 
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
 
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
 
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
 
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 TaskEssex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
 
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
 
Fooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality EstimatorFooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality Estimator
 
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
 
Pixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social ImagesPixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social Images
 
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-MatchingHCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
 
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
 
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
 
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
 
A Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image DetectionA Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image Detection
 
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
 
Fine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with AttentionFine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with Attention
 
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
 
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
 
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ... Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 

Kürzlich hochgeladen

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 

Kürzlich hochgeladen (20)

Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 

MediaEval 2017 - Interestingness Task: MediaEval 2017 Predicting Media Interestingness Task (Overview)

  • 1. Predicting Media Interestingness Task Overview Claire-Hélène Demarty – Technicolor Mats Sjöberg – University of Helsinki Bogdan Ionescu – University Polytehnica of Bucharest Thanh-Toan Do – University of Adelaide Michael Gygli, ETH & Gifs.com Ngoc Q.K. Duong, Technicolor MediaEval 2017 Workshop Dublin, 13-16th 2016 In its second year
  • 2. Derives from a use case at Technicolor  Helping professionals to illustrate a Video on Demand (VOD) web site by selecting some interesting frames and/or video excerpts for the posted movies. 2 Task definition 9/13/2017
  • 3. Derives from a use case at Technicolor  Helping professionals to illustrate a Video on Demand (VOD) web site by selecting some interesting frames and/or video excerpts for the posted movies. 3 Task definition 9/13/2017
  • 4. Derives from a use case at Technicolor  Helping professionals to illustrate a Video on Demand (VOD) web site by selecting some interesting frames and/or video excerpts for the posted movies. 4 Task definition 9/13/2017 Definition: The frames and excerpts should be suitable in terms of helping a user to make his/her decision about whether he/she is interested in watching the underlying movie. Emphasized in 2017
  • 5.  Two subtasks -> Image and Video  Image subtask: given a set of keyframes extracted from a movie, …  Video subtask: given a set of video segments extracted from a movie, … … automatically identify those images/segments that viewers report to be interesting.  Binary classification task on a per movie basis… … but confidence values are also required. 5 Task definition 9/13/2017
  • 6.  From Hollywood-like movie trailers or full-length movie extracts  Manual segmentation of shots/longer segments with a semantic meaning  Extraction of middle key-frame of each shot/segment 6 Dataset & additional features 9/13/2017 Development Set Test Set 78 trailers 26 trailers 4 movie extracts (ca.15min) Total % interesting Total % interesting Total % interesting Shot # 7,396 9.0 2,192 11.3 243 11.5 Key-frame # 7,396 11.6 2,192 11.9 243 22.6 Modified in 2017
  • 7.  From Hollywood-like movie trailers or full-length movie extracts  Manual segmentation of shots/longer segments with a semantic meaning  Extraction of middle key-frame of each shot/segment 7 Dataset & additional features 9/13/2017 Development Set Test Set 78 trailers 26 trailers 4 movie extracts (ca.15min) Total % interesting Total % interesting Total % interesting Shot # 7,396 9.0 2,192 11.3 243 11.5 Key-frame # 7,396 11.6 2,192 11.9 243 22.6  Precomputed content descriptors:  Low-level: denseSift, HoG, LBP, GIST, HSV color histograms, MFCC, fc7 and prob layers from AlexNet  Mid-level: face detection and tracking-by-detection  Segment-based: C3D from fc6 layer and averaged over each segment Modified in 2017 Added in 2017
  • 8. 8 Manual annotations 9/13/2017 Binary decision (manual thresholding) Pair comparison protocol ONE SINGLE aggregation into ranking pairs ranking Annotators: >252 persons for video >188 persons for image From 22 countries Modified in 2017
  • 9. Up to 5 runs per subtask! Image subtask: Visual information, external data allowed Video subtask: BOTH audio and visual information, external data allowed 9 Required runs 9/13/2017 Modified in 2017 Modified in 2017
  • 10.  2017 official measure: ➢ Mean Average Precision at 10 (over all movies)  Additional metrics are computed:  2016 Mean Average Precision  False alarm rate, miss detection rate, precision, recall, F-measure, etc. 10 Evaluation metrics 9/13/2017 Modified in 2017
  • 11. 11 Task participation 9/13/2017  Registrations:  32 teams  18 countries  Submissions:  10 teams  7 ‘experienced’ teams 0 5 10 15 20 25 30 35 40 Registrations Returned agreements Submitting teams Experienced teams Workshop Task participation 2016 2017
  • 12. 12 Official results – Image subtask – 33 runs 9/13/2017 * organizers Run MAP@10 MAP Official ranking me17in_DUT-MMSR_image_run2.txt 0.1385 0.3075DUT_MMSR me17in_HKBU_image_5.txt 0.1369 0.291HKBU me17in_DUT-MMSR_image_run3.txt 0.1349 0.3052DUT_MMSR me17in_HKBU_image_3.txt 0.1332 0.2898HKBU me17in_HKBU_image_2.txt 0.132 0.2916HKBU me17in_HKBU_image_4.txt 0.1315 0.2884HKBU me17in_DUT-MMSR_image_run1.txt 0.131 0.3002DUT_MMSR me17in_DUT-MMSR_image_run4.txt 0.1213 0.2887DUT_MMSR me17in_HKBU_image_1.txt 0.1184 0.2812HKBU me17in_gibis_image_run1-required.txt 0.1129 0.271GIBIS me17in_technicolor_image_run2.txt 0.1054 0.2525Technicolor* me17in_gibis_image_run2.txt 0.1029 0.2645GIBIS me17in_technicolor_image_run1.txt 0.1028 0.2615Technicolor* me17in_RUC_image_run1-required.txt 0.094 0.2655RUC me17in_gibis_image_run5.txt 0.0939 0.2531GIBIS me17in_gibis_image_run3.txt 0.0924 0.2502GIBIS me17in_gibis_image_run4.txt 0.0916 0.2525GIBIS me17in_IITB_image_run2-required.txt 0.0911 0.257IITB me17in_technicolor_image_run4.txt 0.0875 0.2382Technicolor* me17in_technicolor_image_run5.txt 0.0861 0.2347Technicolor* 2016 BEST RUN 0.2336 me17in_technicolor_image_run3.txt 0.0693 0.2244Technicolor* me17in_DUT-MMSR_image_run5histface.txt 0.0649 0.2105DUT_MMSR me17in_Eurecom_image_run1-required.txt 0.0587 0.2029Eurecom me17in_Eurecom_image_run2.txt 0.0579 0.2016Eurecom me17in_LAPI_image_run3.txt 0.0555 0.1873LAPI* me17in_LAPI_image_run4.txt 0.0529 0.1851LAPI* me17in_IITB_image_run4-required.txt 0.0521 0.2054IITB me17in_IITB_image_run1-required.txt 0.05 0.1886IITB Baseline 0.0495 0.1731 me17in_IITB_image_run3-required.txt 0.0494 0.2038IITB me17in_LAPI_image_run1.txt 0.0463 0.1791LAPI* me17in_LAPI_image_run2.txt 0.0442 0.1789LAPI me17in_DAIICT_image_run1-required.txt 0.0406 0.1824DAIICT me17in_TCNJ-CS_image_run1-required.txt 0.0126 0.1331TCNJ-CS MAP=0.3075 MAP=0.2336 MAP=0.1731 MAP@10=0.1385 MAP@10=0.0495
  • 13. 13 Official results – Image subtask – best runs 9/13/2017 Run MAP@10 MAP Official ranking me17in_DUT-MMSR_image_run2.txt 0.1385 0.3075 DUT_MMSR me17in_HKBU_image_5.txt 0.1369 0.291 HKBU me17in_gibis_image_run1-required.txt 0.1129 0.271 GIBIS me17in_technicolor_image_run2.txt 0.1054 0.2525 Technicolor* me17in_RUC_image_run1-required.txt 0.094 0.2655 RUC me17in_IITB_image_run2-required.txt 0.0911 0.257 IITB 2016 BEST RUN 0.2336 me17in_Eurecom_image_run1-required.txt 0.0587 0.2029 Eurecom me17in_LAPI_image_run3.txt 0.0555 0.1873 LAPI* Baseline 0.0495 0.1731 me17in_DAIICT_image_run1-required.txt 0.0406 0.1824 DAIICT me17in_TCNJ-CS_image_run1-required.txt 0.0126 0.1331 TCNJ-CS * organizers
  • 14. 14 Official results – Video subtask – 42 runs 9/13/2017 * organizers Run MAP@10 MAP Official ranking me17in_Eurecom_video_run4.txt 0.0827 0.2094 Eurecom me17in_Eurecom_video_run5.txt 0.0774 0.2002 Eurecom me17in_LAPI_video_run4.txt 0.0732 0.2028 LAPI* me17in_Eurecom_video_run2.txt 0.0732 0.196 Eurecom me17in_Eurecom_video_run1-required.txt 0.0717 0.2034 Eurecom me17in_technicolor_video_run4.txt 0.0641 0.1878 Technicolor* me17in_Eurecom_video_run3.txt 0.064 0.1964 Eurecom me17in_DAIICT_video_run4.txt 0.064 0.1885 DAIICT me17in_RUC_video_run2.txt 0.0637 0.1897 RUC me17in_DAIICT_video_run1-required.txt 0.0636 0.1867 DAIICT me17in_gibis_video_run5.txt 0.0628 0.183 GIBIS me17in_gibis_video_run4.txt 0.0624 0.1836 GIBIS me17in_LAPI_video_run1.txt 0.0619 0.1937 LAPI* me17in_LAPI_video_run3.txt 0.0619 0.1937 LAPI* me17in_gibis_video_run3.txt 0.0614 0.1877 GIBIS me17in_technicolor_video_run5.txt 0.0609 0.1918 Technicolor* me17in_technicolor_video_run1.txt 0.0589 0.1856 Technicolor* me17in_RUC_video_run1-required.txt 0.0589 0.183 RUC me17in_DAIICT_video_run3.txt 0.0585 0.1839 DAIICT me17in_LAPI_video_run5.txt 0.0571 0.1843 LAPI* me17in_DAIICT_video_run5.txt 0.0571 0.1838 DAIICT me17in_LAPI_video_run2.txt 0.0564 0.1819 LAPI* Run MAP@10 MAP Official ranking 2016 BEST RUN 0.1815 Baseline 0.0564 0.1716 me17in_technicolor_video_run3.txt 0.0563 0.1825 Technicolor* me17in_HKBU_video_1.txt 0.0556 0.1813 HKBU me17in_DAIICT_video_run2.txt 0.0553 0.1812 DAIICT me17in_gibis_video_run2-required.txt 0.053 0.1807 GIBIS me17in_IITB_video_run1-required.txt 0.0525 0.1795 IITB me17in_TCNJ-CS_video_run1-required.txt 0.0524 0.1774 TCNJ-CS me17in_DUT- MMSR_video_run5histface.txt 0.0516 0.1791 DUT-MMSR me17in_DUT-MMSR_video_run4.txt 0.0482 0.1783 DUT-MMSR me17in_DUT-MMSR_video_run3.txt 0.0478 0.177 DUT-MMSR me17in_IITB_video_run3-required.txt 0.0474 0.17 IITB me17in_HKBU_video_2.txt 0.0468 0.1761 HKBU me17in_HKBU_video_3.txt 0.0468 0.1761 HKBU me17in_technicolor_video_run2.txt 0.0465 0.1768 Technicolor* me17in_DUT-MMSR_video_run2.txt 0.0465 0.1748 DUT-MMSR me17in_HKBU_video_4.txt 0.0463 0.1742 HKBU me17in_HKBU_video_5.txt 0.0445 0.1746 HKBU me17in_IITB_video_run4-required.txt 0.0445 0.1678 IITB me17in_IITB_video_run2-required.txt 0.0445 0.1675 IITB me17in_DUT-MMSR_video_run1.txt 0.0443 0.1734 DUT-MMSR me17in_gibis_video_run1.txt 0.0396 0.1667 GIBIS 2017 Best run: MAP@10=0.0827 ; MAP=0.2094 2016 Best run: MAP=0.1815 Baseline: MAP@10=0.0564 ; MAP=0.1716
  • 15. 15 Official results – Video subtask – best runs 9/13/2017 * organizers Run MAP@10 MAP Official ranking me17in_Eurecom_video_run4.txt 0.0827 0.2094 Eurecom me17in_LAPI_video_run4.txt 0.0732 0.2028 LAPI* me17in_technicolor_video_run4.txt 0.0641 0.1878 Technicolor* me17in_DAIICT_video_run4.txt 0.064 0.1885 DAIICT me17in_RUC_video_run2.txt 0.0637 0.1897 RUC me17in_gibis_video_run5.txt 0.0628 0.183 GIBIS 2016 BEST RUN 0.1815 Baseline 0.0564 0.1716 me17in_HKBU_video_1.txt 0.0556 0.1813 HKBU me17in_IITB_video_run1-required.txt 0.0525 0.1795 IITB me17in_TCNJ-CS_video_run1-required.txt 0.0524 0.1774 TCNJ-CS me17in_DUT-MMSR_video_run5histface.txt 0.0516 0.1791 DUT-MMSR
  • 16.  Reconfirmed that Image interestingness is NOT video interestingness  Some significant improvement, especially for the image subtask  Dataset quality improved:  Increase of number of iterations/annotations per sample  Increase of dataset size  Longer movie extracts ➢ Image subtask: All teams did better: Best MAP@10=0.2105 Best MAP=0.4343 ➢ Video subtask: 1 team cleary improved, 5 teams depending on their runs: Best MAP@10=0.1678 Best MAP=0.2637 16 What we have learned on the TASK itself 9/13/2017
  • 17.  This year’s trend?  DNN as (last) classifying step is not the majority choice  Dataset size….  Multimodal equals audio+video ONLY (text was used only once)  (Mostly) no temporal approaches  (Mostly) no use of external data  Late fusion, dimension reduction  Adding semantic/affect in the approaches  Genre recognition pre-step  Aesthetics-related features  Movie context (Contextual feature, Textual description) 17 What we have learned on the participants systems 9/13/2017
  • 18.  This year’s trend?  DNN as (last) classifying step is not the majority choice  Dataset size….  Multimodal equals audio+video ONLY (text was used only once)  (Mostly) no temporal approaches  (Mostly) no use of external data  Late fusion, dimension reduction  Adding semantic/affect in the approaches  Genre recognition pre-step  Aesthetics-related features  Movie context (Contextual feature, Textual description)  Insights  What works for the images does not work for the videos  Monomodal systems (no audio) did as well as multimodal systems  Adding semantic/affect/context in the approaches is promising! 18 What we have learned on the participants systems 9/13/2017