MediaEval 2015 - GTM-UVigo Systems for Person Discovery Task at MediaEval 2015

•

0 gefällt mir•152 views

M

In this paper, we present the systems developed by GTMUVigo team for the Multimedia Person Discovery in Broadcast TV task at MediaEval 2015. The systems propose two different strategies for person discovery in audio through speaker diarization (one based on an online clustering strategy with error correction using OCR information and the other based on agglomerative hierarchical clustering) as well as intrashot and intershot trategies for face clustering. http://ceur-ws.org/Vol-1436/ http://www.multimediaeval.org

GTM-UVigo Systems for Person Discovery Task
at MediaEval 2015
Paula López Otero, Rosal´ıa Barros, Laura Doc´ıo Fernández,
Elisardo González Agulla, José Luis Alba Castro, Carmen Garc´ıa
Mateo
López Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 1/6

Main contributions
Error correction in speaker diarization using written names
Face tracking correction using quality scores
Visual Voice activity detection
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 2/6

Speaker diarization + written names
Speech activity detection
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speech activity detection
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speech activity detection
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker segmentation
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Speaker diarization + written names
Speaker clustering
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

Face diarization + shot segmentation
Face detection and tracking
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Face detection and tracking
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Quality Filter
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Quality Filter
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Quality Filter
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Quality Filter
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Quality Filter
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Quality Filter
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Quality Filter
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Visual Voice Activity Detection
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Visual Voice Activity Detection
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Visual Voice Activity Detection
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Face diarization + shot segmentation
Face recognition
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

Results
REPERE INA
EwMAP MAP C EwMAP MAP C
fusion 75.76 % 77.10 % 78.03 % 80.34 % 80.61 % 92.42 %
audio 69.37 % 70.90 % 78.48 % 89.38 % 89.76 % 97.34 %
video 73.94 % 75.29 % 78.03 % 80.66 % 80.94 % 92.46 %
baseline 63.58 % 63.93 % 71.75 % 78.35 % 78.64 % 92.71 %
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 5/6

Conclusions
Difficult scenarios:
Audio: background music, noise.
Video: face pose and distance to the camara, video quality.
Face approaches work better in REPERE, but speech
approach works better in INA.
Future work: finding a smarter way to combine speech and
video.
López Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 6/6

GTM-UVigo Systems for Person Discovery Task
at MediaEval 2015
L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 6/6

Empfohlen

MediaEval 2016 - Simula Team @ Context of Experience Task

MediaEval 2016 - Simula Team @ Context of Experience Task

MediaEval 2016 - Simula Team @ Context of Experience Taskmultimediaeval

The InVID Plug-in: Web Video Verification on the Browser

The InVID Plug-in: Web Video Verification on the Browser

The InVID Plug-in: Web Video Verification on the BrowserInVID Project

MediaEval 2016: LAPI at Predicting Media Interestingness Task

MediaEval 2016: LAPI at Predicting Media Interestingness Task

MediaEval 2016: LAPI at Predicting Media Interestingness Taskmultimediaeval

MediaEval 2016 - BUT Zero-Cost Speech Recognition

MediaEval 2016 - BUT Zero-Cost Speech Recognition

MediaEval 2016 - BUT Zero-Cost Speech Recognitionmultimediaeval

MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models

MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models

MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Modelsmultimediaeval

MediaEval 2016 - Verifying Multimedia Use Task Overview

MediaEval 2016 - Verifying Multimedia Use Task Overview

MediaEval 2016 - Verifying Multimedia Use Task Overviewmultimediaeval

MediaEval 2016 - Emotion in Music Task: Lessons Learned

MediaEval 2016 - Emotion in Music Task: Lessons Learned

MediaEval 2016 - Emotion in Music Task: Lessons Learnedmultimediaeval

MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Task

MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Task

MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Taskmultimediaeval

Empfohlen

MediaEval 2016 - Simula Team @ Context of Experience Task

MediaEval 2016 - Simula Team @ Context of Experience Task

MediaEval 2016 - Simula Team @ Context of Experience Taskmultimediaeval

The InVID Plug-in: Web Video Verification on the Browser

The InVID Plug-in: Web Video Verification on the Browser

The InVID Plug-in: Web Video Verification on the BrowserInVID Project

MediaEval 2016: LAPI at Predicting Media Interestingness Task

MediaEval 2016: LAPI at Predicting Media Interestingness Task

MediaEval 2016: LAPI at Predicting Media Interestingness Taskmultimediaeval

MediaEval 2016 - BUT Zero-Cost Speech Recognition

MediaEval 2016 - BUT Zero-Cost Speech Recognition

MediaEval 2016 - BUT Zero-Cost Speech Recognitionmultimediaeval

MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models

MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Models

MediaEval 2016 - HUCVL Predicting Interesting Key Frames with Deep Modelsmultimediaeval

MediaEval 2016 - Verifying Multimedia Use Task Overview

MediaEval 2016 - Verifying Multimedia Use Task Overview

MediaEval 2016 - Verifying Multimedia Use Task Overviewmultimediaeval

MediaEval 2016 - Emotion in Music Task: Lessons Learned

MediaEval 2016 - Emotion in Music Task: Lessons Learned

MediaEval 2016 - Emotion in Music Task: Lessons Learnedmultimediaeval

MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Task

MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Task

MediaEval 2015 - JRS at Synchronization of Multi-user Event Media Taskmultimediaeval

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...multimediaeval

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...multimediaeval

MediaEval 2016 - TUD-MMC Predicting media Interestingness Task

MediaEval 2016 - TUD-MMC Predicting media Interestingness Task

MediaEval 2016 - TUD-MMC Predicting media Interestingness Taskmultimediaeval

Media REVEALr: A social multimedia monitoring and intelligence system for Web...

Media REVEALr: A social multimedia monitoring and intelligence system for Web...

Media REVEALr: A social multimedia monitoring and intelligence system for Web...Symeon Papadopoulos

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...multimediaeval

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loopmultimediaeval

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Taskmultimediaeval

Video Retrieval for Multimedia Verification of Breaking News on Social Networks

Video Retrieval for Multimedia Verification of Breaking News on Social Networks

Video Retrieval for Multimedia Verification of Breaking News on Social NetworksInVID Project

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...multimediaeval

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...multimediaeval

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015multimediaeval

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...multimediaeval

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Taskmultimediaeval

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...multimediaeval

Fooling an Automatic Image Quality Estimator

Fooling an Automatic Image Quality Estimator

Fooling an Automatic Image Quality Estimatormultimediaeval

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...multimediaeval

Pixel Privacy: Quality Camouflage for Social Images

Pixel Privacy: Quality Camouflage for Social Images

Pixel Privacy: Quality Camouflage for Social Imagesmultimediaeval

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matchingmultimediaeval

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...multimediaeval

Weitere ähnliche Inhalte

Andere mochten auch

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...multimediaeval

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...multimediaeval

MediaEval 2016 - TUD-MMC Predicting media Interestingness Task

MediaEval 2016 - TUD-MMC Predicting media Interestingness Task

MediaEval 2016 - TUD-MMC Predicting media Interestingness Taskmultimediaeval

Media REVEALr: A social multimedia monitoring and intelligence system for Web...

Media REVEALr: A social multimedia monitoring and intelligence system for Web...

Media REVEALr: A social multimedia monitoring and intelligence system for Web...Symeon Papadopoulos

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...multimediaeval

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loopmultimediaeval

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Taskmultimediaeval

Video Retrieval for Multimedia Verification of Breaking News on Social Networks

Video Retrieval for Multimedia Verification of Breaking News on Social Networks

Video Retrieval for Multimedia Verification of Breaking News on Social NetworksInVID Project

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...multimediaeval

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...multimediaeval

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015multimediaeval

Andere mochten auch (11)

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

MediaEval 2015 - CERTH at MediaEval 2015 Synchronization of Multi-User Event ...

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...

MediaEval 2015 - Synchronization of Multi-User Event Media at MediaEval 2015:...

MediaEval 2016 - TUD-MMC Predicting media Interestingness Task

MediaEval 2016 - TUD-MMC Predicting media Interestingness Task

MediaEval 2016 - TUD-MMC Predicting media Interestingness Task

Media REVEALr: A social multimedia monitoring and intelligence system for Web...

Media REVEALr: A social multimedia monitoring and intelligence system for Web...

Media REVEALr: A social multimedia monitoring and intelligence system for Web...

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop

MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

Video Retrieval for Multimedia Verification of Breaking News on Social Networks

Video Retrieval for Multimedia Verification of Breaking News on Social Networks

Video Retrieval for Multimedia Verification of Breaking News on Social Networks

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...

MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

MediaEval 2015 - Verifying Multimedia Use at MediaEval 2015

Mehr von multimediaeval

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...multimediaeval

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Taskmultimediaeval

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...multimediaeval

Fooling an Automatic Image Quality Estimator

Fooling an Automatic Image Quality Estimator

Fooling an Automatic Image Quality Estimatormultimediaeval

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...multimediaeval

Pixel Privacy: Quality Camouflage for Social Images

Pixel Privacy: Quality Camouflage for Social Images

Pixel Privacy: Quality Camouflage for Social Imagesmultimediaeval

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matchingmultimediaeval

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...multimediaeval

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...multimediaeval

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...multimediaeval

Deep Conditional Adversarial learning for polyp Segmentation

Deep Conditional Adversarial learning for polyp Segmentation

Deep Conditional Adversarial learning for polyp Segmentationmultimediaeval

A Temporal-Spatial Attention Model for Medical Image Detection

A Temporal-Spatial Attention Model for Medical Image Detection

A Temporal-Spatial Attention Model for Medical Image Detectionmultimediaeval

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...multimediaeval

Fine-tuning for Polyp Segmentation with Attention

Fine-tuning for Polyp Segmentation with Attention

Fine-tuning for Polyp Segmentation with Attentionmultimediaeval

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...multimediaeval

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...multimediaeval

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...multimediaeval

Mehr von multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Fooling an Automatic Image Quality Estimator

Fooling an Automatic Image Quality Estimator

Fooling an Automatic Image Quality Estimator

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Pixel Privacy: Quality Camouflage for Social Images

Pixel Privacy: Quality Camouflage for Social Images

Pixel Privacy: Quality Camouflage for Social Images

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Deep Conditional Adversarial learning for polyp Segmentation

Deep Conditional Adversarial learning for polyp Segmentation

Deep Conditional Adversarial learning for polyp Segmentation

A Temporal-Spatial Attention Model for Medical Image Detection

A Temporal-Spatial Attention Model for Medical Image Detection

A Temporal-Spatial Attention Model for Medical Image Detection

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

Fine-tuning for Polyp Segmentation with Attention

Fine-tuning for Polyp Segmentation with Attention

Fine-tuning for Polyp Segmentation with Attention

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Kürzlich hochgeladen

The basics of sentences session 2pptx copy.pptx

The basics of sentences session 2pptx copy.pptx

The basics of sentences session 2pptx copy.pptxheathfieldcps1

Advance Mobile Application Development class 07

Advance Mobile Application Development class 07

Advance Mobile Application Development class 07Dr. Mazin Mohamed alkathiri

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services

Activity 01 - Artificial Culture (1).pdf

Activity 01 - Artificial Culture (1).pdf

Activity 01 - Artificial Culture (1).pdfciinovamais

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur

Software Engineering Methodologies (overview)

Software Engineering Methodologies (overview)

Software Engineering Methodologies (overview)eniolaolutunde

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82

Interactive Powerpoint_How to Master effective communication

Interactive Powerpoint_How to Master effective communication

Interactive Powerpoint_How to Master effective communicationnomboosow

Accessible design: Minimum effort, maximum impact

Accessible design: Minimum effort, maximum impact

Accessible design: Minimum effort, maximum impactdawncurless

9548086042 for call girls in Indira Nagar with room service

9548086042 for call girls in Indira Nagar with room service

9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt

Sanyam Choudhary Chemistry practical.pdf

Sanyam Choudhary Chemistry practical.pdf

Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019

Grant Readiness 101 TechSoup and Remy Consulting

Grant Readiness 101 TechSoup and Remy Consulting

Grant Readiness 101 TechSoup and Remy ConsultingTechSoup

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching

Beyond the EU: DORA and NIS 2 Directive's Global Impact

Beyond the EU: DORA and NIS 2 Directive's Global Impact

Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB

fourth grading exam for kindergarten in writing

fourth grading exam for kindergarten in writing

fourth grading exam for kindergarten in writingTeacherCyreneCayanan

Advanced Views - Calendar View in Odoo 17

Advanced Views - Calendar View in Odoo 17

Advanced Views - Calendar View in Odoo 17Celine George

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood

Unit-IV- Pharma. Marketing Channels.pptx

Unit-IV- Pharma. Marketing Channels.pptx

Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics

Kürzlich hochgeladen (20)

The basics of sentences session 2pptx copy.pptx

The basics of sentences session 2pptx copy.pptx

The basics of sentences session 2pptx copy.pptx

Advance Mobile Application Development class 07

Advance Mobile Application Development class 07

Advance Mobile Application Development class 07

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...

Activity 01 - Artificial Culture (1).pdf

Activity 01 - Artificial Culture (1).pdf

Activity 01 - Artificial Culture (1).pdf

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...

Software Engineering Methodologies (overview)

Software Engineering Methodologies (overview)

Software Engineering Methodologies (overview)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi

Interactive Powerpoint_How to Master effective communication

Interactive Powerpoint_How to Master effective communication

Interactive Powerpoint_How to Master effective communication

Accessible design: Minimum effort, maximum impact

Accessible design: Minimum effort, maximum impact

Accessible design: Minimum effort, maximum impact

9548086042 for call girls in Indira Nagar with room service

9548086042 for call girls in Indira Nagar with room service

9548086042 for call girls in Indira Nagar with room service

Sanyam Choudhary Chemistry practical.pdf

Sanyam Choudhary Chemistry practical.pdf

Sanyam Choudhary Chemistry practical.pdf

Grant Readiness 101 TechSoup and Remy Consulting

Grant Readiness 101 TechSoup and Remy Consulting

Grant Readiness 101 TechSoup and Remy Consulting

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Beyond the EU: DORA and NIS 2 Directive's Global Impact

Beyond the EU: DORA and NIS 2 Directive's Global Impact

Beyond the EU: DORA and NIS 2 Directive's Global Impact

fourth grading exam for kindergarten in writing

fourth grading exam for kindergarten in writing

fourth grading exam for kindergarten in writing

Advanced Views - Calendar View in Odoo 17

Advanced Views - Calendar View in Odoo 17

Advanced Views - Calendar View in Odoo 17

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

Unit-IV- Pharma. Marketing Channels.pptx

Unit-IV- Pharma. Marketing Channels.pptx

Unit-IV- Pharma. Marketing Channels.pptx

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...

MediaEval 2015 - GTM-UVigo Systems for Person Discovery Task at MediaEval 2015

1. GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 Paula López Otero, Rosal´ıa Barros, Laura Doc´ıo Fernández, Elisardo González Agulla, José Luis Alba Castro, Carmen Garc´ıa Mateo López Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 1/6

2. Main contributions Error correction in speaker diarization using written names Face tracking correction using quality scores Visual Voice activity detection L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 2/6

3. Speaker diarization + written names Speech activity detection L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

4. Speaker diarization + written names Speech activity detection L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

5. Speaker diarization + written names Speech activity detection L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

6. Speaker diarization + written names Speaker segmentation L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

7. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

8. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

9. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

10. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

11. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

12. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

13. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

14. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

15. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

16. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

17. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

18. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

19. Speaker diarization + written names Speaker clustering L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 3/6

20. Face diarization + shot segmentation Face detection and tracking L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

21. Face diarization + shot segmentation Face detection and tracking L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

22. Face diarization + shot segmentation Quality Filter L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

23. Face diarization + shot segmentation Quality Filter L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

24. Face diarization + shot segmentation Quality Filter L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

25. Face diarization + shot segmentation Quality Filter L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

26. Face diarization + shot segmentation Quality Filter L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

27. Face diarization + shot segmentation Quality Filter L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

28. Face diarization + shot segmentation Quality Filter L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

29. Face diarization + shot segmentation Visual Voice Activity Detection L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

30. Face diarization + shot segmentation Visual Voice Activity Detection L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

31. Face diarization + shot segmentation Visual Voice Activity Detection L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

32. Face diarization + shot segmentation Face recognition L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 4/6

33. Results REPERE INA EwMAP MAP C EwMAP MAP C fusion 75.76 % 77.10 % 78.03 % 80.34 % 80.61 % 92.42 % audio 69.37 % 70.90 % 78.48 % 89.38 % 89.76 % 97.34 % video 73.94 % 75.29 % 78.03 % 80.66 % 80.94 % 92.46 % baseline 63.58 % 63.93 % 71.75 % 78.35 % 78.64 % 92.71 % L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 5/6

34. Conclusions Difficult scenarios: Audio: background music, noise. Video: face pose and distance to the camara, video quality. Face approaches work better in REPERE, but speech approach works better in INA. Future work: finding a smarter way to combine speech and video. López Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 6/6

35. GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 L´opez Otero, Barros et al. — GTM-UVigo Systems for Person Discovery Task at MediaEval 2015 6/6