SlideShare ist ein Scribd-Unternehmen logo
1 von 30
VISUAL SEARCH
FOR MUSICAL PERFORMANCES
AND ENDOSCOPIC VIDEOS
Degree’s Final Project Dissertation
Telecommunications Engineering
Jennifer Roldán
Supervisors:
Assoc. Prof. Mathias Lux
Assoc. Prof. Xavier Giró
Outline of the Thesis
1. Introduction
i. Motivation
ii. Gantt chart. Work Plan
2. Overview. Existing Demo-Application
3. Methods
i. Global features using Late Fusion Methods
ii. Local features: SIMPLE descriptor
4. Data sets
i. Musical Performances
ii. Endoscopic Videos
5. Experiments
i. Quantitative evaluation
ii. Qualitative evaluation. Thinking-aloud test
6. Conclusions and Further Work
Sep 2014 – May 2015
Slide 2
Motivation
• Application for covering the surgeons’ needs and
automatize data processing
• Endoscopic videos (confidential data)
• Focus of the project
• Video retrieval on demand for surgeons
• Musical performances (free data set)
• Reproducible results for evaluation
• Quantitative and qualitative studies
Slide 3
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Gantt Chart. Work Plan
Slide 4
Use of existing tools and
define the Thesis’s statements
Experiments with
endoscopic videos
Two papers submitted
in 13th CBMI Congress
Project development
with Jiku Mobile data set
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Existing Demo Application
Slide 5
Fig. All results are presented in HTML 5 and can be viewed in a
recent version of common browsers.
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Existing Demo Application
Slide 6
Publicated at ACM Mutimedia Open Source Competition [1]
• Open source library for CBIR
• Based on Lucene
• Java text retrieval framework
• Indexing and Search
• Supporting Global and Local features
(Integrate until 20 descriptors)
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
[1] Mathias Lux. LIRE: Open source image retrieval in java. In Proceedings of the 21st ACM international conference on Multimedia, pages 843{846.ACM, 2013.
Methodology
Slide 7
1. Previous methods in demo application:
• Global Features
i. CEDD. Color and Edge Directivity Descriptor
ii. Color Histogram.
iii.PHOG. Pyramid Histogram of Oriented Gradients
• Late Fusion Methods
2. Extend the methods to local features for retrieval
• Use an existing tool to study better results
• SIMPLE descriptor
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Method 1
Global features using Late Fusion
Feature extraction and indexing Similarity measure Fusion
Fig. System Architecture
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Slide 8
Method 1
Global features using Late Fusion
Feature extraction and indexing Similarity measure Fusion
Global descriptors for each IRM:
1. CEDD
2. Color Histogram
3. PHOG
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Slide 9
Method 1
Global features using Late Fusion
Feature extraction and indexing Similarity measure Fusion
Normalization:
• Two different approaches
• N limited images:
1. rank: 𝑅 𝐾 n =
N+1−Rk n
N
2. score: 𝑅 𝐾 n =
Rk n −min(RK)
max Rk −min(𝑅 𝑘)
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Slide 10
Method 1
Global features using Late Fusion
Feature extraction and indexing Similarity measure Fusion
Fusion Methods:
a. Sum:
𝑅𝑡 n =
𝑘
𝑅 𝑘 𝑛 = 𝑅1 𝑛 + 𝑅2 𝑛 + ⋯ + 𝑅 𝐾 𝑛
b. Sum with combMNZ:
sum x number of IRM returned by image n
Final Ranked Lists:
1. Sum (ranks)
2. Sum (scores)
3. Sum with comMNZ (ranks)
4. Sum with comMNZ (scores)
4
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Slide 11
“Searching Images with MPEG-7 (& MPEG-7-like)
Powered Localized dEscriptors (SIMPLE)” [2]
SURF detector + CEDD descriptor
• Extraction of global features as local ones (image key points)
• Codebook of 512 VW using Bag-Of-Visual-Words (BOVW) model
• K-means clustering algorithm with vocabulary of 512 words.
Method 2
Local features. SIMPLE descriptor
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Slide 12
[2] Chryssanthi Iakovidou, Nektarios Anagnostopoulos, Athanasios Ch Kapoutsis, Yiannis Boutalis, and Savvas A Chatzichristos. Searching images
with MPEG-7 (& mpeg-7-like) powered localized descriptors: the SIMPLE answer to effective content based image retrieval. In 12th International
Workshop on Content-Based Multimedia Indexing (CBMI), pages 1-6. IEEE, 2014.
Data sets
Video Retrieval for two different cases
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Slide 13
1
2
Musical Performances
Endoscopic Videos
Musical Performances
 Freely available data set. It
allows us to compare results
 Jiku Mobile data set
• 473 video clips
• Mobile devices
• Multiple users
• 5 events and several performances
 Test
• 356 videos randomly selected
• Based on 1 frame per second
• 412 query images
Fig. Query images event domain
Slide 14
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
1
Fig. Query images medical domain
Endoscopic Videos2
 Confidential and anonymized
data
 Live video stream data set
• Surgeons’ recordings in HQ
• Inside of their subjects
• 33 hours roughly covered
• 54 laparoscopy procedures
 Test
• 1,276 videos randomly selected
• Based on 5 frame per second
• 600 query images
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Slide 15
Experiments
Video Retrieval tested by two different evaluations
Slide 16
1
2
Quantitative evaluation
Qualitative evaluation
(Thinking-aloud Test)
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Evaluation Social Study, at AAU
Quantitative study:
• To find the position of the video where the query image belongs
• Results Global Features
• Results Local Features
Qualitative study. Thinking-aloud Test
• Interface semi-interactive web-page
• Participants are researchers and non-researchers within the
CODE-MM Project
• 6 Volunteers for Musical Performances Test
• 2 Volunteers for Endoscopic Videos Test
Slide 17
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
1
2
Evaluation Social Study, at AAU
Thinking-aloud Test
• Interface semi-interactive web-page blindly labeled with 3 Search
Engines (A, B, C)
i. sum of ranks method and global features  Search Engine A
ii. sum of scores method and global features  Search Engine B
iii. SIMPLE (SURF detector + CEDD descriptor)  Search Engine C
• Participants must show their thoughts in loud-voice
• Sessions are recorded
Slide 18
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Evaluation Social Study, at AAU
Thinking-aloud Test
Slide 19
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Fig. Screenshots of the different movements of the first volunteer
Fig. Screenshot from the thinking aloud test
Fig. Interface for the thinking aloud test
Experiments
Video Retrieval tested by two different evaluations
Slide 20
1
2
Musical Performances
Endoscopic Videos
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Table I. Results of the tests on where that actual video can be found in the results. The first four
columns give the four different tested feature fusion approaches, the fifth one gives the results
on the use of the SIMPLE-CEDD descriptors
Benchmarking based on the 412 set of queries:
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Musical Performances
Quantitative Evaluation
Slide 21
Source video of the query image ranked in the first position of the result list
• Global features: 96,6% of the queries
• Local features: 91,5% of the queries
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Musical Performances
Qualitative Evaluation
Fig. Most used query images in the user test (left to right)
Global features ( A, B )
• Search Model: Abstract
exploratory
• Different sub-events, same
view point
Local features ( C )
• Search Model: Semantically
similar content
• Same performance, different
viewpoints
• Good results in earlier video’s
position
Overall impression
Slide 22
Global Features using Late Fusion SIMPLE: SURF detector + CEDD
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Musical Performances
Qualitative Evaluation
Slide 23
Experiments
Slide 24
1
2
Musical Performances
Endoscopic Videos
Video Retrieval tested by two different evaluations
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Benchmarking based on the 600 set of queries:
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Endoscopic Videos
Quantitative Evaluation
Table II. Results of the tests on where that actual video can be found in the results. The first four
columns give the four different tested feature fusion approaches, the fifth one gives the results on
the use of the SIMPLE-CEDD descriptors
Slide 25
Source video of the query image ranked in the first position of the result list
• Global features: 78.3% of the queries
• Local features: 79,8% of the queries
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Endoscopic Videos
Qualitative Evaluation
Global features ( A, B )
• Search Model: Abstract
exploratory
• Relevant shots in the top results
(semantically dissimilar)
Local features ( C )
• Search Model: Semantically
similar content
• Same movements in surgeries
• Good results for finding the
query’s video source
Overall impression
Fig. Shots (photos) manually created from the surgeon in the course of
the procedure.
Slide 26
Qualitative Evaluation
Fig. Screenshots of the result presentation showing the three top videos and the query image. All results
are presented in HTML5 and can be viewed in recent browsers supporting HTML5 videos and JavaScript.
Best matching frames are indicated by triangles in the red and grey time line below the video player.
SIMPLE: SURF detector + CEDD descriptor
Slide 27
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Conclusions and Further Work
An existing tool is adapted and extended
for content-based video retrieval
Slide 28
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
Global features
Exploratory
search mode
Local features
Semantically
similar content
Further work:
• ad-hoc search within surgery procedures.
• faster indexing strategies
• fusion of local and global features.
• different implementation of SIMPLE descriptor (Random
Detector + modified-CEDD descriptor).
Appendix
Slide 29
Introduction · Overview · Methods · Data sets · Experiments · Conclusions
[3] Roldan-Carlos J, Lux M, Giró-i-Nieto X, Muñoz-Trallero P, Anagnostopoulos N. Event Video Retrieval using Global and Local Descriptors in Visual Domain.
In: IEEE/ACM International Workshop on Content-Based Multimedia Indexing - CBMI 2015 .
[4] Roldan-Carlos J, Lux M, Giró-i-Nieto X, Muñoz-Trallero P, Anagnostopoulos N. Visual Information Retrieval in Endoscopic Video Archives. In: IEEE/ACM
International Workshop on Content-Based Multimedia Indexing - CBMI 2015 . Prague, Czech Republic: In Presshttp://arxiv.org/abs/1504.07874
Two papers were presented in the Special Session on Medical Multimedia
Processing [3] [4] (acceptance rate for special sessions= 55%)
Thank you for your attention
Do you have any question?
7 May 2015
Visual Search for
Musical Performances
and Endoscopic Videos
Jennifer Roldán

Weitere ähnliche Inhalte

Andere mochten auch

Mobile Visual Search
Mobile Visual SearchMobile Visual Search
Mobile Visual SearchOge Marques
 
Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...
Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...
Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...Michael Riegler
 
Advances in Image Search and Retrieval
Advances in Image Search and RetrievalAdvances in Image Search and Retrieval
Advances in Image Search and RetrievalOge Marques
 
Workshop: Im Kontakt mit Menschen mit Behinderungen
Workshop: Im Kontakt mit Menschen mit BehinderungenWorkshop: Im Kontakt mit Menschen mit Behinderungen
Workshop: Im Kontakt mit Menschen mit BehinderungenAndreas Jeitler
 
ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1
ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1
ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1Dr. Ahmad Haj Mosa
 
Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010Oge Marques
 
Visual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesVisual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesOge Marques
 
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval SystemLIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval SystemUniversitat Politècnica de Catalunya
 
Präsentation Defensio Masterarbeit
Präsentation Defensio MasterarbeitPräsentation Defensio Masterarbeit
Präsentation Defensio MasterarbeitGerhard Pilz
 
Image Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPadImage Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPadOge Marques
 
Detecting Sarcasm in Multimodal Social Platforms
Detecting Sarcasm in Multimodal Social PlatformsDetecting Sarcasm in Multimodal Social Platforms
Detecting Sarcasm in Multimodal Social PlatformsRossano Schifanella
 
LinkedIn Tips For The Biotech Professional
LinkedIn Tips For The Biotech ProfessionalLinkedIn Tips For The Biotech Professional
LinkedIn Tips For The Biotech ProfessionalLinkedIn
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017LinkedIn
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...SlideShare
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksSlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShareSlideShare
 

Andere mochten auch (16)

Mobile Visual Search
Mobile Visual SearchMobile Visual Search
Mobile Visual Search
 
Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...
Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...
Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing I...
 
Advances in Image Search and Retrieval
Advances in Image Search and RetrievalAdvances in Image Search and Retrieval
Advances in Image Search and Retrieval
 
Workshop: Im Kontakt mit Menschen mit Behinderungen
Workshop: Im Kontakt mit Menschen mit BehinderungenWorkshop: Im Kontakt mit Menschen mit Behinderungen
Workshop: Im Kontakt mit Menschen mit Behinderungen
 
ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1
ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1
ROBOTICS, Lecture, CHAP 4- Robotic Mechanics 1
 
Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010Recent advances in visual information retrieval marques klu june 2010
Recent advances in visual information retrieval marques klu june 2010
 
Visual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesVisual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and Opportunities
 
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval SystemLIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
 
Präsentation Defensio Masterarbeit
Präsentation Defensio MasterarbeitPräsentation Defensio Masterarbeit
Präsentation Defensio Masterarbeit
 
Image Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPadImage Processing and Computer Vision in iPhone and iPad
Image Processing and Computer Vision in iPhone and iPad
 
Detecting Sarcasm in Multimodal Social Platforms
Detecting Sarcasm in Multimodal Social PlatformsDetecting Sarcasm in Multimodal Social Platforms
Detecting Sarcasm in Multimodal Social Platforms
 
LinkedIn Tips For The Biotech Professional
LinkedIn Tips For The Biotech ProfessionalLinkedIn Tips For The Biotech Professional
LinkedIn Tips For The Biotech Professional
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Ähnlich wie Visual Search for Musical Performances and Endoscopic Videos

Research Proposal Presentation Pitch
Research Proposal Presentation PitchResearch Proposal Presentation Pitch
Research Proposal Presentation Pitchtchoonyong
 
From Unsupervised to Semi-Supervised Event Detection
From Unsupervised to Semi-Supervised Event DetectionFrom Unsupervised to Semi-Supervised Event Detection
From Unsupervised to Semi-Supervised Event DetectionVincent Chu
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningElaheh Rashedi
 
TVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesTVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesNEERAJ BAGHEL
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersSymeon Papadopoulos
 
Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...LinkedTV
 
Video Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPVideo Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPinventionjournals
 
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇台灣資料科學年會
 
MediaEval 2016 - UNIFESP Predicting Media Interestingness Task
MediaEval 2016 - UNIFESP Predicting Media Interestingness TaskMediaEval 2016 - UNIFESP Predicting Media Interestingness Task
MediaEval 2016 - UNIFESP Predicting Media Interestingness Taskmultimediaeval
 
Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...
Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...
Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...Tarek Gaber
 
Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...MediaMixerCommunity
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman
 

Ähnlich wie Visual Search for Musical Performances and Endoscopic Videos (20)

Video Thumbnail Selector
Video Thumbnail SelectorVideo Thumbnail Selector
Video Thumbnail Selector
 
Overview of ImageCLEF 2014
Overview of ImageCLEF 2014Overview of ImageCLEF 2014
Overview of ImageCLEF 2014
 
Research Proposal Presentation Pitch
Research Proposal Presentation PitchResearch Proposal Presentation Pitch
Research Proposal Presentation Pitch
 
Visual instance mining of news videos using a graph-based approach
Visual instance mining of news videos using a graph-based approachVisual instance mining of news videos using a graph-based approach
Visual instance mining of news videos using a graph-based approach
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
From Unsupervised to Semi-Supervised Event Detection
From Unsupervised to Semi-Supervised Event DetectionFrom Unsupervised to Semi-Supervised Event Detection
From Unsupervised to Semi-Supervised Event Detection
 
QoE Research
QoE ResearchQoE Research
QoE Research
 
Long-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep LearningLong-term Face Tracking in the Wild using Deep Learning
Long-term Face Tracking in the Wild using Deep Learning
 
TVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesTVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using Titles
 
Progress Reprot.pptx
Progress Reprot.pptxProgress Reprot.pptx
Progress Reprot.pptx
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
 
Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...
 
ISM2014
ISM2014ISM2014
ISM2014
 
Video Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPVideo Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAP
 
these_15-9
these_15-9these_15-9
these_15-9
 
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
[2018 台灣人工智慧學校校友年會] 視訊畫面生成 / 林彥宇
 
MediaEval 2016 - UNIFESP Predicting Media Interestingness Task
MediaEval 2016 - UNIFESP Predicting Media Interestingness TaskMediaEval 2016 - UNIFESP Predicting Media Interestingness Task
MediaEval 2016 - UNIFESP Predicting Media Interestingness Task
 
Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...
Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...
Sift based arabic sign language recognition aecia 2014 –november17-19, addis ...
 
Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...
 
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
 

Mehr von Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Mehr von Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Kürzlich hochgeladen

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Kürzlich hochgeladen (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Visual Search for Musical Performances and Endoscopic Videos

  • 1. VISUAL SEARCH FOR MUSICAL PERFORMANCES AND ENDOSCOPIC VIDEOS Degree’s Final Project Dissertation Telecommunications Engineering Jennifer Roldán Supervisors: Assoc. Prof. Mathias Lux Assoc. Prof. Xavier Giró
  • 2. Outline of the Thesis 1. Introduction i. Motivation ii. Gantt chart. Work Plan 2. Overview. Existing Demo-Application 3. Methods i. Global features using Late Fusion Methods ii. Local features: SIMPLE descriptor 4. Data sets i. Musical Performances ii. Endoscopic Videos 5. Experiments i. Quantitative evaluation ii. Qualitative evaluation. Thinking-aloud test 6. Conclusions and Further Work Sep 2014 – May 2015 Slide 2
  • 3. Motivation • Application for covering the surgeons’ needs and automatize data processing • Endoscopic videos (confidential data) • Focus of the project • Video retrieval on demand for surgeons • Musical performances (free data set) • Reproducible results for evaluation • Quantitative and qualitative studies Slide 3 Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 4. Gantt Chart. Work Plan Slide 4 Use of existing tools and define the Thesis’s statements Experiments with endoscopic videos Two papers submitted in 13th CBMI Congress Project development with Jiku Mobile data set Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 5. Existing Demo Application Slide 5 Fig. All results are presented in HTML 5 and can be viewed in a recent version of common browsers. Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 6. Existing Demo Application Slide 6 Publicated at ACM Mutimedia Open Source Competition [1] • Open source library for CBIR • Based on Lucene • Java text retrieval framework • Indexing and Search • Supporting Global and Local features (Integrate until 20 descriptors) Introduction · Overview · Methods · Data sets · Experiments · Conclusions [1] Mathias Lux. LIRE: Open source image retrieval in java. In Proceedings of the 21st ACM international conference on Multimedia, pages 843{846.ACM, 2013.
  • 7. Methodology Slide 7 1. Previous methods in demo application: • Global Features i. CEDD. Color and Edge Directivity Descriptor ii. Color Histogram. iii.PHOG. Pyramid Histogram of Oriented Gradients • Late Fusion Methods 2. Extend the methods to local features for retrieval • Use an existing tool to study better results • SIMPLE descriptor Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 8. Method 1 Global features using Late Fusion Feature extraction and indexing Similarity measure Fusion Fig. System Architecture Introduction · Overview · Methods · Data sets · Experiments · Conclusions Slide 8
  • 9. Method 1 Global features using Late Fusion Feature extraction and indexing Similarity measure Fusion Global descriptors for each IRM: 1. CEDD 2. Color Histogram 3. PHOG Introduction · Overview · Methods · Data sets · Experiments · Conclusions Slide 9
  • 10. Method 1 Global features using Late Fusion Feature extraction and indexing Similarity measure Fusion Normalization: • Two different approaches • N limited images: 1. rank: 𝑅 𝐾 n = N+1−Rk n N 2. score: 𝑅 𝐾 n = Rk n −min(RK) max Rk −min(𝑅 𝑘) Introduction · Overview · Methods · Data sets · Experiments · Conclusions Slide 10
  • 11. Method 1 Global features using Late Fusion Feature extraction and indexing Similarity measure Fusion Fusion Methods: a. Sum: 𝑅𝑡 n = 𝑘 𝑅 𝑘 𝑛 = 𝑅1 𝑛 + 𝑅2 𝑛 + ⋯ + 𝑅 𝐾 𝑛 b. Sum with combMNZ: sum x number of IRM returned by image n Final Ranked Lists: 1. Sum (ranks) 2. Sum (scores) 3. Sum with comMNZ (ranks) 4. Sum with comMNZ (scores) 4 Introduction · Overview · Methods · Data sets · Experiments · Conclusions Slide 11
  • 12. “Searching Images with MPEG-7 (& MPEG-7-like) Powered Localized dEscriptors (SIMPLE)” [2] SURF detector + CEDD descriptor • Extraction of global features as local ones (image key points) • Codebook of 512 VW using Bag-Of-Visual-Words (BOVW) model • K-means clustering algorithm with vocabulary of 512 words. Method 2 Local features. SIMPLE descriptor Introduction · Overview · Methods · Data sets · Experiments · Conclusions Slide 12 [2] Chryssanthi Iakovidou, Nektarios Anagnostopoulos, Athanasios Ch Kapoutsis, Yiannis Boutalis, and Savvas A Chatzichristos. Searching images with MPEG-7 (& mpeg-7-like) powered localized descriptors: the SIMPLE answer to effective content based image retrieval. In 12th International Workshop on Content-Based Multimedia Indexing (CBMI), pages 1-6. IEEE, 2014.
  • 13. Data sets Video Retrieval for two different cases Introduction · Overview · Methods · Data sets · Experiments · Conclusions Slide 13 1 2 Musical Performances Endoscopic Videos
  • 14. Musical Performances  Freely available data set. It allows us to compare results  Jiku Mobile data set • 473 video clips • Mobile devices • Multiple users • 5 events and several performances  Test • 356 videos randomly selected • Based on 1 frame per second • 412 query images Fig. Query images event domain Slide 14 Introduction · Overview · Methods · Data sets · Experiments · Conclusions 1
  • 15. Fig. Query images medical domain Endoscopic Videos2  Confidential and anonymized data  Live video stream data set • Surgeons’ recordings in HQ • Inside of their subjects • 33 hours roughly covered • 54 laparoscopy procedures  Test • 1,276 videos randomly selected • Based on 5 frame per second • 600 query images Introduction · Overview · Methods · Data sets · Experiments · Conclusions Slide 15
  • 16. Experiments Video Retrieval tested by two different evaluations Slide 16 1 2 Quantitative evaluation Qualitative evaluation (Thinking-aloud Test) Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 17. Evaluation Social Study, at AAU Quantitative study: • To find the position of the video where the query image belongs • Results Global Features • Results Local Features Qualitative study. Thinking-aloud Test • Interface semi-interactive web-page • Participants are researchers and non-researchers within the CODE-MM Project • 6 Volunteers for Musical Performances Test • 2 Volunteers for Endoscopic Videos Test Slide 17 Introduction · Overview · Methods · Data sets · Experiments · Conclusions 1 2
  • 18. Evaluation Social Study, at AAU Thinking-aloud Test • Interface semi-interactive web-page blindly labeled with 3 Search Engines (A, B, C) i. sum of ranks method and global features  Search Engine A ii. sum of scores method and global features  Search Engine B iii. SIMPLE (SURF detector + CEDD descriptor)  Search Engine C • Participants must show their thoughts in loud-voice • Sessions are recorded Slide 18 Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 19. Evaluation Social Study, at AAU Thinking-aloud Test Slide 19 Introduction · Overview · Methods · Data sets · Experiments · Conclusions Fig. Screenshots of the different movements of the first volunteer Fig. Screenshot from the thinking aloud test Fig. Interface for the thinking aloud test
  • 20. Experiments Video Retrieval tested by two different evaluations Slide 20 1 2 Musical Performances Endoscopic Videos Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 21. Table I. Results of the tests on where that actual video can be found in the results. The first four columns give the four different tested feature fusion approaches, the fifth one gives the results on the use of the SIMPLE-CEDD descriptors Benchmarking based on the 412 set of queries: Introduction · Overview · Methods · Data sets · Experiments · Conclusions Musical Performances Quantitative Evaluation Slide 21 Source video of the query image ranked in the first position of the result list • Global features: 96,6% of the queries • Local features: 91,5% of the queries
  • 22. Introduction · Overview · Methods · Data sets · Experiments · Conclusions Musical Performances Qualitative Evaluation Fig. Most used query images in the user test (left to right) Global features ( A, B ) • Search Model: Abstract exploratory • Different sub-events, same view point Local features ( C ) • Search Model: Semantically similar content • Same performance, different viewpoints • Good results in earlier video’s position Overall impression Slide 22
  • 23. Global Features using Late Fusion SIMPLE: SURF detector + CEDD Introduction · Overview · Methods · Data sets · Experiments · Conclusions Musical Performances Qualitative Evaluation Slide 23
  • 24. Experiments Slide 24 1 2 Musical Performances Endoscopic Videos Video Retrieval tested by two different evaluations Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 25. Benchmarking based on the 600 set of queries: Introduction · Overview · Methods · Data sets · Experiments · Conclusions Endoscopic Videos Quantitative Evaluation Table II. Results of the tests on where that actual video can be found in the results. The first four columns give the four different tested feature fusion approaches, the fifth one gives the results on the use of the SIMPLE-CEDD descriptors Slide 25 Source video of the query image ranked in the first position of the result list • Global features: 78.3% of the queries • Local features: 79,8% of the queries
  • 26. Introduction · Overview · Methods · Data sets · Experiments · Conclusions Endoscopic Videos Qualitative Evaluation Global features ( A, B ) • Search Model: Abstract exploratory • Relevant shots in the top results (semantically dissimilar) Local features ( C ) • Search Model: Semantically similar content • Same movements in surgeries • Good results for finding the query’s video source Overall impression Fig. Shots (photos) manually created from the surgeon in the course of the procedure. Slide 26
  • 27. Qualitative Evaluation Fig. Screenshots of the result presentation showing the three top videos and the query image. All results are presented in HTML5 and can be viewed in recent browsers supporting HTML5 videos and JavaScript. Best matching frames are indicated by triangles in the red and grey time line below the video player. SIMPLE: SURF detector + CEDD descriptor Slide 27 Introduction · Overview · Methods · Data sets · Experiments · Conclusions
  • 28. Conclusions and Further Work An existing tool is adapted and extended for content-based video retrieval Slide 28 Introduction · Overview · Methods · Data sets · Experiments · Conclusions Global features Exploratory search mode Local features Semantically similar content Further work: • ad-hoc search within surgery procedures. • faster indexing strategies • fusion of local and global features. • different implementation of SIMPLE descriptor (Random Detector + modified-CEDD descriptor).
  • 29. Appendix Slide 29 Introduction · Overview · Methods · Data sets · Experiments · Conclusions [3] Roldan-Carlos J, Lux M, Giró-i-Nieto X, Muñoz-Trallero P, Anagnostopoulos N. Event Video Retrieval using Global and Local Descriptors in Visual Domain. In: IEEE/ACM International Workshop on Content-Based Multimedia Indexing - CBMI 2015 . [4] Roldan-Carlos J, Lux M, Giró-i-Nieto X, Muñoz-Trallero P, Anagnostopoulos N. Visual Information Retrieval in Endoscopic Video Archives. In: IEEE/ACM International Workshop on Content-Based Multimedia Indexing - CBMI 2015 . Prague, Czech Republic: In Presshttp://arxiv.org/abs/1504.07874 Two papers were presented in the Special Session on Medical Multimedia Processing [3] [4] (acceptance rate for special sessions= 55%)
  • 30. Thank you for your attention Do you have any question? 7 May 2015 Visual Search for Musical Performances and Endoscopic Videos Jennifer Roldán

Hinweis der Redaktion

  1. HOW IT WORKS: It represents a challenge to find the particular scenes within the videos recorded during the procedure in the daily work of surgeons. re-find easily these shots within video streams by visual queries and, The similar frames are shown in the time line where the images were taken. summarize the video content of the surgeries
  2. http://es.slideshare.net/dermotte/lire-27544341?related=2
  3. https://imatge.upc.edu/web/publications/visual-information-retrieval-endoscopic-video-archives https://imatge.upc.edu/web/publications/event-video-retrieval-using-global-and-local-descriptors-visual-domain