SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
11Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Analysis of Visual Similarity in News VideosAnalysis of Visual Similarity in News Videos
with Robust and Memorywith Robust and Memory--EfficientEfficient
Image RetrievalImage Retrieval
David Chen, Peter Vajda, Sam Tsai, Maryam Daneshi,
Matt Yu, Huizhong Chen, Andre Araujo, Bernd Girod
Image, Video, and Multimedia Systems Group
Stanford University
22Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
33Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
44Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
55Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Plays 30 second clip around
query phrase match
Would benefit from accurate
segmentation of stories
Would benefit from reliable
generation of summary clips
66Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Applications of Anchor DetectionApplications of Anchor Detection
1. Provide strong cues for story segmentation
2. Extract news story summaries/previews
3. Identify anchors for general person recognition
TURNING TO TECH, SHARES OF RESEARCH IN MOTION REBOUNDED FROM A ONE MONTH LOW. THE
COMPANY'S NEXT GENERATION BLACKBERRY-10 PRODUCT LINE IS EXPECTED TO BE UNVEILED IN
JUST A FEW WEEKS. YOU MAY REMEMBER SHARES SOLD OFF LAST WEEK AFTER THE COMPANY
ISSUED A CAUTIOUS OUTLOOK FOR ITS FOURTH QUARTER RESULTS. BUT TODAY SHARES
BOUNCED BACK: UP 11.5% TO A UNDER $12.
Anchor Brian
Williams
Anchor Susie
Gharib
Don’t confuse anchors with
other people in the videos
77Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Applications of Preview MatchingApplications of Preview Matching
1. Provide strong cues for story segmentation
2. Extract news story summaries/previews
3. Indicate the most important stories in a broadcast
JUST A MESS. IN WASHINGTON, LAWMAKERS LEAVE TOWN FOR THE HOLIDAYS. THE CLOCK TICKS
DOWN TO THE SO-CALLED FISCAL CLIFF. LATE TODAY, THE PRESIDENT HASTILY APPEARS TO ASK IF
SOME OF THIS BUSINESS CAN BE FINISHED SOON.
88Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
OutlineOutline
• Related work in news video analysis
• Long-range visual similarity
• Anchor detection algorithm
• Preview matching algorithm
• Experimental results
99Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Related Work in News Video AnalysisRelated Work in News Video Analysis
• Model-based anchor detection
[Zhang et al., 1998] [Hanjalic et al., 1998] [Liu et al., 2000]
• Model-free anchor detection
[Gao et al., 2002] [De Santo et al., 2006] [D’Anna et al., 2007]
[Ma et al., 2008] [Broilo et al., 2011]
• Spatio-temporal slices for reporter detection
[Liu et al., 2007] [Zheng et al., 2010]
• Classification of news video shots
[Bertini et al., 2001] [Xiao et al., 2010] [Lee et al., 2011]
1010Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
Frame Number
FrameNumber
1 501 1001 1501 2001 2501 3001
1
501
1001
1501
2001
2501
3001
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
1111Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
Frame Number
FrameNumber
1 501 1001 1501 2001 2501 3001 3501
1
501
1001
1501
2001
2501
3001
3501
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
What causes these long-
range visual similarities?
What causes these long-
range visual similarities?
1212Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
NBC Nightly News on Dec. 21, 2012
1313Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual SimilarityAnchor:
Brian
Williams
Analyst:
David
Gregory
Reporter:
Kelly
O’Donnell
Reporter:
Andrea
Mitchell
1414Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
LongLong--Range Visual SimilarityRange Visual Similarity
1515Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Anchor Detection PipelineAnchor Detection Pipeline
Exclude
Frames
Without Faces
Extract Image
Signatures
Compare
Image
Signatures
Form Initial
Anchor
Candidates
Prune Away
False
Candidates
Include
Temporally
Nearby
Candidates
Keyframes
Detections
Similarity
Matrix
Count number of long-range local peaks in the
current row of the similarity matrix and pick initial
candidates from high-count rows
Compare initial candidates to one another and
prune out candidates which are not very similar to
the other initial candidates
From pruned set of candidates, expand to include
temporally nearby candidates which are also very
similar in appearance
1616Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
IntraIntra--Episode vs. InterEpisode vs. Inter--EpisodeEpisode
• Intra-episode: compare frames within a single
episode of a news program
• Inter-episode: compare frames between different
episodes of a news program
1717Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Preview Matching PipelinePreview Matching Pipeline
Detect and
Recognize
Text
Adaptively
Crop to
Preview
Region
Extract Image
Signature
Compare
Image
Signatures
Verify
Geometry in
Shortlist
JUST A
MESS
JUST A
MESS
COMING
UP
COMING
UP
Database of Image
Signatures
Frame
Matches
1818Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
REVV: Residual Enhanced Visual VectorREVV: Residual Enhanced Visual Vector
Extract Local
Features
Vector
Quantize to
Visual Words
Visual Codebook
Perform Mean
Aggregation
of Residuals
Query
Image
……
Regularize
with Power
Law
Reduce
Dimensions
by LDA
Binarize
Components
from Sign
Compute
Weighted
Correlations
Database
Signatures
Ranked List
1.74
1.75
1.79
1.80
1.83
1.84
…
1919Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Experimental SetupExperimental Setup
• Anchor detection
– Training on 12 episodes of NBC Nightly News (1
anchor/episode), ABC World News (1 anchor/episode),
Nightly Business Report (2 anchors/episode)
– Testing on 21 episodes of same three programs
– Measure precision / recall / F-score
• Preview matching
– Testing on 10 episodes of NBC Nightly News and ABC
World News
– Measure precision / recall / F-score
• Comparison of two memory-efficient signatures
– GIST: 66 MB/episode [Oliva et al., 2001] [Douze et al., 2009]
– REVV: 10 MB/episode [Chen et al., 2013]
2020Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Anchor Detection ResultsAnchor Detection Results
Recall Precision F-Score
GIST Intra 0.53 0.84 0.65
REVV Intra 0.87 0.90 0.88
2121Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Anchor Detection ResultsAnchor Detection Results
Recall Precision F-Score
REVV Intra 0.87 0.90 0.88
REVV
Intra + Inter
0.90 0.91 0.90
2222Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Preview Matching ResultsPreview Matching Results
Recall Precision F-Score
GIST 0.48 1.00 0.65
REVV 0.90 1.00 0.95
Type A: Preview occurs
at beginning of broadcast
2323Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
Preview Matching ResultsPreview Matching Results
Recall Precision F-Score
GIST 0.62 1.00 0.77
REVV 0.93 1.00 0.96
Type B: Preview occurs
prior to a commercial
2424Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
ConclusionsConclusions
• Long-range visual similarity in news videos provides
a general and effective method for anchor detection
and preview matching
• A robust image signature is required to handle
challenging appearance variations throughout a
newscast
• The image signature should be memory-efficient to
enable parallelized processing of large video
archives
Thank YouThank YouThank You
dmchen@stanford.edudmchen@stanford.edu

Weitere ähnliche Inhalte

Ähnlich wie Analysis of visual similarity in news videos with robust and memory efficient image retrieval

"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...Edge AI and Vision Alliance
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...CSCJournals
 
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...IJECEIAES
 
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future DirectionsToru Tamaki
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptxSaravanaD2
 
A Comparison of People Counting Techniques via Video Scene Analysis
A Comparison of People Counting Techniques viaVideo Scene AnalysisA Comparison of People Counting Techniques viaVideo Scene Analysis
A Comparison of People Counting Techniques via Video Scene AnalysisPoo Kuan Hoong
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersSymeon Papadopoulos
 
TVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesTVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesNEERAJ BAGHEL
 
Design and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding SystemDesign and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding Systemijtsrd
 
How to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdfHow to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdfPubrica
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationIRJET Journal
 
ppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptxppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptxssusera4da91
 
How to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptxHow to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptxPubrica
 
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...IRJET Journal
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...IJCSIS Research Publications
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachCSCJournals
 
IRJET - Applications of Image and Video Deduplication: A Survey
IRJET -  	  Applications of Image and Video Deduplication: A SurveyIRJET -  	  Applications of Image and Video Deduplication: A Survey
IRJET - Applications of Image and Video Deduplication: A SurveyIRJET Journal
 

Ähnlich wie Analysis of visual similarity in news videos with robust and memory efficient image retrieval (20)

"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
 
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
MTVRep: A movie and TV show reputation system based on fine-grained sentiment ...
 
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
 
A Comparison of People Counting Techniques via Video Scene Analysis
A Comparison of People Counting Techniques viaVideo Scene AnalysisA Comparison of People Counting Techniques viaVideo Scene Analysis
A Comparison of People Counting Techniques via Video Scene Analysis
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
 
TVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesTVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using Titles
 
Design and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding SystemDesign and Analysis of Quantization Based Low Bit Rate Encoding System
Design and Analysis of Quantization Based Low Bit Rate Encoding System
 
How to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdfHow to prepare a perfect video abstract for your research paper – Pubrica.pdf
How to prepare a perfect video abstract for your research paper – Pubrica.pdf
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
 
Video + Language
Video + LanguageVideo + Language
Video + Language
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage Summarization
 
ppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptxppt icitisee 2022_without_recording.pptx
ppt icitisee 2022_without_recording.pptx
 
How to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptxHow to prepare a perfect video abstract for your research paper – Pubrica.pptx
How to prepare a perfect video abstract for your research paper – Pubrica.pptx
 
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
 
IRJET - Applications of Image and Video Deduplication: A Survey
IRJET -  	  Applications of Image and Video Deduplication: A SurveyIRJET -  	  Applications of Image and Video Deduplication: A Survey
IRJET - Applications of Image and Video Deduplication: A Survey
 

Mehr von MediaMixerCommunity

VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...MediaMixerCommunity
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutMediaMixerCommunity
 
Remixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and SemanticsRemixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and SemanticsMediaMixerCommunity
 
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationRe-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationMediaMixerCommunity
 
Re-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and ExamplesRe-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and ExamplesMediaMixerCommunity
 
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskSemantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskMediaMixerCommunity
 
Opening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchersOpening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchersMediaMixerCommunity
 
The Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixingThe Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixingMediaMixerCommunity
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesMediaMixerCommunity
 
Media Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industryMedia Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industryMediaMixerCommunity
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesMediaMixerCommunity
 
Semantic technologies for copyright management
Semantic technologies for copyright managementSemantic technologies for copyright management
Semantic technologies for copyright managementMediaMixerCommunity
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013MediaMixerCommunity
 

Mehr von MediaMixerCommunity (14)

VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...VideoLecturesMashup: using media fragments and semantic annotations to enable...
VideoLecturesMashup: using media fragments and semantic annotations to enable...
 
Re-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playoutRe-using Media on the Web: Media fragment re-mixing and playout
Re-using Media on the Web: Media fragment re-mixing and playout
 
Remixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and SemanticsRemixing Media on the Web: Media Fragment Specification and Semantics
Remixing Media on the Web: Media Fragment Specification and Semantics
 
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationRe-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
 
Re-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and ExamplesRe-using Media on the Web Tutorial: Introduction and Examples
Re-using Media on the Web Tutorial: Introduction and Examples
 
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskSemantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
 
Opening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchersOpening up audiovisual archives for media professionals and researchers
Opening up audiovisual archives for media professionals and researchers
 
The Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixingThe Sensor Web - New Opportunities for MediaMixing
The Sensor Web - New Opportunities for MediaMixing
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ Archives
 
Media Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industryMedia Mixing in the broadcast TV industry
Media Mixing in the broadcast TV industry
 
Building a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ ArchivesBuilding a linked data based content discovery service for the RTÉ Archives
Building a linked data based content discovery service for the RTÉ Archives
 
Semantic multimedia remixing
Semantic multimedia remixingSemantic multimedia remixing
Semantic multimedia remixing
 
Semantic technologies for copyright management
Semantic technologies for copyright managementSemantic technologies for copyright management
Semantic technologies for copyright management
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Analysis of visual similarity in news videos with robust and memory efficient image retrieval

  • 1. 11Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Analysis of Visual Similarity in News VideosAnalysis of Visual Similarity in News Videos with Robust and Memorywith Robust and Memory--EfficientEfficient Image RetrievalImage Retrieval David Chen, Peter Vajda, Sam Tsai, Maryam Daneshi, Matt Yu, Huizhong Chen, Andre Araujo, Bernd Girod Image, Video, and Multimedia Systems Group Stanford University
  • 2. 22Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
  • 3. 33Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
  • 4. 44Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval
  • 5. 55Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Plays 30 second clip around query phrase match Would benefit from accurate segmentation of stories Would benefit from reliable generation of summary clips
  • 6. 66Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Applications of Anchor DetectionApplications of Anchor Detection 1. Provide strong cues for story segmentation 2. Extract news story summaries/previews 3. Identify anchors for general person recognition TURNING TO TECH, SHARES OF RESEARCH IN MOTION REBOUNDED FROM A ONE MONTH LOW. THE COMPANY'S NEXT GENERATION BLACKBERRY-10 PRODUCT LINE IS EXPECTED TO BE UNVEILED IN JUST A FEW WEEKS. YOU MAY REMEMBER SHARES SOLD OFF LAST WEEK AFTER THE COMPANY ISSUED A CAUTIOUS OUTLOOK FOR ITS FOURTH QUARTER RESULTS. BUT TODAY SHARES BOUNCED BACK: UP 11.5% TO A UNDER $12. Anchor Brian Williams Anchor Susie Gharib Don’t confuse anchors with other people in the videos
  • 7. 77Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Applications of Preview MatchingApplications of Preview Matching 1. Provide strong cues for story segmentation 2. Extract news story summaries/previews 3. Indicate the most important stories in a broadcast JUST A MESS. IN WASHINGTON, LAWMAKERS LEAVE TOWN FOR THE HOLIDAYS. THE CLOCK TICKS DOWN TO THE SO-CALLED FISCAL CLIFF. LATE TODAY, THE PRESIDENT HASTILY APPEARS TO ASK IF SOME OF THIS BUSINESS CAN BE FINISHED SOON.
  • 8. 88Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval OutlineOutline • Related work in news video analysis • Long-range visual similarity • Anchor detection algorithm • Preview matching algorithm • Experimental results
  • 9. 99Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Related Work in News Video AnalysisRelated Work in News Video Analysis • Model-based anchor detection [Zhang et al., 1998] [Hanjalic et al., 1998] [Liu et al., 2000] • Model-free anchor detection [Gao et al., 2002] [De Santo et al., 2006] [D’Anna et al., 2007] [Ma et al., 2008] [Broilo et al., 2011] • Spatio-temporal slices for reporter detection [Liu et al., 2007] [Zheng et al., 2010] • Classification of news video shots [Bertini et al., 2001] [Xiao et al., 2010] [Lee et al., 2011]
  • 10. 1010Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity Frame Number FrameNumber 1 501 1001 1501 2001 2501 3001 1 501 1001 1501 2001 2501 3001 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
  • 11. 1111Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity Frame Number FrameNumber 1 501 1001 1501 2001 2501 3001 3501 1 501 1001 1501 2001 2501 3001 3501 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 What causes these long- range visual similarities? What causes these long- range visual similarities?
  • 12. 1212Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity NBC Nightly News on Dec. 21, 2012
  • 13. 1313Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual SimilarityAnchor: Brian Williams Analyst: David Gregory Reporter: Kelly O’Donnell Reporter: Andrea Mitchell
  • 14. 1414Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval LongLong--Range Visual SimilarityRange Visual Similarity
  • 15. 1515Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Anchor Detection PipelineAnchor Detection Pipeline Exclude Frames Without Faces Extract Image Signatures Compare Image Signatures Form Initial Anchor Candidates Prune Away False Candidates Include Temporally Nearby Candidates Keyframes Detections Similarity Matrix Count number of long-range local peaks in the current row of the similarity matrix and pick initial candidates from high-count rows Compare initial candidates to one another and prune out candidates which are not very similar to the other initial candidates From pruned set of candidates, expand to include temporally nearby candidates which are also very similar in appearance
  • 16. 1616Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval IntraIntra--Episode vs. InterEpisode vs. Inter--EpisodeEpisode • Intra-episode: compare frames within a single episode of a news program • Inter-episode: compare frames between different episodes of a news program
  • 17. 1717Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Preview Matching PipelinePreview Matching Pipeline Detect and Recognize Text Adaptively Crop to Preview Region Extract Image Signature Compare Image Signatures Verify Geometry in Shortlist JUST A MESS JUST A MESS COMING UP COMING UP Database of Image Signatures Frame Matches
  • 18. 1818Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval REVV: Residual Enhanced Visual VectorREVV: Residual Enhanced Visual Vector Extract Local Features Vector Quantize to Visual Words Visual Codebook Perform Mean Aggregation of Residuals Query Image …… Regularize with Power Law Reduce Dimensions by LDA Binarize Components from Sign Compute Weighted Correlations Database Signatures Ranked List 1.74 1.75 1.79 1.80 1.83 1.84 …
  • 19. 1919Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Experimental SetupExperimental Setup • Anchor detection – Training on 12 episodes of NBC Nightly News (1 anchor/episode), ABC World News (1 anchor/episode), Nightly Business Report (2 anchors/episode) – Testing on 21 episodes of same three programs – Measure precision / recall / F-score • Preview matching – Testing on 10 episodes of NBC Nightly News and ABC World News – Measure precision / recall / F-score • Comparison of two memory-efficient signatures – GIST: 66 MB/episode [Oliva et al., 2001] [Douze et al., 2009] – REVV: 10 MB/episode [Chen et al., 2013]
  • 20. 2020Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Anchor Detection ResultsAnchor Detection Results Recall Precision F-Score GIST Intra 0.53 0.84 0.65 REVV Intra 0.87 0.90 0.88
  • 21. 2121Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Anchor Detection ResultsAnchor Detection Results Recall Precision F-Score REVV Intra 0.87 0.90 0.88 REVV Intra + Inter 0.90 0.91 0.90
  • 22. 2222Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Preview Matching ResultsPreview Matching Results Recall Precision F-Score GIST 0.48 1.00 0.65 REVV 0.90 1.00 0.95 Type A: Preview occurs at beginning of broadcast
  • 23. 2323Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval Preview Matching ResultsPreview Matching Results Recall Precision F-Score GIST 0.62 1.00 0.77 REVV 0.93 1.00 0.96 Type B: Preview occurs prior to a commercial
  • 24. 2424Chen et al., Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval ConclusionsConclusions • Long-range visual similarity in news videos provides a general and effective method for anchor detection and preview matching • A robust image signature is required to handle challenging appearance variations throughout a newscast • The image signature should be memory-efficient to enable parallelized processing of large video archives
  • 25. Thank YouThank YouThank You dmchen@stanford.edudmchen@stanford.edu