SlideShare a Scribd company logo
1 of 38
Download to read offline
Processing Video
Content and
Transcript for
Key-Topic
Identification
Daniela Cretu
2570710
Problem Statement
Large number of devices which can take pictures and videos lead to an increase in uploaded
multimedia content
300 hours of video uploaded to YouTube each hour
3.25 billion hours of YouTube videos watched every month
By 2020, Cisco forecasts it would take 5 million years for a person to watch every online video
Difference between video topic and description
Recipe :
relevant
for video
Subscriber channels
- not related to video
subject
Same video type - different categories
Both cooking-related videos, yet appear in different categories
Tagging issues
No tags!
Many
relevant
tags
Perhaps
irrelevant
tags?
Findability Problem
Problem : content becomes less and less findable
How can we fix this?
Annotating the videos could improve findability
User annotation
2 problems :
● Small number of tags (average 9 per video)
● May contain irrelevant tags - to gain more views
Alternative : automatically annotate the videos
Solution : Automatic annotation
● Process video streams (Google Video Intelligence, Clarifai API)
● Process video subtitles (Alchemy API, Google Natural Language)
● Tools processing the same type of data - likely yield different results - COMBINE THEM
● Disadvantage : video tools only able to provide content information, text tools only able to
provide context information
● Best approach - Combine tools which process different video dimensions
Solution Approach
Related work
● Concept detection in video relies
mostly on using low level image
attributes (e.g. color
histograms)-Lin et al, Chang et al
● Detecting concepts in subtitles -
used to assign categories to videos
(Katsiouli et al) or as basis for
finding other relevant entities
(Garcia et al)
● Crowdsourcing concepts -
encourage users to play games
and draw outlines of objects in
the video (Di Salvo et al) or
(Kavasidis et al)
Color histograms for similar images
Research Question
Main Research Question : How can we identify key topics in a video through processing
of the video stream and its textual description?
Two sub-research questions:
● How can we determine if certain concepts are more relevant than others ? (RQ1)
● How can we best align the concepts from the input sources(video stream and
transcript) ? (RQ2)
Dataset
● YouTube videos
● Of various types and lengths
● Aimed to select videos which do
not fit in more than one category
● In total 519 videos
Tools
● 1 subtitle processing tool - Google
Natural Language [1]
○ Outputs detected concepts in
the order of appearance
● 2 video processing tools - Clarifai[2]
and Google Video Intelligence[3]
○ We chose these tools because
the alternatives break down the
video into keyframes and
perform concept detection on
images, rather than video
Clarifai GVI
Output format JSON JSON
Tags per second yes no
Tags ordered
alphabetically
no yes
Occurrences of
same tags grouped
together
no yes
Confidence score
for tag
yes yes
‘Video relevant’
label for tags
no yes
[1] https://cloud.google.com/natural-language/
[2] https://www.clarifai.com/
[3] https://cloud.google.com/video-intelligence/
GVI Sample Output
TAG with single
occurence
Multiple occurrences of
same tag are grouped
Tag relevant at video
level
Clarifai Sample Output - list of vectors
Vector of seconds
Vector of
concepts for
each second
Vector of probabilitie
for each concept
After running the tools on the dataset...
● Clarifai - highest number of tags
● Subtitles - lowest number of tags
● Large amount of unique tags between the tools - thus overlap is low
Research Methodology
Aim : find key topics in the processed dataset
Research Methodology
Tag Processing (for each topl separately)
Step 1 : calculate number of occurrences and
longest time interval of each tag
In Figure 6:
Black
● Number of occurrences = 1
● Longest time interval = 5 (last - first)
Classroom
● Number of occurrences = 3
● Longest time interval = 5 (last - first)
Research Methodology
Step 2 : Transform confidence scales so
the tag with highest confidence score
ends up having confidence = 1 (highest
confidence score becomes divisor)
a. Recalculate confidence for all
other tags
In the example to the right : text has the
highest confidence score - use that as
divisor
Research Methodology
Step 3 : calculate relevance score for each tag
Sum of confidence scores / video length in
seconds
Step 4 : combine tags from the three different
outputs
● Use average formula
● If tag detected by tool, use relevance, if not,
use 0
Combining tags from the three tools
Evaluation Goals
We have identified 4 evaluation goals
● Confirm our computations (EV1)
● Check for bias towards one of the tools (EV2)
● Check for any correlation between bias and video characteristics (EV3)
● Check if the automatic tools may have missed something (EV4)
Evaluate using crowdsourcing
Strategy for Selecting Videos to Evaluate
Choose a sample of videos which have high overlap between the 3 tools
Because it was concluded that shorter videos are more suitable for crowdsourcing (workers tend to lose focus
for longer videos) we decided to show 10 second segments of video
From the sample of videos - pick 10 second segments to evaluate
Pick those segments from each video in which highly relevant (as resulted after combining the outputs of the
tools) tags occur
In total, 2169 segments to be evaluated, from 213 videos
Selecting Tags to display
For each segment of video - compose a list of most relevant, maybe relevant and not so relevant tags from the tags for the
overall video
AT most 10 tags in each category
3 variables help to assign tags to categories:
1. Max relevance score for segment (MaxConf)
2. Tag’s relevance score (Rel)
3. A relevance threshold (Thresh - is 0.2 is MaxConf > 0.2 and is = 0.02 if MaxConf <=0.2)
Assign tags in categories:
1. If MaxConf - Thresh < Rel < MaxConf AND less than 10 tags in category => put tag in that category
2. Repeat until rule no longer holds or more than 10 tags in category
3. MaxConf = MaxConf - Thresh
4. Repeat until categories full or no more tags
Crowdsourcing Task
● Ask users to watch 10 seconds of video
● Users can then select tags related to the video from the list
● Users can add any other tags they think are relevant to the segment
● Each task is evaluated by 15 workers
● Each worker gets 2 cents for each completed task
● Workers cannot submit the results without watching all 10 seconds
Evaluation Strategy
Watch video (Step 1)
Select
tags
(step 2)
Add additional
tags if desired
(step 3)
Evaluation Results - EV1
At segment level, an average of 41.74% of highly relevant tags (as evaluated by the crowdsourcing
workers) were correctly detected by the algorithm
Maybe relevant tags - smallest overlap of all
Additional subtitle tags (not detected by any tool other than subtitles) have highest overlap - BUT we
counted each tag chosen by at least one worker in the same category (perhaps relevance is low ?)
Evaluation Results - EV1
At video level, an average of 46.19% of the tags which were evaluated as being highly relevant
by the workers were also detected by the algorithm as being highly relevant
Same as segment level, medium relevance tags have lowest overlap
Low relevance tags slightly higher overlap than high relevance tags - for very short videos, there
is higher overlap for highly relevant videos
Evaluation Results - EV2
● Clarifai - mainly low relevance tags
● Most high relevance tags were
detected by both visual processing
tools
● Bias towards choosing tags
detected by more than one tool
Evaluation Results - EV3
By assigning numerical values to the time
distribution and tag category, we were able to
calculate correlation with the help of the
corresponding Excel function.
Assignment: {100, 200, 300, 400} corresponds to
{under 3 min, 3-5 min., 5-10 min and 10-15 min}
{10, 20, 30, 40, 50} corresponds to {clarifai + gvi,
clarifai + gvi + sub, clarifai, gvi, sub}
High correlation score for cooking between time
distribution and processing tool.
Inexistent or not very strong correlation for the
other 4 categories (for nature there is a
correlation, but very light)
Evaluation Results - EV3
Using the same assignment as in
the previous slide for the tag
detection tools, we assigned
{1,2,3,4,5} to be the alias of
{cooking, culture, nature,
travel,other}
For all time distributions,
correlation factor is negative
No apparent correlation between
categories and detection tools in
any time distribution.
Evaluation Results - EV4
● Only about 20% additional tags found in out lists
● Most of them with low relevance
Evaluation Result - RQ1
● Identified a bias towards choosing tags detected by more than one tool
● These should be higher up in the list
● Better alignment strategy : instead of simple average, use a weighted average
● Assign higher weight to tags detected by more than one tool
Evaluation Result - RQ2
● Current alignment detects 46.19% of highly relevant tags for the sampled videos
(comparison between the highly relevant tags detected by our algorithm and the highly
relevant tags chosen by crowdsourcing workers)
● There is a percentage of tags detected as being of medium relevance which have been
promoted to high relevance after crowdsourcing
● Find a better relevance threshold
Evaluation result - RQ2
Examined users choice behaviour for each category (the other three categories on next slide) to see
whether combining tools results in more accurate results
● For each category, tags selected by GVI + Clarifai are chosen more often that either Clarifai or GVI
separately
● Adding subtitles does not make much of a difference (highest overlap score for highly relevant tags
happens for tags detected by GVI+Clarifai
● Subtitles have the least chosen amount of tags ( remember that subtitle tags included here are not
detected by any other tool)
Combining visual tools - better than
using them individually
Combining visual tags with subtitle -
better than using just subtitles
Linear increase in tags - as relevance
decreases - number of tags increases
Conclusion and Future Work
● Our alignment strategy correctly detects around 46% of relevant tags for sampled videos
● Wanted to find out whether combining tools would yield better results
○ Tags from GVI are chosen more often than Clarifai tags
○ Most tags for sampled videos come from GVI + Clarifai - more relevant
○ Adding subtitles to visual tags -better than using just subtitle tags
● Differences between video categories are not that many - can use them as one single
dataset
● Related work deals mostly with one source of information, whereas we deal with
information from 3 different sources
○ Also mostly concerned with aligning tags to parts of the video, whereas we tried to find tags
relevant to the whole video.
● Our algorithm can be improved
● Include crowdsourcing to identify better threshold, not just for confirmation
● Use weighted average as part of alignment
Questions?
References
C. Y. Lin et al ‘VideoAL: A novel End-To-End MPEG-7 Video Automatic Labeling System’(2003)
Chang, S. F. and Ellis, D. and Jiang, W. and Lee, K. and Yanagawa, A. and Loui, A. C. and Luo, J :
‘Large-Scale Multimodal Semantic Concept Detection for Consumer Video ‘ (2007)
Katsiouli, P. and Tsetsos, V. and Hadjiefthymiades, S. : Semantic Video Classification Based on
Subtitles and Domain Terminologies (2007)
Garcia, J. L. R. and Vocht, L. and Troncy, R. and Mannens, E. and Van de Walle, R. : Describing and
contextualizing events in TV news shows (2014)
Di Salvo, R. and Giordano, D. and Kavasidi, I : A Crowdsourcing Approach to Support Video
Annotation (2014)
Kavasidis, I. and Palazzo, S. and Di Salvo, R. and Giordano, D. and Spampinato, C. : An innovative
web-based collaborative platform for video annotation (2013)

More Related Content

Recently uploaded

Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themeitharjee
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 

Recently uploaded (20)

Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 

Featured

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

Processing Video Content and Transcript for Key-Topic Identification

  • 1. Processing Video Content and Transcript for Key-Topic Identification Daniela Cretu 2570710
  • 2. Problem Statement Large number of devices which can take pictures and videos lead to an increase in uploaded multimedia content 300 hours of video uploaded to YouTube each hour 3.25 billion hours of YouTube videos watched every month By 2020, Cisco forecasts it would take 5 million years for a person to watch every online video
  • 3. Difference between video topic and description Recipe : relevant for video Subscriber channels - not related to video subject
  • 4. Same video type - different categories Both cooking-related videos, yet appear in different categories
  • 6. Findability Problem Problem : content becomes less and less findable How can we fix this? Annotating the videos could improve findability
  • 7. User annotation 2 problems : ● Small number of tags (average 9 per video) ● May contain irrelevant tags - to gain more views Alternative : automatically annotate the videos
  • 8. Solution : Automatic annotation ● Process video streams (Google Video Intelligence, Clarifai API) ● Process video subtitles (Alchemy API, Google Natural Language) ● Tools processing the same type of data - likely yield different results - COMBINE THEM ● Disadvantage : video tools only able to provide content information, text tools only able to provide context information ● Best approach - Combine tools which process different video dimensions
  • 10. Related work ● Concept detection in video relies mostly on using low level image attributes (e.g. color histograms)-Lin et al, Chang et al ● Detecting concepts in subtitles - used to assign categories to videos (Katsiouli et al) or as basis for finding other relevant entities (Garcia et al) ● Crowdsourcing concepts - encourage users to play games and draw outlines of objects in the video (Di Salvo et al) or (Kavasidis et al) Color histograms for similar images
  • 11. Research Question Main Research Question : How can we identify key topics in a video through processing of the video stream and its textual description? Two sub-research questions: ● How can we determine if certain concepts are more relevant than others ? (RQ1) ● How can we best align the concepts from the input sources(video stream and transcript) ? (RQ2)
  • 12. Dataset ● YouTube videos ● Of various types and lengths ● Aimed to select videos which do not fit in more than one category ● In total 519 videos
  • 13. Tools ● 1 subtitle processing tool - Google Natural Language [1] ○ Outputs detected concepts in the order of appearance ● 2 video processing tools - Clarifai[2] and Google Video Intelligence[3] ○ We chose these tools because the alternatives break down the video into keyframes and perform concept detection on images, rather than video Clarifai GVI Output format JSON JSON Tags per second yes no Tags ordered alphabetically no yes Occurrences of same tags grouped together no yes Confidence score for tag yes yes ‘Video relevant’ label for tags no yes [1] https://cloud.google.com/natural-language/ [2] https://www.clarifai.com/ [3] https://cloud.google.com/video-intelligence/
  • 14. GVI Sample Output TAG with single occurence Multiple occurrences of same tag are grouped Tag relevant at video level
  • 15. Clarifai Sample Output - list of vectors Vector of seconds Vector of concepts for each second Vector of probabilitie for each concept
  • 16. After running the tools on the dataset... ● Clarifai - highest number of tags ● Subtitles - lowest number of tags ● Large amount of unique tags between the tools - thus overlap is low
  • 17. Research Methodology Aim : find key topics in the processed dataset
  • 18. Research Methodology Tag Processing (for each topl separately) Step 1 : calculate number of occurrences and longest time interval of each tag In Figure 6: Black ● Number of occurrences = 1 ● Longest time interval = 5 (last - first) Classroom ● Number of occurrences = 3 ● Longest time interval = 5 (last - first)
  • 19. Research Methodology Step 2 : Transform confidence scales so the tag with highest confidence score ends up having confidence = 1 (highest confidence score becomes divisor) a. Recalculate confidence for all other tags In the example to the right : text has the highest confidence score - use that as divisor
  • 20. Research Methodology Step 3 : calculate relevance score for each tag Sum of confidence scores / video length in seconds Step 4 : combine tags from the three different outputs ● Use average formula ● If tag detected by tool, use relevance, if not, use 0 Combining tags from the three tools
  • 21. Evaluation Goals We have identified 4 evaluation goals ● Confirm our computations (EV1) ● Check for bias towards one of the tools (EV2) ● Check for any correlation between bias and video characteristics (EV3) ● Check if the automatic tools may have missed something (EV4) Evaluate using crowdsourcing
  • 22. Strategy for Selecting Videos to Evaluate Choose a sample of videos which have high overlap between the 3 tools Because it was concluded that shorter videos are more suitable for crowdsourcing (workers tend to lose focus for longer videos) we decided to show 10 second segments of video From the sample of videos - pick 10 second segments to evaluate Pick those segments from each video in which highly relevant (as resulted after combining the outputs of the tools) tags occur In total, 2169 segments to be evaluated, from 213 videos
  • 23. Selecting Tags to display For each segment of video - compose a list of most relevant, maybe relevant and not so relevant tags from the tags for the overall video AT most 10 tags in each category 3 variables help to assign tags to categories: 1. Max relevance score for segment (MaxConf) 2. Tag’s relevance score (Rel) 3. A relevance threshold (Thresh - is 0.2 is MaxConf > 0.2 and is = 0.02 if MaxConf <=0.2) Assign tags in categories: 1. If MaxConf - Thresh < Rel < MaxConf AND less than 10 tags in category => put tag in that category 2. Repeat until rule no longer holds or more than 10 tags in category 3. MaxConf = MaxConf - Thresh 4. Repeat until categories full or no more tags
  • 24. Crowdsourcing Task ● Ask users to watch 10 seconds of video ● Users can then select tags related to the video from the list ● Users can add any other tags they think are relevant to the segment ● Each task is evaluated by 15 workers ● Each worker gets 2 cents for each completed task ● Workers cannot submit the results without watching all 10 seconds
  • 25. Evaluation Strategy Watch video (Step 1) Select tags (step 2) Add additional tags if desired (step 3)
  • 26. Evaluation Results - EV1 At segment level, an average of 41.74% of highly relevant tags (as evaluated by the crowdsourcing workers) were correctly detected by the algorithm Maybe relevant tags - smallest overlap of all Additional subtitle tags (not detected by any tool other than subtitles) have highest overlap - BUT we counted each tag chosen by at least one worker in the same category (perhaps relevance is low ?)
  • 27. Evaluation Results - EV1 At video level, an average of 46.19% of the tags which were evaluated as being highly relevant by the workers were also detected by the algorithm as being highly relevant Same as segment level, medium relevance tags have lowest overlap Low relevance tags slightly higher overlap than high relevance tags - for very short videos, there is higher overlap for highly relevant videos
  • 28. Evaluation Results - EV2 ● Clarifai - mainly low relevance tags ● Most high relevance tags were detected by both visual processing tools ● Bias towards choosing tags detected by more than one tool
  • 29. Evaluation Results - EV3 By assigning numerical values to the time distribution and tag category, we were able to calculate correlation with the help of the corresponding Excel function. Assignment: {100, 200, 300, 400} corresponds to {under 3 min, 3-5 min., 5-10 min and 10-15 min} {10, 20, 30, 40, 50} corresponds to {clarifai + gvi, clarifai + gvi + sub, clarifai, gvi, sub} High correlation score for cooking between time distribution and processing tool. Inexistent or not very strong correlation for the other 4 categories (for nature there is a correlation, but very light)
  • 30. Evaluation Results - EV3 Using the same assignment as in the previous slide for the tag detection tools, we assigned {1,2,3,4,5} to be the alias of {cooking, culture, nature, travel,other} For all time distributions, correlation factor is negative No apparent correlation between categories and detection tools in any time distribution.
  • 31. Evaluation Results - EV4 ● Only about 20% additional tags found in out lists ● Most of them with low relevance
  • 32. Evaluation Result - RQ1 ● Identified a bias towards choosing tags detected by more than one tool ● These should be higher up in the list ● Better alignment strategy : instead of simple average, use a weighted average ● Assign higher weight to tags detected by more than one tool
  • 33. Evaluation Result - RQ2 ● Current alignment detects 46.19% of highly relevant tags for the sampled videos (comparison between the highly relevant tags detected by our algorithm and the highly relevant tags chosen by crowdsourcing workers) ● There is a percentage of tags detected as being of medium relevance which have been promoted to high relevance after crowdsourcing ● Find a better relevance threshold
  • 34. Evaluation result - RQ2 Examined users choice behaviour for each category (the other three categories on next slide) to see whether combining tools results in more accurate results ● For each category, tags selected by GVI + Clarifai are chosen more often that either Clarifai or GVI separately ● Adding subtitles does not make much of a difference (highest overlap score for highly relevant tags happens for tags detected by GVI+Clarifai ● Subtitles have the least chosen amount of tags ( remember that subtitle tags included here are not detected by any other tool)
  • 35. Combining visual tools - better than using them individually Combining visual tags with subtitle - better than using just subtitles Linear increase in tags - as relevance decreases - number of tags increases
  • 36. Conclusion and Future Work ● Our alignment strategy correctly detects around 46% of relevant tags for sampled videos ● Wanted to find out whether combining tools would yield better results ○ Tags from GVI are chosen more often than Clarifai tags ○ Most tags for sampled videos come from GVI + Clarifai - more relevant ○ Adding subtitles to visual tags -better than using just subtitle tags ● Differences between video categories are not that many - can use them as one single dataset ● Related work deals mostly with one source of information, whereas we deal with information from 3 different sources ○ Also mostly concerned with aligning tags to parts of the video, whereas we tried to find tags relevant to the whole video. ● Our algorithm can be improved ● Include crowdsourcing to identify better threshold, not just for confirmation ● Use weighted average as part of alignment
  • 38. References C. Y. Lin et al ‘VideoAL: A novel End-To-End MPEG-7 Video Automatic Labeling System’(2003) Chang, S. F. and Ellis, D. and Jiang, W. and Lee, K. and Yanagawa, A. and Loui, A. C. and Luo, J : ‘Large-Scale Multimodal Semantic Concept Detection for Consumer Video ‘ (2007) Katsiouli, P. and Tsetsos, V. and Hadjiefthymiades, S. : Semantic Video Classification Based on Subtitles and Domain Terminologies (2007) Garcia, J. L. R. and Vocht, L. and Troncy, R. and Mannens, E. and Van de Walle, R. : Describing and contextualizing events in TV news shows (2014) Di Salvo, R. and Giordano, D. and Kavasidi, I : A Crowdsourcing Approach to Support Video Annotation (2014) Kavasidis, I. and Palazzo, S. and Di Salvo, R. and Giordano, D. and Spampinato, C. : An innovative web-based collaborative platform for video annotation (2013)