SlideShare a Scribd company logo
1 of 15
Deep learning for semantic
analysis and annotation of
conventional and 360° video
Hannes Fassold
Who we are
• Smart Media Solutions Team
• CCM research group @ DIGITAL, JOANNEUM RESEARCH, Graz, Austria
• Content-based quality analysis & restoration of film and video
• http://vidicert.com
• http://www.hs-art.com
• Semantic video analysis
• Extraction of semantic information from a video
(with deep learning and classical methods)
• Shot & cadence detection
• Brand monitoring
• Object detection & recognition (faces, persons, …)
• Most components are real-time capable
2
Presentation overview
• Deep learning in a nutshell
• Face detection & recognition
• State of the art & issues
• Object detection & tracking
• State of the art & issues
• Applications
• Semi-automatic annotation of video
• Generating non-interactive version
of 360° video
3
Deep learning in a nutshell
• Deep neural networks (DNNs)
• Mimick the human brain structure
• Training
• Learn the weights for all layers
• A huge annotated (‚ground truth‘) dataset
is needed for training ‚from scratch‘
• Inference
• Run the network (classify / detect / …) for one image
• Both training and inference usually done on graphic cards (GPUs)
4
Face detection
• State of the art approaches
• Multi-Task CNN, RetinaFace, …
• Face detection is more or less ‚solved‘
• Works great even for small faces
and profile views of faces
• Accuracy of > 90 % (mAP)
• Real-time capable
(on the GPU)
5
Result of our face detection algorithm on a region of an image from a 360° video.
Content provided by Mediaset for Hyper360 project.
Face recognition
• Most algorithms rely on „closed world assumption“
• All faces occurring in the videos are known, meaning that the face recognition
algorithm has been trained on them
• State of the art approaches
• FaceNet, ArcFace, SphereFace, …
• Accuracy of > 98 % on the standard databases, processing in real-time
• Factors influencing the recognition result negatively
• Small face (or low resolution video)
• Profile view
• Bad lighting conditions
6
Face recognition – challenges & issues
• „Closed world assumption“ is difficult to achieve in practice
• You do not want to retrain your DNN if you want to recognize a new person,
as training takes quite some time …
• Incremental training can help here
• Not easy – you have to identify first that a person is ‚new‘ and have to retrain the DNN on-the-fly
• We have added incremental training in our in-house face framework
• You may not have enough annotations (samples) for each person
• 50 – 100 annotations for each person‘s face usually employed in the databases
• Training with less data is an active research area („few-shot learning“)
7
Face recognition – challenges & issues
• Class imbalance
• Some classes are under-represented in the dataset used for training the DNN
• Ethnic bias
• Publically available face datasets contain mostly faces from caucasian people
• Error rates on african people are about twice as big as for caucasian people [1]
• Few faces with glasses in most face datsets, but many asians have glasses
• Active research on methods in order to mitigate class imbalance
• Better data augmentation strategies
• Data crawling
• Synthetic generation of additional training data samples (‚face synthesis‘)
• Domain adaption & unsupervised learning
8
[1] https://arxiv.org/pdf/1812.00194.pdf
Object detection & tracking
• Task
• Detect an object of a certain class (e.g. person, dog, car, …)
and track it through its lifetime (each object gets an unique id)
• State of the art approaches
• RetinaNet, YoloV3, Faster R-CNN, …
• Usually detect 80 classes from MS COCO
• Our inhouse algorithm
• Detects & tracks general objects,
faces, text and logo in real-time
• << Demovideo >>
9
Result of our object detection & tracking algorithm
Object detection & tracking – challenges & issues
• Current state
• Algorithms are really usable in practice: robust (mAP > 60 %) and fast (real-time)
• Remaining issues
• Re-identification of objects is challenging
• E.g. persons which get occluded and then appear again (crowdy scene)
• One can use the object‘s appearance, but what if all look the same (e.g. soccer players) ?
• Simple Strategy used in our framework - newly appearing objects get a new id
• Quite limited number of object classes
• E.g. MS COCO dataset [1] has classes for a few animals
(dog, cat, horse, cow, …) but what if your the subject
of your documentary video are dinosaurs ?
10
[1] https://arxiv.org/pdf/1405.0312.pdf
Semi-automatic video annotation
• Automate the annotation process of archive videos
• Who is appearing in the video (with whom), in which video sections
• Other potentially useful metadata: facial emotion, what action is he / she doing,
what is he / she saying, what logos appear, what are the ‚video highlights‘, …
• Semi-automatic video annotation workflow
• Deep learning algorithms (face recognition, object detection & tracking, …)
do the first pass and generate the „raw metadata“
• Raw metadata is inspected and corrected (false detections, multiple ids for one
person, …) by a human operator with a convenient tool
• Hopefully the whole process is more efficient than the ‚human-only‘ workflow ☺
11
Non-interactive version of 360° video
• Generate non-interactive version of 360° video
• For archiving purposes a preview-version of the video
(additionally to original 360° video) could be fine
• For consumption of 360° video on old TV sets, or as
„lean-back mode“ for users who do not want to interact
• Rough algorithm workflow
• Works iteratively, shot-per-shot
• Extract all scene objects (focusing on persons currently)
• Determine the most „interesting“ person for the current
shot (based on size, movement, what we have seen in
last shot etc.) and track it
12
Non-interactive version of a 360° music video
(each row is one generated shot)
Content provided by RBB for Hyper360 project.
Non-interactive version of 360° video – outlook
• << Demovideo >>
• Currently working on adressing some limitations of original algorithm
• More diverse shot types: close-up, wide-angle shot, panning shot, …
(currently, all shots are tracking shots with horizontal FOV of 75°)
• Employ best-practice rules for framing and „continuity editing“
• Avoid jump-cuts
• 180° rule
• …
• Goal is „virtual director“ which trys to mimicks a certain human director‘s style
13
Acknowledgments
• Thanks to the “Hyper360” project partners RBB, Mediaset, Fraunhofer Fokus, Drukka for providing the
360° video sequences for research and development purposes within the project.
• The research leading to these results has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement No 761934 - Hyper360 and grant
agreement No. 761802 – MARCONI
• http://www.hyper360.eu/
• https://www.projectmarconi.eu/
14
Thank you for your attention !
Contact:
hannes.fassold@joanneum.at
JOANNEUM RESEARCH
Forschungsgesellschaft mbH
DIGITAL– Institut für Informations-
und Kommunikationstechnologien
Steyrergasse 17, 8010 Graz
Tel. +43 316 876-5000
digital@joanneum.at
www.joanneum.at/digital

More Related Content

Similar to FASSOLD Deep learning for semantic analysis and annotation of conventional and 360 degees video

Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 

Similar to FASSOLD Deep learning for semantic analysis and annotation of conventional and 360 degees video (20)

Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationRe-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
 
NDF2017 - Digitisation 101 Workshop
NDF2017 - Digitisation 101 WorkshopNDF2017 - Digitisation 101 Workshop
NDF2017 - Digitisation 101 Workshop
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...
 
Face Recognition System for Door Unlocking
Face Recognition System for Door UnlockingFace Recognition System for Door Unlocking
Face Recognition System for Door Unlocking
 
Lidnug Presentation - Kinect - The How, Were and When of developing with it
Lidnug Presentation - Kinect - The How, Were and When of developing with itLidnug Presentation - Kinect - The How, Were and When of developing with it
Lidnug Presentation - Kinect - The How, Were and When of developing with it
 
Virtual mouse
Virtual mouseVirtual mouse
Virtual mouse
 
Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...
 
Hacklu2011 tricaud
Hacklu2011 tricaudHacklu2011 tricaud
Hacklu2011 tricaud
 
Deep Learning AtoC with Image Perspective
Deep Learning AtoC with Image PerspectiveDeep Learning AtoC with Image Perspective
Deep Learning AtoC with Image Perspective
 
Five Cliches of Online Game Development
Five Cliches of Online Game DevelopmentFive Cliches of Online Game Development
Five Cliches of Online Game Development
 
KorraAI - a probabilistic virtual agent framework
KorraAI - a probabilistic virtual agent frameworkKorraAI - a probabilistic virtual agent framework
KorraAI - a probabilistic virtual agent framework
 
Deep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the EnterpriseDeep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the Enterprise
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Information from pixels
Information from pixelsInformation from pixels
Information from pixels
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
 
Shmoocon 2015 - httpscreenshot
Shmoocon 2015 - httpscreenshotShmoocon 2015 - httpscreenshot
Shmoocon 2015 - httpscreenshot
 

More from FIAT/IFTA

More from FIAT/IFTA (20)

2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey
 
20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List
 
WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTV
 
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
 
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉCULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
 
HULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiativesHULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiatives
 
WILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC ScotlandWILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC Scotland
 
GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!
 
LORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal depositLORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal deposit
 
BIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formatsBIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formats
 
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
 
BERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memoriesBERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memories
 
AOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archiveAOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archive
 
HULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open upHULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open up
 
PERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archivesPERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archives
 
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AIAICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
 
VINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methodsVINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methods
 
LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?
 
AZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archiveAZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archive
 

Recently uploaded

Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
amitlee9823
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Anamikakaur10
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
dollysharma2066
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
daisycvs
 
Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...
Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...
Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...
lizamodels9
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
lizamodels9
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
dlhescort
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
amitlee9823
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 

Recently uploaded (20)

Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...
Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...
Russian Call Girls In Rajiv Chowk Gurgaon ❤️8448577510 ⊹Best Escorts Service ...
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
Falcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investors
 
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
 
Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 

FASSOLD Deep learning for semantic analysis and annotation of conventional and 360 degees video

  • 1. Deep learning for semantic analysis and annotation of conventional and 360° video Hannes Fassold
  • 2. Who we are • Smart Media Solutions Team • CCM research group @ DIGITAL, JOANNEUM RESEARCH, Graz, Austria • Content-based quality analysis & restoration of film and video • http://vidicert.com • http://www.hs-art.com • Semantic video analysis • Extraction of semantic information from a video (with deep learning and classical methods) • Shot & cadence detection • Brand monitoring • Object detection & recognition (faces, persons, …) • Most components are real-time capable 2
  • 3. Presentation overview • Deep learning in a nutshell • Face detection & recognition • State of the art & issues • Object detection & tracking • State of the art & issues • Applications • Semi-automatic annotation of video • Generating non-interactive version of 360° video 3
  • 4. Deep learning in a nutshell • Deep neural networks (DNNs) • Mimick the human brain structure • Training • Learn the weights for all layers • A huge annotated (‚ground truth‘) dataset is needed for training ‚from scratch‘ • Inference • Run the network (classify / detect / …) for one image • Both training and inference usually done on graphic cards (GPUs) 4
  • 5. Face detection • State of the art approaches • Multi-Task CNN, RetinaFace, … • Face detection is more or less ‚solved‘ • Works great even for small faces and profile views of faces • Accuracy of > 90 % (mAP) • Real-time capable (on the GPU) 5 Result of our face detection algorithm on a region of an image from a 360° video. Content provided by Mediaset for Hyper360 project.
  • 6. Face recognition • Most algorithms rely on „closed world assumption“ • All faces occurring in the videos are known, meaning that the face recognition algorithm has been trained on them • State of the art approaches • FaceNet, ArcFace, SphereFace, … • Accuracy of > 98 % on the standard databases, processing in real-time • Factors influencing the recognition result negatively • Small face (or low resolution video) • Profile view • Bad lighting conditions 6
  • 7. Face recognition – challenges & issues • „Closed world assumption“ is difficult to achieve in practice • You do not want to retrain your DNN if you want to recognize a new person, as training takes quite some time … • Incremental training can help here • Not easy – you have to identify first that a person is ‚new‘ and have to retrain the DNN on-the-fly • We have added incremental training in our in-house face framework • You may not have enough annotations (samples) for each person • 50 – 100 annotations for each person‘s face usually employed in the databases • Training with less data is an active research area („few-shot learning“) 7
  • 8. Face recognition – challenges & issues • Class imbalance • Some classes are under-represented in the dataset used for training the DNN • Ethnic bias • Publically available face datasets contain mostly faces from caucasian people • Error rates on african people are about twice as big as for caucasian people [1] • Few faces with glasses in most face datsets, but many asians have glasses • Active research on methods in order to mitigate class imbalance • Better data augmentation strategies • Data crawling • Synthetic generation of additional training data samples (‚face synthesis‘) • Domain adaption & unsupervised learning 8 [1] https://arxiv.org/pdf/1812.00194.pdf
  • 9. Object detection & tracking • Task • Detect an object of a certain class (e.g. person, dog, car, …) and track it through its lifetime (each object gets an unique id) • State of the art approaches • RetinaNet, YoloV3, Faster R-CNN, … • Usually detect 80 classes from MS COCO • Our inhouse algorithm • Detects & tracks general objects, faces, text and logo in real-time • << Demovideo >> 9 Result of our object detection & tracking algorithm
  • 10. Object detection & tracking – challenges & issues • Current state • Algorithms are really usable in practice: robust (mAP > 60 %) and fast (real-time) • Remaining issues • Re-identification of objects is challenging • E.g. persons which get occluded and then appear again (crowdy scene) • One can use the object‘s appearance, but what if all look the same (e.g. soccer players) ? • Simple Strategy used in our framework - newly appearing objects get a new id • Quite limited number of object classes • E.g. MS COCO dataset [1] has classes for a few animals (dog, cat, horse, cow, …) but what if your the subject of your documentary video are dinosaurs ? 10 [1] https://arxiv.org/pdf/1405.0312.pdf
  • 11. Semi-automatic video annotation • Automate the annotation process of archive videos • Who is appearing in the video (with whom), in which video sections • Other potentially useful metadata: facial emotion, what action is he / she doing, what is he / she saying, what logos appear, what are the ‚video highlights‘, … • Semi-automatic video annotation workflow • Deep learning algorithms (face recognition, object detection & tracking, …) do the first pass and generate the „raw metadata“ • Raw metadata is inspected and corrected (false detections, multiple ids for one person, …) by a human operator with a convenient tool • Hopefully the whole process is more efficient than the ‚human-only‘ workflow ☺ 11
  • 12. Non-interactive version of 360° video • Generate non-interactive version of 360° video • For archiving purposes a preview-version of the video (additionally to original 360° video) could be fine • For consumption of 360° video on old TV sets, or as „lean-back mode“ for users who do not want to interact • Rough algorithm workflow • Works iteratively, shot-per-shot • Extract all scene objects (focusing on persons currently) • Determine the most „interesting“ person for the current shot (based on size, movement, what we have seen in last shot etc.) and track it 12 Non-interactive version of a 360° music video (each row is one generated shot) Content provided by RBB for Hyper360 project.
  • 13. Non-interactive version of 360° video – outlook • << Demovideo >> • Currently working on adressing some limitations of original algorithm • More diverse shot types: close-up, wide-angle shot, panning shot, … (currently, all shots are tracking shots with horizontal FOV of 75°) • Employ best-practice rules for framing and „continuity editing“ • Avoid jump-cuts • 180° rule • … • Goal is „virtual director“ which trys to mimicks a certain human director‘s style 13
  • 14. Acknowledgments • Thanks to the “Hyper360” project partners RBB, Mediaset, Fraunhofer Fokus, Drukka for providing the 360° video sequences for research and development purposes within the project. • The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 761934 - Hyper360 and grant agreement No. 761802 – MARCONI • http://www.hyper360.eu/ • https://www.projectmarconi.eu/ 14
  • 15. Thank you for your attention ! Contact: hannes.fassold@joanneum.at JOANNEUM RESEARCH Forschungsgesellschaft mbH DIGITAL– Institut für Informations- und Kommunikationstechnologien Steyrergasse 17, 8010 Graz Tel. +43 316 876-5000 digital@joanneum.at www.joanneum.at/digital