SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Unsupervised Learning of Object Landmarks
through Conditional Image Generation
Tomas Jakab1∗ Ankush Gupta1∗ Hakan Bilen2 Andrea Vedaldi1
1 Visual Geometry Group, University of Oxford
2 School of Informatics, University of Edinburgh
Advances in Neural Information Processing Systems (NeurIPS) 2018
Bingwen hu
2019-01-20
Goal
Learn semantically meaningful landmarks without any manual annotations.
It automatically learns from images or videos and works across different datasets of faces, humans,
and 3D objects.
Why to learn landmarks?
Low dimensional object representation
Interpretable
Why unsupervised?
Reduce dependency on expensive manual annotations
Leverage vast amount of videos available online
Architecture
Source image
Target image
appearance
encoding
unsupervised keypoint extraction
image
reconstruction
heatmap for each keypoint
Method
(1) Heatmaps bottleneck
Then, each heatmap is replaced with Gaussian-like function centred at u*k with
a small fixed standard deviation
it provides a differentiable and distributed representation of the location of
landmarks.
 it restricts the information from the target image to spatial locations only
(2) Generator network using a perceptual loss
Where Γ(x) is an off-the-shelf pre-trained neural network, for
example VGG-19. Γl denotes the output of the l-th sub-network
 The perceptual loss compares a set of the activations extracted from multiple
layers of a deep network for both the reference and the generated images,
instead of the only raw pixel values.
Model details
• Landmark detection network: ingests the image x' to produce K
landmark heatmaps y'
It is composed of sequential blocks consisting of two convolutional.
The spatial size of the final output, outputting the heatmaps, is set to 16×16.
These K feature channels are then used to render 16×16×K 2D-Gaussian
maps y' (with σ = 0:1)
• Image generation network: input the image x and the landmarks
y' = Φ(x'), reconstructe x'
First, the image x is encoded as a feature tensor Z
Next, the features z and the landmarks y' are stacked to gether and fed to a
regressor that reconstructs the target frame x'.
 Experiments
Experiments——Learning facial landmarks
Experiments——Learning human body landmarks
Experiments——Learning 3D object landmarks
Experiments——Disentangling appearance and geometry
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation

Weitere ähnliche Inhalte

Was ist angesagt?

ICRA 2015 interactive presentation
ICRA 2015 interactive presentationICRA 2015 interactive presentation
ICRA 2015 interactive presentationSunando Sengupta
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringSOYEON KIM
 
Data Challenges with 3D Computer Vision
Data Challenges with 3D Computer VisionData Challenges with 3D Computer Vision
Data Challenges with 3D Computer VisionMartin Scholl
 
Understanding neural radiance fields
Understanding neural radiance fieldsUnderstanding neural radiance fields
Understanding neural radiance fieldsVarun Bhaseen
 
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...JinTaek Seo
 
Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes  Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes ijcga
 
Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes  Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes ijcga
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks
 
6 texture mapping computer graphics
6 texture mapping computer graphics6 texture mapping computer graphics
6 texture mapping computer graphicscairo university
 
Find nuclei in images with U-net
Find nuclei in images with U-netFind nuclei in images with U-net
Find nuclei in images with U-netDing Li
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Dongmin Choi
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionAmar Jindal
 
Visual hull construction from semitransparent coloured silhouettes
Visual hull construction from semitransparent coloured silhouettesVisual hull construction from semitransparent coloured silhouettes
Visual hull construction from semitransparent coloured silhouettesijcga
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou
 
3D Generalization Lenses (IV 2008)
3D Generalization Lenses (IV 2008)3D Generalization Lenses (IV 2008)
3D Generalization Lenses (IV 2008)Matthias Trapp
 

Was ist angesagt? (20)

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
 
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
 
ICRA 2015 interactive presentation
ICRA 2015 interactive presentationICRA 2015 interactive presentation
ICRA 2015 interactive presentation
 
PCL (Point Cloud Library)
PCL (Point Cloud Library)PCL (Point Cloud Library)
PCL (Point Cloud Library)
 
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
 
Data Challenges with 3D Computer Vision
Data Challenges with 3D Computer VisionData Challenges with 3D Computer Vision
Data Challenges with 3D Computer Vision
 
Understanding neural radiance fields
Understanding neural radiance fieldsUnderstanding neural radiance fields
Understanding neural radiance fields
 
Visual cryptography
Visual cryptographyVisual cryptography
Visual cryptography
 
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...
Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...
 
Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes  Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes
 
Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes  Visual Hull Construction from Semitransparent Coloured Silhouettes
Visual Hull Construction from Semitransparent Coloured Silhouettes
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
6 texture mapping computer graphics
6 texture mapping computer graphics6 texture mapping computer graphics
6 texture mapping computer graphics
 
Find nuclei in images with U-net
Find nuclei in images with U-netFind nuclei in images with U-net
Find nuclei in images with U-net
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
T01022103108
T01022103108T01022103108
T01022103108
 
Visual hull construction from semitransparent coloured silhouettes
Visual hull construction from semitransparent coloured silhouettesVisual hull construction from semitransparent coloured silhouettes
Visual hull construction from semitransparent coloured silhouettes
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 
3D Generalization Lenses (IV 2008)
3D Generalization Lenses (IV 2008)3D Generalization Lenses (IV 2008)
3D Generalization Lenses (IV 2008)
 

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation

[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...Seiya Ito
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...inside-BigData.com
 
MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transferMLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transferCharles Deledalle
 
Paper id 252014130
Paper id 252014130Paper id 252014130
Paper id 252014130IJRAT
 
Deferred Pixel Shading on the PLAYSTATION®3
Deferred Pixel Shading on the PLAYSTATION®3Deferred Pixel Shading on the PLAYSTATION®3
Deferred Pixel Shading on the PLAYSTATION®3Slide_N
 
Optimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageOptimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageAlexander Decker
 
11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound imageAlexander Decker
 
mvitelli_ee367_final_report
mvitelli_ee367_final_reportmvitelli_ee367_final_report
mvitelli_ee367_final_reportMatt Vitelli
 
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...IDES Editor
 
Convolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep LearningConvolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep Learningalihassaah1994
 
Biometric simulator for visually impaired (1)
Biometric simulator for visually impaired (1)Biometric simulator for visually impaired (1)
Biometric simulator for visually impaired (1)Rahul Bhagat
 
UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxNoorUlHaq47
 

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation (20)

Lecture1
Lecture1Lecture1
Lecture1
 
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
 
MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transferMLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
 
Paper id 252014130
Paper id 252014130Paper id 252014130
Paper id 252014130
 
Final Poster
Final PosterFinal Poster
Final Poster
 
Cj36511514
Cj36511514Cj36511514
Cj36511514
 
Deferred Pixel Shading on the PLAYSTATION®3
Deferred Pixel Shading on the PLAYSTATION®3Deferred Pixel Shading on the PLAYSTATION®3
Deferred Pixel Shading on the PLAYSTATION®3
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Mnist report
Mnist reportMnist report
Mnist report
 
Optimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageOptimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound image
 
11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image
 
mvitelli_ee367_final_report
mvitelli_ee367_final_reportmvitelli_ee367_final_report
mvitelli_ee367_final_report
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 
Convolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep LearningConvolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep Learning
 
Module 1.pptx
Module 1.pptxModule 1.pptx
Module 1.pptx
 
Biometric simulator for visually impaired (1)
Biometric simulator for visually impaired (1)Biometric simulator for visually impaired (1)
Biometric simulator for visually impaired (1)
 
UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
 

Mehr von 哲东 郑

Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification哲东 郑
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...哲东 郑
 
Visual saliency
Visual saliencyVisual saliency
Visual saliency哲东 郑
 
Image Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and StyleImage Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and Style哲东 郑
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东 郑
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval哲东 郑
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation哲东 郑
 
Video object detection
Video object detectionVideo object detection
Video object detection哲东 郑
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition哲东 郑
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation哲东 郑
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding哲东 郑
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization哲东 郑
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow哲东 郑
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic哲东 郑
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks 哲东 郑
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck哲东 郑
 

Mehr von 哲东 郑 (20)

Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...
 
Step zhedong
Step zhedongStep zhedong
Step zhedong
 
Visual saliency
Visual saliencyVisual saliency
Visual saliency
 
Image Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and StyleImage Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and Style
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation
 
Video object detection
Video object detectionVideo object detection
Video object detection
 
Center nets
Center netsCenter nets
Center nets
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks
 
Style gan
Style ganStyle gan
Style gan
 
Vi2vi
Vi2viVi2vi
Vi2vi
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Unsupervised Learning of Object Landmarks through Conditional Image Generation

  • 1. Unsupervised Learning of Object Landmarks through Conditional Image Generation Tomas Jakab1∗ Ankush Gupta1∗ Hakan Bilen2 Andrea Vedaldi1 1 Visual Geometry Group, University of Oxford 2 School of Informatics, University of Edinburgh Advances in Neural Information Processing Systems (NeurIPS) 2018 Bingwen hu 2019-01-20
  • 2. Goal Learn semantically meaningful landmarks without any manual annotations. It automatically learns from images or videos and works across different datasets of faces, humans, and 3D objects. Why to learn landmarks? Low dimensional object representation Interpretable Why unsupervised? Reduce dependency on expensive manual annotations Leverage vast amount of videos available online
  • 3. Architecture Source image Target image appearance encoding unsupervised keypoint extraction image reconstruction heatmap for each keypoint
  • 5. (1) Heatmaps bottleneck Then, each heatmap is replaced with Gaussian-like function centred at u*k with a small fixed standard deviation
  • 6. it provides a differentiable and distributed representation of the location of landmarks.  it restricts the information from the target image to spatial locations only
  • 7. (2) Generator network using a perceptual loss Where Γ(x) is an off-the-shelf pre-trained neural network, for example VGG-19. Γl denotes the output of the l-th sub-network  The perceptual loss compares a set of the activations extracted from multiple layers of a deep network for both the reference and the generated images, instead of the only raw pixel values.
  • 8. Model details • Landmark detection network: ingests the image x' to produce K landmark heatmaps y' It is composed of sequential blocks consisting of two convolutional. The spatial size of the final output, outputting the heatmaps, is set to 16×16. These K feature channels are then used to render 16×16×K 2D-Gaussian maps y' (with σ = 0:1) • Image generation network: input the image x and the landmarks y' = Φ(x'), reconstructe x' First, the image x is encoded as a feature tensor Z Next, the features z and the landmarks y' are stacked to gether and fed to a regressor that reconstructs the target frame x'.
  • 9.