Unsupervised Learning of Object Landmarks through Conditional Image Generation

•Als PPTX, PDF herunterladen•

0 gefällt mir•99 views

哲

哲东郑

Bingwen Hu

Technologie

Unsupervised Learning of Object Landmarks
through Conditional Image Generation
Tomas Jakab1∗ Ankush Gupta1∗ Hakan Bilen2 Andrea Vedaldi1
1 Visual Geometry Group, University of Oxford
2 School of Informatics, University of Edinburgh
Advances in Neural Information Processing Systems (NeurIPS) 2018
Bingwen hu
2019-01-20

Goal
Learn semantically meaningful landmarks without any manual annotations.
It automatically learns from images or videos and works across different datasets of faces, humans,
and 3D objects.
Why to learn landmarks?
Low dimensional object representation
Interpretable
Why unsupervised?
Reduce dependency on expensive manual annotations
Leverage vast amount of videos available online

Architecture
Source image
Target image
appearance
encoding
unsupervised keypoint extraction
image
reconstruction
heatmap for each keypoint

(1) Heatmaps bottleneck
Then, each heatmap is replaced with Gaussian-like function centred at u*k with
a small fixed standard deviation

it provides a differentiable and distributed representation of the location of
landmarks.
 it restricts the information from the target image to spatial locations only

(2) Generator network using a perceptual loss
Where Γ(x) is an off-the-shelf pre-trained neural network, for
example VGG-19. Γl denotes the output of the l-th sub-network
 The perceptual loss compares a set of the activations extracted from multiple
layers of a deep network for both the reference and the generated images,
instead of the only raw pixel values.

Model details
• Landmark detection network: ingests the image x' to produce K
landmark heatmaps y'
It is composed of sequential blocks consisting of two convolutional.
The spatial size of the final output, outputting the heatmaps, is set to 16×16.
These K feature channels are then used to render 16×16×K 2D-Gaussian
maps y' (with σ = 0:1)
• Image generation network: input the image x and the landmarks
y' = Φ(x'), reconstructe x'
First, the image x is encoded as a feature tensor Z
Next, the features z and the landmarks y' are stacked to gether and fed to a
regressor that reconstructs the target frame x'.

Experiments——Learning human body landmarks

Experiments——Learning 3D object landmarks

Experiments——Disentangling appearance and geometry

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Weitere ähnliche Inhalte

Was ist angesagt?

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Universitat Politècnica de Catalunya

Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018Universitat Politècnica de Catalunya

ICRA 2015 interactive presentationSunando Sengupta

PCL (Point Cloud Library)University of Oklahoma

Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringSOYEON KIM

Data Challenges with 3D Computer VisionMartin Scholl

Understanding neural radiance fieldsVarun Bhaseen

Visual cryptographykiranlohakare2

Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...JinTaek Seo

Visual Hull Construction from Semitransparent Coloured Silhouettes ijcga

Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks

6 texture mapping computer graphicscairo university

Find nuclei in images with U-netDing Li

Objects as points (CenterNet) review [CDM]Dongmin Choi

Introduction to object detectionAmar Jindal

T01022103108IOSR Journals

Visual hull construction from semitransparent coloured silhouettesijcga

Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou

3D Generalization Lenses (IV 2008)Matthias Trapp

Was ist angesagt? (20)

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)

Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018

ICRA 2015 interactive presentation

PCL (Point Cloud Library)

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Data Challenges with 3D Computer Vision

Understanding neural radiance fields

Visual cryptography

Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...

Visual Hull Construction from Semitransparent Coloured Silhouettes

Introduction to 3D Computer Vision and Differentiable Rendering

6 texture mapping computer graphics

Find nuclei in images with U-net

Objects as points (CenterNet) review [CDM]

Introduction to object detection

T01022103108

Visual hull construction from semitransparent coloured silhouettes

Fast Object Recognition from 3D Depth Data with Extreme Learning Machine

3D Generalization Lenses (IV 2008)

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation

Lecture1Mobeen Mustafa

[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...Seiya Ito

Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...inside-BigData.com

MLIP - Chapter 6 - Generation, Super-Resolution, Style transferCharles Deledalle

Paper id 252014130IJRAT

Final PosterElizabeth Koshelev

Cj36511514IJERA Editor

Deferred Pixel Shading on the PLAYSTATION®3Slide_N

Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya

Mnist reportRaghunandanJairam

Optimal nonlocal means algorithm for denoising ultrasound imageAlexander Decker

11.optimal nonlocal means algorithm for denoising ultrasound imageAlexander Decker

mvitelli_ee367_final_reportMatt Vitelli

Mnist report pptRaghunandanJairam

Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...IDES Editor

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Universitat Politècnica de Catalunya

Convolutional Neural Network (CNN)of Deep Learningalihassaah1994

Module 1.pptxMattupallipardhu

Biometric simulator for visually impaired (1)Rahul Bhagat

UNetEliyaLaialy (2).pptxNoorUlHaq47

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation (20)

Lecture1

[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...

Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...

MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer

Paper id 252014130

Final Poster

Cj36511514

Deferred Pixel Shading on the PLAYSTATION®3

Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)

Mnist report

Optimal nonlocal means algorithm for denoising ultrasound image

11.optimal nonlocal means algorithm for denoising ultrasound image

mvitelli_ee367_final_report

Mnist report ppt

Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018

Convolutional Neural Network (CNN)of Deep Learning

Module 1.pptx

Biometric simulator for visually impaired (1)

UNetEliyaLaialy (2).pptx

Mehr von 哲东郑

Deep learning for person re-identification哲东郑

Cross-domain complementary learning with synthetic data for multi-person part...哲东郑

Step zhedong哲东郑

Visual saliency哲东郑

Image Synthesis From Reconfigurable Layout and Style哲东郑

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东郑

Weijian image retrieval哲东郑

Scops self supervised co-part segmentation哲东郑

Video object detection哲东郑

Center nets哲东郑

C2 ae open set recognition哲东郑

Sota semantic segmentation哲东郑

Deep randomized embedding哲东郑

Semantic Image Synthesis with Spatially-Adaptive Normalization哲东郑

Instance level facial attributes transfer with geometry-aware flow哲东郑

Learning to adapt structured output space for semantic哲东郑

Graph based global reasoning networks 哲东郑

Style gan哲东郑

Vi2vi哲东郑

Variational Discriminator Bottleneck哲东郑

Mehr von 哲东郑 (20)

Deep learning for person re-identification

Cross-domain complementary learning with synthetic data for multi-person part...

Step zhedong

Visual saliency

Image Synthesis From Reconfigurable Layout and Style

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval

Weijian image retrieval

Scops self supervised co-part segmentation

Video object detection

Center nets

C2 ae open set recognition

Sota semantic segmentation

Deep randomized embedding

Semantic Image Synthesis with Spatially-Adaptive Normalization

Instance level facial attributes transfer with geometry-aware flow

Learning to adapt structured output space for semantic

Graph based global reasoning networks

Style gan

Vi2vi

Variational Discriminator Bottleneck

Kürzlich hochgeladen

Artificial Intelligence: Facts and MythsJoaquim Jorge

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

Tech Trends Report 2024 Future Today Institute.pdfhans926745

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Histor y of HAM Radio presentation slidevu2urc

A Year of the Servo Reboot: Where Are We Now?Igalia

Partners Life - Insurer Innovation Award 2024The Digital Insurer

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths

Driving Behavioral Change for Information Management through Data-Driven Gree...

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Powerful Google developer tools for immediate impact! (2023-24 C)

GenCyber Cyber Security Day Presentation

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Exploring the Future Potential of AI-Enabled Smartphone Processors

The 7 Things I Know About Cyber Security After 25 Years | April 2024

HTML Injection Attacks: Impact and Mitigation Strategies

Tech Trends Report 2024 Future Today Institute.pdf

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Histor y of HAM Radio presentation slide

A Year of the Servo Reboot: Where Are We Now?

Partners Life - Insurer Innovation Award 2024

presentation ICT roal in 21st century education

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

🐬 The future of MySQL is Postgres 🐘

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Data Cloud, More than a CDP by Matt Robison

Unsupervised Learning of Object Landmarks through Conditional Image Generation

1. Unsupervised Learning of Object Landmarks through Conditional Image Generation Tomas Jakab1∗ Ankush Gupta1∗ Hakan Bilen2 Andrea Vedaldi1 1 Visual Geometry Group, University of Oxford 2 School of Informatics, University of Edinburgh Advances in Neural Information Processing Systems (NeurIPS) 2018 Bingwen hu 2019-01-20

2. Goal Learn semantically meaningful landmarks without any manual annotations. It automatically learns from images or videos and works across different datasets of faces, humans, and 3D objects. Why to learn landmarks? Low dimensional object representation Interpretable Why unsupervised? Reduce dependency on expensive manual annotations Leverage vast amount of videos available online

3. Architecture Source image Target image appearance encoding unsupervised keypoint extraction image reconstruction heatmap for each keypoint

4. Method

5. (1) Heatmaps bottleneck Then, each heatmap is replaced with Gaussian-like function centred at u*k with a small fixed standard deviation

6. it provides a differentiable and distributed representation of the location of landmarks.  it restricts the information from the target image to spatial locations only

7. (2) Generator network using a perceptual loss Where Γ(x) is an off-the-shelf pre-trained neural network, for example VGG-19. Γl denotes the output of the l-th sub-network  The perceptual loss compares a set of the activations extracted from multiple layers of a deep network for both the reference and the generated images, instead of the only raw pixel values.

8. Model details • Landmark detection network: ingests the image x' to produce K landmark heatmaps y' It is composed of sequential blocks consisting of two convolutional. The spatial size of the final output, outputting the heatmaps, is set to 16×16. These K feature channels are then used to render 16×16×K 2D-Gaussian maps y' (with σ = 0:1) • Image generation network: input the image x and the landmarks y' = Φ(x'), reconstructe x' First, the image x is encoded as a feature tensor Z Next, the features z and the landmarks y' are stacked to gether and fed to a regressor that reconstructs the target frame x'.

10.  Experiments

11. Experiments——Learning facial landmarks

12. Experiments——Learning human body landmarks

13. Experiments——Learning 3D object landmarks

14. Experiments——Disentangling appearance and geometry

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation (20)

Mehr von 哲东郑

Mehr von 哲东郑 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation

Ähnlich wie Unsupervised Learning of Object Landmarks through Conditional Image Generation (20)

Mehr von 哲东 郑

Mehr von 哲东 郑 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Mehr von 哲东郑

Mehr von 哲东郑 (20)