Human parsing

•Download as PPTX, PDF•

0 likes•448 views

ssuserb1420b

Seminar 27-01-2018 Human Parsing By Yawei Luo

Science

Problem
description
 Human parsing aims to segment a human image into
multiple semantic parts.
 It is a pixel-wise parsing problem.
 It is a supervised machine learning problem.

Challenges
 Occluded (especially by other people)
 Multi-scale
 Cross-domain
 Label conflict
 Blurry
 Cavity
 …
Main conflict is the desire for both larger
field of view & more accurate location
(Deeper or Denser?)
}
}
Need larger field
of view
Need denser &
more accurate
location

Related works
 Atrous Convolution
e.g. Deeplab

Related works
 Skip Net
e.g. U-net (top)
FCN(bottom)

Related works
Edge + Pixel Voting e.g. CoCNN

Baseline
ASPP
3*256*256 20*256*256 20*256*256
64*128*128
fake real
256*64*64
512*32*32
1024*16*16 8192*16*16
2048*16*16
DeeplabV2
Resnet101 Block
Resnet101 Block with Atrous Conv
Tensor Transfer
Upsampling

Two GANs
 Patch GAN focuses on low-level and local features,
which guarantees sharp and clear labelmaps.
 Pose GAN focuses on high-level and global features,
which helps generating labelmaps that consist with
human pose priors.

ASPP
Patch
D
Patch
GAN loss
Shallow
NLL loss
Deep
NLL loss
Resize
Concat
Totalloss
Copy
3*256*256 20*256*256 20*256*256
3*256*256
20*16*16
64*128*128
20*16*16
fake real
fake
256*64*64
512*32*32
real
1024*16*16
8192*16*16
2048*16*16
Resnet101 Block
Resnet101 Block with Atrous Conv
Tensor Transfer
Upsampling

Experimental
result with
Patch GAN
(LIP)

ASPP
Patch
D
Pose
D
Patch
GAN loss
Shallow
NLL loss
Deep
NLL loss
Pose GAN
loss
Resize
Concat
Concat
Totalloss
Copy
3*256*256
19*16*16
20*256*256 20*256*256
3*256*256
19*16*16
20*16*16
64*128*128
Openpose
20*16*16
fake real
fake
256*64*64
512*32*32
real
1024*16*16
8192*16*16
2048*16*16
Resnet101 Block
Resnet101 Block with Atrous Conv
Tensor Transfer
Upsampling
Resize
Concat

Real:
1 ⋯ 1
⋮ ⋱ ⋮
1 ⋯ 1
Fake:
0 ⋯ 0
⋮ ⋱ ⋮
0 ⋯ 0
Real: 1
Fake: 0
Patch GAN
Pose GAN
Difference
between two
discriminator
RGB image Pose Label map Feature map

Experimental result
with Two GANs
(LIP): Total loss

Experimental result
with Two GANs
(LIP): D_loss and
G_loss

Contributions
 We propose an effective PP-GAN for human parsing, which employs two
conditional GANs as supplementary supervisions on shallow, fine layers
and deep, coarse layers of the network respectively. Our model explicitly
divides the human parsing into "what" and "where" subtasks in an unified
framework and boosts the parsing performance on both image level and
semantic level.
 To our best knowledge, it is the first attempt to integrate human pose
information into a conditional GAN framework for human parsing task,
which significantly reduces the structural error of parsing results.
 In the proposed framework, discrimination process is naturally divided into
two easier tasks and two different discriminators are employed. The
experiments demonstrate that multiple discriminators, which only focus on
their own areas, prevail over single discriminator which is prone to saturate
when facing with complex task.
 The proposed PP-GAN significantly surpasses the previous methods on
both challenging LIP and XXX benchmark datasets.

What's hot

Pr045 deep lab_semantic_segmentationTaeoh Kim

PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee

Learning Convolutional Neural Networks for GraphsMathias Niepert

Object Detection Using R-CNN Deep Learning FrameworkNader Karimi

CNNUkjae Jeong

DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...Joonhyung Lee

Neural Network as a functionTaisuke Oe

Deep Learning Tutorial Ligeng Zhu

Mobilenetv1 v2 slide威智黃

Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Universitat Politècnica de Catalunya

Cnn methodAmirSajedi1

Tutorial on convolutional neural networksHojin Yang

Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone

Offline Character Recognition Using Monte Carlo Method and Neural Networkijaia

PR243: Designing Network Design SpacesJinwon Lee

Introduction to Convolutional Neural NetworksParrotAI

convolutional neural network (CNN, or ConvNet)RakeshSaran5

Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu

Understanding Convolutional Neural NetworksJeremy Nixon

What's hot (20)

Pr045 deep lab_semantic_segmentation

PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks

Learning Convolutional Neural Networks for Graphs

Object Detection Using R-CNN Deep Learning Framework

CNN

DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...

Neural Network as a function

Deep Learning Tutorial

Mobilenetv1 v2 slide

Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017

Cnn method

Tutorial on convolutional neural networks

Modern Convolutional Neural Network techniques for image segmentation

Offline Character Recognition Using Monte Carlo Method and Neural Network

PR243: Designing Network Design Spaces

Introduction to Convolutional Neural Networks

convolutional neural network (CNN, or ConvNet)

Big Data Intelligence: from Correlation Discovery to Causal Reasoning

Understanding Convolutional Neural Networks

Similar to Human parsing

Batch normalization presentationOwin Will

Resnet.pdfYanhuaSi

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya

Resnet.pptxYanhuaSi

Recent Progress on Object Detection_20170331Jihong Kang

A Novel Approach to Image Denoising and Image in PaintingEswar Publications

最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui

Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Universitat Politècnica de Catalunya

Efficient de cvpr_2020_papershanullah3

Convolutional Neural Networks (CNN)Gaurav Mittal

Deep Learning in Computer VisionSungjoon Choi

Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLuba Elliott

Conception_et_realisation_dun_site_Web_d.pdfSofianeHassine2

ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...sipij

Mnist report pptRaghunandanJairam

Mnist reportRaghunandanJairam

IEEE 2015 Matlab ProjectsVijay Karan

An introduction to super resolution using deep learningAnil Chandra Naidu Matcha

Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Universitat Politècnica de Catalunya

IEEE 2015 Matlab ProjectsVijay Karan

Similar to Human parsing (20)

Batch normalization presentation

Resnet.pdf

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018

Resnet.pptx

Recent Progress on Object Detection_20170331

A Novel Approach to Image Denoising and Image in Painting

最近の研究情勢についていくために - Deep Learningを中心に -

Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...

Efficient de cvpr_2020_paper

Convolutional Neural Networks (CNN)

Deep Learning in Computer Vision

Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup

Conception_et_realisation_dun_site_Web_d.pdf

ADVANCED SINGLE IMAGE RESOLUTION UPSURGING USING A GENERATIVE ADVERSARIAL NET...

Mnist report ppt

Mnist report

IEEE 2015 Matlab Projects

An introduction to super resolution using deep learning

Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...

IEEE 2015 Matlab Projects

Recently uploaded

Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju

Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur

Volatile Oils Pharmacognosy And Phytochemistry -INandakishor Bhaurao Deshmukh

LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth

The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar

User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems

Radiation physics in Dental Radiology...navyadasi1992

Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh

Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad

FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV

(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54

Speech, hearing, noise, intelligibility.pptxpriyankatabhane

User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems

Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48

BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1

Citronella presentation SlideShare mani upadhyayupadhyaymani499

Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju

User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems

ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1

Recently uploaded (20)

Pests of Bengal gram_Identification_Dr.UPR.pdf

Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...

Volatile Oils Pharmacognosy And Phytochemistry -I

LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx

The dark energy paradox leads to a new structure of spacetime.pptx

User Guide: Pulsar™ Weather Station (Columbia Weather Systems)

Radiation physics in Dental Radiology...

Base editing, prime editing, Cas13 & RNA editing and organelle base editing

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝

Environmental Biotechnology Topic:- Microbial Biosensor

FREE NURSING BUNDLE FOR NURSES.PDF by na

(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)

Speech, hearing, noise, intelligibility.pptx

User Guide: Capricorn FLX™ Weather Station

Vision and reflection on Mining Software Repositories research in 2024

BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.

Citronella presentation SlideShare mani upadhyay

Pests of soyabean_Binomics_IdentificationDr.UPR.pdf

User Guide: Orion™ Weather Station (Columbia Weather Systems)

ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx

Human parsing

1. Human Parsing Yawei Luo

2. Problem description  Human parsing aims to segment a human image into multiple semantic parts.  It is a pixel-wise parsing problem.  It is a supervised machine learning problem.

3. Challenges  Occluded (especially by other people)  Multi-scale  Cross-domain  Label conflict  Blurry  Cavity  … Main conflict is the desire for both larger field of view & more accurate location (Deeper or Denser?) } } Need larger field of view Need denser & more accurate location

4. Related works  Atrous Convolution e.g. Deeplab

5. Related works  Atrous Convolution e.g. Deeplab

6. Related works  Skip Net e.g. U-net (top) FCN(bottom)

7. Related works Edge + Pixel Voting e.g. CoCNN

8. Baseline ASPP 3*256*256 20*256*256 20*256*256 64*128*128 fake real 256*64*64 512*32*32 1024*16*16 8192*16*16 2048*16*16 DeeplabV2 Resnet101 Block Resnet101 Block with Atrous Conv Tensor Transfer Upsampling

9. Two GANs  Patch GAN focuses on low-level and local features, which guarantees sharp and clear labelmaps.  Pose GAN focuses on high-level and global features, which helps generating labelmaps that consist with human pose priors.

10. ASPP Patch D Patch GAN loss Shallow NLL loss Deep NLL loss Resize Concat Totalloss Copy 3*256*256 20*256*256 20*256*256 3*256*256 20*16*16 64*128*128 20*16*16 fake real fake 256*64*64 512*32*32 real 1024*16*16 8192*16*16 2048*16*16 Resnet101 Block Resnet101 Block with Atrous Conv Tensor Transfer Upsampling

11. Experimental result with Patch GAN (LIP)

12. Experimental result with Patch GAN (LIP)

13. ASPP Patch D Pose D Patch GAN loss Shallow NLL loss Deep NLL loss Pose GAN loss Resize Concat Concat Totalloss Copy 3*256*256 19*16*16 20*256*256 20*256*256 3*256*256 19*16*16 20*16*16 64*128*128 Openpose 20*16*16 fake real fake 256*64*64 512*32*32 real 1024*16*16 8192*16*16 2048*16*16 Resnet101 Block Resnet101 Block with Atrous Conv Tensor Transfer Upsampling Resize Concat

14. Real: 1 ⋯ 1 ⋮ ⋱ ⋮ 1 ⋯ 1 Fake: 0 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ 0 Real: 1 Fake: 0 Patch GAN Pose GAN Difference between two discriminator RGB image Pose Label map Feature map

15. Experimental result with Two GANs (LIP)

16. Experimental result with Two GANs (LIP)

17. Experimental result with Two GANs (LIP): Total loss

18. Experimental result with Two GANs (LIP): D_loss and G_loss

19. Contributions  We propose an effective PP-GAN for human parsing, which employs two conditional GANs as supplementary supervisions on shallow, fine layers and deep, coarse layers of the network respectively. Our model explicitly divides the human parsing into "what" and "where" subtasks in an unified framework and boosts the parsing performance on both image level and semantic level.  To our best knowledge, it is the first attempt to integrate human pose information into a conditional GAN framework for human parsing task, which significantly reduces the structural error of parsing results.  In the proposed framework, discrimination process is naturally divided into two easier tasks and two different discriminators are employed. The experiments demonstrate that multiple discriminators, which only focus on their own areas, prevail over single discriminator which is prone to saturate when facing with complex task.  The proposed PP-GAN significantly surpasses the previous methods on both challenging LIP and XXX benchmark datasets.

Human parsing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Human parsing

Similar to Human parsing (20)

Recently uploaded

Recently uploaded (20)

Human parsing