[DL輪読会] The Conditional Analogy GAN: Swapping Fashion Articles on People Images

•

2 likes•1,586 views

Deep Learning JP

2017/9/25 Deep Learning JP: http://deeplearning.jp/seminar-2/

Technology

DEEP LEARNING JP
[DL Papers]
“The Conditional Analogy GAN: Swapping Fashion Articles
on People Images”
Ryosuke Goto, VASILY, Inc.
http://deeplearning.jp/

2
• The Conditional Analogy GAN: Swapping Fashion Articles on People Images
• 著者
• Nikolay Jetchev, Urs Bergmann
• Zalando Research
• 選定理由
• 服の着せ替えの仕組みをサービスとして実用化したい
• 問題設定がシンプルでよい
書誌情報

3
• ドイツのファッションECサイト
• ヨーロッパ各国に展開
• 売上はzozotownの7倍
• 技術ブログ
• https://jobs.zalando.com/tech/blog/
Zalando?

4
• 入力xiとyiとyjの関係を学習し、xiとyjに対応するxi
jを生成するCA-GANを提案
• 服を着ているモデルに別の服を着せ替えることができる
Abstract

• 利用するデータの種類
• 置き撮り画像(Article)
• モデル着用画像(Human)
• ECサイトで入手しやすいペア
• どんな商品でも用意されている
• 集めるのが簡単
5
Introduction
Article Human

6
• ファッションビジネスにおける課題
• Humanは着用イメージが湧きやすいため、購買にとって重要
• しかし、Humanの作成は高価で時間がかかる
• 置き撮りから生成できると有り難い
Introduction

7
• バーチャル試着への応用
• 服の3Dモデル作成の必要
• 写真を取るよりコスト高
• Articleから生成したい
Introduction
https://www.slideshare.net/metatechnology/ magic-mirror-for-fashion-stores

8
• CAGAN
• xiとyi
Proposed Model
Article Human

9
• Generator
• 入力
• トリプレット
• 出力
• 着せ替えイメージ
• フィルタ
• Discriminator
• 入力
• Human, Articleのペア
Networks
Real/Fake

10
• 3つの損失関数を重み付けしたものを学習
• cGAN（通常のGAN アーキテクチャ）
• id loss （フィルターの学習に掛かる制約）
• cycle loss （生成物を元に戻した際の差分）
Loss Function

11
• 各組み合わせをReal/Fakeで識別
• 着せ替え後の生成物はFake
cGAN

12
• Generatorのアウトプット
• フィルターと重ねる前のイメージ
• できるだけフィルター範囲を小さく取りたい
id loss

13
• 着せ替え後のモデルに元の服を着せる
• 生成結果が整合性を保つために必要
Cycle Loss

14
• ADAM (lr = 0.0002)
• minibatch: 16
• 3つのlossの比: 1.0 : 0.1 : 1.0
• input 128×98 pixel image
Experiments

15
• 首周りもきちんと着せ替えできている
• 着せ替え部分のみに反応するフィルターを獲得
Results

16
• 特定のHumanに様々なArticleを着せ替え
• 横縞のようなtextureの着せ替えが苦手
Results

17
• 様々なHumanに特定のArticleを着せ替え
• 顔が崩れる…
Results

18
• 手に入れやすいHumanとArticleのデータで、精度の高い着せ替えに成功
• 色の着せ替えはうまくいく一方で、横縞などのtexsureの着せ替えが苦手
• future work
• 今回は、背景が単純な画像のみのデータセット
• 外で撮ったスナップ写真などはより難しいタスクになる
まとめ

What's hot

「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...Takumi Ohkuma

[DL輪読会]Causality Inspired Representation Learning for Domain GeneralizationDeep Learning JP

[DL輪読会]Image-to-Image Translation with Conditional Adversarial NetworksDeep Learning JP

[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted WindowsDeep Learning JP

[DL輪読会]World ModelsDeep Learning JP

【DL輪読会】Transformers are Sample Efficient World ModelsDeep Learning JP

強化学習の実適用に向けた課題と工夫Masahiro Yasumoto

ICLR2019 読み会in京都 ICLRから読み取るFeature Disentangleの研究動向Yamato OKAMOTO

NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision TreeTakami Sato

Face Quality Assessment 顔画像品質評価についてPlot Hong

SSII2019TS: Shall We GANs? ～GANの基礎から最近の研究まで～SSII

【DL輪読会】Hierarchical Text-Conditional Image Generation with CLIP LatentsDeep Learning JP

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP

【DL輪読会】SDEdit: Guided Image Synthesis and Editing with Stochastic Differentia...Deep Learning JP

Cvpr 2019 pvnetKenta Tanaka

[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial NetworksDeep Learning JP

[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image GeneratorsDeep Learning JP

【東工大・鈴木良郎】「画像生成用StyleGANの技術」を「3D形状の生成」に活用!!　新車のボディ形状を生成するAIssuser1bf283

モデルベース協調フィルタリングにおける推薦の透明性に関する検討Okamoto Laboratory, The University of Electro-Communications

[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and EditingDeep Learning JP

What's hot (20)

「解説資料」Toward Fast and Stabilized GAN Training for High-fidelity Few-shot Imag...

[DL輪読会]Causality Inspired Representation Learning for Domain Generalization

[DL輪読会]Image-to-Image Translation with Conditional Adversarial Networks

[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

[DL輪読会]World Models

【DL輪読会】Transformers are Sample Efficient World Models

強化学習の実適用に向けた課題と工夫

ICLR2019 読み会in京都 ICLRから読み取るFeature Disentangleの研究動向

NIPS2017読み会 LightGBM: A Highly Efficient Gradient Boosting Decision Tree

Face Quality Assessment 顔画像品質評価について

SSII2019TS: Shall We GANs? ～GANの基礎から最近の研究まで～

【DL輪読会】Hierarchical Text-Conditional Image Generation with CLIP Latents

【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...

【DL輪読会】SDEdit: Guided Image Synthesis and Editing with Stochastic Differentia...

Cvpr 2019 pvnet

[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks

[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

【東工大・鈴木良郎】「画像生成用StyleGANの技術」を「3D形状の生成」に活用!!　新車のボディ形状を生成するAI

モデルベース協調フィルタリングにおける推薦の透明性に関する検討

[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing

Viewers also liked

[DLHacks 実装] The statistical recurrent unitDeep Learning JP

[DL輪読会]Training RNNs as Fast as CNNsDeep Learning JP

[DLHacks 実装]Neural Machine Translation in Linear Time Deep Learning JP

[DL輪読会]Parallel Multiscale Autoregressive Density EstimationDeep Learning JP

Web開発初心者がReactをチームに導入して半年経ったkazuki matsumura

[DLHacks 実装] DeepPose: Human Pose Estimation via Deep Neural NetworksDeep Learning JP

[DL輪読会] Towards an Automatic Turing Test: Learning to Evaluate Dialogue Respo...Deep Learning JP

[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...Deep Learning JP

[DL輪読会] DeepNav: Learning to Navigate Large CitiesDeep Learning JP

React.js + Flux入門 #scripty02Yahoo!デベロッパーネットワーク

[DLHacks] DLHacks説明資料Deep Learning JP

[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...Deep Learning JP

[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-Deep Learning JP

[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image TransformationDeep Learning JP

[DL輪読会]Opening the Black Box of Deep Neural Networks via InformationDeep Learning JP

[DL輪読会] MoCoGAN: Decomposing Motion and Content for Video GenerationDeep Learning JP

[DL輪読会]Energy-based generative adversarial networksDeep Learning JP

Viewers also liked (17)

[DLHacks 実装] The statistical recurrent unit

[DL輪読会]Training RNNs as Fast as CNNs

[DLHacks 実装]Neural Machine Translation in Linear Time

[DL輪読会]Parallel Multiscale Autoregressive Density Estimation

Web開発初心者がReactをチームに導入して半年経った

[DLHacks 実装] DeepPose: Human Pose Estimation via Deep Neural Networks

[DL輪読会] Towards an Automatic Turing Test: Learning to Evaluate Dialogue Respo...

[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...

[DL輪読会] DeepNav: Learning to Navigate Large Cities

React.js + Flux入門 #scripty02

[DLHacks] DLHacks説明資料

[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...

[DLHacks LT] PytorchのDataLoader -torchtextのソースコードを読んでみた-

[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

[DL輪読会]Opening the Black Box of Deep Neural Networks via Information

[DL輪読会] MoCoGAN: Decomposing Motion and Content for Video Generation

[DL輪読会]Energy-based generative adversarial networks

Recently uploaded

クラウドネイティブなサーバー仮想化基盤 - OpenShift Virtualization.pdfFumieNakayama

NewSQLの可用性構成パターン（OCHaCafe Season 8 #4 発表資料）NTT DATA Technology & Innovation

自分史上一番早い2024振り返り〜コロナ後、仕事は通常ペースに戻ったか〜 by IoT fullstack engineerYuki Kikuchi

CTO, VPoE, テックリードなどリーダーポジションに登用したくなるのはどんな人材か？akihisamiyanaga1

モーダル間の変換後の一致性とジャンル表を用いた解釈可能性の考察～Text-to-MusicとText-To-ImageかつImage-to-Music...博三太田

TataPixel: 畳の異方性を利用した切り替え可能なディスプレイの提案sugiuralab

AWS の OpenShift サービス (ROSA) を使った OpenShift Virtualizationの始め方.pdfFumieNakayama

業務で生成AIを活用したい人のための生成AI入門講座（社外公開版：キンドリルジャパン社内勉強会：2024年4月発表）Hiroshi Tomioka

デジタル・フォレンジックの最新動向（2024年4月27日情洛会総会特別講演スライド）UEHARA, Tetsutaro

Recently uploaded (9)

クラウドネイティブなサーバー仮想化基盤 - OpenShift Virtualization.pdf

NewSQLの可用性構成パターン（OCHaCafe Season 8 #4 発表資料）

自分史上一番早い2024振り返り〜コロナ後、仕事は通常ペースに戻ったか〜 by IoT fullstack engineer

CTO, VPoE, テックリードなどリーダーポジションに登用したくなるのはどんな人材か？

モーダル間の変換後の一致性とジャンル表を用いた解釈可能性の考察～Text-to-MusicとText-To-ImageかつImage-to-Music...

TataPixel: 畳の異方性を利用した切り替え可能なディスプレイの提案

AWS の OpenShift サービス (ROSA) を使った OpenShift Virtualizationの始め方.pdf

業務で生成AIを活用したい人のための生成AI入門講座（社外公開版：キンドリルジャパン社内勉強会：2024年4月発表）

デジタル・フォレンジックの最新動向（2024年4月27日情洛会総会特別講演スライド）

[DL輪読会] The Conditional Analogy GAN: Swapping Fashion Articles on People Images

1. DEEP LEARNING JP [DL Papers] “The Conditional Analogy GAN: Swapping Fashion Articles on People Images” Ryosuke Goto, VASILY, Inc. http://deeplearning.jp/

2. 2 • The Conditional Analogy GAN: Swapping Fashion Articles on People Images • 著者 • Nikolay Jetchev, Urs Bergmann • Zalando Research • 選定理由 • 服の着せ替えの仕組みをサービスとして実用化したい • 問題設定がシンプルでよい書誌情報

3. 3 • ドイツのファッションECサイト • ヨーロッパ各国に展開 • 売上はzozotownの7倍 • 技術ブログ • https://jobs.zalando.com/tech/blog/ Zalando?

4. 4 • 入力xiとyiとyjの関係を学習し、xiとyjに対応するxi jを生成するCA-GANを提案 • 服を着ているモデルに別の服を着せ替えることができる Abstract

5. • 利用するデータの種類 • 置き撮り画像(Article) • モデル着用画像(Human) • ECサイトで入手しやすいペア • どんな商品でも用意されている • 集めるのが簡単 5 Introduction Article Human

6. 6 • ファッションビジネスにおける課題 • Humanは着用イメージが湧きやすいため、購買にとって重要 • しかし、Humanの作成は高価で時間がかかる • 置き撮りから生成できると有り難い Introduction

7. 7 • バーチャル試着への応用 • 服の3Dモデル作成の必要 • 写真を取るよりコスト高 • Articleから生成したい Introduction https://www.slideshare.net/metatechnology/ magic-mirror-for-fashion-stores

8. 8 • CAGAN • xiとyi Proposed Model Article Human

9. 9 • Generator • 入力 • トリプレット • 出力 • 着せ替えイメージ • フィルタ • Discriminator • 入力 • Human, Articleのペア Networks Real/Fake

10. 10 • 3つの損失関数を重み付けしたものを学習 • cGAN（通常のGAN アーキテクチャ） • id loss （フィルターの学習に掛かる制約） • cycle loss （生成物を元に戻した際の差分） Loss Function

11. 11 • 各組み合わせをReal/Fakeで識別 • 着せ替え後の生成物はFake cGAN

12. 12 • Generatorのアウトプット • フィルターと重ねる前のイメージ • できるだけフィルター範囲を小さく取りたい id loss

13. 13 • 着せ替え後のモデルに元の服を着せる • 生成結果が整合性を保つために必要 Cycle Loss

14. 14 • ADAM (lr = 0.0002) • minibatch: 16 • 3つのlossの比: 1.0 : 0.1 : 1.0 • input 128×98 pixel image Experiments

15. 15 • 首周りもきちんと着せ替えできている • 着せ替え部分のみに反応するフィルターを獲得 Results

16. 16 • 特定のHumanに様々なArticleを着せ替え • 横縞のようなtextureの着せ替えが苦手 Results

17. 17 • 様々なHumanに特定のArticleを着せ替え • 顔が崩れる… Results

18. 18 • 手に入れやすいHumanとArticleのデータで、精度の高い着せ替えに成功 • 色の着せ替えはうまくいく一方で、横縞などのtexsureの着せ替えが苦手 • future work • 今回は、背景が単純な画像のみのデータセット • 外で撮ったスナップ写真などはより難しいタスクになるまとめ