SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Fast Image Tagging
M. Chen(Amazon.com), A. Zheng(MSR, Redmond), and K. Weinberger(Washington Univ.)
ICML2013
ICML2013読み会 2013.7.9
株式会社Preferred Infrastructure
Takashi Abe <tabe@preferred.jp>
自己紹介
 阿部厳 (あべたかし)
 Twitter: @tabe2314
 東北大岡谷研(コンピュータビジョン)→PFIインターン→PFI新入
社員
2
紹介する論文
M. Chen, A. Zheng and K. Weinberger. Fast Image Tagging. ICML, 2013.
※ スライド中の図表はこの論文より引用しています
3
Image Tagging (1)
4
 画像から、関連するタグ(複数)を推定
 training:
 入力: {(画像, タグ), …} 出力: 画像→タグ集合
 testing:
 入力: 画像 出力: 推定したタグ集合
bear polar snow tundra buildings clothes shops
street
???
training testing
Image Tagging (2): 何が難しい?
 効果的な特徴が物体によって違う → いろんな特徴を入れたい
 見えの多様性 → 大きなデータセットを使いたい
 不完全なアノテーションデータ
 PrecisionはともかくRecallが低いデータしか得られない
 (本来のタグから一部が抜け落ちたデータが得られる)
 例: Flickrのタグ
 タグの出現頻度の偏り
5
Color Edges
FastTag
6
基本的なアイデア
 アノテーションされたタグ集合を補完しつつ、画像から補完された
タグへの(線形の)マッピングを学習
 B: タグ集合 → 補完されたタグ集合
 W: 画像特徴 → 補完されたタグ集合
 学習:
7
gure 1. Schematic illustration of FastTag. During training two classifiers B and W
edict similar results. At testing time, a simple linear mapping x → W x predicts t
ntly convex and has closed form solutions in each
eration of the optimization.
o-regularized learning. As we are only provided
th an incomplete set of tags, wecreate an additional
xiliary problem and obtain two sub-tasks: 1) train-
g an image classifier xi → W xi that predicts the
mplete tag set from image features, and 2) training
mapping yi → Byi to enrich the existing sparse
g vector yi by estimating which tags are likely to
-occur with those already in yi . We train both clas-
ers simultaneously and force their output to agree
minimizing
1
n
n
i = 1
Byi − W xi
2
. (1)
B to y would recov
set.
For simplicity, we u
ondary corruption m
belers may select ta
bility. We can appr
distribution with pie
learning step (see se
the original corrupti
also easily be incorp
More formally, for e
created by randomly
each entry in y with
fore, for each user
p(˜yt = 0) = p and
Marginalized blank-out regularization (1)
 を単純に最小化するとB=0=Wなので要制約
 Bはアノテーションyiを真のタグ集合ziにマップして欲しい
 ziは得られないので、yiの要素をそれぞれ確率pで落とした から yi
を復元するBを考える
 の生成を繰り返し行うことを考えると復元誤差の期待値は
8
or yi by estimating which tags are likely to
r with those already in yi . We train both clas-
multaneously and force their output to agree
mizing
1
n
n
i = 1
Byi − W xi
2
. (1)
yi is the enriched tag set for the i-th training
and each row of W contains the weights of a
assifier that tries to predict the corresponding
ed) tag based on image features.
s function as currently written has a trivial so-
t B = 0 = W , suggesting that the current for-
n is underconstrained. We next describe ad-
regularizations on B that guides the solution
something more useful.
alized blank-out regularizat ion. We take
on from the idea of marginalized stacked de-
autoencoders (Chen et al., 2012) and related
?) in formulating the tag enrichment mapping
} T
→RT
. Our intention isto enrich theincom-
the original corruption mechanism is
also easily be incorporated into our m
More formally, for each y, a corrup
created by randomly removing (i.e.,
each entry in y with some probability
fore, for each user tag vector y an
p(˜yt = 0) = p and p(˜yt = yt ) = 1 −
to optimize
B = argmin
B
1
n
n
i = 1
yi − B
Here, each row of B is an ordinary
gressor that predicts the presence o
existing tags in ˜y. To reduce varian
repeated samples of ˜y. In the limi
many corrupted versions of y), the
struction error under the corrupting
be expressed as
r(B) =
1
n
n
i = 1
E yi − B ˜yi
cts the
raining
sparse
kely to
th clas-
o agree
(1)
raining
hts of a
ponding
vial so-
ent for-
ibe ad-
solution
bility. We can approximate the unknown corrupting
distribution with piecewise uniform corruption in the
learning step (see section 3.2). If prior knowledge on
the original corruption mechanism is available, it can
also easily be incorporated into our model.
More formally, for each y, a corrupted version ˜y is
created by randomly removing (i.e., setting to zero)
each entry in y with some probability p≥ 0 and there-
fore, for each user tag vector y and dimensions t,
p(˜yt = 0) = p and p(˜yt = yt ) = 1 − p. We train B
to optimize
B = argmin
B
1
n
n
i = 1
yi − B ˜yi
2
.
Here, each row of B is an ordinary least squares re-
gressor that predicts the presence of a tag given all
existing tags in ˜y. To reduce variance in B, we take
repeated samples of ˜y. In the limit (with infinitely
many corrupted versions of y), the expected recon-
(1)
ning
of a
ding
so-
for-
ad-
tion
ake
de-
ated
ping
om-
hat
each entry in y with some probability p≥ 0 and there-
fore, for each user tag vector y and dimensions t,
p(˜yt = 0) = p and p(˜yt = yt ) = 1 − p. We train B
to optimize
B = argmin
B
1
n
n
i = 1
yi − B ˜yi
2
.
Here, each row of B is an ordinary least squares re-
gressor that predicts the presence of a tag given all
existing tags in ˜y. To reduce variance in B, we take
repeated samples of ˜y. In the limit (with infinitely
many corrupted versions of y), the expected recon-
struction error under the corrupting distribution can
be expressed as
r(B) =
1
n
n
i = 1
E yi − B ˜yi
2
p( ˜y i |y )
. (2)
Let us denote as Y ≡ [y1, · · · , yn ] the matrix contain-
Marginalized blank-out regularization (2)
 (2)を式変形して
 つまり実際に を作る必要は無い(pだけ決めればよい)
 この復元誤差の期待値をロス関数に加える
9
com-
that
that
m the
of the
e en-
ocess.
on of
from
tches
lying
Let us denote as Y ≡ [y1, · · · , yn ] the matrix contain-
ing the partial labels for each image in each column.
Define P ≡
n
i = 1 yi E[˜yi ] and Q ≡
n
i = 1 E[˜yi ˜yi ],
then we can rewrite the loss in (2) as
r(B) =
1
n
trace(BQB − 2PB + Y Y ) (3)
We use Eq. (3) to regularize B. For the uniform
“ blank-out” noise introduced above, we have the ex-
pected value of the corruptions E[˜y]p( ˜y |y ) = (1 − p)y,
Optimization
 Bを固定すると(5)を最小化するWは閉じた式で
 同様にWを固定すると
 交互に求めると大域解に収束 (jointly convex)
10
拡張
 Tag bootstrapping
 Bはタグの共起関係で学習されてるので、共起しないけど似たタグ
が補完されない(例: lakeとpond)
 stacking
 Byiを新たなアノテーションとしてもう一度学習、を繰り返す
 共起関係を伝搬させるイメージ?
 スタック数は実験的に決定
11
画像特徴
 複数の特徴を組み合わせ(既存手法と同じもの)
 GIST
 6種類の色ヒストグラム
 8種類の局所特徴のBoW
 事前に内積がχ^2距離を近似する空間にあらかじめ写像しておく
 Vedaldi, A. and Zisserman, A. Efficient additive kernels via explicit
feature maps. PAMI, 34(3):480–492, 2012.
12
実験結果
13
精度評価
 leastSquares: FastTagのタグ補完無し版、ベースライン
 TagProp: これまでのstate of the art, 学習O(n^2), テストO(n)
 FastTagの精度はほぼTagPropと同じ
14
brown, ear, painting,
woman, yellow
board, lake, man
wave, white
blue, circle, feet
round, white
drawing, hat, people
red, woman
blue, dot, feet,
microphone, statue
hair, ice, man,
white, woman
black, moon, red,
shadow, woman
LowF-1s
Figure 2. Predicted keywords using FastTag for sample images in the ESP game dataset (using all 268 keywords).
Table 1. Comparison of FastTag and TagProp in terms of
P, R, F1 score and N+ on the Corel5K dataset. Previously
reported results using other image annotation techniques
are also included for reference.
Name P R F1 N+
leastSquares 29 32 30 125
CRM (Lavrenko et al., 2003) 16 19 17 107
InfNet (M et zler & M anmat ha, 2004) 17 24 20 112
NPDE (Yavlinsky et al., 2005) 18 21 19 114
SML (Carneiro et al., 2007) 23 29 26 137
MBRM (Feng et al., 2004) 24 25 24 122
TGLM (Liu et al., 2009) 25 29 27 131
JEC (M akadia et al., 2008) 27 32 29 139
TagProp (Guillaumin et al., 2009) 33 42 37 160
Fast Tag 32 43 37 166
report the number of keywords with non-zero recall
value (N+ ). In all metrics a higher value indicates
better performance.
B aselines. We compare against leastSquares, a ridge
regression model which uses the partial subset of tags
y1, . . . , yn as labels to learn W , i.e., FastTag without
tag enrichment. We also compare against the Tag-
Prop algorithm (Guillaumin et al., 2009), a local kNN
method combining different distance metrics through
Table 2. Comparison of FastTag and TagProp in terms of
P , R, F1 score and N+ on the Espgame and IAPRTC-12
datasets.
ESP game IAPR
P R F1 N+ P R F1 N+
leastSquares 35 19 25 215 40 19 26 198
MBRM 18 19 18 209 24 23 23 223
JEC 24 19 21 222 29 19 23 211
TagProp 39 27 32 238 45 34 39 260
FastTag 46 22 30 247 47 26 34 280
4.2. Comparison wit h relat ed work
Table 1 shows a detailed comparison of FastTag to
the leastSquares baseline and eight published results
on the Corel5K dataset. We can make three obser-
vations: 1. The performance of FastTag aligns with
that of TagProp (so far the best algorithm in terms
of accuracy on this dataset), and significantly outper-
forms the other methods; 2. The leastSquares base-
line, which corresponds to FastTag without the tag
enricher, performs surprisingly well compared to exist-
ing approaches, which suggests theadvantage of a sim-
ple model that can extend to a large number of visual
descriptor, as opposed to a complex model that can af-
ford fewer descriptors. One may instead more cheaply
brown, ear, painting,
woman, yellow
board, lake, man
wave, white
blue, circle, feet
round, white
drawing, hat, people
red, woman
blue, dot, feet,
microphone, statue
hair, ice, man,
white, woman
black, moon, red,
shadow, woman
LowF-1scor
Figure 2. Predicted keywords using FastTag for sample images in the ESP game dataset (using all 268 keywords).
Table 1. Comparison of FastTag and TagProp in terms of
P, R, F1 score and N+ on the Corel5K dataset. Previously
reported results using other image annotation techniques
are also included for reference.
Name P R F1 N+
leastSquares 29 32 30 125
CRM (Lavrenko et al., 2003) 16 19 17 107
InfNet (M et zler & M anmat ha, 2004) 17 24 20 112
NPDE (Yavlinsky et al., 2005) 18 21 19 114
SML (Carneiro et al., 2007) 23 29 26 137
MBRM (Feng et al., 2004) 24 25 24 122
TGLM (Liu et al., 2009) 25 29 27 131
JEC (M akadia et al., 2008) 27 32 29 139
TagProp (Guillaumin et al., 2009) 33 42 37 160
Fast Tag 32 43 37 166
report the number of keywords with non-zero recall
value (N+ ). In all metrics a higher value indicates
better performance.
Baselines. We compare against leastSquares, a ridge
regression model which uses the partial subset of tags
y1, . . . , yn as labels to learn W , i.e., FastTag without
tag enrichment. We also compare against the Tag-
Prop algorithm (Guillaumin et al., 2009), a local kNN
method combining different distance metrics through
metric learning. It is the current best performer on
these benchmark sets. Most existing work do not pro-
vide publicly available implementations. As a result,
Table 2. Comparison of FastTag and TagProp in terms of
P , R, F1 score and N+ on the Espgame and IAPRTC-12
datasets.
ESP game IAPR
P R F1 N+ P R F1 N+
leastSquares 35 19 25 215 40 19 26 198
MBRM 18 19 18 209 24 23 23 223
JEC 24 19 21 222 29 19 23 211
TagProp 39 27 32 238 45 34 39 260
FastTag 46 22 30 247 47 26 34 280
4.2. Comparison wit h relat ed work
Table 1 shows a detailed comparison of FastTag to
the leastSquares baseline and eight published results
on the Corel5K dataset. We can make three obser-
vations: 1. The performance of FastTag aligns with
that of TagProp (so far the best algorithm in terms
of accuracy on this dataset), and significantly outper-
forms the other methods; 2. The leastSquares base-
line, which corresponds to FastTag without the tag
enricher, performs surprisingly well compared to exist-
ing approaches, which suggeststheadvantage of a sim-
ple model that can extend to a large number of visual
descriptor, as opposed to a complex model that can af-
ford fewer descriptors. One may instead more cheaply
glean the benefits of a complex model via non-linear
transformation of the features. 3. The duo classifier
formulation of FastTag, which adds the tag enricher,
15
最大タグ数
16
Fast I m age Tagging
bug, green, insect,
tree, wood
baby, doll, dress,
green, hair
blue, earth, globe,
map, world
fish, fishing, fly,
hook, orange
blue, cloud, ocean,
sky, water
fly, plane, red,
sky, train
black, computer, drawing
handle, screen
brown, ear, painting,
woman, yellow
board, lake, man
wave, white
blue, circle, feet
round, white
drawing, hat, people
red, woman
blue, dot, feet,
microphone, statue
hair, ice, man,
white, woman
black, moon, red,
shadow, woman
asian, boy, gun,
man, white
anime, comic, people,
red, woman
feet, flower, fur.
red, shoes
blue, chart, diagram,
internet, table
gray, sky, stone,
water, white
black, dark, game,
man, night
plane, red, sky,
train, truck
HighF-1scoreLowF-1scoreRandom
Figure 2. Predicted keywords using FastTag for sample images in the ESP game dataset (using all 268 keywords).
Table 1. Comparison of FastTag and TagProp in terms of
P, R, F1 score and N+ on the Corel5K dataset. Previously
Table 2. Comparison of FastTag and TagProp in terms of
P , R, F1 score and N+ on the Espgame and IAPRTC-12

Weitere ähnliche Inhalte

Was ist angesagt?

Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...홍배 김
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionKazuki Fujikawa
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing홍배 김
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)VARUN KUMAR
 
SPIRE2015-Improved Practical Compact Dynamic Tries
SPIRE2015-Improved Practical Compact Dynamic TriesSPIRE2015-Improved Practical Compact Dynamic Tries
SPIRE2015-Improved Practical Compact Dynamic TriesAndreas Poyias
 
Ijmsr 2016-05
Ijmsr 2016-05Ijmsr 2016-05
Ijmsr 2016-05ijmsr
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processesKazuki Fujikawa
 
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydH2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydSri Ambati
 
Boosted Tree-based Multinomial Logit Model for Aggregated Market Data
Boosted Tree-based Multinomial Logit Model for Aggregated Market DataBoosted Tree-based Multinomial Logit Model for Aggregated Market Data
Boosted Tree-based Multinomial Logit Model for Aggregated Market DataJay (Jianqiang) Wang
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
 
YaPingPresentation
YaPingPresentationYaPingPresentation
YaPingPresentationYa-Ping Wang
 
All Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and AnalysisAll Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and AnalysisInderjeet Singh
 
Task based Programming with OmpSs and its Application
Task based Programming with OmpSs and its ApplicationTask based Programming with OmpSs and its Application
Task based Programming with OmpSs and its ApplicationFacultad de Informática UCM
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsYoonho Lee
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks 신동 강
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfPolytechnique Montréal
 

Was ist angesagt? (19)

Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
 
NIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph ConvolutionNIPS2017 Few-shot Learning and Graph Convolution
NIPS2017 Few-shot Learning and Graph Convolution
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
 
SPIRE2015-Improved Practical Compact Dynamic Tries
SPIRE2015-Improved Practical Compact Dynamic TriesSPIRE2015-Improved Practical Compact Dynamic Tries
SPIRE2015-Improved Practical Compact Dynamic Tries
 
Ijmsr 2016-05
Ijmsr 2016-05Ijmsr 2016-05
Ijmsr 2016-05
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processes
 
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydH2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
 
Boosted Tree-based Multinomial Logit Model for Aggregated Market Data
Boosted Tree-based Multinomial Logit Model for Aggregated Market DataBoosted Tree-based Multinomial Logit Model for Aggregated Market Data
Boosted Tree-based Multinomial Logit Model for Aggregated Market Data
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
YaPingPresentation
YaPingPresentationYaPingPresentation
YaPingPresentation
 
Lecture4 xing
Lecture4 xingLecture4 xing
Lecture4 xing
 
upgrade2013
upgrade2013upgrade2013
upgrade2013
 
All Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and AnalysisAll Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and Analysis
 
Task based Programming with OmpSs and its Application
Task based Programming with OmpSs and its ApplicationTask based Programming with OmpSs and its Application
Task based Programming with OmpSs and its Application
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
 
Pythonic Math
Pythonic MathPythonic Math
Pythonic Math
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
 

Andere mochten auch

【2016.10】cvpaper.challenge2016
【2016.10】cvpaper.challenge2016【2016.10】cvpaper.challenge2016
【2016.10】cvpaper.challenge2016cvpaper. challenge
 
多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26
多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26
多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26Takashi Abe
 
Introduction to YOLO detection model
Introduction to YOLO detection modelIntroduction to YOLO detection model
Introduction to YOLO detection modelWEBFARMER. ltd.
 
これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...
これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...
これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...cvpaper. challenge
 
論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNNTakashi Abe
 
Introduction to A3C model
Introduction to A3C modelIntroduction to A3C model
Introduction to A3C modelWEBFARMER. ltd.
 
Introduction to Deep Compression
Introduction to Deep CompressionIntroduction to Deep Compression
Introduction to Deep CompressionWEBFARMER. ltd.
 
ディープラーニング・ハンズオン勉強会161229
ディープラーニング・ハンズオン勉強会161229ディープラーニング・ハンズオン勉強会161229
ディープラーニング・ハンズオン勉強会161229WEBFARMER. ltd.
 
ICML2013読み会 ELLA: An Efficient Lifelong Learning Algorithm
ICML2013読み会 ELLA: An Efficient Lifelong Learning AlgorithmICML2013読み会 ELLA: An Efficient Lifelong Learning Algorithm
ICML2013読み会 ELLA: An Efficient Lifelong Learning AlgorithmYuya Unno
 
ICML2013読み会: Distributed training of Large-scale Logistic models
ICML2013読み会: Distributed training of Large-scale Logistic modelsICML2013読み会: Distributed training of Large-scale Logistic models
ICML2013読み会: Distributed training of Large-scale Logistic modelssleepy_yoshi
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionSeiya Tokui
 
ICML2013読み会 開会宣言
ICML2013読み会 開会宣言ICML2013読み会 開会宣言
ICML2013読み会 開会宣言Shohei Hido
 
Introduction to GAN model
Introduction to GAN modelIntroduction to GAN model
Introduction to GAN modelWEBFARMER. ltd.
 
Vanishing Component Analysis
Vanishing Component AnalysisVanishing Component Analysis
Vanishing Component AnalysisKoji Matsuda
 
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...Koji Matsuda
 
いまさら聞けない “モデル” の話 @DSIRNLP#5
いまさら聞けない “モデル” の話 @DSIRNLP#5いまさら聞けない “モデル” の話 @DSIRNLP#5
いまさら聞けない “モデル” の話 @DSIRNLP#5Koji Matsuda
 
基調講演:「多様化する情報を支える技術」/西川徹
基調講演:「多様化する情報を支える技術」/西川徹基調講演:「多様化する情報を支える技術」/西川徹
基調講演:「多様化する情報を支える技術」/西川徹Preferred Networks
 
Vanishing Component Analysisの試作と簡単な実験
Vanishing Component Analysisの試作と簡単な実験Vanishing Component Analysisの試作と簡単な実験
Vanishing Component Analysisの試作と簡単な実験Hiroshi Tsukahara
 
Lindsey Stirling
Lindsey StirlingLindsey Stirling
Lindsey StirlingYulia_CB
 

Andere mochten auch (20)

【2016.10】cvpaper.challenge2016
【2016.10】cvpaper.challenge2016【2016.10】cvpaper.challenge2016
【2016.10】cvpaper.challenge2016
 
多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26
多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26
多層NNの教師なし学習 コンピュータビジョン勉強会@関東 2014/5/26
 
Introduction to YOLO detection model
Introduction to YOLO detection modelIntroduction to YOLO detection model
Introduction to YOLO detection model
 
ECCV 2016 まとめ
ECCV 2016 まとめECCV 2016 まとめ
ECCV 2016 まとめ
 
これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...
これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...
これからのコンピュータビジョン技術 - cvpaper.challenge in PRMU Grand Challenge 2016 (PRMU研究会 2...
 
論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN
 
Introduction to A3C model
Introduction to A3C modelIntroduction to A3C model
Introduction to A3C model
 
Introduction to Deep Compression
Introduction to Deep CompressionIntroduction to Deep Compression
Introduction to Deep Compression
 
ディープラーニング・ハンズオン勉強会161229
ディープラーニング・ハンズオン勉強会161229ディープラーニング・ハンズオン勉強会161229
ディープラーニング・ハンズオン勉強会161229
 
ICML2013読み会 ELLA: An Efficient Lifelong Learning Algorithm
ICML2013読み会 ELLA: An Efficient Lifelong Learning AlgorithmICML2013読み会 ELLA: An Efficient Lifelong Learning Algorithm
ICML2013読み会 ELLA: An Efficient Lifelong Learning Algorithm
 
ICML2013読み会: Distributed training of Large-scale Logistic models
ICML2013読み会: Distributed training of Large-scale Logistic modelsICML2013読み会: Distributed training of Large-scale Logistic models
ICML2013読み会: Distributed training of Large-scale Logistic models
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
 
ICML2013読み会 開会宣言
ICML2013読み会 開会宣言ICML2013読み会 開会宣言
ICML2013読み会 開会宣言
 
Introduction to GAN model
Introduction to GAN modelIntroduction to GAN model
Introduction to GAN model
 
Vanishing Component Analysis
Vanishing Component AnalysisVanishing Component Analysis
Vanishing Component Analysis
 
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...
 
いまさら聞けない “モデル” の話 @DSIRNLP#5
いまさら聞けない “モデル” の話 @DSIRNLP#5いまさら聞けない “モデル” の話 @DSIRNLP#5
いまさら聞けない “モデル” の話 @DSIRNLP#5
 
基調講演:「多様化する情報を支える技術」/西川徹
基調講演:「多様化する情報を支える技術」/西川徹基調講演:「多様化する情報を支える技術」/西川徹
基調講演:「多様化する情報を支える技術」/西川徹
 
Vanishing Component Analysisの試作と簡単な実験
Vanishing Component Analysisの試作と簡単な実験Vanishing Component Analysisの試作と簡単な実験
Vanishing Component Analysisの試作と簡単な実験
 
Lindsey Stirling
Lindsey StirlingLindsey Stirling
Lindsey Stirling
 

Ähnlich wie 論文紹介 Fast imagetagging

A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
 
論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy Labels論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy LabelsToru Tamaki
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Ono Shigeru
 
20140530.journal club
20140530.journal club20140530.journal club
20140530.journal clubHayaru SHOUNO
 
Digital electronics k map comparators and their function
Digital electronics k map comparators and their functionDigital electronics k map comparators and their function
Digital electronics k map comparators and their functionkumarankit06875
 
SinGAN for Image Denoising
SinGAN for Image DenoisingSinGAN for Image Denoising
SinGAN for Image DenoisingKhalilBergaoui
 
GPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_Diffusion
GPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_DiffusionGPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_Diffusion
GPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_DiffusionVartika Sharma
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...IJERA Editor
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...IJERA Editor
 
A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images
A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images
A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images IOSR Journals
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
 
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment ProblemA Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment ProblemMary Calkins
 
An Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingAn Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingCSCJournals
 
Lego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawingsLego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawingsMathieu Dutour Sikiric
 

Ähnlich wie 論文紹介 Fast imagetagging (20)

A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy Labels論文紹介:Learning With Neighbor Consistency for Noisy Labels
論文紹介:Learning With Neighbor Consistency for Noisy Labels
 
Lec-3 DIP.pptx
Lec-3 DIP.pptxLec-3 DIP.pptx
Lec-3 DIP.pptx
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
 
20140530.journal club
20140530.journal club20140530.journal club
20140530.journal club
 
Digital electronics k map comparators and their function
Digital electronics k map comparators and their functionDigital electronics k map comparators and their function
Digital electronics k map comparators and their function
 
SinGAN for Image Denoising
SinGAN for Image DenoisingSinGAN for Image Denoising
SinGAN for Image Denoising
 
GPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_Diffusion
GPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_DiffusionGPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_Diffusion
GPU_Based_Image_Compression_and_Interpolation_with_Anisotropic_Diffusion
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
 
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...
 
Signals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptxSignals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptx
 
Introduction to OpenCV
Introduction to OpenCVIntroduction to OpenCV
Introduction to OpenCV
 
A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images
A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images
A Novel PSNR-B Approach for Evaluating the Quality of De-blocked Images
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
2021 04-01-dalle
2021 04-01-dalle2021 04-01-dalle
2021 04-01-dalle
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Aistats RTD
Aistats RTDAistats RTD
Aistats RTD
 
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment ProblemA Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
A Parallel Branch And Bound Algorithm For The Quadratic Assignment Problem
 
An Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingAn Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video Coding
 
Lego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawingsLego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawings
 

Kürzlich hochgeladen

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

論文紹介 Fast imagetagging

  • 1. Fast Image Tagging M. Chen(Amazon.com), A. Zheng(MSR, Redmond), and K. Weinberger(Washington Univ.) ICML2013 ICML2013読み会 2013.7.9 株式会社Preferred Infrastructure Takashi Abe <tabe@preferred.jp>
  • 2. 自己紹介  阿部厳 (あべたかし)  Twitter: @tabe2314  東北大岡谷研(コンピュータビジョン)→PFIインターン→PFI新入 社員 2
  • 3. 紹介する論文 M. Chen, A. Zheng and K. Weinberger. Fast Image Tagging. ICML, 2013. ※ スライド中の図表はこの論文より引用しています 3
  • 4. Image Tagging (1) 4  画像から、関連するタグ(複数)を推定  training:  入力: {(画像, タグ), …} 出力: 画像→タグ集合  testing:  入力: 画像 出力: 推定したタグ集合 bear polar snow tundra buildings clothes shops street ??? training testing
  • 5. Image Tagging (2): 何が難しい?  効果的な特徴が物体によって違う → いろんな特徴を入れたい  見えの多様性 → 大きなデータセットを使いたい  不完全なアノテーションデータ  PrecisionはともかくRecallが低いデータしか得られない  (本来のタグから一部が抜け落ちたデータが得られる)  例: Flickrのタグ  タグの出現頻度の偏り 5 Color Edges
  • 7. 基本的なアイデア  アノテーションされたタグ集合を補完しつつ、画像から補完された タグへの(線形の)マッピングを学習  B: タグ集合 → 補完されたタグ集合  W: 画像特徴 → 補完されたタグ集合  学習: 7 gure 1. Schematic illustration of FastTag. During training two classifiers B and W edict similar results. At testing time, a simple linear mapping x → W x predicts t ntly convex and has closed form solutions in each eration of the optimization. o-regularized learning. As we are only provided th an incomplete set of tags, wecreate an additional xiliary problem and obtain two sub-tasks: 1) train- g an image classifier xi → W xi that predicts the mplete tag set from image features, and 2) training mapping yi → Byi to enrich the existing sparse g vector yi by estimating which tags are likely to -occur with those already in yi . We train both clas- ers simultaneously and force their output to agree minimizing 1 n n i = 1 Byi − W xi 2 . (1) B to y would recov set. For simplicity, we u ondary corruption m belers may select ta bility. We can appr distribution with pie learning step (see se the original corrupti also easily be incorp More formally, for e created by randomly each entry in y with fore, for each user p(˜yt = 0) = p and
  • 8. Marginalized blank-out regularization (1)  を単純に最小化するとB=0=Wなので要制約  Bはアノテーションyiを真のタグ集合ziにマップして欲しい  ziは得られないので、yiの要素をそれぞれ確率pで落とした から yi を復元するBを考える  の生成を繰り返し行うことを考えると復元誤差の期待値は 8 or yi by estimating which tags are likely to r with those already in yi . We train both clas- multaneously and force their output to agree mizing 1 n n i = 1 Byi − W xi 2 . (1) yi is the enriched tag set for the i-th training and each row of W contains the weights of a assifier that tries to predict the corresponding ed) tag based on image features. s function as currently written has a trivial so- t B = 0 = W , suggesting that the current for- n is underconstrained. We next describe ad- regularizations on B that guides the solution something more useful. alized blank-out regularizat ion. We take on from the idea of marginalized stacked de- autoencoders (Chen et al., 2012) and related ?) in formulating the tag enrichment mapping } T →RT . Our intention isto enrich theincom- the original corruption mechanism is also easily be incorporated into our m More formally, for each y, a corrup created by randomly removing (i.e., each entry in y with some probability fore, for each user tag vector y an p(˜yt = 0) = p and p(˜yt = yt ) = 1 − to optimize B = argmin B 1 n n i = 1 yi − B Here, each row of B is an ordinary gressor that predicts the presence o existing tags in ˜y. To reduce varian repeated samples of ˜y. In the limi many corrupted versions of y), the struction error under the corrupting be expressed as r(B) = 1 n n i = 1 E yi − B ˜yi cts the raining sparse kely to th clas- o agree (1) raining hts of a ponding vial so- ent for- ibe ad- solution bility. We can approximate the unknown corrupting distribution with piecewise uniform corruption in the learning step (see section 3.2). If prior knowledge on the original corruption mechanism is available, it can also easily be incorporated into our model. More formally, for each y, a corrupted version ˜y is created by randomly removing (i.e., setting to zero) each entry in y with some probability p≥ 0 and there- fore, for each user tag vector y and dimensions t, p(˜yt = 0) = p and p(˜yt = yt ) = 1 − p. We train B to optimize B = argmin B 1 n n i = 1 yi − B ˜yi 2 . Here, each row of B is an ordinary least squares re- gressor that predicts the presence of a tag given all existing tags in ˜y. To reduce variance in B, we take repeated samples of ˜y. In the limit (with infinitely many corrupted versions of y), the expected recon- (1) ning of a ding so- for- ad- tion ake de- ated ping om- hat each entry in y with some probability p≥ 0 and there- fore, for each user tag vector y and dimensions t, p(˜yt = 0) = p and p(˜yt = yt ) = 1 − p. We train B to optimize B = argmin B 1 n n i = 1 yi − B ˜yi 2 . Here, each row of B is an ordinary least squares re- gressor that predicts the presence of a tag given all existing tags in ˜y. To reduce variance in B, we take repeated samples of ˜y. In the limit (with infinitely many corrupted versions of y), the expected recon- struction error under the corrupting distribution can be expressed as r(B) = 1 n n i = 1 E yi − B ˜yi 2 p( ˜y i |y ) . (2) Let us denote as Y ≡ [y1, · · · , yn ] the matrix contain-
  • 9. Marginalized blank-out regularization (2)  (2)を式変形して  つまり実際に を作る必要は無い(pだけ決めればよい)  この復元誤差の期待値をロス関数に加える 9 com- that that m the of the e en- ocess. on of from tches lying Let us denote as Y ≡ [y1, · · · , yn ] the matrix contain- ing the partial labels for each image in each column. Define P ≡ n i = 1 yi E[˜yi ] and Q ≡ n i = 1 E[˜yi ˜yi ], then we can rewrite the loss in (2) as r(B) = 1 n trace(BQB − 2PB + Y Y ) (3) We use Eq. (3) to regularize B. For the uniform “ blank-out” noise introduced above, we have the ex- pected value of the corruptions E[˜y]p( ˜y |y ) = (1 − p)y,
  • 11. 拡張  Tag bootstrapping  Bはタグの共起関係で学習されてるので、共起しないけど似たタグ が補完されない(例: lakeとpond)  stacking  Byiを新たなアノテーションとしてもう一度学習、を繰り返す  共起関係を伝搬させるイメージ?  スタック数は実験的に決定 11
  • 12. 画像特徴  複数の特徴を組み合わせ(既存手法と同じもの)  GIST  6種類の色ヒストグラム  8種類の局所特徴のBoW  事前に内積がχ^2距離を近似する空間にあらかじめ写像しておく  Vedaldi, A. and Zisserman, A. Efficient additive kernels via explicit feature maps. PAMI, 34(3):480–492, 2012. 12
  • 14. 精度評価  leastSquares: FastTagのタグ補完無し版、ベースライン  TagProp: これまでのstate of the art, 学習O(n^2), テストO(n)  FastTagの精度はほぼTagPropと同じ 14 brown, ear, painting, woman, yellow board, lake, man wave, white blue, circle, feet round, white drawing, hat, people red, woman blue, dot, feet, microphone, statue hair, ice, man, white, woman black, moon, red, shadow, woman LowF-1s Figure 2. Predicted keywords using FastTag for sample images in the ESP game dataset (using all 268 keywords). Table 1. Comparison of FastTag and TagProp in terms of P, R, F1 score and N+ on the Corel5K dataset. Previously reported results using other image annotation techniques are also included for reference. Name P R F1 N+ leastSquares 29 32 30 125 CRM (Lavrenko et al., 2003) 16 19 17 107 InfNet (M et zler & M anmat ha, 2004) 17 24 20 112 NPDE (Yavlinsky et al., 2005) 18 21 19 114 SML (Carneiro et al., 2007) 23 29 26 137 MBRM (Feng et al., 2004) 24 25 24 122 TGLM (Liu et al., 2009) 25 29 27 131 JEC (M akadia et al., 2008) 27 32 29 139 TagProp (Guillaumin et al., 2009) 33 42 37 160 Fast Tag 32 43 37 166 report the number of keywords with non-zero recall value (N+ ). In all metrics a higher value indicates better performance. B aselines. We compare against leastSquares, a ridge regression model which uses the partial subset of tags y1, . . . , yn as labels to learn W , i.e., FastTag without tag enrichment. We also compare against the Tag- Prop algorithm (Guillaumin et al., 2009), a local kNN method combining different distance metrics through Table 2. Comparison of FastTag and TagProp in terms of P , R, F1 score and N+ on the Espgame and IAPRTC-12 datasets. ESP game IAPR P R F1 N+ P R F1 N+ leastSquares 35 19 25 215 40 19 26 198 MBRM 18 19 18 209 24 23 23 223 JEC 24 19 21 222 29 19 23 211 TagProp 39 27 32 238 45 34 39 260 FastTag 46 22 30 247 47 26 34 280 4.2. Comparison wit h relat ed work Table 1 shows a detailed comparison of FastTag to the leastSquares baseline and eight published results on the Corel5K dataset. We can make three obser- vations: 1. The performance of FastTag aligns with that of TagProp (so far the best algorithm in terms of accuracy on this dataset), and significantly outper- forms the other methods; 2. The leastSquares base- line, which corresponds to FastTag without the tag enricher, performs surprisingly well compared to exist- ing approaches, which suggests theadvantage of a sim- ple model that can extend to a large number of visual descriptor, as opposed to a complex model that can af- ford fewer descriptors. One may instead more cheaply brown, ear, painting, woman, yellow board, lake, man wave, white blue, circle, feet round, white drawing, hat, people red, woman blue, dot, feet, microphone, statue hair, ice, man, white, woman black, moon, red, shadow, woman LowF-1scor Figure 2. Predicted keywords using FastTag for sample images in the ESP game dataset (using all 268 keywords). Table 1. Comparison of FastTag and TagProp in terms of P, R, F1 score and N+ on the Corel5K dataset. Previously reported results using other image annotation techniques are also included for reference. Name P R F1 N+ leastSquares 29 32 30 125 CRM (Lavrenko et al., 2003) 16 19 17 107 InfNet (M et zler & M anmat ha, 2004) 17 24 20 112 NPDE (Yavlinsky et al., 2005) 18 21 19 114 SML (Carneiro et al., 2007) 23 29 26 137 MBRM (Feng et al., 2004) 24 25 24 122 TGLM (Liu et al., 2009) 25 29 27 131 JEC (M akadia et al., 2008) 27 32 29 139 TagProp (Guillaumin et al., 2009) 33 42 37 160 Fast Tag 32 43 37 166 report the number of keywords with non-zero recall value (N+ ). In all metrics a higher value indicates better performance. Baselines. We compare against leastSquares, a ridge regression model which uses the partial subset of tags y1, . . . , yn as labels to learn W , i.e., FastTag without tag enrichment. We also compare against the Tag- Prop algorithm (Guillaumin et al., 2009), a local kNN method combining different distance metrics through metric learning. It is the current best performer on these benchmark sets. Most existing work do not pro- vide publicly available implementations. As a result, Table 2. Comparison of FastTag and TagProp in terms of P , R, F1 score and N+ on the Espgame and IAPRTC-12 datasets. ESP game IAPR P R F1 N+ P R F1 N+ leastSquares 35 19 25 215 40 19 26 198 MBRM 18 19 18 209 24 23 23 223 JEC 24 19 21 222 29 19 23 211 TagProp 39 27 32 238 45 34 39 260 FastTag 46 22 30 247 47 26 34 280 4.2. Comparison wit h relat ed work Table 1 shows a detailed comparison of FastTag to the leastSquares baseline and eight published results on the Corel5K dataset. We can make three obser- vations: 1. The performance of FastTag aligns with that of TagProp (so far the best algorithm in terms of accuracy on this dataset), and significantly outper- forms the other methods; 2. The leastSquares base- line, which corresponds to FastTag without the tag enricher, performs surprisingly well compared to exist- ing approaches, which suggeststheadvantage of a sim- ple model that can extend to a large number of visual descriptor, as opposed to a complex model that can af- ford fewer descriptors. One may instead more cheaply glean the benefits of a complex model via non-linear transformation of the features. 3. The duo classifier formulation of FastTag, which adds the tag enricher,
  • 16. 16 Fast I m age Tagging bug, green, insect, tree, wood baby, doll, dress, green, hair blue, earth, globe, map, world fish, fishing, fly, hook, orange blue, cloud, ocean, sky, water fly, plane, red, sky, train black, computer, drawing handle, screen brown, ear, painting, woman, yellow board, lake, man wave, white blue, circle, feet round, white drawing, hat, people red, woman blue, dot, feet, microphone, statue hair, ice, man, white, woman black, moon, red, shadow, woman asian, boy, gun, man, white anime, comic, people, red, woman feet, flower, fur. red, shoes blue, chart, diagram, internet, table gray, sky, stone, water, white black, dark, game, man, night plane, red, sky, train, truck HighF-1scoreLowF-1scoreRandom Figure 2. Predicted keywords using FastTag for sample images in the ESP game dataset (using all 268 keywords). Table 1. Comparison of FastTag and TagProp in terms of P, R, F1 score and N+ on the Corel5K dataset. Previously Table 2. Comparison of FastTag and TagProp in terms of P , R, F1 score and N+ on the Espgame and IAPRTC-12