Presenter: Mathias Lux, Alpen-Adria-Universität Klagenfurt, Austria
Paper: http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_8.pdf
Video: https://youtu.be/-jv9NO5pBhk
Authors: Stefan Petscharnig, Klaus Schöffmann, Mathias Lux
Abstract: In this working note, we describe our approach to gastrointestinal disease and anatomical landmark classification for the Medico task at MediaEval 2017. We propose an inception-like CNN architecture and a fixed-crop data augmentation scheme for training and testing. The architecture is based on GoogLeNet and designed to keep the number of trainable parameters and its computational overhead small. Preliminary experiments show that the architecture is able to learn the classification problem from scratch using a tiny fraction of the provided training data only.
2. Overview
● Deep Learning based approach
● Inception like structure
● Extending the training set
● Results
increase by parkjisun from the Noun Project
3. Deep Learning - The W hy
Com pare NVidia’s recent blog post about MICCAI (Quebec, CA)
● Glob al h ealth care sp en d in gs are aroun d 6.5 trillion USD
● Of 80 0 sub m ission s to MICCAI 20 17
○ 60 % of th em are focusin g on m ach in e learn in g
○ 80 % of th e ab ove are ab out d eep learn in g
src. h ttp s://b logs.n vid ia.com /b log/20 17/0 9/11/m ed ical-im agin g-at-m iccai/
4. Deep Learning - The How
● Training of a new netw ork based on the design of GoogLeNet
○ Using an inception-like CNN architecture
○ Sm all num ber of param eters and sm all com putational
overhead
● Seeing how far w e can go w ith the few training sam ples
● Experim ents w ith tw o m odels and different training set sizes
5. Incept ion like Approach
● Inception m odule allow s for different layers in parallel
○ 1x1convolution branch is left out om pared to GoogleNet / had no effect in our
experim ents
● Should favor the best approach for training data autom atically
6. Augm ent ing t he Training Set
● Seven different cropping schem es
● Random m irroring
● Extraction at 3 different scales
7. Result s: Confusion
● Sim ilar confusion in all m odels
○ dyed resection m argins w ith dyed-liftedpolyps
○ polyps w ith ulcerative-colitis
○ hypothesis: crops are the reason as polyps and resection
m argins are not alw ays visible in center crops
● Minor w eaknesses at distinguishing norm al-z-line from
esophagitis
● Experim ents w ith binary classification CNNs and global
features did not yield better results
9. Result s: Runt im e
● Measurem ent of forw ard passes over 10 0 0 iterations (GTXTitan
X)
○ seven forw ard passes needed for one prediction
● Model A takes 2.25m s per forw ard pass
● Model B10 24 and B20 48 take 2.91m s and 3.42m s
● Rather fast com pared to
○ Caffenet (an AlexNet variant) - 3.27m s
○ GoogLeNet - 14.16m s
Running by Karina M. from the Noun Project
10. Result s: Num bers
● Model is learned from scratch
● Only a fraction of the training data already yields results
11. Conclusions
Prelim in ary exp erim en ts sh ow th at th e arch itecture is ab le to learn th e
classification p rob lem from scratch usin g a tin y fraction of th e p rovid ed
train in g d ata on ly.
Ad d in g th e glob al features d id n ot result in in creased classification
p erform an ce in our exp erim en ts