SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
Performance Evaluation of GANs
in a semi-supervised OCR Use Case
Florian Wilhelm London, 2018-10-11
Special Interests
• Mathematical Modelling
• Recommendation Systems
• Data Science in Production
• Python Data Stack
• Maintainer of PyScaffold
Dr. Florian Wilhelm
Principal Data Scientist @ inovex
@FlorianWilhelm
FlorianWilhelm
florianwilhelm.info
2
Florian Tanten
Master Thesis @ inovex
October 2017 - May 2018
IT-project house for digital transformation:
‣ Agile Development & Management
‣ Web · UI/UX · Replatforming · Microservices
‣ Mobile · Apps · Smart Devices · Robotics
‣ Big Data & Business Intelligence Platforms
‣ Data Science · Data Products · Search · Deep Learning
‣ Data Center Automation · DevOps · Cloud · Hosting
‣ Trainings & Coachings
Using technology to inspire our
clients. And ourselves.
inovex offices in
Karlsruhe · Cologne · Munich ·
Pforzheim · Hamburg · Stuttgart.
www.inovex.de
4
Agenda
1. Use Case
2. Text Spotting
3. Data and Pipeline
4. Generative Adversarial Networks
5. Semi-supervised Learning
6. Results
5https://www.autocheck.com/vehiclehistory/autocheck/en/vinbasics
Vehicle Identification Number (VIN)
Unique identifier
like a fingerprint of a
vehicle
serial number
country security code
model year
assembly
plant
details
flexible fuel vehicles
manufacturer
6
Use Case
VIN:
WF0DXXGAKDEJ37385
VIN-Decoder Manufacturer: BMW
Model: X3
Year: 2013-03-21
Engine power: 143 PS
Equipment:
- Xenon Lights
...
Information about the car:
Spotting the vehicle identification number (VIN) in images
of vehicle registration documents
7
OCR -Libraries
PyOCR
Commercial software Open source tools
8
„VSSZZZGJZHR03G533“
???
+
OCR with Tesseract
9
Agenda
1. Use Case
2. Text Spotting
3. Data and Pipeline
4. Generative Adversarial Networks
5. Semi-supervised Learning
6. Results
Character detection & extraction Character recognition
11Girshick et al. (2014), „Region-Based Convolutional Networks for Accurate Object Detection and Segmentation“
Methodology in Text Spotting
Sliding Window
Computer Vision
Tools
Others
- Connected components
- Stroke width transform
- Edge detection
- SVM
- Learning with HOG
- CNN
- Region proposal
- Hypotheses CNN pooling
Character or word
CNN
CNN + RNN
SVM
Nearest Neighbor
High-performer
current studies
CNN = Convolutional Neural Network
SVM = Support Vector Machine
HOG = Histogram of oriented Gradients
RNN = Recurrent Neural Networks
RL = Reinforcement Learning
379Character
Recognition
...
Spotting = Detection + Recognition
12https://en.wikipedia.org/wiki/Convolutional_neural_network; http://intellabs.github.io/ParallelJavaScript/
Convolutional Neural Network
Max pooling with a 2x2 filter and stride = 2Convolution with 3x3 kernel and stride = 1
14
Agenda
1. Use Case
2. Data and Pipeline
3. Semi-supervised Learning
4. Generative Adversarial Networks
5. Semi-supervised Learning
6. Results
15
Objectives
- ~170 images of vehicle registration documents
b) Semi-supervised method
a) Supervised method
2. Comparison of classifiers
1. Implementation of a prototype „XLG0H200NA0A10348“
Dataset:
Text
Spotting
16
End-to-End Text Spotting Pipeline
Sliding window
Character Detector (2 classes)
Chararacter Recognizer (36 classes)
Only one window per character
All windows
Non Maximum Suppression
All windows with characters
Region of Interest Extractor
Image depicting only VIN
X L G 0 H 2 0 N A 10 04 43 80
17
Small Dataset
What to do about that?
1. Data Generation
2. Data Augmentation
18
Data Augmentation
Data augmentation:
Datasets:
Original image labeled manually as „0“
2 classes 36 classes
Chararacter Recognizer (36 classes)
Label: „0“
Character Detector (2 classes)
Label: „character“
Label: „no character“
19
170 images of
vehicle registration documents
Training set
85 images 85 images
Training sets of classifiers Testing sets of classifiers Testing sets of pipeline
85 images
RecognizerDetector
~ 42000 images
2 classes
~ 8000 images
36 classes
~ 42000 images
2 classes
~ 8000 images
36 classes
RecognizerDetector
Data Augmentation Data Augmentation
Testing set
Datasets
20
Classifiers
1. Supervised Convolutional Neural Network
2. Semi-supervised Generative Adversarial Network Generator Discriminator
Input Feature extraction Classification
21
Agenda
1. Use Case
2. Text Spotting
3. Data and Pipeline
4. Generative Adversarial Networks
5. Semi-supervised Learning
6. Results
22
Yann LeCun
Director of Facebook AI Research, Prof at NYU
“... (GANs) and the variations
that are now being proposed
is the most interesting idea
in the last 10 years in ML, in
my opinion.“
Ian J. Goodfellow
@ Google Brain
23
Generative Adversarial Network
Generator (G) Discriminator (D)
Goal: Generate images, which seem to
be realistic
Goal: Differentiate between fake and real
images
24
Generative Adversarial Network
Generator (G)
Discriminator (D) Is D
correct?
„D classified the generated image as 10% real“
„yes“
A
B
.
.
.
8
9
F
Real imagesReal labeled images
25Goodfellow et al. (2014), Generative Adversarial Networks
Mathematical formulation
Discriminator output
for real images
Discriminator output
for fake images
Discriminator calculates likelihood [0,1] for an image being real
Maximizing discriminator loss
Minimizing generator loss
Objective function
Training (alternating)
26
Example of generated images
Training images: Generated images during learning process:
27
Agenda
1. Use Case
2. Text Spotting
3. Data and Pipeline
4. Generative Adversarial Networks
5. Semi-supervised Learning
6. Results
28
Semi-supervised Learning
Supervised
Learning
Unsupervised
Learning
Semi-supervised
Learning
• Makes use of
unlabeled data
• Combines supervised
and unsupervised
learning
29
Semi-supervised GAN for Character Detection
Real labeled images
Real unlabeled
images
Generator
Discriminator
30
Agenda
1. Use Case
2. Text Spotting
3. Data and Pipeline
4. Generative Adversarial Networks
5. Semi-supervised Learning
6. Results
31
Character Detector (2 classes)
60,00%
70,00%
80,00%
90,00%
100,00%
20 50 100 200 400 700 1000 5000 15000 30000 42000
DCNN DCNN pretrained
„Character“ „No character“
Manually generated images with CAPTCHA methods
Pretraining of DCNN
Size of labeled training set
Accuracy
Bildschirmfoto 2018-04-24 um 17.48.20Bildschirmfoto 2018-04-24 um 17.48.20
32
Character Detector (2 classes)
60,00%
70,00%
80,00%
90,00%
100,00%
20 50 100 200 400 700 1000 5000 15000 30000 42000
DCNN DCNN pretrained Supervised GAN
Generator
Discriminator
Real labeled
images
C
C
F
C C
F
Supervised GAN
Size of labeled training set
Accuracy
Bildschirmfoto 2018-04-24 um 17.48.20
Bildschirmfoto 2018-04-24 um
17.48.20
33
Character Detector (2 classes)
60,00%
70,00%
80,00%
90,00%
100,00%
20 50 100 200 400 700 1000 5000 15000 30000 42000
DCNN DCNN pretrained Supervised GAN Semi-supervised GAN
Discriminator
C
C
F
Generator
F
Real labeled
images
CC
Real unlabeled
images
Semi-supervised GAN
Size of labeled training set
Accuracy
Bildschirmfoto 2018-04-24 um 17.48.20
34
Character Recognizer (36 classes)
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
36 72 108 200 300 400 600 800 1000 5000 8000
60,00%
70,00%
80,00%
90,00%
100,00%
20
50
100
200
400
700
1000
5000
15000
30000
42000
DCNN DCNN pretrained SupervisedGAN
Character DetectorCharacter Recognizer
Size of labeled training set
Accuracy
Size of labeled training set
Accuracy
Bildschirmfoto 2018-04-24 um 17.48.20
..
35
End-to-End Text Spotting Pipeline
Sliding window
Character Detector (2 classes)
Chararacter Recognizer (36 classes)
Non Maximum Suppression
Region of Interest Extractor
Accuracy = 99.94%
85 images
1.
2.
85.
.
36
Google Cloud Vision API
Sliding window
Character Detector (2 classes)
Chararacter Recognizer (36 classes)
Non Maximum Suppression
Region of Interest Extractor
85 images
∅ Levenshtein distance = 4.49
85 images of VINs
..
.
Our ApproachGoogle Cloud Vision API vs.
∅ Levenshtein distance = 0.011
Levenshtein distance:
Classification Label
AYZ33 XYZ321 = 3
37
Key Learnings
• Custom solutions can tremendously outperform
off-the-shelve software in a specific use-case
• Semi-supervised GANs can be successfully
applied in use-cases with little data
• With simple data augmentation techniques
having only little data might be enough
38
Bibliography
- Krizhevsky et al. (2012) „ImageNet Classication with Deep Convolutional Neural Networks“
- Girshick et al. (2014), „Region-Based Convolutional Networks for Accurate Object Detection and Segmentation“
- Girshick et al. (2015), „Fast R-CNN“
- Girshick et al. (2015), „Faster R-CNN“
- He et al. (2017), „Mask-R-CNN“
- Goodfellow et al. (2014) „Generative Adversarial Networks"
Thank you!
Florian Wilhelm
Principal Data Scientist
inovex GmbH
Schanzenstraße 6-20
Kupferhütte 1.13
51063 Köln
florian.wilhelm@inovex.de

Weitere ähnliche Inhalte

Was ist angesagt?

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Universitat Politècnica de Catalunya
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Bartlomiej Twardowski
 
Bayesian Networks with R and Hadoop
Bayesian Networks with R and HadoopBayesian Networks with R and Hadoop
Bayesian Networks with R and HadoopOfer Mendelevitch
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningMarc Bolaños Solà
 
Deep Advances in Generative Modeling
Deep Advances in Generative ModelingDeep Advances in Generative Modeling
Deep Advances in Generative Modelingindico data
 
BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkDoug Chang
 
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingScikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingGael Varoquaux
 

Was ist angesagt? (9)

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
 
Bayesian Networks with R and Hadoop
Bayesian Networks with R and HadoopBayesian Networks with R and Hadoop
Bayesian Networks with R and Hadoop
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal Learning
 
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial NetworkSEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
 
Deep Advances in Generative Modeling
Deep Advances in Generative ModelingDeep Advances in Generative Modeling
Deep Advances in Generative Modeling
 
BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning Talk
 
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imagingScikit-learn and nilearn: Democratisation of machine learning for brain imaging
Scikit-learn and nilearn: Democratisation of machine learning for brain imaging
 

Ähnlich wie Performance evaluation of GANs in a semisupervised OCR use case

Machine learning quality for production
Machine learning quality for productionMachine learning quality for production
Machine learning quality for productionyusuke shibui
 
Machine Vision On Embedded Platform
Machine Vision On Embedded Platform Machine Vision On Embedded Platform
Machine Vision On Embedded Platform Omkar Rane
 
Machine vision Application
Machine vision ApplicationMachine vision Application
Machine vision ApplicationAbhishek Sainkar
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵CHENHuiMei
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Provectus
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptxArthur240715
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Databricks
 
[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...
[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...
[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...Naoki (Neo) SATO
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 
Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportationWanjin Yu
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERGanesan Narayanasamy
 
Intelligent Thumbnail Selection
Intelligent Thumbnail SelectionIntelligent Thumbnail Selection
Intelligent Thumbnail SelectionKamil Sindi
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...polochau
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksDataBench
 
20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - Amsterdam20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - AmsterdamAllen Day, PhD
 

Ähnlich wie Performance evaluation of GANs in a semisupervised OCR use case (20)

AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Machine learning quality for production
Machine learning quality for productionMachine learning quality for production
Machine learning quality for production
 
Machine Vision On Embedded Platform
Machine Vision On Embedded Platform Machine Vision On Embedded Platform
Machine Vision On Embedded Platform
 
Machine vision Application
Machine vision ApplicationMachine vision Application
Machine vision Application
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
RaymondResume2015v5
RaymondResume2015v5RaymondResume2015v5
RaymondResume2015v5
 
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
 
[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...
[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...
[Connect(); // Japan 2016] Microsoft の AI 開発最新アップデート ~ Cognitive Services からA...
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportation
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWER
 
Intelligent Thumbnail Selection
Intelligent Thumbnail SelectionIntelligent Thumbnail Selection
Intelligent Thumbnail Selection
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data Benchmarks
 
20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - Amsterdam20170402 Crop Innovation and Business - Amsterdam
20170402 Crop Innovation and Business - Amsterdam
 

Mehr von inovex GmbH

lldb – Debugger auf Abwegen
lldb – Debugger auf Abwegenlldb – Debugger auf Abwegen
lldb – Debugger auf Abwegeninovex GmbH
 
Are you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AIAre you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AIinovex GmbH
 
Why natural language is next step in the AI evolution
Why natural language is next step in the AI evolutionWhy natural language is next step in the AI evolution
Why natural language is next step in the AI evolutioninovex GmbH
 
Network Policies
Network PoliciesNetwork Policies
Network Policiesinovex GmbH
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learninginovex GmbH
 
Jenkins X – CI/CD in wolkigen Umgebungen
Jenkins X – CI/CD in wolkigen UmgebungenJenkins X – CI/CD in wolkigen Umgebungen
Jenkins X – CI/CD in wolkigen Umgebungeninovex GmbH
 
AI auf Edge-Geraeten
AI auf Edge-GeraetenAI auf Edge-Geraeten
AI auf Edge-Geraeteninovex GmbH
 
Prometheus on Kubernetes
Prometheus on KubernetesPrometheus on Kubernetes
Prometheus on Kubernetesinovex GmbH
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systemsinovex GmbH
 
Representation Learning von Zeitreihen
Representation Learning von ZeitreihenRepresentation Learning von Zeitreihen
Representation Learning von Zeitreiheninovex GmbH
 
Talk to me – Chatbots und digitale Assistenten
Talk to me – Chatbots und digitale AssistentenTalk to me – Chatbots und digitale Assistenten
Talk to me – Chatbots und digitale Assistenteninovex GmbH
 
Künstlich intelligent?
Künstlich intelligent?Künstlich intelligent?
Künstlich intelligent?inovex GmbH
 
Das Android Open Source Project
Das Android Open Source ProjectDas Android Open Source Project
Das Android Open Source Projectinovex GmbH
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning Interpretabilityinovex GmbH
 
People & Products – Lessons learned from the daily IT madness
People & Products – Lessons learned from the daily IT madnessPeople & Products – Lessons learned from the daily IT madness
People & Products – Lessons learned from the daily IT madnessinovex GmbH
 
Infrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with PulumiInfrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with Pulumiinovex GmbH
 
Remote First – Der Arbeitsplatz in der Cloud
Remote First – Der Arbeitsplatz in der CloudRemote First – Der Arbeitsplatz in der Cloud
Remote First – Der Arbeitsplatz in der Cloudinovex GmbH
 

Mehr von inovex GmbH (20)

lldb – Debugger auf Abwegen
lldb – Debugger auf Abwegenlldb – Debugger auf Abwegen
lldb – Debugger auf Abwegen
 
Are you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AIAre you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AI
 
Why natural language is next step in the AI evolution
Why natural language is next step in the AI evolutionWhy natural language is next step in the AI evolution
Why natural language is next step in the AI evolution
 
WWDC 2019 Recap
WWDC 2019 RecapWWDC 2019 Recap
WWDC 2019 Recap
 
Network Policies
Network PoliciesNetwork Policies
Network Policies
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
Jenkins X – CI/CD in wolkigen Umgebungen
Jenkins X – CI/CD in wolkigen UmgebungenJenkins X – CI/CD in wolkigen Umgebungen
Jenkins X – CI/CD in wolkigen Umgebungen
 
AI auf Edge-Geraeten
AI auf Edge-GeraetenAI auf Edge-Geraeten
AI auf Edge-Geraeten
 
Prometheus on Kubernetes
Prometheus on KubernetesPrometheus on Kubernetes
Prometheus on Kubernetes
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Azure IoT Edge
Azure IoT EdgeAzure IoT Edge
Azure IoT Edge
 
Representation Learning von Zeitreihen
Representation Learning von ZeitreihenRepresentation Learning von Zeitreihen
Representation Learning von Zeitreihen
 
Talk to me – Chatbots und digitale Assistenten
Talk to me – Chatbots und digitale AssistentenTalk to me – Chatbots und digitale Assistenten
Talk to me – Chatbots und digitale Assistenten
 
Künstlich intelligent?
Künstlich intelligent?Künstlich intelligent?
Künstlich intelligent?
 
Dev + Ops = Go
Dev + Ops = GoDev + Ops = Go
Dev + Ops = Go
 
Das Android Open Source Project
Das Android Open Source ProjectDas Android Open Source Project
Das Android Open Source Project
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning Interpretability
 
People & Products – Lessons learned from the daily IT madness
People & Products – Lessons learned from the daily IT madnessPeople & Products – Lessons learned from the daily IT madness
People & Products – Lessons learned from the daily IT madness
 
Infrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with PulumiInfrastructure as (real) Code – Manage your K8s resources with Pulumi
Infrastructure as (real) Code – Manage your K8s resources with Pulumi
 
Remote First – Der Arbeitsplatz in der Cloud
Remote First – Der Arbeitsplatz in der CloudRemote First – Der Arbeitsplatz in der Cloud
Remote First – Der Arbeitsplatz in der Cloud
 

Kürzlich hochgeladen

%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 

Kürzlich hochgeladen (20)

%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

Performance evaluation of GANs in a semisupervised OCR use case

  • 1. Performance Evaluation of GANs in a semi-supervised OCR Use Case Florian Wilhelm London, 2018-10-11
  • 2. Special Interests • Mathematical Modelling • Recommendation Systems • Data Science in Production • Python Data Stack • Maintainer of PyScaffold Dr. Florian Wilhelm Principal Data Scientist @ inovex @FlorianWilhelm FlorianWilhelm florianwilhelm.info 2 Florian Tanten Master Thesis @ inovex October 2017 - May 2018
  • 3. IT-project house for digital transformation: ‣ Agile Development & Management ‣ Web · UI/UX · Replatforming · Microservices ‣ Mobile · Apps · Smart Devices · Robotics ‣ Big Data & Business Intelligence Platforms ‣ Data Science · Data Products · Search · Deep Learning ‣ Data Center Automation · DevOps · Cloud · Hosting ‣ Trainings & Coachings Using technology to inspire our clients. And ourselves. inovex offices in Karlsruhe · Cologne · Munich · Pforzheim · Hamburg · Stuttgart. www.inovex.de
  • 4. 4 Agenda 1. Use Case 2. Text Spotting 3. Data and Pipeline 4. Generative Adversarial Networks 5. Semi-supervised Learning 6. Results
  • 5. 5https://www.autocheck.com/vehiclehistory/autocheck/en/vinbasics Vehicle Identification Number (VIN) Unique identifier like a fingerprint of a vehicle serial number country security code model year assembly plant details flexible fuel vehicles manufacturer
  • 6. 6 Use Case VIN: WF0DXXGAKDEJ37385 VIN-Decoder Manufacturer: BMW Model: X3 Year: 2013-03-21 Engine power: 143 PS Equipment: - Xenon Lights ... Information about the car: Spotting the vehicle identification number (VIN) in images of vehicle registration documents
  • 9. 9 Agenda 1. Use Case 2. Text Spotting 3. Data and Pipeline 4. Generative Adversarial Networks 5. Semi-supervised Learning 6. Results
  • 10. Character detection & extraction Character recognition 11Girshick et al. (2014), „Region-Based Convolutional Networks for Accurate Object Detection and Segmentation“ Methodology in Text Spotting Sliding Window Computer Vision Tools Others - Connected components - Stroke width transform - Edge detection - SVM - Learning with HOG - CNN - Region proposal - Hypotheses CNN pooling Character or word CNN CNN + RNN SVM Nearest Neighbor High-performer current studies CNN = Convolutional Neural Network SVM = Support Vector Machine HOG = Histogram of oriented Gradients RNN = Recurrent Neural Networks RL = Reinforcement Learning 379Character Recognition ... Spotting = Detection + Recognition
  • 11. 12https://en.wikipedia.org/wiki/Convolutional_neural_network; http://intellabs.github.io/ParallelJavaScript/ Convolutional Neural Network Max pooling with a 2x2 filter and stride = 2Convolution with 3x3 kernel and stride = 1
  • 12. 14 Agenda 1. Use Case 2. Data and Pipeline 3. Semi-supervised Learning 4. Generative Adversarial Networks 5. Semi-supervised Learning 6. Results
  • 13. 15 Objectives - ~170 images of vehicle registration documents b) Semi-supervised method a) Supervised method 2. Comparison of classifiers 1. Implementation of a prototype „XLG0H200NA0A10348“ Dataset: Text Spotting
  • 14. 16 End-to-End Text Spotting Pipeline Sliding window Character Detector (2 classes) Chararacter Recognizer (36 classes) Only one window per character All windows Non Maximum Suppression All windows with characters Region of Interest Extractor Image depicting only VIN X L G 0 H 2 0 N A 10 04 43 80
  • 15. 17 Small Dataset What to do about that? 1. Data Generation 2. Data Augmentation
  • 16. 18 Data Augmentation Data augmentation: Datasets: Original image labeled manually as „0“ 2 classes 36 classes Chararacter Recognizer (36 classes) Label: „0“ Character Detector (2 classes) Label: „character“ Label: „no character“
  • 17. 19 170 images of vehicle registration documents Training set 85 images 85 images Training sets of classifiers Testing sets of classifiers Testing sets of pipeline 85 images RecognizerDetector ~ 42000 images 2 classes ~ 8000 images 36 classes ~ 42000 images 2 classes ~ 8000 images 36 classes RecognizerDetector Data Augmentation Data Augmentation Testing set Datasets
  • 18. 20 Classifiers 1. Supervised Convolutional Neural Network 2. Semi-supervised Generative Adversarial Network Generator Discriminator Input Feature extraction Classification
  • 19. 21 Agenda 1. Use Case 2. Text Spotting 3. Data and Pipeline 4. Generative Adversarial Networks 5. Semi-supervised Learning 6. Results
  • 20. 22 Yann LeCun Director of Facebook AI Research, Prof at NYU “... (GANs) and the variations that are now being proposed is the most interesting idea in the last 10 years in ML, in my opinion.“ Ian J. Goodfellow @ Google Brain
  • 21. 23 Generative Adversarial Network Generator (G) Discriminator (D) Goal: Generate images, which seem to be realistic Goal: Differentiate between fake and real images
  • 22. 24 Generative Adversarial Network Generator (G) Discriminator (D) Is D correct? „D classified the generated image as 10% real“ „yes“ A B . . . 8 9 F Real imagesReal labeled images
  • 23. 25Goodfellow et al. (2014), Generative Adversarial Networks Mathematical formulation Discriminator output for real images Discriminator output for fake images Discriminator calculates likelihood [0,1] for an image being real Maximizing discriminator loss Minimizing generator loss Objective function Training (alternating)
  • 24. 26 Example of generated images Training images: Generated images during learning process:
  • 25. 27 Agenda 1. Use Case 2. Text Spotting 3. Data and Pipeline 4. Generative Adversarial Networks 5. Semi-supervised Learning 6. Results
  • 26. 28 Semi-supervised Learning Supervised Learning Unsupervised Learning Semi-supervised Learning • Makes use of unlabeled data • Combines supervised and unsupervised learning
  • 27. 29 Semi-supervised GAN for Character Detection Real labeled images Real unlabeled images Generator Discriminator
  • 28. 30 Agenda 1. Use Case 2. Text Spotting 3. Data and Pipeline 4. Generative Adversarial Networks 5. Semi-supervised Learning 6. Results
  • 29. 31 Character Detector (2 classes) 60,00% 70,00% 80,00% 90,00% 100,00% 20 50 100 200 400 700 1000 5000 15000 30000 42000 DCNN DCNN pretrained „Character“ „No character“ Manually generated images with CAPTCHA methods Pretraining of DCNN Size of labeled training set Accuracy Bildschirmfoto 2018-04-24 um 17.48.20Bildschirmfoto 2018-04-24 um 17.48.20
  • 30. 32 Character Detector (2 classes) 60,00% 70,00% 80,00% 90,00% 100,00% 20 50 100 200 400 700 1000 5000 15000 30000 42000 DCNN DCNN pretrained Supervised GAN Generator Discriminator Real labeled images C C F C C F Supervised GAN Size of labeled training set Accuracy Bildschirmfoto 2018-04-24 um 17.48.20 Bildschirmfoto 2018-04-24 um 17.48.20
  • 31. 33 Character Detector (2 classes) 60,00% 70,00% 80,00% 90,00% 100,00% 20 50 100 200 400 700 1000 5000 15000 30000 42000 DCNN DCNN pretrained Supervised GAN Semi-supervised GAN Discriminator C C F Generator F Real labeled images CC Real unlabeled images Semi-supervised GAN Size of labeled training set Accuracy Bildschirmfoto 2018-04-24 um 17.48.20
  • 32. 34 Character Recognizer (36 classes) 0,00% 10,00% 20,00% 30,00% 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00% 36 72 108 200 300 400 600 800 1000 5000 8000 60,00% 70,00% 80,00% 90,00% 100,00% 20 50 100 200 400 700 1000 5000 15000 30000 42000 DCNN DCNN pretrained SupervisedGAN Character DetectorCharacter Recognizer Size of labeled training set Accuracy Size of labeled training set Accuracy Bildschirmfoto 2018-04-24 um 17.48.20
  • 33. .. 35 End-to-End Text Spotting Pipeline Sliding window Character Detector (2 classes) Chararacter Recognizer (36 classes) Non Maximum Suppression Region of Interest Extractor Accuracy = 99.94% 85 images 1. 2. 85. .
  • 34. 36 Google Cloud Vision API Sliding window Character Detector (2 classes) Chararacter Recognizer (36 classes) Non Maximum Suppression Region of Interest Extractor 85 images ∅ Levenshtein distance = 4.49 85 images of VINs .. . Our ApproachGoogle Cloud Vision API vs. ∅ Levenshtein distance = 0.011 Levenshtein distance: Classification Label AYZ33 XYZ321 = 3
  • 35. 37 Key Learnings • Custom solutions can tremendously outperform off-the-shelve software in a specific use-case • Semi-supervised GANs can be successfully applied in use-cases with little data • With simple data augmentation techniques having only little data might be enough
  • 36. 38 Bibliography - Krizhevsky et al. (2012) „ImageNet Classication with Deep Convolutional Neural Networks“ - Girshick et al. (2014), „Region-Based Convolutional Networks for Accurate Object Detection and Segmentation“ - Girshick et al. (2015), „Fast R-CNN“ - Girshick et al. (2015), „Faster R-CNN“ - He et al. (2017), „Mask-R-CNN“ - Goodfellow et al. (2014) „Generative Adversarial Networks"
  • 37. Thank you! Florian Wilhelm Principal Data Scientist inovex GmbH Schanzenstraße 6-20 Kupferhütte 1.13 51063 Köln florian.wilhelm@inovex.de