Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars

Copyright © 2017 LUXOFT 1
Alexey Rybakov, LUXOFT
May 2017

What This Talk is AboutWhat This Talk is About
Key Decisions
in a Deep Learning Computer Vision Project
1. Data 2. Compute Pipeline
3. Training and
Tuning
• Data acquisition, and
how much is enough
• Data processing
pipeline
• Compute infrastructure
• AI platform and model
• Data preparation
• Network training
• Fine tuning

Why We Are Giving This TalkWhy We Are Giving This Talk
Our AI Practice:
-Computer vision
-..and non-vision AI
PharmaPharma
AgricultureAgricultureRetailRetail
IndustrialIndustrial
AutomotiveAutomotive

Introduction: Paradigm Shift in Software R&D
0.

Paradigm Shift in Software R&DParadigm Shift in Software R&D
Coding / testing
Integration, roll-out and
maintenance
Developers
IT Support
Knowledge
(Algorithm development)
System design
SME
System
Architects
Data selection
Learning Pipeline Design
Data preparation,
Training, Tuning
Integration, roll-out and
maintenance
SME
AI Architects
Data
specialists
Data and AI
Support
Access to DataFocus
of this
talk
New
Professions!

Data is the King
1.

Paradigm Shift: Another Way to Look At ItParadigm Shift: Another Way to Look At It
Was Becoming
Subject Matter Experts Data
Algorithm Development No algorithm Development
Domain (Vertical) Knowledge AI (System) Knowledge
Decision Quality is Art Decision Quality is Predictable
The single most important investment you can make in AI project
is Data Strategy
The single most important investment you can make in AI project
is Data Strategy

1. Data coverage - needs to be representative - cover all cases (more than you
think)
2. Data balance (normalization) - about equal amount of data for each
class/case/scenario
3. Amount of data
Data Sourcing: It Starts with Data!Data Sourcing: It Starts with Data!
Acquire
Data
Prepare
Data
Six data guidelines that we consider important
…to achieve the best results (accuracy, false positives, false negatives).
4. Data formatting - make it work with existing/selected DNN architectures: Like
ROI selection, breaking big picture into smaller, video into frames, etc
5. Data synthesis and augmentation - “cat and mirror reflection of a cat are two
different cats”. Often easier to transform existing data than to obtain new data
6. Data annotation

Example: Pharma DataExample: Pharma Data
Crystal clear media
before dissolution
Just after the sample
is dropped, and
remains a single solid
piece
When the sample
started to swell, still a
single piece
When the sample
continues to swell, and
produces many small
particles: low-contrast
media

Pharma Improper Data SamplingPharma Improper Data Sampling –– ImbalancedImbalanced
SetSet
Accuracy paradox
clear media sample dropped
sample dissolving 1 sample dissolving 2
precipitate presented cloudy media
Labeled video sample
Problem: dataset imbalance
•Tricky to identify
•What to do: change performance metric for trained network
What to do:
•Use penalized models: adjust cost function for imbalance
•Decompose large classes into smaller
•Resample: over- and under-sample to balance

Pharma Improper Data Sampling (contd.)Pharma Improper Data Sampling (contd.)
Different optimization approaches to
find separate thresholds for each class
(1,2,3)
3 options of Uniform class distribution
(balanced datasets)
Proper data shuffling is important
Common approaches used by NVidia
DIGITS (conventional 2-stage shuffling)
Our custom data shuffling gave us up
to 1.3% of increase in accuracy in
comparison with conventional scheme
Top 1% accuracy
Balanced dataset 1 Balanced dataset 2 Balanced dataset 3

Data Processing Pipeline
2.

Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning
[Ian Goodfellow et all, ISBN 978-0262035613]
Classic ML
Deep Learning

100% Deep Learning: possible but not practical
•In this diagram, it seems that deep learning works
from raw data. In reality this is the most ideal case
• Needs infinite data + infinite compute
•Practical implementations are still in between
Classic ML and DL with a lot of upfront non-neural
efforts in data selection and preparation.
That is why we need to build data processing
pipeline
Machine Learning vs. Deep LearningMachine Learning vs. Deep Learning
[Ian Goodfellow et all, ISBN 978-0262035613]

Feeding raw data straight to Deep Network is still a dream. Asymptotically possible, but
practically inefficient.
•Therefore we pipeline processing blocks, like this:
Visual Data Processing PipelineVisual Data Processing Pipeline

• Purpose:
• Region of interest (ROI) detection
• Camera calibration
• Detect shakes and shifts
• Solution
• Based on YOLO (you look only once,
http://pjreddie.com/darknet/yolo/)
• Processing in real-time
Example: Pharma Processing Pipeline (One of theExample: Pharma Processing Pipeline (One of the
AI Parts)AI Parts)

• Wobbling detection
• Represent video in temporal
space
• Search for sinusoid amplitude
• RPM detection
• Treat paddle as a signal
• Fourier transformation for
frequency detection
• Implemented for 20 FPS
• Calibration required
Example: Pharma Processing Pipeline (One of the Non-Example: Pharma Processing Pipeline (One of the Non-
AI Parts)AI Parts)
Spatial (left) and temporal (right) representation of video data
ROI (left) and signal (right) of paddle appearance

Platform selection (Caffe / Torch / TensorFlow / etc) - is like selecting a java app server, or web server:
many similarities and hard to chose, however this is what we use:
•Availability of models for your task
•Deployment compatibility: both enterprise and embedded
• Works with your cloud? Like Amazon AWS or MS Azure?
• Works on your device? Like ARM or NVidia GPUs?
•Distributed processing choices: Cloud / Edge
•Embedded optimization opportunities
•Maintenance considerations
•Support for training models, tools and scenarios <= important to think ahead
Platform Selection GuidelinesPlatform Selection Guidelines

We often hear: “use the latest, greatest model”.
However, we’d rather use “simple model for simple data”. Same as “good-enough”
concept.
Important selection criteria we encounter
•Production accuracy
•“Cascadability”
•Production run-time speed
•Training and fine-tuning scenarios: amount and
kind of required training data, efforts to train
Pharma example: Winner is AlexNet (yes!):
quick learner, fast to run, good accuracy on our data
Model Selection GuidelinesModel Selection Guidelines
Accuracy
Training Epoch

• Total number of different classes ~100,000
• We use a cascade of networks
• Level 1: Type-search network
• Level 2: Different Classification networks
for each type
• Conventional accuracy metrics
for type-search networks:
• mAP, Precision, Recall
• These metrics always use fixed IoU at =0.5
• IoU = Intersection over Union
• Classification accuracy strongly depends on IoU
• IoU = 0.5  classification probability =~ 0.68
• IoU > 0.7  classification probability > 0.90
Retail Example: “Cascadability” of a SearchRetail Example: “Cascadability” of a Search
NetworkNetwork
IoU=1 IoU=0.5
IoU
Classification
Probability
What is IoU:

Model Training and Tuning (Data Again!)
3.

AI System Development Is More IterativeAI System Development Is More Iterative
Traditional SW Engineering Deep Learning SW Engineering

Example: Pharma Iterative Training WorkflowExample: Pharma Iterative Training Workflow
database: ~5,000 images for
each class
Uniform distribution of
classes required
Training Epoch

Example: Pharma Iterative Training WorkflowExample: Pharma Iterative Training Workflow
DNN
Human
Legend
Crystal clear media
Sample dropped
Sample Swelling
Sample dissolving
Precipitate presented
Media Foggy
Iterative DNN training
Use trained DNN for fixing data labelling that follows "Pseudo-Label" technique
Visualize your data

1. Data coverage - needs to be representative - cover all cases (more than you
think)
2. Data balance (normalization) - about equal amount of data for each
class/case/scenario
3. Amount of data
Model Training: Time to Use Your Data!Model Training: Time to Use Your Data!
Acquire
Data
Prepare
Data
Six data guidelines that we consider important
…to achieve the best results (accuracy, false positives, false negatives).
4. Data formatting - make it work with existing/selected DNN architectures: Like
ROI selection, breaking big picture into smaller, video into frames, etc
5. Data synthesis and augmentation - “cat and mirror reflection of a cat are two
different cats”. Often easier to transform existing data than to obtain new data
6. Data annotation

Data Preparation and Annotation ScenariosData Preparation and Annotation Scenarios
Data Annotation
(Labeling)
Use Cases
Real / raw Manual Sometimes good choice for total greenfield. Need human
resources.
Tools dramatically increase the efficiency
Computer, then
human
Rudimentary AI provides draft annotation. Then humans
confirm/correct.
Example: house number captcha
Human, then
computer
Humans annotate 1st frame. Then existing CV methods provide
object tracking on subsequent frames. Very efficient for dynamic
scenes
Augmented Automated Use high quality labeled dataset and augment to simulate real-life
conditions. Example: “German Traffic Sign” dataset could have
been almost entirely synthesized trough augmentation.
Synthetic Automated Use 3D rendering software and scripts to generate scenes. Our
construction example.

Retail Example. Data Augmentation andRetail Example. Data Augmentation and
SynthesisSynthesis
Normal Light Conditions WB Deviations ISO Simulation
Year Synthesis

Railway Safety Example: Synthetic Data forRailway Safety Example: Synthetic Data for
TrainingTraining

Lessons Learned and Resources
!.

Lessons LearnedLessons Learned
model
effort
data
effort
data
effort

• The most important decision is data strategy
• Data acquisition: Need lots of data, full coverage, well balanced.
• Model decisions: Best overall may be not the best for you.
• Pipeline decisions: Cascade and combine: Classic and AI algorithms.
• Training: A lot can be achieved by data preparation and
synthesis.
Annotation tools save millions.
• Be prepared to stop. AI development is a very iterative process
Lessons LearnedLessons Learned

• Our website: www.luxoft.com
• Our last year talk here at the Summit 2016 about computer vision pipeline
optimization:
• Available in full on Embedded Vision Alliance website:
https://www.embedded-vision.com/platinum-members/luxoft/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit
ResourcesResources

Check out the demos at LUXOFT booth!Check out the demos at LUXOFT booth!
Extremely optimized artificial
intelligence and computer vision
pipelines running on low-power
embedded platforms and GPU
architectures: stereo vision, video and
image processing, DNNs; as well as
our Hybrid AI Platform that distributes
and manages both deep learning and
classic computation across cloud and
edge devices.
*photo of our last year booth

Alexey Rybakov
ARybakov@luxoft.com
LUXOFT
4400 Bohannon Dr Ste 235
Menlo Park, CA 94025
Thank you!
:)

Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars

Ähnlich wie Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Practical Artificial Intelligence: Deep Learning Beyond Cats and Cars

Hinweis der Redaktion