SlideShare ist ein Scribd-Unternehmen logo
1 von 61
Downloaden Sie, um offline zu lesen
Learning from Computer Simulations to
Tackle Real-World Problems
Hoon Kim
Joint work with Kangwook Lee, Gyeongjo Hwang, and Changho Suh
KAIST
Hoon Kim
Korea Advanced Institute of Science and Technology (KAIST)
• M.S. in School of Electrical Engineering (2017 Mar. – Current)
• Advisor: Prof. Changho Suh
• B.S. in School of Electrical Engineering (2012 Feb. – 2017 Feb.)
• Double major Computer Science
• Interested in utilizing simulators to tackle real world problems!
About myself
1.2 million images / 1000 Object Classes
à Such datasets are not always viable in the real world
• Deep learning has achieved exceptional performances in many tasks
• Training sets need to be large, diverse, and accurately annotated
Motivation
Application 1: ”Predicting Eye Gaze Vector”
à Difficult to annotate the eye gaze vector
E.g., eyegaze dataset
(ˆ✓, ˆ)
Motivation
Application 2: ”Predicting Accidents”
à Difficult to acquire the data itself in the real world
E.g., accident dataset
Accident / Non-accident
Motivation
Recently, many researchers proposed utilizing simulator to tackle data scarcity
à Generate arbitrary type & amount of data
à No labeling cost
However, synthetic data may not be realistic enough,
resulting in poor performance in the real world
Motivation
UnityEyes – Eye simulator AirSim – Drone simulator Robot Arm Simulator
CARLA - Driving simulator
Overview of this talk
1. Simulated+Unsupervised Learning with Adaptive Data Generation and Bidirectional Mapping
- International Conference on Learning Representation (ICLR) 2018
2. Crash to not Crash: Learn to Identify Dangerous Vehicles Using a Simulator
- Association for the Advancement in Artificial Intelligence (AAAI) 2019
Overview of this talk
1. Simulated+Unsupervised Learning with Adaptive Data Generation and Bidirectional Mapping
- International Conference on Learning Representation (ICLR) 2018
2. Crash to not Crash: Learn to Identify Dangerous Vehicles Using a Simulator
- Association for the Advancement in Artificial Intelligence (AAAI) 2019
Source domain (simulator)
Xs
Ys
Target domain (real world)
Xt
(-5, 5) ,
Given labeled source domain data (Xs, Ys)
unlabeled target domain data (Xt)
(25, 15)
, ,
Problem Formulation
à Train a predictor that works well in the target domain
[Shrivastava et al., CVPR 2017] proposed “Simulated+Unsupervised learning”
(3) Predict using the original images(1) Forward/translate images
F (ˆ✓, ˆ)GX!Y F(✓, ) = ( 3.0/ 8.2) (✓, ) = ( 14.2/17.3)
(✓, ) = ( 2.1/3.2) (✓, ) = ( 1/ 10)
(✓, ) = ( 3.0/ 8.2) (✓, ) = ( 14.2/17.3)
(✓, ) = ( 2.1/3.2) (✓, ) = ( 1/ 10)
They showed,
1. Refining synthetic images with GAN makes them look real while preserving the labels
2. Training models on these refined images leads to significant improvements in accuracy
(2) Train predictor on translated images
Domain Adaptation from Simulation to Real World
Comparison of a gaze estimator trained on Synthetic vs Refined Synthetic
Training Data % of images within
7 degree error
Synthetic Data 62.3
Synthetic Data 4x 64.9
Domain Adaptation from Simulation to Real World
Comparison of a gaze estimator trained on Synthetic vs Refined Synthetic
Training Data % of images within
7 degree error
Synthetic Data 62.3
Synthetic Data 4x 64.9
Refined Synthetic Data 69.4
Refined Synthetic Data 4x 87.2
Can we do better?
Domain Adaptation from Simulation to Real World
2 limitations of existing approach
1. [S] How can we specify synthetic label distribution?
à Can we utilize the flexibility of the simulator to generate better synthetic data?
2. [U] Not taking advantage of the asymmetry between the real and synthetic dataset
• The amount of available synthetic data is much larger than that of real data
• Synthetic data is usually much cleaner (less noise) compared to real data
à Can we devise a better unsupervised domain adaptation algorithm?
Domain Adaptation from Simulation to Real World
1. Adaptive Data Generation
• Match distributions between synthetic /real labels
2. Learn Bi-Directional Mapping
• Learn mapping between synthetic /real data with modified CycleGAN
3. Backward Translation
• Make prediction with real synthetic translated image
Our proposed algorithm - Overview
Simulator
Synthetic data w/
Labels
Generate Train
Predictor
Real data w/o
Labels
Real data w/
pseudo-labels
Real-to-synthetic
mapping Training
Synthetic-like
test data
Test data Prediction Results
Testing
Update simulator parameters
Adaptive Data Generation
Learn Bi-Directional
Mapping
Backward Translation
Predictor
P
(30, -10) (20, 5) (-11, 13) (20, 5)
Our proposed algorithm - Overview
Simulator
Synthetic data w/
Labels
Generate Train
Predictor
Real data w/o
Labels
Real data w/
pseudo-labels
Real-to-synthetic
mapping Training
Synthetic-like
test data
Test data Prediction Results
Testing
Update simulator parameters
Adaptive Data Generation
Learn Bi-Directional
Mapping
Backward Translation
Predictor
P
(30, -10) (20, 5) (-11, 13) (20, 5)
Our proposed algorithm - Overview
à Label distributions can be different in synthetic & real datasets
à Fortunately, the synthetic distribution can be easily modified
How Simulator works
Specify Label Distribution
Draw random
label instance Generate image
Q. How can we systematically adapt the label distribution?
Step 1 – Adaptive Data Generation
Overview of our adaptive data generation process
(✓, ) = (12.5, 7.8)
(✓, ) = ( 37.9, 7.1)
(✓, ) = (7.4, 27.2)
(✓, ) = (6.93, 12.8)
F
(3) Train a predictor F on (X, Z)
(ˆ✓, ˆ) = ( 0.5, 1.2)
(ˆ✓, ˆ) = ( 10.6, 1.0)(ˆ✓, ˆ) = ( 11.0, 12.8)
(ˆ✓, ˆ) = ( 16.2, 6.4)
(4) Pseudo-label real data: (Y, F(Y ))
(✓, ) = (12.5, 7.8) (✓, ) = ( 37.9, 7.1)
(✓, ) = (7.4, 27.2) (✓, ) = (6.93, 12.8)
(2) Generate synthetic data (X, Z)
(5) Estimate distribution of F(Y )
ˆ✓
ˆ PF (Y ) = P(ˆ✓, ˆ)
(1) Set distribution of Z
PZ = P(✓, )
✓
(1) Set distribution of Z
PZ = P(✓, )
✓
Step 1 – Adaptive Data Generation
à Experimental results of the adaptive data generation algorithm
Iteration 0 Iteration 1 Iteration 2
8.54 3.00 1.04
0.31 0.02 0.00
Step 1 – Adaptive Data Generation
Simulator
Synthetic data w/
Labels
Generate Train
Predictor
Real data w/o
Labels
Real data w/
pseudo-labels
Real-to-synthetic
mapping Training
Synthetic-like
test data
Test data Prediction Results
Testing
Update simulator parameters
Adaptive Data Generation
Learn Bi-Directional
Mapping
Backward Translation
Predictor
P
(30, -10) (20, 5) (-11, 13) (20, 5)
Step 2 – Learn Bi-Directional Mapping
• CycleGAN
[Zhu et al., ICCV 2017]
Step 2 – Learn Bi-Directional Mapping
Step 2 – Learn Bi-Directional Mapping
Step 2 – Learn Bi-Directional Mapping
Step 3 – Backward Translation
Simulator
Synthetic data w/
Labels
Generate Train
Predictor
Real data w/o
Labels
Real data w/
pseudo-labels
Real-to-synthetic
mapping Training
Synthetic-like
test data
Test data Prediction Results
Testing
Update simulator parameters
Adaptive Data Generation
Learn Bi-Directional
Mapping
Backward Translation
Predictor
P
(30, -10) (20, 5) (-11, 13) (20, 5)
Step 3 – Backward Translation
Backward/Translation: Our proposed training/test methodology
Forward/Translation: training/test methodology proposed by [Shrivastava et al., 2017]
Error with Forward/Translation: 8.4 7.71
Error with Backward/Translation: 8.4 7.60
Results
Comparison of the state of the arts on the MPIIGaze dataset
(Error: mean angle error in degrees)
Method
Trained
on
Tested
with Error
KNN w/ UT Multiview
(Zhang et al., 2015) R R 16.2
CNN w/ UT Multiview
(Zhang et al., 2015) R R 13.9
CNN w/ UnityEyes
(Shrivastava et al., 2017) S R 11.2
KNN w/ UnityEyes
(Wood et al., 2016) S R 9.9
CNN w/ UnityEyes
(Shrivastava et al., 2017) RS R 7.8
Ours w/ feature loss S RR 7.6
Conclusion
• In this work, we proposed a new S+U learning algorithm
• Fully exploits the flexibility of simulators
• Improved bidirectional mapping
• Utilizes the data asymmetry between real/synthetic dastaset
• Achieved the state-of-the-art result on MPIIGaze dataset
Overview of this talk
1. Simulated+Unsupervised Learning with Adaptive Data Generation and Bidirectional Mapping
- International Conference on Learning Representation (ICLR) 2018
2. Crash to not Crash: Learn to Identify Dangerous Vehicles Using a Simulator
- Association for the Advancement in Artificial Intelligence (AAAI) 2019
Collision Prediction System (CPS)
• Goal of CPS: To predict vehicle collisions ahead of time
• Existing CPSs are
- based on rule-based algorithms
- require expensive devices (LIDAR, communication chips, …)
• Our goal is to build a CPS that is
- based on data-driven algorithms
- require commodity devices (camera sensor)
Deep Learning ModelSingle Camera Collision Probability
Problem Formulation
Deep Learning ModelSingle Camera Collision Probability
1. Object Tracking 2. Object Classification
detects/tracks vehicles
in the input image(s)
classifies whether or not the
given vehicle is dangerous
Problem Formulation
Deep Learning ModelSingle Camera Collision Probability
1. Object Tracking 2. Object Classification
detects/tracks vehicles
in the input image(s)
classifies whether or not the
given vehicle is dangerous
à Our main interest!
CPS in Action
Single Camera
CPS
- Safe vehicle - Dangerous vehicle
Challenges in building a data-driven CPS: Data Scarcity
• Existing driving dataset (KITTI, Cityscapes) has no accident data
• Collecting accident data in the real world is very expensive
• We can take advantage of driving simulator
CARLA - Driving simulatorGrand Theft Auto V
How do we collect accident dataset from a simulator?
Q1. How do we reproduce accident scenes in a simulator?
How do we collect accident dataset from a simulator?
Q1. How do we reproduce accident scenes in a simulator?
à we modify driver behavior via simulator hacking
• ScriptHookV - A third party script that allows for an access to game engine
• On top of ScriptHookV, we implemented the following helper functions
• Start_Cruising(s) – makes the player’s car cruise along the lane at a constant speed s
• Get_Nearby_Vehicles() – returns the list of vehicles in the current scene of the game
• Set_Vehicle_Out_Of_Control(v) – makes the vehicle v drive out of control
• Is_My_Car_Colliding() – checks whether my car is colliding or not
• Our accident simulator generates accident data in a random manner by setting nearby
vehicles out of control while cruising and saving past 20 frames when collision occurs
Data Collection
Original GTA V
Data Collection
Driver behavior modification
via simulator hacking
Hacked GTA V
Normal Driving Scene
Accident Driving Scene
Data Collection - GTACrash
• 3661 from original GTA V (Non-accident scenes)
• 7720 from hacked GTA V (Accident scenes)
• Each scene is composed of following frames:
···0.1s 0.1s
20 frames
Each vehicle’s states
(position, velocity, etc.)
are also collected
Dataset is available @ https://sites.google.com/view/crash-to-not-crash
How can we collect accident dataset from a simulator?
Q2. How can we make them more realistic?
How can we collect accident dataset from a simulator?
Q2. How can we make them more realistic?
à Feature (Image): Style transfer via GAN
à Label: label adaptation via motion model
Synthetic train data Real test data
Domain Adaptation – Feature Adaptation
Domain difference between
train and test images
results in poor test performance
à We apply backward translation approach (Lee, Kim, and Suh 2018)
Synthetic train data Real test data Backward-translated test data
Map images to
synthetic domain
at test time
Domain Adaptation – Feature Adaptation
How can we collect accident dataset from a simulator?
Q2. How can we make them more realistic?
à Feature (Image): Style transfer via GAN
à Label: label adaptation via motion model
Synthetic labels are biased
à Synthetic labels are biased!
Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1
Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1
Desired Label = 0 Desired Label = 0 Desired Label = 1 Desired Label = 1
Desired Label = 1 Desired Label = 1 Desired Label = 1 Desired Label = 1
No sign of dangerNo sign of danger dangerous dangerous
• Vehicle path prediction is a well studied field
• Constant Turn Rate and Acceleration (CTRA) model – Schubert et al. 2008
• state-of-the-art path prediction performance in the real world
• Requires vehicle states such as position, velocity, acceleration, and turn rate
• By accessing internal states we can access all the required information
à Our proposed label adaptation
1. Compute the predicted path of each vehicle as per the CTRA model
2. Assign a positive label if and only if the predicted path intersects with that of the ego-vehicle
Vehicles’ internal states
• Position
• Velocity
• Acceleration
• Turnrate
CTRA based
Label Adaptation
Refined Label
Label Adaptation
Label Adaptation
t+0.5s t+0.5s
t+2.0s
t+1.5s
t+1.0s
t+0.5s
t+0.5s
t+1.0s
t+1.5s
t+2.0s
t
t
t
t
Ego Vehicle
Nearby VehicleRefined Label = 0 Refined Label = 1
1. Retrieve
internal states
2. Predict future path
via motion model
3. Refine Labels
à We adapt labels based on a well-studied vehicle motion model
Desired Label = 0 Desired Label = 1
Experiments
Q. Which approach will perform better on real data?
Trained with
Small real dataset
Trained with
Large synthetic data
Pros Dataset is unbiased Large amount
Cons Small amount Dataset is biased
Experiments - YouTubeCrash
• 100 Non-accident Scenes / 122 Accident Scenes
• Manual annotation of bounding box and vehicle IDs
• A half of YouTubeCrash is used as training dataset for baseline
• Another half is used for test dataset for both baseline and our approach
Accidents Non-Accidents
Experiments – Dataset Comparison
YouTubeCrash GTACrash
Accident scenes 61 ~7000
Non-accident scenes 50 ~3500
Positive samples ~ 1000 ~130000
Negative samples ~ 5500 ~600000
~ 100x
···
Safe
or
Dangerous
CNN
Experiments – Classification Algorithms
Alg1. 5-layer NN with bounding box sizes
Alg2. CNN with bounding boxes
Alg3. CNN with RGBs & bounding boxes
···
Safe
or
Dangerous
CNN
···
Safe
or
Dangerous
0.32
0.02
···
Fully Connected
Network
Experiments – Test AUC on YoutubeCrash
Alg/Training data
Trained with
real data
Trained with
synthetic data
Size of
training data
61 accidents
50 non-accidents
7000 accidents
3500 non-accidents
Alg1 0.8766
Alg2 0.8708
Alg3 0.8730
Experiments – Test AUC on YoutubeCrash
Alg/Training data
Trained with
real data
Trained with
synthetic data
Size of
training data
61 accidents
50 non-accidents
7000 accidents
3500 non-accidents
Alg1 0.8766 0.8812
Alg2 0.8708 0.8992
Alg3 0.8730 0.9086
Performance improvement with all tested algorithms
Experiments – Test AUC on YoutubeCrash
Alg/Training data
Trained with
real data
Trained with
synthetic data
Size of
training data
61 accidents
50 non-accidents
7000 accidents
3500 non-accidents
Alg1 0.8766 0.8812
Alg2 0.8708 0.8992
Alg3 0.8730 0.9086
A complex model can be trained only with large data
Increasing
model
complexity
Experiments – Test AUC on YoutubeCrash
Alg/Training data
Trained with
real data
Trained with
synthetic data
Size of
training data
61 accidents
50 non-accidents
7000 accidents
3500 non-accidents
Alg1 0.8766 0.8812
Alg2 0.8708 0.8992
Alg3 0.8730
0.9086
à 0.9164
Domain adaptation further improves performance
Experiments – Missed Detection vs Time-To-Collision
Easy Difficult
As TTC increases, it is
more difficult to identify
dangerous vehicles
Experiments – Missed Detection vs Time-To-Collision
Alg 1 – Trained with real data
Experiments – Missed Detection vs Time-To-Collision
18.5%
improvement
Alg 1 – Trained with real data
Alg 3 – Trained with synthetic data
Experiment Results – Missed Detection vs Time-To-Collision
Alg 1 – Trained with real data
Alg 3 – Trained with synthetic data
When TTC > 0.8, a complex model
is necessary for identifying distant
dangerous vehicles
Velocity
Wheel Angle
Vehicle Angle
Experiment Results – Learning Key Visual Features
Checkout the website for the demonstration video
Checkout the website for the demonstration video
Conclusion
• In this work, we propose an efficient deep learning-based CPS
- Customized synthetic data generator
- Novel label adaptation method
- Our method outperforms those trained with real data
• Our datasets and demo videos are available at
https://sites.google.com/view/crash-to-not-crash

Weitere ähnliche Inhalte

Was ist angesagt?

An adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approachAn adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approach
Cemal Ardil
 
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
paperpublications3
 

Was ist angesagt? (20)

Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DL
 
Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight P...
Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight P...Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight P...
Google Research Siggraph Whitepaper | Total Relighting: Learning to Relight P...
 
Digital Image Processing: Digital Image Fundamentals
Digital Image Processing: Digital Image FundamentalsDigital Image Processing: Digital Image Fundamentals
Digital Image Processing: Digital Image Fundamentals
 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Lec17 sparse signal processing & applications
Lec17 sparse signal processing & applicationsLec17 sparse signal processing & applications
Lec17 sparse signal processing & applications
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
 
G1802053147
G1802053147G1802053147
G1802053147
 
FORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHOD
FORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHODFORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHOD
FORGERY (COPY-MOVE) DETECTION IN DIGITAL IMAGES USING BLOCK METHOD
 
An adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approachAn adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approach
 
I0343065072
I0343065072I0343065072
I0343065072
 
H1802054851
H1802054851H1802054851
H1802054851
 
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
 
Making Robots Learn
Making Robots LearnMaking Robots Learn
Making Robots Learn
 
APPLYING R-SPATIOGRAM IN OBJECT TRACKING FOR OCCLUSION HANDLING
APPLYING R-SPATIOGRAM IN OBJECT TRACKING FOR OCCLUSION HANDLINGAPPLYING R-SPATIOGRAM IN OBJECT TRACKING FOR OCCLUSION HANDLING
APPLYING R-SPATIOGRAM IN OBJECT TRACKING FOR OCCLUSION HANDLING
 
research_paper
research_paperresearch_paper
research_paper
 
Performance analysis of transformation and bogdonov chaotic substitution base...
Performance analysis of transformation and bogdonov chaotic substitution base...Performance analysis of transformation and bogdonov chaotic substitution base...
Performance analysis of transformation and bogdonov chaotic substitution base...
 
Fast Feature Pyramids for Object Detection
Fast Feature Pyramids for Object DetectionFast Feature Pyramids for Object Detection
Fast Feature Pyramids for Object Detection
 
A proposed accelerated image copy-move forgery detection-vcip2014
A proposed accelerated image copy-move forgery detection-vcip2014A proposed accelerated image copy-move forgery detection-vcip2014
A proposed accelerated image copy-move forgery detection-vcip2014
 
IMAGE QUALITY OPTIMIZATION USING RSATV
IMAGE QUALITY OPTIMIZATION USING RSATVIMAGE QUALITY OPTIMIZATION USING RSATV
IMAGE QUALITY OPTIMIZATION USING RSATV
 

Ähnlich wie Learning from Computer Simulation to Tackle Real-World Problems

Ähnlich wie Learning from Computer Simulation to Tackle Real-World Problems (20)

Using synthetic data for computer vision model training
Using synthetic data for computer vision model trainingUsing synthetic data for computer vision model training
Using synthetic data for computer vision model training
 
Artem Baklanov - Votes Aggregation Techniques in Geo-Wiki Crowdsourcing Game:...
Artem Baklanov - Votes Aggregation Techniques in Geo-Wiki Crowdsourcing Game:...Artem Baklanov - Votes Aggregation Techniques in Geo-Wiki Crowdsourcing Game:...
Artem Baklanov - Votes Aggregation Techniques in Geo-Wiki Crowdsourcing Game:...
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
Atari Game State Representation using Convolutional Neural Networks
Atari Game State Representation using Convolutional Neural NetworksAtari Game State Representation using Convolutional Neural Networks
Atari Game State Representation using Convolutional Neural Networks
 
An Automatic Attendance System Using Image processing
An Automatic Attendance System Using Image processingAn Automatic Attendance System Using Image processing
An Automatic Attendance System Using Image processing
 
AI Powered Drones
AI Powered DronesAI Powered Drones
AI Powered Drones
 
Sample CS Senior Capstone Projects
Sample CS Senior Capstone ProjectsSample CS Senior Capstone Projects
Sample CS Senior Capstone Projects
 
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
 
Iaetsd multi-view and multi band face recognition
Iaetsd multi-view and multi band face recognitionIaetsd multi-view and multi band face recognition
Iaetsd multi-view and multi band face recognition
 
One Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical ComputationOne Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical Computation
 
Applying Machine Learning for Mobile Games by Neil Patrick Del Gallego
Applying Machine Learning for Mobile Games by Neil Patrick Del GallegoApplying Machine Learning for Mobile Games by Neil Patrick Del Gallego
Applying Machine Learning for Mobile Games by Neil Patrick Del Gallego
 
Crude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationCrude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimization
 
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
Eugene Khvedchenya. State of the art Image Augmentations with Albumentations.
 
Real time video analytics with InfoSphere Streams, OpenCV and R
Real time video analytics with InfoSphere Streams, OpenCV and RReal time video analytics with InfoSphere Streams, OpenCV and R
Real time video analytics with InfoSphere Streams, OpenCV and R
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
IRJET- Smart Classroom Attendance System: Survey
IRJET- Smart Classroom Attendance System: SurveyIRJET- Smart Classroom Attendance System: Survey
IRJET- Smart Classroom Attendance System: Survey
 
Integrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsIntegrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of Robots
 
IRJET- Facial Expression Recognition using GPA Analysis
IRJET-  	  Facial Expression Recognition using GPA AnalysisIRJET-  	  Facial Expression Recognition using GPA Analysis
IRJET- Facial Expression Recognition using GPA Analysis
 

Mehr von NAVER Engineering

Mehr von NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Learning from Computer Simulation to Tackle Real-World Problems

  • 1. Learning from Computer Simulations to Tackle Real-World Problems Hoon Kim Joint work with Kangwook Lee, Gyeongjo Hwang, and Changho Suh KAIST
  • 2. Hoon Kim Korea Advanced Institute of Science and Technology (KAIST) • M.S. in School of Electrical Engineering (2017 Mar. – Current) • Advisor: Prof. Changho Suh • B.S. in School of Electrical Engineering (2012 Feb. – 2017 Feb.) • Double major Computer Science • Interested in utilizing simulators to tackle real world problems! About myself
  • 3. 1.2 million images / 1000 Object Classes à Such datasets are not always viable in the real world • Deep learning has achieved exceptional performances in many tasks • Training sets need to be large, diverse, and accurately annotated Motivation
  • 4. Application 1: ”Predicting Eye Gaze Vector” à Difficult to annotate the eye gaze vector E.g., eyegaze dataset (ˆ✓, ˆ) Motivation
  • 5. Application 2: ”Predicting Accidents” à Difficult to acquire the data itself in the real world E.g., accident dataset Accident / Non-accident Motivation
  • 6. Recently, many researchers proposed utilizing simulator to tackle data scarcity à Generate arbitrary type & amount of data à No labeling cost However, synthetic data may not be realistic enough, resulting in poor performance in the real world Motivation UnityEyes – Eye simulator AirSim – Drone simulator Robot Arm Simulator CARLA - Driving simulator
  • 7. Overview of this talk 1. Simulated+Unsupervised Learning with Adaptive Data Generation and Bidirectional Mapping - International Conference on Learning Representation (ICLR) 2018 2. Crash to not Crash: Learn to Identify Dangerous Vehicles Using a Simulator - Association for the Advancement in Artificial Intelligence (AAAI) 2019
  • 8. Overview of this talk 1. Simulated+Unsupervised Learning with Adaptive Data Generation and Bidirectional Mapping - International Conference on Learning Representation (ICLR) 2018 2. Crash to not Crash: Learn to Identify Dangerous Vehicles Using a Simulator - Association for the Advancement in Artificial Intelligence (AAAI) 2019
  • 9. Source domain (simulator) Xs Ys Target domain (real world) Xt (-5, 5) , Given labeled source domain data (Xs, Ys) unlabeled target domain data (Xt) (25, 15) , , Problem Formulation à Train a predictor that works well in the target domain
  • 10. [Shrivastava et al., CVPR 2017] proposed “Simulated+Unsupervised learning” (3) Predict using the original images(1) Forward/translate images F (ˆ✓, ˆ)GX!Y F(✓, ) = ( 3.0/ 8.2) (✓, ) = ( 14.2/17.3) (✓, ) = ( 2.1/3.2) (✓, ) = ( 1/ 10) (✓, ) = ( 3.0/ 8.2) (✓, ) = ( 14.2/17.3) (✓, ) = ( 2.1/3.2) (✓, ) = ( 1/ 10) They showed, 1. Refining synthetic images with GAN makes them look real while preserving the labels 2. Training models on these refined images leads to significant improvements in accuracy (2) Train predictor on translated images Domain Adaptation from Simulation to Real World
  • 11. Comparison of a gaze estimator trained on Synthetic vs Refined Synthetic Training Data % of images within 7 degree error Synthetic Data 62.3 Synthetic Data 4x 64.9 Domain Adaptation from Simulation to Real World
  • 12. Comparison of a gaze estimator trained on Synthetic vs Refined Synthetic Training Data % of images within 7 degree error Synthetic Data 62.3 Synthetic Data 4x 64.9 Refined Synthetic Data 69.4 Refined Synthetic Data 4x 87.2 Can we do better? Domain Adaptation from Simulation to Real World
  • 13. 2 limitations of existing approach 1. [S] How can we specify synthetic label distribution? à Can we utilize the flexibility of the simulator to generate better synthetic data? 2. [U] Not taking advantage of the asymmetry between the real and synthetic dataset • The amount of available synthetic data is much larger than that of real data • Synthetic data is usually much cleaner (less noise) compared to real data à Can we devise a better unsupervised domain adaptation algorithm? Domain Adaptation from Simulation to Real World
  • 14. 1. Adaptive Data Generation • Match distributions between synthetic /real labels 2. Learn Bi-Directional Mapping • Learn mapping between synthetic /real data with modified CycleGAN 3. Backward Translation • Make prediction with real synthetic translated image Our proposed algorithm - Overview
  • 15. Simulator Synthetic data w/ Labels Generate Train Predictor Real data w/o Labels Real data w/ pseudo-labels Real-to-synthetic mapping Training Synthetic-like test data Test data Prediction Results Testing Update simulator parameters Adaptive Data Generation Learn Bi-Directional Mapping Backward Translation Predictor P (30, -10) (20, 5) (-11, 13) (20, 5) Our proposed algorithm - Overview
  • 16. Simulator Synthetic data w/ Labels Generate Train Predictor Real data w/o Labels Real data w/ pseudo-labels Real-to-synthetic mapping Training Synthetic-like test data Test data Prediction Results Testing Update simulator parameters Adaptive Data Generation Learn Bi-Directional Mapping Backward Translation Predictor P (30, -10) (20, 5) (-11, 13) (20, 5) Our proposed algorithm - Overview
  • 17. à Label distributions can be different in synthetic & real datasets à Fortunately, the synthetic distribution can be easily modified How Simulator works Specify Label Distribution Draw random label instance Generate image Q. How can we systematically adapt the label distribution? Step 1 – Adaptive Data Generation
  • 18. Overview of our adaptive data generation process (✓, ) = (12.5, 7.8) (✓, ) = ( 37.9, 7.1) (✓, ) = (7.4, 27.2) (✓, ) = (6.93, 12.8) F (3) Train a predictor F on (X, Z) (ˆ✓, ˆ) = ( 0.5, 1.2) (ˆ✓, ˆ) = ( 10.6, 1.0)(ˆ✓, ˆ) = ( 11.0, 12.8) (ˆ✓, ˆ) = ( 16.2, 6.4) (4) Pseudo-label real data: (Y, F(Y )) (✓, ) = (12.5, 7.8) (✓, ) = ( 37.9, 7.1) (✓, ) = (7.4, 27.2) (✓, ) = (6.93, 12.8) (2) Generate synthetic data (X, Z) (5) Estimate distribution of F(Y ) ˆ✓ ˆ PF (Y ) = P(ˆ✓, ˆ) (1) Set distribution of Z PZ = P(✓, ) ✓ (1) Set distribution of Z PZ = P(✓, ) ✓ Step 1 – Adaptive Data Generation
  • 19. à Experimental results of the adaptive data generation algorithm Iteration 0 Iteration 1 Iteration 2 8.54 3.00 1.04 0.31 0.02 0.00 Step 1 – Adaptive Data Generation
  • 20. Simulator Synthetic data w/ Labels Generate Train Predictor Real data w/o Labels Real data w/ pseudo-labels Real-to-synthetic mapping Training Synthetic-like test data Test data Prediction Results Testing Update simulator parameters Adaptive Data Generation Learn Bi-Directional Mapping Backward Translation Predictor P (30, -10) (20, 5) (-11, 13) (20, 5) Step 2 – Learn Bi-Directional Mapping
  • 21. • CycleGAN [Zhu et al., ICCV 2017] Step 2 – Learn Bi-Directional Mapping
  • 22. Step 2 – Learn Bi-Directional Mapping
  • 23. Step 2 – Learn Bi-Directional Mapping
  • 24. Step 3 – Backward Translation Simulator Synthetic data w/ Labels Generate Train Predictor Real data w/o Labels Real data w/ pseudo-labels Real-to-synthetic mapping Training Synthetic-like test data Test data Prediction Results Testing Update simulator parameters Adaptive Data Generation Learn Bi-Directional Mapping Backward Translation Predictor P (30, -10) (20, 5) (-11, 13) (20, 5)
  • 25. Step 3 – Backward Translation Backward/Translation: Our proposed training/test methodology Forward/Translation: training/test methodology proposed by [Shrivastava et al., 2017] Error with Forward/Translation: 8.4 7.71 Error with Backward/Translation: 8.4 7.60
  • 26. Results Comparison of the state of the arts on the MPIIGaze dataset (Error: mean angle error in degrees) Method Trained on Tested with Error KNN w/ UT Multiview (Zhang et al., 2015) R R 16.2 CNN w/ UT Multiview (Zhang et al., 2015) R R 13.9 CNN w/ UnityEyes (Shrivastava et al., 2017) S R 11.2 KNN w/ UnityEyes (Wood et al., 2016) S R 9.9 CNN w/ UnityEyes (Shrivastava et al., 2017) RS R 7.8 Ours w/ feature loss S RR 7.6
  • 27. Conclusion • In this work, we proposed a new S+U learning algorithm • Fully exploits the flexibility of simulators • Improved bidirectional mapping • Utilizes the data asymmetry between real/synthetic dastaset • Achieved the state-of-the-art result on MPIIGaze dataset
  • 28. Overview of this talk 1. Simulated+Unsupervised Learning with Adaptive Data Generation and Bidirectional Mapping - International Conference on Learning Representation (ICLR) 2018 2. Crash to not Crash: Learn to Identify Dangerous Vehicles Using a Simulator - Association for the Advancement in Artificial Intelligence (AAAI) 2019
  • 29. Collision Prediction System (CPS) • Goal of CPS: To predict vehicle collisions ahead of time • Existing CPSs are - based on rule-based algorithms - require expensive devices (LIDAR, communication chips, …) • Our goal is to build a CPS that is - based on data-driven algorithms - require commodity devices (camera sensor) Deep Learning ModelSingle Camera Collision Probability
  • 30. Problem Formulation Deep Learning ModelSingle Camera Collision Probability 1. Object Tracking 2. Object Classification detects/tracks vehicles in the input image(s) classifies whether or not the given vehicle is dangerous
  • 31. Problem Formulation Deep Learning ModelSingle Camera Collision Probability 1. Object Tracking 2. Object Classification detects/tracks vehicles in the input image(s) classifies whether or not the given vehicle is dangerous à Our main interest!
  • 32. CPS in Action Single Camera CPS - Safe vehicle - Dangerous vehicle
  • 33. Challenges in building a data-driven CPS: Data Scarcity • Existing driving dataset (KITTI, Cityscapes) has no accident data • Collecting accident data in the real world is very expensive • We can take advantage of driving simulator CARLA - Driving simulatorGrand Theft Auto V
  • 34. How do we collect accident dataset from a simulator? Q1. How do we reproduce accident scenes in a simulator?
  • 35. How do we collect accident dataset from a simulator? Q1. How do we reproduce accident scenes in a simulator? à we modify driver behavior via simulator hacking
  • 36. • ScriptHookV - A third party script that allows for an access to game engine • On top of ScriptHookV, we implemented the following helper functions • Start_Cruising(s) – makes the player’s car cruise along the lane at a constant speed s • Get_Nearby_Vehicles() – returns the list of vehicles in the current scene of the game • Set_Vehicle_Out_Of_Control(v) – makes the vehicle v drive out of control • Is_My_Car_Colliding() – checks whether my car is colliding or not • Our accident simulator generates accident data in a random manner by setting nearby vehicles out of control while cruising and saving past 20 frames when collision occurs Data Collection
  • 37. Original GTA V Data Collection Driver behavior modification via simulator hacking Hacked GTA V Normal Driving Scene Accident Driving Scene
  • 38. Data Collection - GTACrash • 3661 from original GTA V (Non-accident scenes) • 7720 from hacked GTA V (Accident scenes) • Each scene is composed of following frames: ···0.1s 0.1s 20 frames Each vehicle’s states (position, velocity, etc.) are also collected Dataset is available @ https://sites.google.com/view/crash-to-not-crash
  • 39. How can we collect accident dataset from a simulator? Q2. How can we make them more realistic?
  • 40. How can we collect accident dataset from a simulator? Q2. How can we make them more realistic? à Feature (Image): Style transfer via GAN à Label: label adaptation via motion model
  • 41. Synthetic train data Real test data Domain Adaptation – Feature Adaptation Domain difference between train and test images results in poor test performance
  • 42. à We apply backward translation approach (Lee, Kim, and Suh 2018) Synthetic train data Real test data Backward-translated test data Map images to synthetic domain at test time Domain Adaptation – Feature Adaptation
  • 43. How can we collect accident dataset from a simulator? Q2. How can we make them more realistic? à Feature (Image): Style transfer via GAN à Label: label adaptation via motion model
  • 44. Synthetic labels are biased à Synthetic labels are biased! Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Synthetic Label = 1 Desired Label = 0 Desired Label = 0 Desired Label = 1 Desired Label = 1 Desired Label = 1 Desired Label = 1 Desired Label = 1 Desired Label = 1 No sign of dangerNo sign of danger dangerous dangerous
  • 45. • Vehicle path prediction is a well studied field • Constant Turn Rate and Acceleration (CTRA) model – Schubert et al. 2008 • state-of-the-art path prediction performance in the real world • Requires vehicle states such as position, velocity, acceleration, and turn rate • By accessing internal states we can access all the required information à Our proposed label adaptation 1. Compute the predicted path of each vehicle as per the CTRA model 2. Assign a positive label if and only if the predicted path intersects with that of the ego-vehicle Vehicles’ internal states • Position • Velocity • Acceleration • Turnrate CTRA based Label Adaptation Refined Label Label Adaptation
  • 46. Label Adaptation t+0.5s t+0.5s t+2.0s t+1.5s t+1.0s t+0.5s t+0.5s t+1.0s t+1.5s t+2.0s t t t t Ego Vehicle Nearby VehicleRefined Label = 0 Refined Label = 1 1. Retrieve internal states 2. Predict future path via motion model 3. Refine Labels à We adapt labels based on a well-studied vehicle motion model Desired Label = 0 Desired Label = 1
  • 47. Experiments Q. Which approach will perform better on real data? Trained with Small real dataset Trained with Large synthetic data Pros Dataset is unbiased Large amount Cons Small amount Dataset is biased
  • 48. Experiments - YouTubeCrash • 100 Non-accident Scenes / 122 Accident Scenes • Manual annotation of bounding box and vehicle IDs • A half of YouTubeCrash is used as training dataset for baseline • Another half is used for test dataset for both baseline and our approach Accidents Non-Accidents
  • 49. Experiments – Dataset Comparison YouTubeCrash GTACrash Accident scenes 61 ~7000 Non-accident scenes 50 ~3500 Positive samples ~ 1000 ~130000 Negative samples ~ 5500 ~600000 ~ 100x
  • 50. ··· Safe or Dangerous CNN Experiments – Classification Algorithms Alg1. 5-layer NN with bounding box sizes Alg2. CNN with bounding boxes Alg3. CNN with RGBs & bounding boxes ··· Safe or Dangerous CNN ··· Safe or Dangerous 0.32 0.02 ··· Fully Connected Network
  • 51. Experiments – Test AUC on YoutubeCrash Alg/Training data Trained with real data Trained with synthetic data Size of training data 61 accidents 50 non-accidents 7000 accidents 3500 non-accidents Alg1 0.8766 Alg2 0.8708 Alg3 0.8730
  • 52. Experiments – Test AUC on YoutubeCrash Alg/Training data Trained with real data Trained with synthetic data Size of training data 61 accidents 50 non-accidents 7000 accidents 3500 non-accidents Alg1 0.8766 0.8812 Alg2 0.8708 0.8992 Alg3 0.8730 0.9086 Performance improvement with all tested algorithms
  • 53. Experiments – Test AUC on YoutubeCrash Alg/Training data Trained with real data Trained with synthetic data Size of training data 61 accidents 50 non-accidents 7000 accidents 3500 non-accidents Alg1 0.8766 0.8812 Alg2 0.8708 0.8992 Alg3 0.8730 0.9086 A complex model can be trained only with large data Increasing model complexity
  • 54. Experiments – Test AUC on YoutubeCrash Alg/Training data Trained with real data Trained with synthetic data Size of training data 61 accidents 50 non-accidents 7000 accidents 3500 non-accidents Alg1 0.8766 0.8812 Alg2 0.8708 0.8992 Alg3 0.8730 0.9086 à 0.9164 Domain adaptation further improves performance
  • 55. Experiments – Missed Detection vs Time-To-Collision Easy Difficult As TTC increases, it is more difficult to identify dangerous vehicles
  • 56. Experiments – Missed Detection vs Time-To-Collision Alg 1 – Trained with real data
  • 57. Experiments – Missed Detection vs Time-To-Collision 18.5% improvement Alg 1 – Trained with real data Alg 3 – Trained with synthetic data
  • 58. Experiment Results – Missed Detection vs Time-To-Collision Alg 1 – Trained with real data Alg 3 – Trained with synthetic data When TTC > 0.8, a complex model is necessary for identifying distant dangerous vehicles
  • 59. Velocity Wheel Angle Vehicle Angle Experiment Results – Learning Key Visual Features Checkout the website for the demonstration video
  • 60. Checkout the website for the demonstration video
  • 61. Conclusion • In this work, we propose an efficient deep learning-based CPS - Customized synthetic data generator - Novel label adaptation method - Our method outperforms those trained with real data • Our datasets and demo videos are available at https://sites.google.com/view/crash-to-not-crash