SlideShare a Scribd company logo
1 of 38
Download to read offline
© 2019 Codeplay Software Ltd
Can We Have Both Safety and
Performance in AI for
Autonomous Vehicles?
Andrew Richards
Codeplay
May 2019
© 2019 Codeplay Software Ltd
Outline
About Codeplay
What is functional safety?
What does an automotive AI system look like in terms of architecture?
=> The wide variety of compute-intensive algorithms
Why do we need high performance for safety?
And why accelerators are the only way to get to high performance
Requirements for safe engineering
Challenges in bringing existing CPU safe engineering practices to
accelerators
2
© 2019 Codeplay Software Ltd
Functional safety
Safety doesn’t mean the system doesn’t fail
➢ Safety means the system fails safely
How do you know if the system fails?
➢ You have to detect the failure with a high level of accuracy
➢ Both incorrect results and late results are a failure
What do you do if the system fails?
➢ You have to come up with a safe state to return to
3
© 2019 Codeplay Software Ltd
Functional Safety
“Absence of unreasonable risk due to
hazards caused by malfunctioning
behavior of electrical/electronic
systems”
The standard requires the
Development of the Product to be
“State of the Art”
Functional Safety lifecycle top down
approach from Vehicle to IPs & SW
Components
Safety Compliance from Project
Initiation to Project decommission
© 2019 Codeplay Software Ltd
Safety failure types
Systematic Failures:
Result from a failure in
design or manufacturing
Often a result of failure to
follow best practices
Rate of systematic failures
can be reduced through
continual and rigorous
process improvement
Random Failures:
Result from random
defects inherent to process
or usage condition
Rate of random failures
cannot generally be
reduced; focus must be on
the detection and handling
of random failures in the
application
© 2019 Codeplay Software Ltd
SOTIF: Safety Of The Intended Function
Systems or subsystems can cause
hazards based on erroneous
decision on the environment and
not necessarily caused by
malfunction of Electrical/Electronic
components (Addressed by ISO26262)
SOTIF answers the question of “How
do you intend to behave” by utilizing
the PAS guidance on design,
verification and validation.
SOTIF intends to address sensor
limitations (i.e. bad reflection,
snow), decision algorithms
(environment, location, highway
construction etc.), misuses by
drivers
6
2 1
43
Known
Unknown
Unsafe Safe
Reduction of scenarios in
areas 2 and 3 is the key,
by developing them onto
known scenarios
SAE levels for autonomous vehicles
SOTIF ISO 21448SOTIF
PAS 21448
Level 5
Fully
autonomous
Level 4
Deep self
control
Level 3
Limited
overall
control
Level 2
Execute
automated
manoeuvres
Level 1
Adaptive
assist
Level 0
Warnings
© 2019 Codeplay Software Ltd
Safety of Autonomous Driving needs
High Performance
- High Performance makes Safety Hard
7
© 2019 Codeplay Software Ltd
From sensing to control
Car controlPath planningSensor fusion
Deep learning
front-camera
Machine vision
and SLAM
surround cameras
LIDAR
RADAR
8
Redundancy is achieved by having multiple, independent, sensors and
perception algorithms combined via sensor fusion
© 2019 Codeplay Software Ltd
Performance cannot be achieved with CPUs
Car
control
Path
plannin
g
Object
trajectory
tracking /
prediction
Sensor
fusion
3D
mapping
Semantic
segmentation
Frame
capture
Camera
9
625
million
pixels per
second
1.5-7.5
TOPS for
each deep
learning
algorithm
250
million
cells
updated
per frame
/ sensor
Combine
all the
data
together
and check
Far beyond the processing power of a multi-core CPU
This level of processing can only be achieved with a
different AI accelerator designed for each class of
algorithm and sensor
Passive (fanless)
cooling requires no
more than 8 W-15 W
per processor
Adding a fan is a safety
challenge, as well as
adding a lot of cost
© 2019 Codeplay Software Ltd
Types of AI accelerator
Deep learning
inference
accelerator
•Fixed-point
precision (8-bit or
16-bit)
•Can execute fast
convolutions and
some basic CNN
layers
•Very high
performance, but
low
programmability
Programmable
accelerator (vision
tasks e.g. SLAM)
•Mix of
programmable
and fixed-function
•Mix of fixed-point
and floating-point
•Highly data
parallel with on-
chip memory
•Throughput
optimized
Sensor fusion
accelerator
•Very
programmable
•Floating-point
•On-chip memory
and caches
•Latency
optimized
•Complex
algorithms
Fixed-function
accelerator
•Simpler LIDAR
and Radar
processing
•Some machine
vision tasks, e.g.
scaling
10
© 2019 Codeplay Software Ltd
Requirements for safety
© 2019 Codeplay Software Ltd
Requirements for safe engineering
• Redundancy (multiple systems)
• Fault detection (both timing and accuracy)
• Fault handling
• Fault injection (to test fault detection &
handling)
• Coverage checking (to ensure test coverage)
• Coding guidelines (e.g. MISRA)
• Little or no dynamic memory management
12
How do we
bring these
capabilities to
accelerators?
© 2019 Codeplay Software Ltd
Redundancy: Systematic vs Random Faults
This architecture allows Processor #1 to fail
and Processor #2 to take over
But: what if the reason Processor #1 fails is
a fault that also applies to Processor #2?
➢ e.g. software failure in software that both
Processor #1 and Processor #2 run
A random fault may be solvable with two
identical redundant systems
But a systematic fault can only be solved
with two fundamentally different redundant
systems
13
Sensor
Processor
#1
Processor
#2
Fusion
© 2019 Codeplay Software Ltd
Redundancy
Redundancy is much easier to achieve with sensors and perception than
sensor fusion, planning and control
Redundancy from two identical systems does not solve systematic faults,
only transient faults
By using standard programming models, much easier to achieve
redundancy: much easier to mix-and-match components from different
suppliers to avoid systematic faults
By using standard programming models, much easier to integrate tools
from multiple vendors, e.g. static checkers, or memory checkers
The OpenCL SC (“Safety Critical”), Vulkan SC and SYCL SC working-groups
are working towards defining safer versions of these standards
14
© 2019 Codeplay Software Ltd
Fault detection
Timing faults can be detected with a watch-dog-timer
All operations must have a maximum timeout
The quantity of processing required for various perception algorithms can
vary by the scene: e.g. the more potential pedestrians discovered means
running pedestrian-classification on more regions of an image
One solution is to periodically pass known input data into each algorithm
and check it against known correct output data
The algorithms used must be deterministic (always give the same outputs
for the same inputs) which is not true of all parallel algorithms
15
© 2019 Codeplay Software Ltd
Fault handling
Handling faults in highly parallel software is a surprisingly tough challenge
Faults detected asynchronously need to be stored somewhere and then
processed. They can’t be handled immediately without consuming
resources asynchronously. This is a safety challenge
For massively parallel software, large numbers of faults could be created at
once: how to handle?
Most parallel programming models handle faults very badly. It’s a much
harder challenge than people expect
16
© 2019 Codeplay Software Ltd
Multi-threaded error handling
• Errors triggered on an accelerator are asynchronous
• Error handling can’t be executed on the accelerator
• When does the main CPU thread process error(s)?
17
Main CPU Thread
Offload
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
OffloadRunkernel
Accelerator Handler CPU Thread
Error
Time
Accelerator
threads are
grouped
Where does this thread
store the error?
This thread waits for the
accelerator to complete. Is that
fast enough to process the error?
© 2019 Codeplay Software Ltd
Pre-emption and independent forward progress
• Most accelerators are groups of SIMD/SIMT
units: this gives high performance per Watt
• “Single Instruction Multiple Data/Thread”
• This means each thread executes the same
instruction in “lock-step”
• Some threads may be inside the false branch
of a conditional: they “predicate” to not apply
effects of instructions until the condition ends
• This means that if one “thread” in a group
goes into an infinite loop, the others will also
pause indefinitely
• The accelerator does not complete until all
groups complete
18
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
© 2019 Codeplay Software Ltd
Putting an accelerator in a safe state
With a CPU thread, you stop the thread by no
longer giving it CPU cycles
Stopping a CPU thread is instant. Stopping an
accelerator thread is not
Stopping one group of accelerator threads
doesn’t necessarily stop other groups of
threads
You can’t safely free accelerator-accessed
memory until all accelerator threads have
safely stopped. You can’t easily predict how
long this will take
Simple solution: Shut down the whole chip
19
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Accelerator ‘Thread’
Kill
threads
Accelerator
memory buffer
© 2019 Codeplay Software Ltd
Fault injection
Can only test good fault handling if can inject faults into a system during
testing
Fault injection must happen asynchronously to be sure of finding bugs
Fault injection needs to work across multiple AI accelerators
Fault injection must be included into continuous test processes
Faults to consider:
• Transient hardware faults
• Overheating causing throttling
• Threading errors
• Algorithms taking an unusually long time to complete due to complex input data
We need fault injection tools for accelerators (e.g. NVIDIA SASSIFI)
20
© 2019 Codeplay Software Ltd
Coverage checking
A standard ISO 26262 process is to require line-coverage and condition-
coverage for test suites.
Tests that each line (or condition combination) is tested in a test suite
Commonly-supported on CPUs, but what about AI accelerators?
The compilers for AI accelerators typically perform transformations, such as
data-parallel vectorization used with GPUs, that significantly changes the
control-flow of the program relative to the source code
How do we define coverage-checking for AI accelerators?
21
© 2019 Codeplay Software Ltd
Coverage checking in a heterogeneous environment
Coverage checking is a way of applying a metric to a test-suite: does the
test-suite test every line in a program?
Stricter coverage checking ensures every condition in a conditional is also
tested
In a heterogeneous environment, a single source line may be compiled for
different accelerator cores
Each accelerator core may execute the source line in a slightly different way
• How do we define coverage in an accelerator model?
• How do we test coverage in an accelerator model?
• If a SIMT compiler has transformed code, what does coverage mean?
22
© 2019 Codeplay Software Ltd
Coding guidelines: MISRA C++
Standardized coding guidelines for writing safe software. Can be checked
with source code static checker tools
• Originated by the automotive industry, for the automotive industry
• But is applicable to any industry that requires high-integrity software
• Originally, Misra suggests (in its vision) its use in safety-related software
• But now suggests (in its vision) its applicability to any application with
high integrity or high reliability requirements
The MISRA C++ group is updating the MISRA C++ standard to support
accelerator programming, in collaboration with AUTOSAR. Being written as
an update to MISRA C++ 2008. This is where the AI and SYCL accelerator
support will go for autonomous driving coding guidelines
23
© 2019 Codeplay Software Ltd
Dynamic memory management
Accelerator programming models rely extensively on dynamic memory
management
This is a real challenge for AI accelerators: how to define a standard way of
statically-allocating memory for AI acceleration
How to free memory safely in a fault situation
How to isolate different safety domains in a program without corruption
between memory allocated in different safety domains
24
© 2019 Codeplay Software Ltd
Accelerator memory management
• Accelerators have a much more direct view of memory than a CPU
• The simplest approach is pinned memory: at a known physical address
• Accelerators have much simpler memory protection than a CPU
25
CPU
Virtual memory management system
Operating
System
Physical Memory (e.g. DDR)Storage (e.g. hard disk)
Accelerator
(There maybe a
memory
management unit
here, but usually
much simpler than
for a CPU)
© 2019 Codeplay Software Ltd
CPU
Hypervisor
Virtualization
Virtualization is well-defined for CPUs and can contribute to safety isolation
But for accelerators, virtualization is not clearly-defined
Can’t switch instantly between accelerator threads. Can’t shut down
accelerator thread instantly. Memory protection isn’t same as on CPU
26
Virtual memory management system
Operating
System
Physical Memory (e.g. DDR)Storage (e.g. hard disk)
AcceleratorVirtualization
goes here
How does
virtualization
go here?
© 2019 Codeplay Software Ltd
Package non-safety-qualified
systems via decomposition
•ISO 26262 defines “Quality Managed”
(“QM”)
•These systems can adopt latest
technologies, without developed to full
safety standards
•We can wrap QM systems inside ASIL
systems
•We need to monitor the running of the
system and be able to shut down a faulty
system
•Requires ability to detect failures
Build full safety-qualified
systems
•Build from the ground up: Safe
RTOS that supports accelerators
•Safe programming models
•Safety analysis tools
•Independent testing and
validation
Multiple, independent,
redundant systems
•If we independently develop
systems to perform specific tasks,
we can achieve fully safe
redundancy
Pragmatic solutions
27
OutputInput QM AI
System
ASIL B Monitoring
system
Safety monitoring for AI
Safe heterogeneous
programming tools
Safe RTOS
CPU AI Accelerator
Combine
& check
results
System
#1
System
#2
Dev
Team #1
Dev
Team #2
© 2019 Codeplay Software Ltd
Summary
1. We need to use a range of AI accelerators to achieve AI in automotive.
• We can’t just assume CPU safety processes can easily transfer to accelerators
• We need all the tools we have for safety on CPUs brought to accelerators
2. There are a lot of unexpected challenges
3. Standards are critical for building out these tools and ecosystem
• There are industry-wide standards being developed, but we need to get more
people involved to deliver safe solutions
28
© 2019 Codeplay Software Ltd
About Codeplay
Accelerator silicon
enablement
•OpenCL and Vulkan
implementations with
ComputeAorta product for
customers’ processors
•Custom LLVM compiler
back-ends and runtime
drivers
•Accelerator processor
optimizations
Open accelerator ecosystem
•Open standards and open-
source ecosystem for AI
acceleration
•SYCL ecosystem: the open
alternative ecosystem to
CUDA
•TensorFlow, Eigen
•SYCL-BLAS, SYCL-DNN,
SYCL-M
•Open-source accelerator
libraries: clSPV, SPIR-V
tools
Automotive AI tools
•Support for Renesas R-Car
and Imagination
Technologies PowerVR
•Optimized SYCL-BLAS and
SYCL-DNN libraries for
automotive AI processors
•Profiler to analyse
performance
•Working towards ISO
26262 ASIL B standards-
based acceleration
29
70+ expert AI and graphics acceleration engineers in Edinburgh, Scotland, UK
Ready to provide all the tech & services to deliver ground-breaking AI technologies
© 2019 Codeplay Software Ltd
Resource
30
SYCL standard & ecosystem
http://sycl.tech/
MISRA
MISRA C and C++ standards body
https://www.misra.org.uk/
Codeplay automotive tools
https://developer.codeplay.com/home/
Codeplay booth
See our tools on Renesas and Imagination
Technologies ADAS accelerator processors
Khronos Workshop at EVS
Will cover OpenVX, Vulkan, OpenCL, NNEF
and SYCL in much more detail
Thursday May 23rd, 9am-5pm
https://www.khronos.org/events/2019-
embedded-vision-summit
© 2019 Codeplay Software Ltd
Backup
© 2019 Codeplay Software Ltd
Tesla FSD chip
mm2 GOPS
GPU 40.9 600
CPU 22.1 211
NNA 15.4 72,000
SRAM 67.6
Cache 18.6
Total 260 72,811
NNA
•Fast, low-precision
convolutions
SRAM
•Needed to keep
processors supplied
with data
CPU
•Highly general-
purpose at lower
performance
GPU
•Most of the
programmable
performance
https://www.youtube.com/watch?v=Ucp0TTmvqOE
© 2019 Codeplay Software Ltd
From sensing to control
Car control
Path
planning
Trajectory
tracking
Sensor
fusion
3D mapping
Semantic
segmentation
Frame
capture
Camera
33
• These systems typically operate at 15-25 frames per second (depending
on maximum speed and safety requirements)
• Roughly 8 input frames are required to make a processing decision
• Includes tracking movement over several frames
• Includes pipelining for higher throughput
• At 70mph (112 km/h), braking distance is 75 m and “thinking distance”
(for a human) is 21 m, or 1.5 seconds
© 2019 Codeplay Software Ltd
Car controlPath planning
Trajectory
tracking
Sensor fusion3D mapping
Semantic
segmentation
Frame captureCamera
Frame capture
If a camera can capture a
compete view of a 2m
pedestrian at 2m distance,
then a pedestrian at a 100m
distance will cover no more
than 1/50th the height of the
image, or 1/2,500th of the
area of the image.
34
2m 2m
100m
2m
If an algorithm can recognize a pedestrian with 100 pixels, the camera must be 25
megapixels to recognize a pedestrian at 100m, which is required to drive at 70mph
© 2019 Codeplay Software Ltd
Car controlPath planning
Trajectory
tracking
Sensor fusion3D mapping
Semantic
segmentation
Frame captureCamera
Semantic segmentation
60-300GFLOPS per
frame
At 25fps = 1.5TFLOPS to
7.5TFLOPS, but for inference
can often be doing in fixed,
point, which is TOPS, not
TFLOPS
35
Recurrent Segmentation for Variable Computational Budgets: Stanford
University & Google Brain: L McIntosh, N Maheswaranathan
D Sussillo, J Shlens, arXiv:1711.10151v2 [cs.CV] 15 Mar 2018
© 2019 Codeplay Software Ltd
Car controlPath planning
Trajectory
tracking
Sensor fusion3D mapping
Semantic
segmentation
Frame captureCamera
3D Mapping
Each sensor (cameras, LIDAR, Radar)
and each perception algorithm (deep
learning, SLAM, point cloud, etc)
needs to generate a 3D map of the
environment it detects and a list of
objects (pedestrians, cars etc) to
track)
36
A 100 m × 100 m × 10 m
occupancy grid of 100 cm ×
100 cm x 100 cm cells
contains 100,000,000 cells
updated every frame
© 2019 Codeplay Software Ltd
Car controlPath planning
Trajectory
tracking
Sensor fusion3D mapping
Semantic
segmentation
Frame captureCamera
Sensor Fusion
Sensor fusion combines data from all
sensors and perception algorithms.
It detects inconsistencies between
different sensors to detect errors
This is where the redundancy in the
sensors is used to achieve safety. But
how do you achieve redundancy in
the sensor fusion?
37
Needs to process all data
from all perception
algorithms combined
© 2019 Codeplay Software Ltd
To achieve performance, create a pipeline
Car controlPath planning
Object
trajectory
tracking/
prediction
Sensor fusion3D mapping
Semantic
segmentation
Frame captureCamera
38
• To achieve maximum throughput, this will be pipelined
• It can also take at least 3 frames to track movement
Path planning
Object
trajectory
tracking/
prediction
Sensor fusion3D mapping
Semantic
segmentation
Frame captureCamera
Object
trajectory
tracking/
prediction
Sensor fusion3D mapping
Semantic
segmentation
Frame captureCamera

More Related Content

What's hot

High Definition – The Way Video Communications Was Meant to Be
High Definition – The Way Video Communications Was Meant to BeHigh Definition – The Way Video Communications Was Meant to Be
High Definition – The Way Video Communications Was Meant to Be
Videoguy
 
10 reasons why you shouldn't buy an anologue camera
10 reasons why you shouldn't buy an anologue camera10 reasons why you shouldn't buy an anologue camera
10 reasons why you shouldn't buy an anologue camera
cnssources
 

What's hot (20)

Qualcomm Snapdragon Processors: A Super Gaming Platform
Qualcomm Snapdragon Processors: A Super Gaming Platform Qualcomm Snapdragon Processors: A Super Gaming Platform
Qualcomm Snapdragon Processors: A Super Gaming Platform
 
On-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VROn-device Motion Tracking for Immersive VR
On-device Motion Tracking for Immersive VR
 
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
HSA-4146, Creating Smarter Applications and Systems Through Visual Intelligen...
 
"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm"Is Vision the New Wireless?," a Presentation from Qualcomm
"Is Vision the New Wireless?," a Presentation from Qualcomm
 
The Mobile Future of Extended Reality
The Mobile Future of Extended RealityThe Mobile Future of Extended Reality
The Mobile Future of Extended Reality
 
“Challenges and Approaches for Cascaded DNNs: A Case Study of Face Detection ...
“Challenges and Approaches for Cascaded DNNs: A Case Study of Face Detection ...“Challenges and Approaches for Cascaded DNNs: A Case Study of Face Detection ...
“Challenges and Approaches for Cascaded DNNs: A Case Study of Face Detection ...
 
"Highly Efficient, Scalable Vision and AI Processors IP for the Edge," a Pres...
"Highly Efficient, Scalable Vision and AI Processors IP for the Edge," a Pres..."Highly Efficient, Scalable Vision and AI Processors IP for the Edge," a Pres...
"Highly Efficient, Scalable Vision and AI Processors IP for the Edge," a Pres...
 
High Definition – The Way Video Communications Was Meant to Be
High Definition – The Way Video Communications Was Meant to BeHigh Definition – The Way Video Communications Was Meant to Be
High Definition – The Way Video Communications Was Meant to Be
 
The path to personalized, on-device virtual assistant
The path to personalized, on-device virtual assistantThe path to personalized, on-device virtual assistant
The path to personalized, on-device virtual assistant
 
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)
 
“How Containerization Unblocks Barriers to Fast, Easy Deployment of AI-Driven...
“How Containerization Unblocks Barriers to Fast, Easy Deployment of AI-Driven...“How Containerization Unblocks Barriers to Fast, Easy Deployment of AI-Driven...
“How Containerization Unblocks Barriers to Fast, Easy Deployment of AI-Driven...
 
10 reasons why you shouldn't buy an anologue camera
10 reasons why you shouldn't buy an anologue camera10 reasons why you shouldn't buy an anologue camera
10 reasons why you shouldn't buy an anologue camera
 
Boundless Photorealistic Mobile XR Over 5G
Boundless Photorealistic Mobile XR Over 5GBoundless Photorealistic Mobile XR Over 5G
Boundless Photorealistic Mobile XR Over 5G
 
“What We Need to Transform Lives and Industries with On-Device AI, Cloud and ...
“What We Need to Transform Lives and Industries with On-Device AI, Cloud and ...“What We Need to Transform Lives and Industries with On-Device AI, Cloud and ...
“What We Need to Transform Lives and Industries with On-Device AI, Cloud and ...
 
How to take advantage of XR over 5G: Understanding XR Viewers
How to take advantage of XR over 5G: Understanding XR ViewersHow to take advantage of XR over 5G: Understanding XR Viewers
How to take advantage of XR over 5G: Understanding XR Viewers
 
The A2530x24xx AIR Module for ZigBee Standard Applications
The A2530x24xx AIR Module for ZigBee Standard ApplicationsThe A2530x24xx AIR Module for ZigBee Standard Applications
The A2530x24xx AIR Module for ZigBee Standard Applications
 
“Deep Learning on Mobile Devices,” a Presentation from Siddha Ganju
“Deep Learning on Mobile Devices,” a Presentation from Siddha Ganju“Deep Learning on Mobile Devices,” a Presentation from Siddha Ganju
“Deep Learning on Mobile Devices,” a Presentation from Siddha Ganju
 
2010 Sprint Developers Conference - Best Practices in Location Based Services
2010 Sprint Developers Conference - Best Practices in Location Based Services2010 Sprint Developers Conference - Best Practices in Location Based Services
2010 Sprint Developers Conference - Best Practices in Location Based Services
 
Designing LoRaWAN for dense IoT deployments webinar
Designing LoRaWAN for dense IoT deployments webinarDesigning LoRaWAN for dense IoT deployments webinar
Designing LoRaWAN for dense IoT deployments webinar
 
"End to End Fire Detection Deep Neural Network Platform," a Presentation from...
"End to End Fire Detection Deep Neural Network Platform," a Presentation from..."End to End Fire Detection Deep Neural Network Platform," a Presentation from...
"End to End Fire Detection Deep Neural Network Platform," a Presentation from...
 

Similar to "Can We Have Both Safety and Performance in AI for Autonomous Vehicles?," a Presentation from Codeplay Software

Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...
Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...
Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...
Curiosity Software Ireland
 
Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2
Srinivasa Addepalli
 
Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...
Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...
Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...
Curiosity Software Ireland
 
How to Operate Kubernetes CI/CD Pipelines at Scale
How to Operate Kubernetes CI/CD Pipelines at ScaleHow to Operate Kubernetes CI/CD Pipelines at Scale
How to Operate Kubernetes CI/CD Pipelines at Scale
DevOps.com
 

Similar to "Can We Have Both Safety and Performance in AI for Autonomous Vehicles?," a Presentation from Codeplay Software (20)

Highly dependable automotive software
Highly dependable automotive softwareHighly dependable automotive software
Highly dependable automotive software
 
Webinar presentation on AUTOSAR Multicore Systems
Webinar presentation on AUTOSAR Multicore SystemsWebinar presentation on AUTOSAR Multicore Systems
Webinar presentation on AUTOSAR Multicore Systems
 
Qualifying a high performance memory subsysten for Functional Safety
Qualifying a high performance memory subsysten for Functional SafetyQualifying a high performance memory subsysten for Functional Safety
Qualifying a high performance memory subsysten for Functional Safety
 
Highly dependable automotive software
Highly dependable automotive softwareHighly dependable automotive software
Highly dependable automotive software
 
ERTS_IV_ECE.pptx
ERTS_IV_ECE.pptxERTS_IV_ECE.pptx
ERTS_IV_ECE.pptx
 
Delivering Java Applications? Ensure Top Performance Every Time, with Intell...
 Delivering Java Applications? Ensure Top Performance Every Time, with Intell... Delivering Java Applications? Ensure Top Performance Every Time, with Intell...
Delivering Java Applications? Ensure Top Performance Every Time, with Intell...
 
IRJET- Development of Uncrackable Software
IRJET- Development of Uncrackable SoftwareIRJET- Development of Uncrackable Software
IRJET- Development of Uncrackable Software
 
Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...
Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...
Curiosity software Ireland and Perfecto present: achieving in-sprint regressi...
 
Developing functional safety systems with arm architecture solutions stroud
Developing functional safety systems with arm architecture solutions   stroudDeveloping functional safety systems with arm architecture solutions   stroud
Developing functional safety systems with arm architecture solutions stroud
 
Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2
 
SMART HELMET SYSTEM
SMART HELMET SYSTEMSMART HELMET SYSTEM
SMART HELMET SYSTEM
 
IRJET - Automatic Toll E-Tickting System for Transportation and Finding o...
IRJET -  	  Automatic Toll E-Tickting System for Transportation and Finding o...IRJET -  	  Automatic Toll E-Tickting System for Transportation and Finding o...
IRJET - Automatic Toll E-Tickting System for Transportation and Finding o...
 
Software Reliability and Safety.pdf
Software Reliability and Safety.pdfSoftware Reliability and Safety.pdf
Software Reliability and Safety.pdf
 
Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...
Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...
Curiosity and Sauce Labs present - When to stop testing: 3 dimensions of test...
 
IRJET- FPGA Implementation of an Improved Watchdog Timer for Safety-Critical ...
IRJET- FPGA Implementation of an Improved Watchdog Timer for Safety-Critical ...IRJET- FPGA Implementation of an Improved Watchdog Timer for Safety-Critical ...
IRJET- FPGA Implementation of an Improved Watchdog Timer for Safety-Critical ...
 
Safety Verification and Software aspects of Automotive SoC
Safety Verification and Software aspects of Automotive SoCSafety Verification and Software aspects of Automotive SoC
Safety Verification and Software aspects of Automotive SoC
 
How to Operate Kubernetes CI/CD Pipelines at Scale
How to Operate Kubernetes CI/CD Pipelines at ScaleHow to Operate Kubernetes CI/CD Pipelines at Scale
How to Operate Kubernetes CI/CD Pipelines at Scale
 
IRJET - Smart Assistance System for Drivers
IRJET - Smart Assistance System for DriversIRJET - Smart Assistance System for Drivers
IRJET - Smart Assistance System for Drivers
 
SaltConf 2014: Safety with powertools
SaltConf 2014: Safety with powertoolsSaltConf 2014: Safety with powertools
SaltConf 2014: Safety with powertools
 
IRJET- Design and Implementation of High Speed FPGA Configuration using SBI
IRJET- Design and Implementation of High Speed FPGA Configuration using SBIIRJET- Design and Implementation of High Speed FPGA Configuration using SBI
IRJET- Design and Implementation of High Speed FPGA Configuration using SBI
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

"Can We Have Both Safety and Performance in AI for Autonomous Vehicles?," a Presentation from Codeplay Software

  • 1. © 2019 Codeplay Software Ltd Can We Have Both Safety and Performance in AI for Autonomous Vehicles? Andrew Richards Codeplay May 2019
  • 2. © 2019 Codeplay Software Ltd Outline About Codeplay What is functional safety? What does an automotive AI system look like in terms of architecture? => The wide variety of compute-intensive algorithms Why do we need high performance for safety? And why accelerators are the only way to get to high performance Requirements for safe engineering Challenges in bringing existing CPU safe engineering practices to accelerators 2
  • 3. © 2019 Codeplay Software Ltd Functional safety Safety doesn’t mean the system doesn’t fail ➢ Safety means the system fails safely How do you know if the system fails? ➢ You have to detect the failure with a high level of accuracy ➢ Both incorrect results and late results are a failure What do you do if the system fails? ➢ You have to come up with a safe state to return to 3
  • 4. © 2019 Codeplay Software Ltd Functional Safety “Absence of unreasonable risk due to hazards caused by malfunctioning behavior of electrical/electronic systems” The standard requires the Development of the Product to be “State of the Art” Functional Safety lifecycle top down approach from Vehicle to IPs & SW Components Safety Compliance from Project Initiation to Project decommission
  • 5. © 2019 Codeplay Software Ltd Safety failure types Systematic Failures: Result from a failure in design or manufacturing Often a result of failure to follow best practices Rate of systematic failures can be reduced through continual and rigorous process improvement Random Failures: Result from random defects inherent to process or usage condition Rate of random failures cannot generally be reduced; focus must be on the detection and handling of random failures in the application
  • 6. © 2019 Codeplay Software Ltd SOTIF: Safety Of The Intended Function Systems or subsystems can cause hazards based on erroneous decision on the environment and not necessarily caused by malfunction of Electrical/Electronic components (Addressed by ISO26262) SOTIF answers the question of “How do you intend to behave” by utilizing the PAS guidance on design, verification and validation. SOTIF intends to address sensor limitations (i.e. bad reflection, snow), decision algorithms (environment, location, highway construction etc.), misuses by drivers 6 2 1 43 Known Unknown Unsafe Safe Reduction of scenarios in areas 2 and 3 is the key, by developing them onto known scenarios SAE levels for autonomous vehicles SOTIF ISO 21448SOTIF PAS 21448 Level 5 Fully autonomous Level 4 Deep self control Level 3 Limited overall control Level 2 Execute automated manoeuvres Level 1 Adaptive assist Level 0 Warnings
  • 7. © 2019 Codeplay Software Ltd Safety of Autonomous Driving needs High Performance - High Performance makes Safety Hard 7
  • 8. © 2019 Codeplay Software Ltd From sensing to control Car controlPath planningSensor fusion Deep learning front-camera Machine vision and SLAM surround cameras LIDAR RADAR 8 Redundancy is achieved by having multiple, independent, sensors and perception algorithms combined via sensor fusion
  • 9. © 2019 Codeplay Software Ltd Performance cannot be achieved with CPUs Car control Path plannin g Object trajectory tracking / prediction Sensor fusion 3D mapping Semantic segmentation Frame capture Camera 9 625 million pixels per second 1.5-7.5 TOPS for each deep learning algorithm 250 million cells updated per frame / sensor Combine all the data together and check Far beyond the processing power of a multi-core CPU This level of processing can only be achieved with a different AI accelerator designed for each class of algorithm and sensor Passive (fanless) cooling requires no more than 8 W-15 W per processor Adding a fan is a safety challenge, as well as adding a lot of cost
  • 10. © 2019 Codeplay Software Ltd Types of AI accelerator Deep learning inference accelerator •Fixed-point precision (8-bit or 16-bit) •Can execute fast convolutions and some basic CNN layers •Very high performance, but low programmability Programmable accelerator (vision tasks e.g. SLAM) •Mix of programmable and fixed-function •Mix of fixed-point and floating-point •Highly data parallel with on- chip memory •Throughput optimized Sensor fusion accelerator •Very programmable •Floating-point •On-chip memory and caches •Latency optimized •Complex algorithms Fixed-function accelerator •Simpler LIDAR and Radar processing •Some machine vision tasks, e.g. scaling 10
  • 11. © 2019 Codeplay Software Ltd Requirements for safety
  • 12. © 2019 Codeplay Software Ltd Requirements for safe engineering • Redundancy (multiple systems) • Fault detection (both timing and accuracy) • Fault handling • Fault injection (to test fault detection & handling) • Coverage checking (to ensure test coverage) • Coding guidelines (e.g. MISRA) • Little or no dynamic memory management 12 How do we bring these capabilities to accelerators?
  • 13. © 2019 Codeplay Software Ltd Redundancy: Systematic vs Random Faults This architecture allows Processor #1 to fail and Processor #2 to take over But: what if the reason Processor #1 fails is a fault that also applies to Processor #2? ➢ e.g. software failure in software that both Processor #1 and Processor #2 run A random fault may be solvable with two identical redundant systems But a systematic fault can only be solved with two fundamentally different redundant systems 13 Sensor Processor #1 Processor #2 Fusion
  • 14. © 2019 Codeplay Software Ltd Redundancy Redundancy is much easier to achieve with sensors and perception than sensor fusion, planning and control Redundancy from two identical systems does not solve systematic faults, only transient faults By using standard programming models, much easier to achieve redundancy: much easier to mix-and-match components from different suppliers to avoid systematic faults By using standard programming models, much easier to integrate tools from multiple vendors, e.g. static checkers, or memory checkers The OpenCL SC (“Safety Critical”), Vulkan SC and SYCL SC working-groups are working towards defining safer versions of these standards 14
  • 15. © 2019 Codeplay Software Ltd Fault detection Timing faults can be detected with a watch-dog-timer All operations must have a maximum timeout The quantity of processing required for various perception algorithms can vary by the scene: e.g. the more potential pedestrians discovered means running pedestrian-classification on more regions of an image One solution is to periodically pass known input data into each algorithm and check it against known correct output data The algorithms used must be deterministic (always give the same outputs for the same inputs) which is not true of all parallel algorithms 15
  • 16. © 2019 Codeplay Software Ltd Fault handling Handling faults in highly parallel software is a surprisingly tough challenge Faults detected asynchronously need to be stored somewhere and then processed. They can’t be handled immediately without consuming resources asynchronously. This is a safety challenge For massively parallel software, large numbers of faults could be created at once: how to handle? Most parallel programming models handle faults very badly. It’s a much harder challenge than people expect 16
  • 17. © 2019 Codeplay Software Ltd Multi-threaded error handling • Errors triggered on an accelerator are asynchronous • Error handling can’t be executed on the accelerator • When does the main CPU thread process error(s)? 17 Main CPU Thread Offload Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ OffloadRunkernel Accelerator Handler CPU Thread Error Time Accelerator threads are grouped Where does this thread store the error? This thread waits for the accelerator to complete. Is that fast enough to process the error?
  • 18. © 2019 Codeplay Software Ltd Pre-emption and independent forward progress • Most accelerators are groups of SIMD/SIMT units: this gives high performance per Watt • “Single Instruction Multiple Data/Thread” • This means each thread executes the same instruction in “lock-step” • Some threads may be inside the false branch of a conditional: they “predicate” to not apply effects of instructions until the condition ends • This means that if one “thread” in a group goes into an infinite loop, the others will also pause indefinitely • The accelerator does not complete until all groups complete 18 Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’
  • 19. © 2019 Codeplay Software Ltd Putting an accelerator in a safe state With a CPU thread, you stop the thread by no longer giving it CPU cycles Stopping a CPU thread is instant. Stopping an accelerator thread is not Stopping one group of accelerator threads doesn’t necessarily stop other groups of threads You can’t safely free accelerator-accessed memory until all accelerator threads have safely stopped. You can’t easily predict how long this will take Simple solution: Shut down the whole chip 19 Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Accelerator ‘Thread’ Kill threads Accelerator memory buffer
  • 20. © 2019 Codeplay Software Ltd Fault injection Can only test good fault handling if can inject faults into a system during testing Fault injection must happen asynchronously to be sure of finding bugs Fault injection needs to work across multiple AI accelerators Fault injection must be included into continuous test processes Faults to consider: • Transient hardware faults • Overheating causing throttling • Threading errors • Algorithms taking an unusually long time to complete due to complex input data We need fault injection tools for accelerators (e.g. NVIDIA SASSIFI) 20
  • 21. © 2019 Codeplay Software Ltd Coverage checking A standard ISO 26262 process is to require line-coverage and condition- coverage for test suites. Tests that each line (or condition combination) is tested in a test suite Commonly-supported on CPUs, but what about AI accelerators? The compilers for AI accelerators typically perform transformations, such as data-parallel vectorization used with GPUs, that significantly changes the control-flow of the program relative to the source code How do we define coverage-checking for AI accelerators? 21
  • 22. © 2019 Codeplay Software Ltd Coverage checking in a heterogeneous environment Coverage checking is a way of applying a metric to a test-suite: does the test-suite test every line in a program? Stricter coverage checking ensures every condition in a conditional is also tested In a heterogeneous environment, a single source line may be compiled for different accelerator cores Each accelerator core may execute the source line in a slightly different way • How do we define coverage in an accelerator model? • How do we test coverage in an accelerator model? • If a SIMT compiler has transformed code, what does coverage mean? 22
  • 23. © 2019 Codeplay Software Ltd Coding guidelines: MISRA C++ Standardized coding guidelines for writing safe software. Can be checked with source code static checker tools • Originated by the automotive industry, for the automotive industry • But is applicable to any industry that requires high-integrity software • Originally, Misra suggests (in its vision) its use in safety-related software • But now suggests (in its vision) its applicability to any application with high integrity or high reliability requirements The MISRA C++ group is updating the MISRA C++ standard to support accelerator programming, in collaboration with AUTOSAR. Being written as an update to MISRA C++ 2008. This is where the AI and SYCL accelerator support will go for autonomous driving coding guidelines 23
  • 24. © 2019 Codeplay Software Ltd Dynamic memory management Accelerator programming models rely extensively on dynamic memory management This is a real challenge for AI accelerators: how to define a standard way of statically-allocating memory for AI acceleration How to free memory safely in a fault situation How to isolate different safety domains in a program without corruption between memory allocated in different safety domains 24
  • 25. © 2019 Codeplay Software Ltd Accelerator memory management • Accelerators have a much more direct view of memory than a CPU • The simplest approach is pinned memory: at a known physical address • Accelerators have much simpler memory protection than a CPU 25 CPU Virtual memory management system Operating System Physical Memory (e.g. DDR)Storage (e.g. hard disk) Accelerator (There maybe a memory management unit here, but usually much simpler than for a CPU)
  • 26. © 2019 Codeplay Software Ltd CPU Hypervisor Virtualization Virtualization is well-defined for CPUs and can contribute to safety isolation But for accelerators, virtualization is not clearly-defined Can’t switch instantly between accelerator threads. Can’t shut down accelerator thread instantly. Memory protection isn’t same as on CPU 26 Virtual memory management system Operating System Physical Memory (e.g. DDR)Storage (e.g. hard disk) AcceleratorVirtualization goes here How does virtualization go here?
  • 27. © 2019 Codeplay Software Ltd Package non-safety-qualified systems via decomposition •ISO 26262 defines “Quality Managed” (“QM”) •These systems can adopt latest technologies, without developed to full safety standards •We can wrap QM systems inside ASIL systems •We need to monitor the running of the system and be able to shut down a faulty system •Requires ability to detect failures Build full safety-qualified systems •Build from the ground up: Safe RTOS that supports accelerators •Safe programming models •Safety analysis tools •Independent testing and validation Multiple, independent, redundant systems •If we independently develop systems to perform specific tasks, we can achieve fully safe redundancy Pragmatic solutions 27 OutputInput QM AI System ASIL B Monitoring system Safety monitoring for AI Safe heterogeneous programming tools Safe RTOS CPU AI Accelerator Combine & check results System #1 System #2 Dev Team #1 Dev Team #2
  • 28. © 2019 Codeplay Software Ltd Summary 1. We need to use a range of AI accelerators to achieve AI in automotive. • We can’t just assume CPU safety processes can easily transfer to accelerators • We need all the tools we have for safety on CPUs brought to accelerators 2. There are a lot of unexpected challenges 3. Standards are critical for building out these tools and ecosystem • There are industry-wide standards being developed, but we need to get more people involved to deliver safe solutions 28
  • 29. © 2019 Codeplay Software Ltd About Codeplay Accelerator silicon enablement •OpenCL and Vulkan implementations with ComputeAorta product for customers’ processors •Custom LLVM compiler back-ends and runtime drivers •Accelerator processor optimizations Open accelerator ecosystem •Open standards and open- source ecosystem for AI acceleration •SYCL ecosystem: the open alternative ecosystem to CUDA •TensorFlow, Eigen •SYCL-BLAS, SYCL-DNN, SYCL-M •Open-source accelerator libraries: clSPV, SPIR-V tools Automotive AI tools •Support for Renesas R-Car and Imagination Technologies PowerVR •Optimized SYCL-BLAS and SYCL-DNN libraries for automotive AI processors •Profiler to analyse performance •Working towards ISO 26262 ASIL B standards- based acceleration 29 70+ expert AI and graphics acceleration engineers in Edinburgh, Scotland, UK Ready to provide all the tech & services to deliver ground-breaking AI technologies
  • 30. © 2019 Codeplay Software Ltd Resource 30 SYCL standard & ecosystem http://sycl.tech/ MISRA MISRA C and C++ standards body https://www.misra.org.uk/ Codeplay automotive tools https://developer.codeplay.com/home/ Codeplay booth See our tools on Renesas and Imagination Technologies ADAS accelerator processors Khronos Workshop at EVS Will cover OpenVX, Vulkan, OpenCL, NNEF and SYCL in much more detail Thursday May 23rd, 9am-5pm https://www.khronos.org/events/2019- embedded-vision-summit
  • 31. © 2019 Codeplay Software Ltd Backup
  • 32. © 2019 Codeplay Software Ltd Tesla FSD chip mm2 GOPS GPU 40.9 600 CPU 22.1 211 NNA 15.4 72,000 SRAM 67.6 Cache 18.6 Total 260 72,811 NNA •Fast, low-precision convolutions SRAM •Needed to keep processors supplied with data CPU •Highly general- purpose at lower performance GPU •Most of the programmable performance https://www.youtube.com/watch?v=Ucp0TTmvqOE
  • 33. © 2019 Codeplay Software Ltd From sensing to control Car control Path planning Trajectory tracking Sensor fusion 3D mapping Semantic segmentation Frame capture Camera 33 • These systems typically operate at 15-25 frames per second (depending on maximum speed and safety requirements) • Roughly 8 input frames are required to make a processing decision • Includes tracking movement over several frames • Includes pipelining for higher throughput • At 70mph (112 km/h), braking distance is 75 m and “thinking distance” (for a human) is 21 m, or 1.5 seconds
  • 34. © 2019 Codeplay Software Ltd Car controlPath planning Trajectory tracking Sensor fusion3D mapping Semantic segmentation Frame captureCamera Frame capture If a camera can capture a compete view of a 2m pedestrian at 2m distance, then a pedestrian at a 100m distance will cover no more than 1/50th the height of the image, or 1/2,500th of the area of the image. 34 2m 2m 100m 2m If an algorithm can recognize a pedestrian with 100 pixels, the camera must be 25 megapixels to recognize a pedestrian at 100m, which is required to drive at 70mph
  • 35. © 2019 Codeplay Software Ltd Car controlPath planning Trajectory tracking Sensor fusion3D mapping Semantic segmentation Frame captureCamera Semantic segmentation 60-300GFLOPS per frame At 25fps = 1.5TFLOPS to 7.5TFLOPS, but for inference can often be doing in fixed, point, which is TOPS, not TFLOPS 35 Recurrent Segmentation for Variable Computational Budgets: Stanford University & Google Brain: L McIntosh, N Maheswaranathan D Sussillo, J Shlens, arXiv:1711.10151v2 [cs.CV] 15 Mar 2018
  • 36. © 2019 Codeplay Software Ltd Car controlPath planning Trajectory tracking Sensor fusion3D mapping Semantic segmentation Frame captureCamera 3D Mapping Each sensor (cameras, LIDAR, Radar) and each perception algorithm (deep learning, SLAM, point cloud, etc) needs to generate a 3D map of the environment it detects and a list of objects (pedestrians, cars etc) to track) 36 A 100 m × 100 m × 10 m occupancy grid of 100 cm × 100 cm x 100 cm cells contains 100,000,000 cells updated every frame
  • 37. © 2019 Codeplay Software Ltd Car controlPath planning Trajectory tracking Sensor fusion3D mapping Semantic segmentation Frame captureCamera Sensor Fusion Sensor fusion combines data from all sensors and perception algorithms. It detects inconsistencies between different sensors to detect errors This is where the redundancy in the sensors is used to achieve safety. But how do you achieve redundancy in the sensor fusion? 37 Needs to process all data from all perception algorithms combined
  • 38. © 2019 Codeplay Software Ltd To achieve performance, create a pipeline Car controlPath planning Object trajectory tracking/ prediction Sensor fusion3D mapping Semantic segmentation Frame captureCamera 38 • To achieve maximum throughput, this will be pipelined • It can also take at least 3 frames to track movement Path planning Object trajectory tracking/ prediction Sensor fusion3D mapping Semantic segmentation Frame captureCamera Object trajectory tracking/ prediction Sensor fusion3D mapping Semantic segmentation Frame captureCamera