On the Safe Deployment of Matrix Multiplication in Safety-Related Systems

Javier Fernández1,3
, Irune Agirre3
, Jon Perez-Cerrolaza3
,
Francisco J. Cazorla1
, Jaume Abella1,2

CONTENTS
01
02
03
CONTENTS
CONTEXTUALIZATION
PROPOSED SOLUTION
EVALUATION
04 CONCLUSIONS

3
Main Concepts
Artificial Intelligence
It has made enormous
progress, reaching near-
human accuracy in several
safety-related tasks.
Functional safety standards
IEC 61508 IEC 61513
EN 5012X ISO 26262
Example in the Automotive domain

4
Main Concepts Baseline concept
Detection of faults at runtime in
the Matrix-matrix Multiplication
Catalog of diagnostics
techniques
CUTLASS: High-performance matrix-
matrix multiplication Library
“On the Safe Deployment of Matrix Multiplication in Massively Parallel Safety-Related Systems”
Object detector application based
on CNNs (Tiny YOLO-v3)
Matrix-matrix
Multiplication (MMM)
It is the backbone of the Convolutional
Neural Networks in terms of execution
time:
• Sequential implementation: 98,5 %
• Vectorized implementation: 87%
• CUDA based implementation: 67 %
𝐴11 𝐴12
𝐴21 𝐴22
𝐴31 𝐴32
𝑋
𝐵11 𝐵12 𝐵13
𝐵21 𝐵22 𝐵23
=
𝐶11 𝐶12 𝐶13
𝐶21 𝐶22 𝐶23
𝐶31 𝐶32 𝐶33

CONTENTS
01
02
03
04
CONTENTS
CONTEXTUALIZATION
PROPOSED SOLUTION
EVALUATION
CONCLUSIONS

6
P R O P O S E D S O LU T I O N
(1)
(1)
(1)
1 Berkeley DeepDrive dataset (https://www.bdd100k.com/)

7
Stage 1
(1)
(1)
(1)

Stage 1
8
(1)
(1)
(1)

9
Stage 1
(1)
(1)
(1)

10
Stage 2
Stage 1
(1)
(1)
(1)

11
Stage 2
Stage 1
11
(1)
(1)
(1)

Stage 2
Stage 1
(1)
(1)
(1)
12

Stage 2
Stage 1
(1)
(1)
(1)
13

Stage 2
Stage 1 Stage 2
Stage 1
(1)
(1)
(1)
14

Stage 2
Stage 1
(1)
(1)
(1)

Stage 2
Stage 1
(1)
(1)
(1)
16

Stage 2
Stage 1
17
DC computation per fault source:
B. Fault injected at the global memory level:
𝐷𝑒𝑡𝐴= (𝐵1𝑑𝑒𝑡𝐴
†
+ 𝐵3𝑑𝑒𝑡𝐴
†
) x 𝑁_𝐵𝑅𝑇1+𝐵2𝑑𝑒𝑡𝐴
∗
+𝐵4𝑑𝑒𝑡𝐴
∗
𝐷𝑒𝑡𝐵=(𝐵1𝑑𝑒𝑡𝐵
⊗
+ 𝐵2𝑑𝑒𝑡𝐵
⊗
) x 𝑁_𝐵𝐶𝑇1+𝐵3𝑑𝑒𝑡𝐵
△
+𝐵4𝑑𝑒𝑡𝐵
△
𝐷𝐶 =
𝐷𝑒𝑡𝐴 + 𝐷𝑒𝑡𝐵
𝑀 + 𝑁 𝑥 𝐾 𝑥 𝑑𝑎𝑡𝑎_𝑠𝑖𝑧𝑒
A. Faults injected at the arithmetic level or at the register level:
𝐷𝐶 =
𝑖=1
4
(𝑁𝑏𝑙𝑜𝑐𝑘𝑠𝐵𝑖
× 𝑁det _𝐵𝑖
)
𝑁𝑓𝑖

Stage 3
Stage 2
Stage 1
(1)
(1)
(1)
18

Set-up
Matrix Multiplication
𝐴11 𝐴12
𝐴21 𝐴22
𝐴31 𝐴32
𝑋
𝐵11 𝐵12 𝐵13
𝐵21 𝐵22 𝐵23
=
𝐶11 𝐶12 𝐶13
𝐶21 𝐶22 𝐶23
𝐶31 𝐶32 𝐶33
𝑀𝑥𝐾 𝑀𝑥𝑁
𝐾𝑥𝑁
Implementation
E VA LUAT I O N

Set-up Stage 1
s
Sensibility to misclassification
E VA LUAT I O N
21

Stage 2
Set-up Stage 1
Performance impact (without compiler optimization) Performance impact (maximum compiler optimization)
E VA LUAT I O N
Performance impact:
L1 (Minimum): From 1,01 to 1,37
L3 (Maximum): From 1,002 to 1,18
22
Performance impact:
L1 (Minimum): From 1,02 to 82,5
L7 (Maximum): From 1,04 to 171,5

Stage 2
DC of each layer of Tiny Yolo-v3
E VA LUAT I O N
23
Set-up Stage 1

Stage 3
Selective protection
Remarks
Note that, while such performance impact is high, it could be reduced if diagnostics
are just executed once periodically. For example:
For the highest diagnostic coverage PI = 3,8x the CNN execution time
Process safety time = 100x a single classification task
----------------------------------------------------------------------------------------------------
PI is lower than 5 %
E VA LUAT I O N
24
Stage 2
Set-up Stage 1

CONTENTS
01
02
03
04
CONTENTS
CONCLUSIONS
CONTEXTUALIZATION
PROPOSED SOLUTION
EVALUATION

C O N C LU S I O N S
Conclusions
Conclusions
We propose a methodology to selectively protect CNNs deployed on GPUs decomposed
into three stages and demonstrate its applicability on a tiny version of an object detector,
tiny YOLO-v3. Additionally, we remark:
• For this CNN, we observe a higher tendency to misclassify (from 83,4 to 99,6%) in the
initial layers (L1-L8). However, the final layers also present lower but still high
misclassification rates (from 55,2 to 74,34%).
• For the given example, we observe that the lowest performance impact to achieve high,
medium, and low DC ranges is 3,8, 3,33, and 2,61, respectively.
26

IKERLAN
P.º José María Arizmendiarrieta, 2 - 20500 Arrasate-Mondragón
T. +34 943712400 F. +34 943796944
THANK YOU

IKERLAN
P.º José María Arizmendiarrieta, 2 - 20500 Arrasate-Mondragón
T. +34 943712400 F. +34 943796944
NAME: JAVIER FERNÁNDEZ MUÑOZ
EMAIL: JAVIER.FERNANDEZ@IKERLAN.ES
Acknowledgements:
• Ikerlan authors have received funding from the Elkartek grant project KK-
2021/00123 of the Basque government.
• BSC authors have been partially supported by the Spanish Ministry of Science
and Innovation under grant PID2019-
107255GBC21/AEI/10.13039/501100011033

Classification is correct if:
1. The central point of the box is less than 50 pixels away
2. Width and height of the boxes vary by less than 25 pixels
3. Accuracy differs by less than 15%.

Safe architectural patterns proposed:

On the Safe Deployment of Matrix Multiplication in Safety-Related Systems

Recommended

Recommended

More Related Content

Similar to On the Safe Deployment of Matrix Multiplication in Safety-Related Systems

Similar to On the Safe Deployment of Matrix Multiplication in Safety-Related Systems (20)

Recently uploaded

Recently uploaded (20)

On the Safe Deployment of Matrix Multiplication in Safety-Related Systems

Editor's Notes