Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
GR740 User day
GR740 User day
Wird geladen in …3
×

Hier ansehen

1 von 32 Anzeige

RISC V in Spacer

Herunterladen, um offline zu lesen

Klepsydra Streaming Distribution Optimiser (SDO):
• • • •

Runs on a separate computer
Executes several dry runs on the OBC
Collect statistics
Runs a genetic algorithm to find the optimal solution for latency, power or throughput
The main variable to optimise is the distribution of layers are the two dimension of the threading model.

Klepsydra Streaming Distribution Optimiser (SDO):
• • • •

Runs on a separate computer
Executes several dry runs on the OBC
Collect statistics
Runs a genetic algorithm to find the optimal solution for latency, power or throughput
The main variable to optimise is the distribution of layers are the two dimension of the threading model.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Ähnlich wie RISC V in Spacer (20)

Aktuellste (20)

Anzeige

RISC V in Spacer

  1. 1. RISC-V IN SPACE 14.12.2022 pablo.ghiglino@klepsydra.com www.klepsydra.com Klepsydra Technologies
  2. 2. Part 1 Pre-existing work
  3. 3. COMPARE AND SWAP • Compare-and-swap (CAS) is an instruction used in multithreading to achieve synchronisation. It compares the contents of a memory location with a given value and, only if they are the same, modi fi es the contents of that memory location to a new given value. This is done as a single atomic operation. • Compare-and-Swap has been an integral part of the IBM 370 architectures since 1970. • Maurice Herlihy (1991) proved that CAS can implement more of these algorithms than atomic read, write, and fetch-and-add
  4. 4. Event Loop Sensor Multiplexer Two main data processing approaches Producer 1 Consumer 1 Consumer 2 Producer 2 Producer 3 Consumer Producer 1 4
  5. 5. Lightweight, modular and compatible with most used operating systems Worldwide application Klepsydra SDK Klepsydra GPU Streaming Klepsydra AI Klepsydra ROS2 executor plugin SDK – Software Development Kit Boost data processing at the edge for general applications and processor intensive algorithms AI – Artificial Intelligence High performance deep neural network (DNN) engine to deploy any AI or machine learning module at the edge ROS2 Executor plugin Executor for ROS2 able to process up to 10 times more data with up to 50% reduction in CPU consumption. GPU (Graphic Processing Unit) High parallelisation of GPU to increase the processing data rate and GPU utilization THE PRODUCT
  6. 6. LOCK-FREE AS ALTERNATIVE TO PARALLELISATION Parallelisation Pipeline
  7. 7. 2-DIM THREADING MODEL Input Data Layer Output Data First dimension: pipelining { Thread 1 (Core 1) Layer Layer Layer Layer Layer { Thread 2 (Core 2) Layer Layer Layer Layer Layer Layer Layer Layer Layer Deep Neural Network Structure
  8. 8. 2-DIM THREADING MODEL Input Data Output Data Second dimension: Matrix multiplication parallelisation { T hread 1 (Core 1) Layer { T hread 2 (Core 2) { T hread 3 (Core 3)
  9. 9. 2-DIM THREADING MODEL Core 1 Core 2 Core 3 Core 4 Layer Layer Layer Layer Layer Layer Core 1 Core 2 Core 3 Core 4 Layer Layer Layer Layer Layer Layer Core 1 Core 2 Core 3 Core 4 Layer Layer Layer Layer Layer Layer Layer Layer Layer • Low CPU • High throughput CPU • High latency • Mid CPU • Mid throughput CPU • Mid latency • High CPU • Mid throughput CPU • Low latency Threading model con fi guration
  10. 10. ONNX API class KPSR_API OnnxDNNImporter { public: /** * @brief import an onnx file and uses a default eventloop factory for all processor cores * @param onnxFileName * @param testDNN * @return a share pointer to a DeepNeuralNetwork object * * When log level is debug, dumps the YAML configuration of the default factory. * It makes use of all processor cores. */ static std::shared_ptr<kpsr::ai::DeepNeuralNetworkFactory> createDNNFactory(const std::string & onnxFileName, bool testDNN = false); /** * @brief importForTest an onnx file and uses a default synchronous factory * @param onnxFileName * @param envFileName. Klepsydra AI configuration environment file. * @return a share pointer to a DeepNeuralNetwork object * * This method is intented to be used for testing purposes only. * */ static std::shared_ptr<kpsr::ai::DeepNeuralNetworkFactory> createDNNFactory(const std::string & onnxFileName, const std::string & envFileName); }; 10
  11. 11. Core API class DeepNeuralNetwork { public: /** * @brief setCallback * @param callback. Callback function for the prediction result. */ virtual void setCallback(std::function<void(const unsigned long &, const kpsr::ai::F32AlignedVector &)> callback) = 0; /** * @brief predict. Load input matrix as input to network. * @param inputVector. An F32AlignedVector of floats containing network input. * * @return Unique id corresponding to the input vector */ virtual unsigned long predict(const kpsr::ai::F32AlignedVector& inputVector) = 0; /** * @brief predict. Copy-less version of predict. * @param inputVector. An F32AlignedVector of floats containing network input. * * @return Unique id corresponding to the input vector */ virtual unsigned long predict(const std::shared_ptr<kpsr::ai::F32AlignedVector> & inputVector) = 0; }; 11
  12. 12. KLEPSYDRA SDO PROCESS • Klepsydra Streaming Distribution Optimiser (SDO): • Runs on a separate computer • Executes several dry runs on the OBC • Collect statistics • Runs a genetic algorithm to fi nd the optimal solution for latency, power or throughput • The main variable to optimise is the distribution of layers are the two dimension of the threading model
  13. 13. KLEPSYDRA STREAMING DISTRIBUTION OPTIMISER (SDO)
  14. 14. Part 2 The KATESU Project
  15. 15. QORIQ® LAYERSCAPE LS1046A MULTICORE PROCESSOR QorIQ® Layerscape LS1046A Klepsydra AI Container
  16. 16. STATUS • Successful installation of the following setup: • LS1046 running Yocto Jethro • Docker Installed on LS1046 • Container with the following: • Ubuntu 20.04 • Klepsydra AI software fully supported (quantised and non- quantised)
  17. 17. XILINX ZEDBOARD ZedBoard Klepsydra AI Container PetaLinux Klepsydra AI Container
  18. 18. STATUS • Successful installation of the following setup: • ZedBoard running PetaLinux 2019.2 • Docker Installed on ZedBoard • Container with the following: • Ubuntu 20.04 • Klepsydra AI software with quantised support only
  19. 19. PERFORMANCE RESULTS: CME ON LS1046 0 6,5 13 19,5 26 CPU / Hz TFLite + NEON Klepsydra 0 45 90 135 180 Latency (ms) TFLite + NEON Klepsydra 0 4,5 9 13,5 18 Throughput (Hz) TFLite + NEON Klepsydra
  20. 20. PERFORMANCE RESULTS: CME-Q ON LS1046 0 6,75 13,5 20,25 27 CPU / Hz TFLite + NEON Klepsydra 0 30 60 90 120 Latency (ms) TFLite + NEON Klepsydra 0 7,5 15 22,5 30 Throughput (Hz) TFLite + NEON Klepsydra
  21. 21. PERFORMANCE RESULTS: CME-Q ON ZEDBOARD 0 12,5 25 37,5 50 CPU / Hz TFLite + NEON Klepsydra 0 250 500 750 1000 Latency (ms) TFLite + NEON Klepsydra 0 0,65 1,3 1,95 2,6 Throughput (Hz) TFLite + NEON Klepsydra
  22. 22. PERFORMANCE RESULTS: BSC ON LS1046 0 20 40 60 80 CPU / Hz TFLite + NEON Klepsydra 0 1250 2500 3750 5000 Latency (ms) TFLite + NEON Klepsydra 0 0,15 0,3 0,45 0,6 Throughput (Hz) TFLite + NEON Klepsydra
  23. 23. Part 2 The PATTERN Project
  24. 24. THE PATTERN PROJECT PATTERN: Klepsydra AI ported to the GR740 aNd RISC-V • Target Processor: GR740, GR765 (Leon5 & Noel-V) • Target OS: RTMES5 • Development on commercial FPGA board • Validation on Space quali fi ed hardware
  25. 25. THE MULTI-THREADING API Klepsydra SDK Multi-threading framework POSIX Operating System PTHREAD Klepsydra AI Klepsydra SDK Threading Abstraction Layer POSIX POSIX Operating System PTHREAD Klepsydra AI RTEMS5 Multi-threading framework RTEMS
  26. 26. THE PARALLELISATION FRAMEWORK Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) POSIX Operating System PTHREAD Parallelisation Framework Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) Parallelisation Framework Threading Abstraction Layer POSIX RTEMS5 Multi-threading framework POSIX Operating System PTHREAD RTEMS
  27. 27. THE MATHEMATICAL BACKEND Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) ARM x86 ARM x86 Klepsydra AI Back-ends Full-backend (Float32, Int8) Quantized-backend (Int8) ARM x86 ARM x86 RISC-V? RISC-V Extensions • Current version of Klepsydra AI supports RV32GV and RV64GC • Preparation for NOEL-V in three modes: • ‘Vanilla’ • P-Extension and V-Extension, • And more….
  28. 28. THE PLAN Phase 1: Klepsydra AI for RTEMS5 Phase 2: Klepsydra AI for GR765/Leon5 Phase 3: Klepsydra AI for GR765/Noel-V Phase 4: Validation of Klepsydra AI on GR740 and GR765
  29. 29. THE SCHEDULE Work Package Start Month End Month Duration in Months 0 1 2 3 4 5 6 7 8 9 10 11 KOM MTR1 MTR2 FR WP4.2 11 11 1 WP0 0 17 18 2 11 10 WP4.1 1 10 10 WP3.3 0 0 1 WP2.1 WP2.2 1 4 4 WP1.1 5 0 4 5 8 4 WP1.2 WP2.3 9 10 2 WP3.2 5 8 4 WP3.1 5 5 1
  30. 30. CONCLUSIONS • Enable real AI for future missions on the GR765/NOEL-V • Very easy to use, via a simple API and web-based optimisation tool • Highly optimised for the GR765/NOEL-V processors • Lightweight software (current version is 4Mb) • Deterministic and full control of the dedicated resources
  31. 31. NEXT STEPS • In-orbit-demonstration: • OPSSAT OBC: Using Onboard Altera FPGA and NOEL-V softcore • Other? • Health Monitoring (core operation failures, etc).
  32. 32. CONTACT INFORMATION Dr Pablo Ghiglino pablo.ghiglino@klepsydra.com +41786931544 www.klepsydra.com linkedin.com/company/klepsydra-technologies

×