Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 36 Anzeige

HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors

Herunterladen, um offline zu lesen

Session ID: HKG18-301
Session Name: HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors
Speaker: Glenn Steiner
Track: LITE


★ Session Summary ★
Key Takeaways:

With the drive to increase integration, reduce system costs, accelerate performance, and enhance reliability, software developers are discovering the processor they would like to target is simply not fast enough. This session will help you the system architect or software developer understand how you can architect and develop software on an FPGA integrated processor, and accelerate software code via FPGA accelerators.

Abstract:
As a software developer, in order to meet system level performance requirements, you may have realized that your next software project will be targeting a processor inside of an FPGA. How will this impact your development process and what benefits might you gain with this tight integration of processor and FPGA? Starting from the basics of what FPGAs are (in terms of software programming), this session will provide a simple to understand primer of what modern FPGAs with embedded processors can do. We will wrap up with examples of how high level synthesis tools can move software to programmable logic hardware enabling dramatic software acceleration.


---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-301/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-301.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-301.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong

---------------------------------------------------
Keyword: LITE
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961

Session ID: HKG18-301
Session Name: HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors
Speaker: Glenn Steiner
Track: LITE


★ Session Summary ★
Key Takeaways:

With the drive to increase integration, reduce system costs, accelerate performance, and enhance reliability, software developers are discovering the processor they would like to target is simply not fast enough. This session will help you the system architect or software developer understand how you can architect and develop software on an FPGA integrated processor, and accelerate software code via FPGA accelerators.

Abstract:
As a software developer, in order to meet system level performance requirements, you may have realized that your next software project will be targeting a processor inside of an FPGA. How will this impact your development process and what benefits might you gain with this tight integration of processor and FPGA? Starting from the basics of what FPGAs are (in terms of software programming), this session will provide a simple to understand primer of what modern FPGAs with embedded processors can do. We will wrap up with examples of how high level synthesis tools can move software to programmable logic hardware enabling dramatic software acceleration.


---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-301/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-301.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-301.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong

---------------------------------------------------
Keyword: LITE
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors (20)

Anzeige

Weitere von Linaro (20)

Aktuellste (20)

Anzeige

HKG18-301 - Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors

  1. 1. Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors Glenn Steiner, Sr. Manager, Xilinx, Inc. February 2018 Glenn Steiner, Sr. Manager, Xilinx, Inc. March 2018
  2. 2. © Copyright 2018 Xilinx 16:00 - 16:55, 21 March 2018 Key Takeaways: With the drive to increase integration, reduce system costs, accelerate performance, and enhance reliability, software developers are discovering the processor they would like to target is simply not fast enough. This session will help you the system architect or software developer understand how you can architect and develop software on an FPGA integrated processor, and accelerate software code via FPGA accelerators. Abstract: As a software developer, in order to meet system level performance requirements, you may have realized that your next software project will be targeting a processor inside of an FPGA. How will this impact your development process and what benefits might you gain with this tight integration of processor and FPGA? Starting from the basics of what FPGAs are (in terms of software programming), this session will provide a simple to understand primer of what modern FPGAs with embedded processors can do. We will wrap up with examples of how high level synthesis tools can move software to programmable logic hardware enabling dramatic software acceleration.Page 2 Abstract
  3. 3. © Copyright 2018 Xilinx What is a Field Programmable Gate Array (FPGA)? Why Use a Heterogeneous All Programmable MPSoC? How Do You Program Heterogeneous All Programmable Devices? Introducing the Xilinx Ultra96 Development Platform Software Acceleration Design Example –German Traffic Sign Recognition Summary Page 3 Agenda
  4. 4. © Copyright 2018 XilinxPage 4 Complexity Continues to Increase - Designers Need Better Ways to Architect, Design & Implement Packet Processing & Transport Video and Vision Cloud and Data Center Industrial loT Wireless  > 400G OTN  Video Over IP  Software Defined Networks  Network Function Virtualization  LTE Advanced  Cloud-RAN  Early 5G  Heterogeneous Wireless Networks  8K/4K Resolution  Immersive Display  Augmented Reality  Video Analytics  Acceleration  Big Data  Software Defined Data Center  Public and Private Cloud  Machine to Machine  Sensory Fusion  Industry 4.0  Cyber-Physical  Embedded Vision Performance & Power Scalability System Integration & Intelligence Security, Safety & Reliability 1 2 3
  5. 5. © Copyright 2018 XilinxPage 5 What is a Field Programmable Gate Array (FPGA)?
  6. 6. © Copyright 2018 Xilinx A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by the customer or designer after manufacturing —hence “ field-programmable“ –Wikipedia In their simplest form FPGAs contain: –Configurable Logic Blocks • AND, OR, Invert & many other logic functions –Configurable interconnect • Enabling Logic Blocks to be connected together –I/O Interfaces With these elements an arbitrary logic design may be createdPage 6 What is an FPGA and How Does It Work?
  7. 7. © Copyright 2018 Xilinx Glue Logic Simple State Machines Prototyping Page 7 FPGAs - Circa 1990 Logic that could be connected quickly - Like LEGO blocks
  8. 8. © Copyright 2018 Xilinx Prototyping or Glue Logic System Integration (SoC) Connectivity – Bridging, Switching Digital Signal Processing (DSP) Contain Complete Embedded Processing Systems Accelerate Software Page 8 What Today’s FPGAs Can Do Chips that can do any Intelligent Function - Like LEGO Mindstorms Blocks
  9. 9. © Copyright 2018 XilinxPage 9 Why Use a Heterogeneous All Programmable MPSoC?
  10. 10. © Copyright 2018 Xilinx Application Processors – High performance general computing Graphics Processing Unit – Graphical User Interface Real-time Processors – Low latency deterministic operation –Lock step for reliable operation Platform Management Unit Processor – Triple redundant for system management & error handling Configuration & Security Unit Processor – Triple redundant for reliable configuration & security operation Programmable Logic – Software acceleration, additional or custom peripheralsPage 10 All Programmable Heterogeneous Processing Diversity – The Right Processing Element for the Right Job
  11. 11. © Copyright 2018 Xilinx Technical Performance Power Flexibility Business Time to Market Lower Cost Flexibility Page 11 The Heterogeneous All Programmable Advantage Highest Level of System Integration and Intelligence The Right Engines for the Right Task
  12. 12. © Copyright 2018 XilinxPage 12 Why Heterogeneous Processors? un-deterministic determinism On-Chip Memory Tightly-Coupled Memory Critical Tasks Real-Time Processing Real-Time Processing RTOS Compute-Intensive Non-Critical Tasks Motor Control (Real-Time Response) I N T E R R U P TI N T E R R U P T Hgp[ General Purpose Processor Linux Non- Critical Tasks Critical Tasks Network Interface I N T E R R U P T
  13. 13. © Copyright 2018 XilinxPage 13 Today’s “FPGAs”  Heterogeneous All Programmable Devices Real-Time Processors 32-bit Dual-Core Up to 600MHz R5 Platform & Power Management Granular Power Control Functional Safety Configuration & Security Unit Anti-Tamper & Trust Industry Standards FPGA Acceleration 16nm UltraScale+ Fabric Customizable Engines Video Codec 8K4K (15fps) 4K2K (60fps) High Speed Peripherals PCIe Gen2, USB 3.0, DisplayPort Graphics Processor ARM Mali-400MP2 2D/3D Visualization Memory Subsystem DDR3/4 LPDDR3/4 Application Processors 64-bit ARMv8 Up to 1.5GHz A53 Programmable Logic Processing System Programmable Logic Processing System Platform Management Unit Config and Security PCIe® Gen4 UltraRAM DisplayPort USB 3.0 SATA PCIe® Gen2 GigE CAN SPI SD/eMMC NAND DisplayPort USB 3.0 SATA PCIe® Gen2 GigE CAN SPI SD/eMMC NAND Memory Subsystem Memory Subsystem ARM Mali™-400MP2 System Functions System Functions Dual-Core ARM Cortex™-R5 16G Transceivers16G & 33G Transceivers 100G Ethernet* 150G Interlaken* Video Codec Dual-Core ARM® Cortex™-A53 Quad-Core ARM® Cortex™-A53 Programmable Logic Processing System Memory Subsystem ARM Mali™-400MP2 Platform Management Unit Config and Security System Functions DisplayPort USB 3.0 SATA PCIe® Gen2 GigE CAN SPI SD/eMMC NAND PCIe® Gen4 UltraRAM Dual-Core ARM Cortex™-R5 16G Transceivers16G & 33G Transceivers 100G Ethernet 150G Interlaken Video Codec Multicore ARM® Cortex™-A53 * ZU11EG, ZU15EG, ZU19EG Only CG Devices Dual-Core A53s EG Devices Quad-Core A53s GPU EV Devices Quad-Core A53s GPU and Video Codec
  14. 14. © Copyright 2018 Xilinx Rapidly Implement & Test designs Accommodate Late Breaking Design Changes In System Field Upgradability Build Next Generation with New Capabilities –Use same Board & Chipset w/ reprogrammed devices Dynamic Partial Reconfiguration –Re-program FPGA regions during operation HW – SW Co-debug –Implement Logic Analyzers inside FPGAs for better visibility into hardware-software problems Page 14 Flexibility – All Programmable Devices Accelerate Development & Enable Change Management
  15. 15. © Copyright 2018 XilinxPage 15 How Do You Program Heterogeneous All Programmable Devices?
  16. 16. © Copyright 2018 Xilinx Create Software Platform and Project Download and Test Software Design Debug Software Application Profile Software Application for Performance Page 16 Software Development Flow Create Software Platform Create Software Projects Library Generation Compilation Debug Download to FPGA Embedded Targeted Reference Design
  17. 17. © Copyright 2018 Xilinx Eclipse based IDE Windows and Linux hosted GUI and command line interface Everything you need for –Board bring-up –Firmware development –Bare-metal application development –Linux application development –Performance optimization –System deployment Page 17 Software Development Kit
  18. 18. © Copyright 2018 Xilinx APUAPU RPURPU Micro- Blaze Micro- Blaze Page 18 Run Time Software APURPU MicroBlaze™ Processor OSOS High-Level OSs Stand-alone Drivers & Libraries RTOSs Hypervisors Begin Application Development with Validated OS’s Determine the Right Processor for your Application Open Source / No Fee Commercial Safety Certifiable Cross-Processor OS Support OS Ecosystem AppsApps KernelKernel Hypervisors OSOS OSOS AppsApps KernelKernel AppsApps KernelKernel OS OS OSOS AppsApps KernelKernel AppsApps KernelKernel AppsApps KernelKernel Linux: Mentor, Wind River Xen Hypervisor Android Bare metal eSol eT-kernel FreeRTOS GreenHills – INTEGRITY LynxOS7, LynxSecure Mentor Hypervisor, Nucleus Micrium - uC/OS-II & III QNX Sciopta Sysgo – PikeOS Wind River VxWorks7/Rocket Hypervisor Support
  19. 19. © Copyright 2018 XilinxPage 19 Introducing the Xilinx Ultra96 Development Platform
  20. 20. © Copyright 2018 Xilinx Linaro 96Boards Consumer Edition –85mm x 54mm form factor Zynq UltraScale+ MPSoC ZU3EG 2GB LPDDR4 Memory MicroSD Socket WiFi Ports –Mini DisplayPort –2x USB 3.0 Type A downstream ports (Host) –1x USB 3.0 Micro-B upstream port (Device) –40-pin Low-speed expansion header –60-pin High speed expansion header Page 20 Ultra96 Features
  21. 21. © Copyright 2018 Xilinx Compatible with 96Boards Mezzanine Boards Grove 96Boards Sensor Mezzanine (board or bundle) –9 Grove connectors for 96Boards IO 5x GPIO, and 4x I2C (mixed 3.3V and 5V; all 5V tolerant) –9 Grove connectors for ATMEGA328 IO 5x GPIO, 3x ADC, and 1x I2C (all 5V) –2 6-pin SPI headers –MicroUSB interface to 96Boards console serial port –1 Power LED & 3 user LEDs –Power & Reset buttons Grove Starter Kit (Includes Mezzanine Board) –LCD, Relay, Buzzer, LEDs, Sound Sensor, Touch Sensor, Light Sensor, … Page 21 Mezzanine Cards
  22. 22. © Copyright 2018 Xilinx Linux Kernel v4.9 – Supports Ultra96 peripherals – MALI 400 GPU support – Enlightenment Desktop Easy-to-use web interface – Access via WiFi or monitor, mouse, keyboard Run Example Projects for Grove devices Edit, compile and run demo applications Add custom content Access Tutorials Page 22 Ultra96 SD Card Image
  23. 23. © Copyright 2018 XilinxPage 23 Accelerating Software via Programmable Logic
  24. 24. © Copyright 2018 XilinxPage 24 Now I Can Turn Software into Hardware Profile Your Code Identify Time Consuming Functions Automatically Create & Attach Hardware Accelerators Model & Optimize for Performance Download & Run
  25. 25. © Copyright 2018 Xilinx MRI Back Projection Algorithm 8x 16k FFT 10x Optical Flow 25x Stereo Local Block Matching 25x 2D Video Optical Filter 30x Binary Neural Network 8,600x Page 25 Acceleration Examples
  26. 26. © Copyright 2018 XilinxPage 26 Software Acceleration Design Example German Traffic Sign Recognition
  27. 27. © Copyright 2018 Xilinx German Road Sign Database –50,000+ 32x32 bit images for training –44 classes (43 road signs, 1 background) –Training via Amazon Web Services • AWS: p2.xlarge Instance – 8 hours  $7.78 6,5e Binary Neural Network Characteristics –6 convolutional layers –2 max pool layers –3 fully connected layers Page 27 Neural Network Example
  28. 28. © Copyright 2018 XilinxPage 28 Neural Network Performance Results Up to 8,600 times faster when accelerated with programmable logic Performance Metric Software Only Programmable Logic Accelerated Tiles per second 2.2 19,000 Scene rate (fps) 0.011 (92 sec per frame) 94 Overall Acceleration - 8,600X
  29. 29. © Copyright 2018 XilinxPage 29
  30. 30. © Copyright 2018 XilinxPage 30 Summary Page 30
  31. 31. © Copyright 2018 Xilinx Your “Stuff” Reference Designs OS / Hypervisor Software Development Development Kits Silicon Leverage Proven Content Rapid Prototyping Reduced Time to Market Reduced System Level Costs Page 31 Go Faster with Development Platforms
  32. 32. © Copyright 2018 Xilinx Submit your most creative, most out-of-the box AI or ML application at the Xilinx or Avnet table during Demo Friday (12:00 – 16:00) The best 25 get a FREE Ultra96 board plus software to help you realize their vision Submit a working design within 60 days and get a T-shirt and SWAG. It’s that simple. Winner announced through Xilinx social media channels. If it’s you, you’re invited to present your design to your peers in industry at Xilinx Developer Forum 2018 Page 32 The Future is Ultra96 Contest
  33. 33. © Copyright 2018 Xilinx Industry Standard Development Environments –Develop, Compile, Link & Download –Easy On-Chip Debugging including HW-SW Co-Debug Extensive IP Libraries, Drivers & Popular OSs Available Automated Tools for Creating SW Accelerators in HW Pre-Built Reference Designs Enable Quick-Starts Page 33 Summary: No Need to Fear Your Processor Being Inside an FPGA Programmable Logic Processing System Memory Subsystem ARM Mali™-400MP2 Platform Management Unit Config and Security System Functions DisplayPort USB 3.0 SATA PCIe® Gen2 GigE CAN SPI SD/eMMC NAND PCIe® Gen4 UltraRAM Dual-Core ARM Cortex™-R5 16G Transceivers 16G & 33G Transceivers 100G Ethernet 150G Interlaken Video Codec Multicore ARM® Cortex™-A53
  34. 34. Dramatically Accelerate 96Board Software via an FPGA with Integrated Processors Glenn Steiner, Sr. Manager, Xilinx, Inc. February 2018 Questions?
  35. 35. © Copyright 2018 XilinxPage 35 Zynq UltraScale+ MPSoC Application Processing Unit 321 ARM® Cortex™-A53 NEON™ 32 KB I-Cache w/Parity Floating Point Unit 32 KB D-Cache w/ECC Memory Management Unit Embedded Trace Macrocell 4 External Memory Controller DDRC DDR4/3/3L LPDDR3/4 ECC Support Graphics Processing Unit ARM Mali™-400 MP2 Memory Management Unit 64 KB L2 Cache Geometry Processor Pixel Processor Pixel Processor 1 2 High-Speed Connectivity DisplayPort SATA 3.1 PCIe 1.0 / 2.0 System Functions FPD DMA PLLs DPLL, VPLL, APLL Debug Timers SWDT USB 3.0 Full Power Domain PMU CSU Processor System ROM System Functions SHA3 AES-GCM RSA System Monitor Timers 2xWDT, 4xTTC LP Connectivity Real-Time Processing UnitReal-Time Processing Unit 21 ARM Cortex™-R5 Vector Floating Point Unit 128 KB TCM w/ECC 32 KB I-Cache w/ECC 32 KB D-Cache w/ECC Memory Protection Unit 128 KB RAM 4 x GEM 2 x CAN 2.0 2 x UART 2 x SPI, 2x- I2C 2 x Quad SPI NAND 2 x SD/eMMC 2 x USB 2.0 LPD DMA Global Registers PLLs RPLL, IOPLL 256 KB OCM with ECC Low Power Domain Storage & Signal Processing Storage & Signal Processing GP I/OGP I/O High-Speed Connectivity High-Speed Connectivity Video Codec H.265, H.264 Video Codec H.265, H.264 Programmable Logic Power Domain BBRAM RTC OSC VCC_PSAUX VCC_PSBATT Battery Domain
  36. 36. © Copyright 2018 Xilinx 96Boards Sensors Mezzanine Card – Integrated ATMEGA328p microcontroller (Arduino compatible) – 18 Grove connectors – Arduino UNO compatible shield connectors – UART interface between baseboard and microcontroller – USB-UART connector for baseboard console and control Grove Sensors – LCD RGB Backlight – Smart Relay – Buzzer – Sound Sensor – Touch Sensor – Rotary Angle Sensor – Temperature & Humidity Sensor – LED – Light Sensor – Button – LEDs Blue, Green, Red – Mini Servo – Grove Cables Page 36 96Boards Sensors Kit

×