SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Implementing Concurrency Abstractions
                     for Programming
       Multi-Core Embedded Systems
                            in Scheme

                                           Ruben Vandamme


  Promotor: Prof. Dr. Wolfgang De Meuter

   Advisors: Dr. Coen De Roover
             Christophe Scholliers
2
                                     Overview

    
        Embedded systems
    
        Event-driven XMOS chip
    
        Interpreter requirements
    
        Bit Scheme
    
        Modifying Bit Scheme to support XMOS
    
        Demonstration
    
        Contributions & Conclusion
3
                          Embedded software
    
        Increasingly important
        ●
            Digital watches, microwaves, cars, etc
        ●
            98% of processors used are embedded
    
        Different from PC and server software
        ●
            Interacts with the outside world
        ●
            Reading sensors, buttons, communicating, etc
    
        Polling: frequently check condition
    
        Interrupts: asynchronous signals
        ●
            Less overhead, less power
4
                                              Interrupts
    
        Dedicated hardware for frequent tasks
        ●
            PWM, UART, I²C,…
        ●
            Timing sensitive tasks
        ●
            Interrupts are often used as an interface
    
        Source of various bugs
        ●
            Stack overflow, interrupt overload, …
        ●
            John Regehr (2005)
              Safe and structured use of interrupts in real-time
              and embedded software
5
            Event-driven chip

    
        XMOS XS1-G4
        ●
            No interrupts or polling
    
        Multi-core, multi-threaded
        ●
            Threads supported in HW
        ●
            Guaranteed execution time
        ●
            Message passing
    
        Transputer
    
        Programmed in XC
        ●
            Based on CSP
6
                  Interpreter requirements
    
        Use less than 64 KB memory (per core)
        ●
            Most interpreters need an order of magnitude
            more memory
        ●
            MiniScheme, Pico, etc.
    
        Contain a real-time garbage collector
    
        We need to extend it to
        ●
            exploit concurrency provided by hardware
        ●
            export hardware functionality
              Input & output, timing, ...
7
                                    Bit Scheme
    
        Fits in 64KB memory
          Byte code + interpreter + runtime memory
    
        Byte code based
          Compiler can remove unneeded functions
    
        No runtime error handling
          Keeps interpreter small
    
        Real-time garbage collector
    
        Served as a basis for our XMOS Scheme
8
            Exploiting XMOS concurrency
    
        We run four interpreters run in parallel
        ●
            One on each core
        ●
            Modified compiler and interpreter accordingly
    
        Exploit all memory and IO possibilities
                                 (par
                                  (core CORE_0
                                    ...)
                                  (core CORE_1
                                    ...)
                                  (core CORE_2
                                    ...)
                                  (core CORE_3
                                    ...))
9
  Communication primitives added
 
     Use message passing over channels
     ●
         cout, cin
     ●
         Primitives use hardware
 
     Doesn't support composite types
     ●
         Serialization needed
                                Core 0   Core 1
 (par
  (core CORE_0
    (cout CORE_1 99))
  (core CORE_1
    (display (cin CORE_0))))    Core 2   Core 3
10
                       IO primitives added
                   
                       Initialize and configure IO
                         pon, poff, pconf_in, pconf_out
                   
                       Perform IO
                         pout, pin
                   
                       Wait for an event on IO pins
                         peq, pneq
                   
                       Use hardware functionality
(define PORT_CLOCKLED 525056)
(pon PORT_CLOCKLED)
(pconf_out PORT_CLOCKLED 0)
(pout PORT_CLOCKLED 15)
11
                          Time primitives added

     
         We added a notion of time
         ●
             Execute actions at a certain point in time
         ●
             (timer)
               Returns the current time in clockticks
         ●
             (after time)
               Blocks thread until current time is after time
         ●
             Both primitives call hardware functionality
               (define now (timer))
               (define clock 100000000)
               (after (+ now (∗ 5 clock)))
               (display ”5 seconds later”)
Handling multiple events at once
12


 
     Certain primitives are blocking
       pne, peq, cin
 
     Threads need to be able to handle more
     than one event at a time
      (select
       ((select_pne buttons 15)
         (lambda (buttonsvalue)
           (display buttonsvalue)))
       ((select_cin CORE_1)
         display)
       (else (display ”default”)))
13
                                        Compilation

     (par
       (core CORE_0 ...)          BC0
       (core CORE_1 ...)          BC1
       (core CORE_2 ...)          BC2
       (core CORE_3 ...)          BC3
     )

     Step 1
     Scheme compiler
     compiles Scheme → bytecode
14
                                                            Compilation

     (par                                                      BC0                     BC1
                                                            Interpreter             Interpreter




                                        Interpreter
       (core CORE_0 ...)          BC0
       (core CORE_1 ...)          BC1
       (core CORE_2 ...)          BC2
       (core CORE_3 ...)          BC3
     )                                                         BC2                     BC3
                                                            Interpreter             Interpreter
     Step 1
     Scheme compiler
     compiles Scheme → bytecode                       Step2
                                                      XMOS toolchain
                                                      compiles bytecode + interpreter
                                                      → XMOS executable
15
                                 Demonstration
     
         Case study
     
         LED Pulse Width Modulation in Scheme
     
         Communication over Xbee
          ●
              Via UART implemented in Scheme
          App           UART
         Buttons         RX




         UART
                        PWM
          TX
16
                Contributions & Conclusion
     
         Ported a Scheme interpreter to the new
         XMOS chip
     
         Exploit the concurrency of XMOS chip
     
         Added new primitives
         ●
             IO, message passing, time, …
     
         Allow to program hardware from Scheme
     
         Modified compiler accordingly
17




     Questions

Weitere ähnliche Inhalte

Was ist angesagt?

07 processor basics
07 processor basics07 processor basics
07 processor basicsMurali M
 
I3 multicore processor
I3 multicore processorI3 multicore processor
I3 multicore processorAmol Barewar
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBshimosawa
 
Embedded TCP/IP stack for FreeRTOS
Embedded TCP/IP stack for FreeRTOSEmbedded TCP/IP stack for FreeRTOS
Embedded TCP/IP stack for FreeRTOS艾鍗科技
 
The Silence of the Canaries
The Silence of the CanariesThe Silence of the Canaries
The Silence of the CanariesKernel TLV
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference acceleratorsDarshanG13
 
한컴MDS_Virtual Target Debugging with TRACE32
한컴MDS_Virtual Target Debugging with TRACE32한컴MDS_Virtual Target Debugging with TRACE32
한컴MDS_Virtual Target Debugging with TRACE32HANCOM MDS
 
Linux Porting
Linux PortingLinux Porting
Linux PortingChamp Yen
 
Linux User Space Debugging & Profiling
Linux User Space Debugging & ProfilingLinux User Space Debugging & Profiling
Linux User Space Debugging & ProfilingAnil Kumar Pugalia
 
Ov psim demo_slides_power_pc
Ov psim demo_slides_power_pcOv psim demo_slides_power_pc
Ov psim demo_slides_power_pcsimon56
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)micchie
 

Was ist angesagt? (20)

07 processor basics
07 processor basics07 processor basics
07 processor basics
 
Synchronization
SynchronizationSynchronization
Synchronization
 
I3 multicore processor
I3 multicore processorI3 multicore processor
I3 multicore processor
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
 
Embedded TCP/IP stack for FreeRTOS
Embedded TCP/IP stack for FreeRTOSEmbedded TCP/IP stack for FreeRTOS
Embedded TCP/IP stack for FreeRTOS
 
Xvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisorXvisor: embedded and lightweight hypervisor
Xvisor: embedded and lightweight hypervisor
 
The Silence of the Canaries
The Silence of the CanariesThe Silence of the Canaries
The Silence of the Canaries
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference accelerators
 
Processes
ProcessesProcesses
Processes
 
Signals
SignalsSignals
Signals
 
한컴MDS_Virtual Target Debugging with TRACE32
한컴MDS_Virtual Target Debugging with TRACE32한컴MDS_Virtual Target Debugging with TRACE32
한컴MDS_Virtual Target Debugging with TRACE32
 
Shell Scripting
Shell ScriptingShell Scripting
Shell Scripting
 
Linux Porting
Linux PortingLinux Porting
Linux Porting
 
淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道 淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道
 
Kernel Debugging & Profiling
Kernel Debugging & ProfilingKernel Debugging & Profiling
Kernel Debugging & Profiling
 
Linux User Space Debugging & Profiling
Linux User Space Debugging & ProfilingLinux User Space Debugging & Profiling
Linux User Space Debugging & Profiling
 
Ov psim demo_slides_power_pc
Ov psim demo_slides_power_pcOv psim demo_slides_power_pc
Ov psim demo_slides_power_pc
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)
 
Block Drivers
Block DriversBlock Drivers
Block Drivers
 
Video Drivers
Video DriversVideo Drivers
Video Drivers
 

Ähnlich wie 05 defense

Atmel and pic microcontroller
Atmel and pic microcontrollerAtmel and pic microcontroller
Atmel and pic microcontrollerTearsome Llantada
 
High Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and FutureHigh Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and Futurekarl.barnes
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeDmitri Nesteruk
 
isa architecture
isa architectureisa architecture
isa architectureAJAL A J
 
Introduction to FreeRTOS
Introduction to FreeRTOSIntroduction to FreeRTOS
Introduction to FreeRTOSICS
 
Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Kynetics
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...Alexandre Moneger
 
Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012
Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012
Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012DefCamp
 
Skiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in DSkiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in DMithun Hunsur
 
Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Davide Carboni
 
Beneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBeneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBhoomil Chavda
 
Inter process communication using Linux System Calls
Inter process communication using Linux System CallsInter process communication using Linux System Calls
Inter process communication using Linux System Callsjyoti9vssut
 
Os Madsen Block
Os Madsen BlockOs Madsen Block
Os Madsen Blockoscon2007
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?OpenFest team
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Akihiro Hayashi
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveNetronome
 
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIMAn Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIMjournalBEEI
 
Lecture1 - Computer Architecture
Lecture1 - Computer ArchitectureLecture1 - Computer Architecture
Lecture1 - Computer ArchitectureVolodymyr Ushenko
 

Ähnlich wie 05 defense (20)

Atmel and pic microcontroller
Atmel and pic microcontrollerAtmel and pic microcontroller
Atmel and pic microcontroller
 
High Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and FutureHigh Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and Future
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
isa architecture
isa architectureisa architecture
isa architecture
 
Introduction to FreeRTOS
Introduction to FreeRTOSIntroduction to FreeRTOS
Introduction to FreeRTOS
 
Mina2
Mina2Mina2
Mina2
 
Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7Heterogeneous multiprocessing on androd and i.mx7
Heterogeneous multiprocessing on androd and i.mx7
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
 
Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012
Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012
Hunting and Exploiting Bugs in Kernel Drivers - DefCamp 2012
 
Skiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in DSkiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in D
 
Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?
 
Beneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBeneath the Linux Interrupt handling
Beneath the Linux Interrupt handling
 
Inter process communication using Linux System Calls
Inter process communication using Linux System CallsInter process communication using Linux System Calls
Inter process communication using Linux System Calls
 
Os Madsen Block
Os Madsen BlockOs Madsen Block
Os Madsen Block
 
Genode Compositions
Genode CompositionsGenode Compositions
Genode Compositions
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
 
BPF Hardware Offload Deep Dive
BPF Hardware Offload Deep DiveBPF Hardware Offload Deep Dive
BPF Hardware Offload Deep Dive
 
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIMAn Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
 
Lecture1 - Computer Architecture
Lecture1 - Computer ArchitectureLecture1 - Computer Architecture
Lecture1 - Computer Architecture
 

05 defense

  • 1. Implementing Concurrency Abstractions for Programming Multi-Core Embedded Systems in Scheme Ruben Vandamme Promotor: Prof. Dr. Wolfgang De Meuter Advisors: Dr. Coen De Roover Christophe Scholliers
  • 2. 2 Overview  Embedded systems  Event-driven XMOS chip  Interpreter requirements  Bit Scheme  Modifying Bit Scheme to support XMOS  Demonstration  Contributions & Conclusion
  • 3. 3 Embedded software  Increasingly important ● Digital watches, microwaves, cars, etc ● 98% of processors used are embedded  Different from PC and server software ● Interacts with the outside world ● Reading sensors, buttons, communicating, etc  Polling: frequently check condition  Interrupts: asynchronous signals ● Less overhead, less power
  • 4. 4 Interrupts  Dedicated hardware for frequent tasks ● PWM, UART, I²C,… ● Timing sensitive tasks ● Interrupts are often used as an interface  Source of various bugs ● Stack overflow, interrupt overload, … ● John Regehr (2005) Safe and structured use of interrupts in real-time and embedded software
  • 5. 5 Event-driven chip  XMOS XS1-G4 ● No interrupts or polling  Multi-core, multi-threaded ● Threads supported in HW ● Guaranteed execution time ● Message passing  Transputer  Programmed in XC ● Based on CSP
  • 6. 6 Interpreter requirements  Use less than 64 KB memory (per core) ● Most interpreters need an order of magnitude more memory ● MiniScheme, Pico, etc.  Contain a real-time garbage collector  We need to extend it to ● exploit concurrency provided by hardware ● export hardware functionality Input & output, timing, ...
  • 7. 7 Bit Scheme  Fits in 64KB memory Byte code + interpreter + runtime memory  Byte code based Compiler can remove unneeded functions  No runtime error handling Keeps interpreter small  Real-time garbage collector  Served as a basis for our XMOS Scheme
  • 8. 8 Exploiting XMOS concurrency  We run four interpreters run in parallel ● One on each core ● Modified compiler and interpreter accordingly  Exploit all memory and IO possibilities (par (core CORE_0 ...) (core CORE_1 ...) (core CORE_2 ...) (core CORE_3 ...))
  • 9. 9 Communication primitives added  Use message passing over channels ● cout, cin ● Primitives use hardware  Doesn't support composite types ● Serialization needed Core 0 Core 1 (par (core CORE_0 (cout CORE_1 99)) (core CORE_1 (display (cin CORE_0)))) Core 2 Core 3
  • 10. 10 IO primitives added  Initialize and configure IO pon, poff, pconf_in, pconf_out  Perform IO pout, pin  Wait for an event on IO pins peq, pneq  Use hardware functionality (define PORT_CLOCKLED 525056) (pon PORT_CLOCKLED) (pconf_out PORT_CLOCKLED 0) (pout PORT_CLOCKLED 15)
  • 11. 11 Time primitives added  We added a notion of time ● Execute actions at a certain point in time ● (timer) Returns the current time in clockticks ● (after time) Blocks thread until current time is after time ● Both primitives call hardware functionality (define now (timer)) (define clock 100000000) (after (+ now (∗ 5 clock))) (display ”5 seconds later”)
  • 12. Handling multiple events at once 12  Certain primitives are blocking pne, peq, cin  Threads need to be able to handle more than one event at a time (select ((select_pne buttons 15) (lambda (buttonsvalue) (display buttonsvalue))) ((select_cin CORE_1) display) (else (display ”default”)))
  • 13. 13 Compilation (par (core CORE_0 ...) BC0 (core CORE_1 ...) BC1 (core CORE_2 ...) BC2 (core CORE_3 ...) BC3 ) Step 1 Scheme compiler compiles Scheme → bytecode
  • 14. 14 Compilation (par BC0 BC1 Interpreter Interpreter Interpreter (core CORE_0 ...) BC0 (core CORE_1 ...) BC1 (core CORE_2 ...) BC2 (core CORE_3 ...) BC3 ) BC2 BC3 Interpreter Interpreter Step 1 Scheme compiler compiles Scheme → bytecode Step2 XMOS toolchain compiles bytecode + interpreter → XMOS executable
  • 15. 15 Demonstration  Case study  LED Pulse Width Modulation in Scheme  Communication over Xbee ● Via UART implemented in Scheme App UART Buttons RX UART PWM TX
  • 16. 16 Contributions & Conclusion  Ported a Scheme interpreter to the new XMOS chip  Exploit the concurrency of XMOS chip  Added new primitives ● IO, message passing, time, …  Allow to program hardware from Scheme  Modified compiler accordingly
  • 17. 17 Questions