SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Field-Programmable Gate Arrays
       as tracking devices

          Roberto Rodríguez Osorio
            Javier Díaz Bruguera

        Group of Computer Architecture
   Dept. of Electronics and Computer Science
     University of Santiago de Compostela
Outline

Application-specific computing machines
ASIC vs FPGA
FPGA technology basics
Hard cores in FPGAs
Performance
Design effort
Choices
Applications




                                          2
Application-specific computing machines

        Microprocessor                   Application-Specific
                                          Integrated Circuit
     Code           Data
    memory         memory
                   M     p

                                                     t    p     M
   PC     IR       Register
                     file              Control
                                        logic            MAC
    Control
     logic        Functional           Control
                    units                             Datapath
                                       section
     Control
                   Datapath
     section

Performance:     10 cycles @ 3GHz   Performance:     1 cycle @ 1GHz
Dissipated power: ~35 W             Dissipated power: ~mW
                                                                      3
ASIC vs FPGA

                                                  $4M



                                                  $3M




                                                  $2M




                                                        NRE
                                                  $1M




0.35   0.25      0.2           0.15      0.1   0.05
              Technology (micrometers)




                                                              4
ASIC vs FPGA

     6
         Computational efficiency (Mops/w)
10
     5
10                Maximum efficiency                             FPGA

     4
                       (ASIC)                                    ASSP
                                                                 MPPA
10                                                               GPGPU
                                                                 VLIW
                                                                 ASIP
     3                                                           ManyCore
10                                                               ...


     2
10
     1
10
     0
10 2                1            0.5         0.25   0.13         0.07
                                                     Technology ( m)


 1986            1990         1994          1998      2002       2006
          Source: Theo A.C.M Claasen, ISSCC 99

                                                                            5
FPGA technology basics – Computing

         a          b                  carry                carry
                                       input   a   b    s   output
                                         0     0    0   0    0
 c out   FA             c in             0     0    1   1    0
                                         0     1    0   1    0
               s                         0     1    1   0    1
                                         1     0    0   1    0
             c in
                                         1     0    1   0    1
 a                             s
                                         1     1    0   0    1
 b
                                         1     1    1   1    1
 a
 b
 a
                               c out
 cin
 b
 c in
                                                                     6
FPGA technology basics – Do not compute

                               Logic blocks
a
         SRAM
b        Memory    s
         8x1-bit
cin




         SRAM
         Memory    cout
         8x1-bit




                                                7
FPGA technology basics – Interconnect
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
█   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █   █
                                                                                                8
FPGA technology basics – Interconnect




                                        9
FPGA technology basics – Interconnect




                                        10
FPGA technology basics – Interconnect + memory

FPGA fabric consists of a huge number of simple memory
elements connected by means of a reconfigurable network
Design software must break every computing tasks into
1-bit size operation with no more than 4, 5 or 6 variables
Operations are spatially distributed according to proximity
criteria
Routing may be troublesome
   Long paths are slow
   Routing though logic blocks increase area




                                                              11
Hard cores in FPGAs

Memory blocks           ████████████████████
Multipliers             ████████████████████
DSP blocks              ████████████████████
Microprocessors         ████████████████████
Floating point units?   ████████████████████
                        ████████████████████
                        ████████████████████
                        ████████████████████
                        ████████████████████
                        ████████████████████




                                               12
Memory blocks

Hundreds or thousands of small memory blocks
     Dual-port blocks
     18 K-bit each for Xilinx
     Flexible configurations
       Many short words or a few large word
Independent access
     Huge aggregated bandwidth




                                               13
Multipliers and DSP blocks

As FPGAs were becoming larger, some people tried to
  implement DSP algorithms on them
     However: Multipliers take too much area
     Therefore: Hardwired multipliers were introduced
DSP algorithms are often based on
     multiply & add
     multiply & accumulate
DSP blocks in modern FPGAs implement hardwired:
     multipliy, multiply & add, multiply & accumulate
     optional addition before multiplying
     three-input add
     1 large, 2 medium or 4 small operations on the same hardware
     shifting, comparisons, bit-wise operations,…
Up to 2000 DSP blocks in current FPGAs for massive
  parallelism
                                                                    14
Microprocessors

Xilinx:
   IBMs Power PC processors
    Virtex II Pro
    Virtex-4 FX
    Virtex-5 FX
  Microblaze soft processors

Altera:
   ARM RISC processors
   Nios soft processor




                                      15
Floating point units

Not implemented so far
• Suggested to help to accelerate scientific computing
• For engineering, fixed point arithmetic is usually enough


Would it happen?
☺ It happened with multipliers, transceivers, DSP blocks, …
  GPUs have already a strong position in this field




                                                              16
Performance

Compared to an ASIC
    10 times slower, larger and power hungry


Compared to a microprocessor
    Fast, depending on:
     Potential parallelism
     Required bandwidth
    Small and simple, even standalone
    Reduced power consumption (< 1W), they may run on batteries




                                                                  17
Design effort

Several scenarios:

Pure VHDL or Verilog coding
     Higher flexibility, efficiency and performance
     Long design time
     Costly debugging
Use macros combined with VHDL or Verilog
     Libraries of IP blocks easy the design process
     It is not guaranteed that the required functionalities can be found
High level languages (DSP logic (Matlab), Impulse-C,
  Handel-C,…)
     Efficient and simple implementation for simple algorithms
     Lack of expressiveness for complex algorithms



                                                                           18
Choices

Xilinx
         Virtex
         Spartan
Altera
         Stratix
         Cyclone
Others
         Actel
         Lattice Semiconductor
         …




                                           19
Choices - Xilinx

                    Spartan 3       Spartan 6        Virtex 6

Logic Cells        1728 – 74880   3840 - 147443   74496 – 566784


Block RAM           12 - 1872      216 - 4824      5616 – 32832
(Kbits)

Multipliers /        4 – 104
DSP                  84 - 126        8 - 180        288 - 2016

Evaluation board     < $200       $300 - $1000    $2000 - $2500
cost




                                                                   20
In the context of this applications

Device choice
• Logic bounded
    •   Standard logic
    •   Multipliers
• IO bounded
Parallel acquisition
• Switching memory blocks for acquisition and computation
High computing speed
• Via pipelining
Results storage
• Internal or external memory
Power consumption
Configuration
                                                            21

Weitere ähnliche Inhalte

Ähnlich wie RR Osorio FPGA

Columnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to comeColumnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to comeWang Zuo
 
Atari 2600 Programming for Fun
Atari 2600 Programming for FunAtari 2600 Programming for Fun
Atari 2600 Programming for FunPaul Dixon
 
IPv6 Fundamentals & Securities
IPv6 Fundamentals & SecuritiesIPv6 Fundamentals & Securities
IPv6 Fundamentals & SecuritiesDon Anto
 
The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...
The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...
The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...Campus Party Brasil
 
flowspec @ APF 2013
flowspec @ APF 2013flowspec @ APF 2013
flowspec @ APF 2013Tom Paseka
 
[IGC2018] AMD Don Woligroski - WHY Ryzen
[IGC2018] AMD Don Woligroski - WHY Ryzen[IGC2018] AMD Don Woligroski - WHY Ryzen
[IGC2018] AMD Don Woligroski - WHY Ryzen강 민우
 
Running Asterisk on virtualized environments
Running Asterisk on virtualized environmentsRunning Asterisk on virtualized environments
Running Asterisk on virtualized environmentsSaúl Ibarra Corretgé
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPIJeff Squyres
 
My talk from PgConf.Russia 2016
My talk from PgConf.Russia 2016My talk from PgConf.Russia 2016
My talk from PgConf.Russia 2016Alex Chistyakov
 
Gerenciamento de Memória(2)
Gerenciamento de Memória(2)Gerenciamento de Memória(2)
Gerenciamento de Memória(2)elliando dias
 
Experiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCExperiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCGanesan Narayanasamy
 
cisco-n3k-c3172tq-32t-datasheet.pdf
cisco-n3k-c3172tq-32t-datasheet.pdfcisco-n3k-c3172tq-32t-datasheet.pdf
cisco-n3k-c3172tq-32t-datasheet.pdfHi-Network.com
 
Cisco catalyst 2960 xr series switches datasheet
Cisco catalyst 2960 xr series switches datasheetCisco catalyst 2960 xr series switches datasheet
Cisco catalyst 2960 xr series switches datasheetAmy Huang
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best PracticesJeff Larkin
 

Ähnlich wie RR Osorio FPGA (20)

Rapport
RapportRapport
Rapport
 
Columnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to comeColumnar processing for SQL-on-Hadoop: The best is yet to come
Columnar processing for SQL-on-Hadoop: The best is yet to come
 
Pic 16 c65b
Pic 16 c65bPic 16 c65b
Pic 16 c65b
 
Atari 2600 Programming for Fun
Atari 2600 Programming for FunAtari 2600 Programming for Fun
Atari 2600 Programming for Fun
 
IPv6 Fundamentals & Securities
IPv6 Fundamentals & SecuritiesIPv6 Fundamentals & Securities
IPv6 Fundamentals & Securities
 
FPGA_BasedGCD
FPGA_BasedGCDFPGA_BasedGCD
FPGA_BasedGCD
 
The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...
The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...
The VR Continuum: From CAVEs to Digital TV, How VR and AR will change everyda...
 
Pic 16 f84
Pic 16 f84Pic 16 f84
Pic 16 f84
 
flowspec @ APF 2013
flowspec @ APF 2013flowspec @ APF 2013
flowspec @ APF 2013
 
8051 micro controller
8051 micro controller8051 micro controller
8051 micro controller
 
[IGC2018] AMD Don Woligroski - WHY Ryzen
[IGC2018] AMD Don Woligroski - WHY Ryzen[IGC2018] AMD Don Woligroski - WHY Ryzen
[IGC2018] AMD Don Woligroski - WHY Ryzen
 
Running Asterisk on virtualized environments
Running Asterisk on virtualized environmentsRunning Asterisk on virtualized environments
Running Asterisk on virtualized environments
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPI
 
My talk from PgConf.Russia 2016
My talk from PgConf.Russia 2016My talk from PgConf.Russia 2016
My talk from PgConf.Russia 2016
 
Gerenciamento de Memória(2)
Gerenciamento de Memória(2)Gerenciamento de Memória(2)
Gerenciamento de Memória(2)
 
Experiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCExperiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRC
 
cisco-n3k-c3172tq-32t-datasheet.pdf
cisco-n3k-c3172tq-32t-datasheet.pdfcisco-n3k-c3172tq-32t-datasheet.pdf
cisco-n3k-c3172tq-32t-datasheet.pdf
 
Cisco catalyst 2960 xr series switches datasheet
Cisco catalyst 2960 xr series switches datasheetCisco catalyst 2960 xr series switches datasheet
Cisco catalyst 2960 xr series switches datasheet
 
1 Day Arm 2007
1 Day Arm 20071 Day Arm 2007
1 Day Arm 2007
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 

Mehr von Miguel Morales

Jj Taboada C Rays Climate
Jj Taboada C Rays ClimateJj Taboada C Rays Climate
Jj Taboada C Rays ClimateMiguel Morales
 
G Kornakov E A Smultivariate Analysis
G Kornakov  E A Smultivariate AnalysisG Kornakov  E A Smultivariate Analysis
G Kornakov E A Smultivariate AnalysisMiguel Morales
 
J A Garzon Trasgo2010 Intro
J A Garzon  Trasgo2010  IntroJ A Garzon  Trasgo2010  Intro
J A Garzon Trasgo2010 IntroMiguel Morales
 
D Gonzalez Diaz Optimization Mstip R P Cs
D Gonzalez Diaz  Optimization Mstip R P CsD Gonzalez Diaz  Optimization Mstip R P Cs
D Gonzalez Diaz Optimization Mstip R P CsMiguel Morales
 
J A Garzon Tim Trackfor Trasgos
J A Garzon  Tim Trackfor TrasgosJ A Garzon  Tim Trackfor Trasgos
J A Garzon Tim Trackfor TrasgosMiguel Morales
 
G Rodriguez Tank Calibration
G Rodriguez  Tank CalibrationG Rodriguez  Tank Calibration
G Rodriguez Tank CalibrationMiguel Morales
 
R Vazquez Showers Signatures
R Vazquez  Showers SignaturesR Vazquez  Showers Signatures
R Vazquez Showers SignaturesMiguel Morales
 
P Cabanelas Hades Telescope
P Cabanelas  Hades TelescopeP Cabanelas  Hades Telescope
P Cabanelas Hades TelescopeMiguel Morales
 
A Gomez TimTrack at C E S G A
A Gomez  TimTrack at C E S G AA Gomez  TimTrack at C E S G A
A Gomez TimTrack at C E S G AMiguel Morales
 
D Belver FEE for Trasgos
D Belver  FEE for TrasgosD Belver  FEE for Trasgos
D Belver FEE for TrasgosMiguel Morales
 
M Traxler TRB and Trasgo
M Traxler  TRB and TrasgoM Traxler  TRB and Trasgo
M Traxler TRB and TrasgoMiguel Morales
 
Ja Garzon Tim Trackfor Trasgos
Ja Garzon Tim Trackfor TrasgosJa Garzon Tim Trackfor Trasgos
Ja Garzon Tim Trackfor TrasgosMiguel Morales
 
Jj Taboada C Rays Climate
Jj Taboada C Rays ClimateJj Taboada C Rays Climate
Jj Taboada C Rays ClimateMiguel Morales
 
D Gonzalez Diaz Optimization Mstip Rp Cs
D Gonzalez Diaz Optimization Mstip Rp CsD Gonzalez Diaz Optimization Mstip Rp Cs
D Gonzalez Diaz Optimization Mstip Rp CsMiguel Morales
 
G Rodriguez Tank Calibration
G Rodriguez Tank CalibrationG Rodriguez Tank Calibration
G Rodriguez Tank CalibrationMiguel Morales
 
G Kornakov Ea Smultivariate Analysis
G Kornakov Ea Smultivariate AnalysisG Kornakov Ea Smultivariate Analysis
G Kornakov Ea Smultivariate AnalysisMiguel Morales
 

Mehr von Miguel Morales (20)

Jj Taboada C Rays Climate
Jj Taboada C Rays ClimateJj Taboada C Rays Climate
Jj Taboada C Rays Climate
 
T Kurtukian Midas
T Kurtukian MidasT Kurtukian Midas
T Kurtukian Midas
 
M Morales Sealed Rpcs
M Morales Sealed RpcsM Morales Sealed Rpcs
M Morales Sealed Rpcs
 
G Kornakov E A Smultivariate Analysis
G Kornakov  E A Smultivariate AnalysisG Kornakov  E A Smultivariate Analysis
G Kornakov E A Smultivariate Analysis
 
J A Garzon Trasgo2010 Intro
J A Garzon  Trasgo2010  IntroJ A Garzon  Trasgo2010  Intro
J A Garzon Trasgo2010 Intro
 
D Gonzalez Diaz Optimization Mstip R P Cs
D Gonzalez Diaz  Optimization Mstip R P CsD Gonzalez Diaz  Optimization Mstip R P Cs
D Gonzalez Diaz Optimization Mstip R P Cs
 
J A Garzon Tim Trackfor Trasgos
J A Garzon  Tim Trackfor TrasgosJ A Garzon  Tim Trackfor Trasgos
J A Garzon Tim Trackfor Trasgos
 
G Rodriguez Tank Calibration
G Rodriguez  Tank CalibrationG Rodriguez  Tank Calibration
G Rodriguez Tank Calibration
 
R Vazquez Showers Signatures
R Vazquez  Showers SignaturesR Vazquez  Showers Signatures
R Vazquez Showers Signatures
 
P Cabanelas Hades Telescope
P Cabanelas  Hades TelescopeP Cabanelas  Hades Telescope
P Cabanelas Hades Telescope
 
P Fonte Trasgo 2010
P Fonte  Trasgo 2010P Fonte  Trasgo 2010
P Fonte Trasgo 2010
 
A Gomez TimTrack at C E S G A
A Gomez  TimTrack at C E S G AA Gomez  TimTrack at C E S G A
A Gomez TimTrack at C E S G A
 
D Belver FEE for Trasgos
D Belver  FEE for TrasgosD Belver  FEE for Trasgos
D Belver FEE for Trasgos
 
M Traxler TRB and Trasgo
M Traxler  TRB and TrasgoM Traxler  TRB and Trasgo
M Traxler TRB and Trasgo
 
Ja Garzon Tim Trackfor Trasgos
Ja Garzon Tim Trackfor TrasgosJa Garzon Tim Trackfor Trasgos
Ja Garzon Tim Trackfor Trasgos
 
M Morales Sealed Rpcs
M Morales Sealed RpcsM Morales Sealed Rpcs
M Morales Sealed Rpcs
 
Jj Taboada C Rays Climate
Jj Taboada C Rays ClimateJj Taboada C Rays Climate
Jj Taboada C Rays Climate
 
D Gonzalez Diaz Optimization Mstip Rp Cs
D Gonzalez Diaz Optimization Mstip Rp CsD Gonzalez Diaz Optimization Mstip Rp Cs
D Gonzalez Diaz Optimization Mstip Rp Cs
 
G Rodriguez Tank Calibration
G Rodriguez Tank CalibrationG Rodriguez Tank Calibration
G Rodriguez Tank Calibration
 
G Kornakov Ea Smultivariate Analysis
G Kornakov Ea Smultivariate AnalysisG Kornakov Ea Smultivariate Analysis
G Kornakov Ea Smultivariate Analysis
 

Kürzlich hochgeladen

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Kürzlich hochgeladen (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

RR Osorio FPGA

  • 1. Field-Programmable Gate Arrays as tracking devices Roberto Rodríguez Osorio Javier Díaz Bruguera Group of Computer Architecture Dept. of Electronics and Computer Science University of Santiago de Compostela
  • 2. Outline Application-specific computing machines ASIC vs FPGA FPGA technology basics Hard cores in FPGAs Performance Design effort Choices Applications 2
  • 3. Application-specific computing machines Microprocessor Application-Specific Integrated Circuit Code Data memory memory M p t p M PC IR Register file Control logic MAC Control logic Functional Control units Datapath section Control Datapath section Performance: 10 cycles @ 3GHz Performance: 1 cycle @ 1GHz Dissipated power: ~35 W Dissipated power: ~mW 3
  • 4. ASIC vs FPGA $4M $3M $2M NRE $1M 0.35 0.25 0.2 0.15 0.1 0.05 Technology (micrometers) 4
  • 5. ASIC vs FPGA 6 Computational efficiency (Mops/w) 10 5 10 Maximum efficiency FPGA 4 (ASIC) ASSP MPPA 10 GPGPU VLIW ASIP 3 ManyCore 10 ... 2 10 1 10 0 10 2 1 0.5 0.25 0.13 0.07 Technology ( m) 1986 1990 1994 1998 2002 2006 Source: Theo A.C.M Claasen, ISSCC 99 5
  • 6. FPGA technology basics – Computing a b carry carry input a b s output 0 0 0 0 0 c out FA c in 0 0 1 1 0 0 1 0 1 0 s 0 1 1 0 1 1 0 0 1 0 c in 1 0 1 0 1 a s 1 1 0 0 1 b 1 1 1 1 1 a b a c out cin b c in 6
  • 7. FPGA technology basics – Do not compute Logic blocks a SRAM b Memory s 8x1-bit cin SRAM Memory cout 8x1-bit 7
  • 8. FPGA technology basics – Interconnect █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ 8
  • 9. FPGA technology basics – Interconnect 9
  • 10. FPGA technology basics – Interconnect 10
  • 11. FPGA technology basics – Interconnect + memory FPGA fabric consists of a huge number of simple memory elements connected by means of a reconfigurable network Design software must break every computing tasks into 1-bit size operation with no more than 4, 5 or 6 variables Operations are spatially distributed according to proximity criteria Routing may be troublesome Long paths are slow Routing though logic blocks increase area 11
  • 12. Hard cores in FPGAs Memory blocks ████████████████████ Multipliers ████████████████████ DSP blocks ████████████████████ Microprocessors ████████████████████ Floating point units? ████████████████████ ████████████████████ ████████████████████ ████████████████████ ████████████████████ ████████████████████ 12
  • 13. Memory blocks Hundreds or thousands of small memory blocks Dual-port blocks 18 K-bit each for Xilinx Flexible configurations Many short words or a few large word Independent access Huge aggregated bandwidth 13
  • 14. Multipliers and DSP blocks As FPGAs were becoming larger, some people tried to implement DSP algorithms on them However: Multipliers take too much area Therefore: Hardwired multipliers were introduced DSP algorithms are often based on multiply & add multiply & accumulate DSP blocks in modern FPGAs implement hardwired: multipliy, multiply & add, multiply & accumulate optional addition before multiplying three-input add 1 large, 2 medium or 4 small operations on the same hardware shifting, comparisons, bit-wise operations,… Up to 2000 DSP blocks in current FPGAs for massive parallelism 14
  • 15. Microprocessors Xilinx: IBMs Power PC processors Virtex II Pro Virtex-4 FX Virtex-5 FX Microblaze soft processors Altera: ARM RISC processors Nios soft processor 15
  • 16. Floating point units Not implemented so far • Suggested to help to accelerate scientific computing • For engineering, fixed point arithmetic is usually enough Would it happen? ☺ It happened with multipliers, transceivers, DSP blocks, … GPUs have already a strong position in this field 16
  • 17. Performance Compared to an ASIC 10 times slower, larger and power hungry Compared to a microprocessor Fast, depending on: Potential parallelism Required bandwidth Small and simple, even standalone Reduced power consumption (< 1W), they may run on batteries 17
  • 18. Design effort Several scenarios: Pure VHDL or Verilog coding Higher flexibility, efficiency and performance Long design time Costly debugging Use macros combined with VHDL or Verilog Libraries of IP blocks easy the design process It is not guaranteed that the required functionalities can be found High level languages (DSP logic (Matlab), Impulse-C, Handel-C,…) Efficient and simple implementation for simple algorithms Lack of expressiveness for complex algorithms 18
  • 19. Choices Xilinx Virtex Spartan Altera Stratix Cyclone Others Actel Lattice Semiconductor … 19
  • 20. Choices - Xilinx Spartan 3 Spartan 6 Virtex 6 Logic Cells 1728 – 74880 3840 - 147443 74496 – 566784 Block RAM 12 - 1872 216 - 4824 5616 – 32832 (Kbits) Multipliers / 4 – 104 DSP 84 - 126 8 - 180 288 - 2016 Evaluation board < $200 $300 - $1000 $2000 - $2500 cost 20
  • 21. In the context of this applications Device choice • Logic bounded • Standard logic • Multipliers • IO bounded Parallel acquisition • Switching memory blocks for acquisition and computation High computing speed • Via pipelining Results storage • Internal or external memory Power consumption Configuration 21