11. PowerPlay Power Analysis Tools Lower Higher Higher Estimation Accuracy PowerPlay Analysis Inputs Design Concept Design Implementation User Input Quartus II Design Profile Place & Route Results Simulation Results Early Power Estimator Spreadsheets Quartus II Power Analyzer
16. Core Dynamic Power Breakdown *DSP Block Power: 5% of Dynamic Power for Designs That Use DSP Blocks Average power Dissipation in varies FPGA architecture elements Routing 38% ALM Combinational 19% ALM Registers 18% RAM Blocks 14% Clock Networks 9% DSP Blocks 2%*
17.
18.
19.
20.
21. Impact On Memory Blocks (Cont) Addr Decoder Data[0:3] Addr[10:11] Addr[10:11] Addr[0:9] Addr[0:11] Data[0:3] Power Efficient (Extra effort) Default Implementation 4K x 4 Memory 4K x 1 M4K RAM 1K x 4 M4K RAM 4 Extra effort Setting Normal Compilation Setting
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33. Technology for Low Power Cyclone III LS / Cyclone IV Stratix IV / Stratix V Hardcopy™
35. Key Technologies to Reduce Power FPGA Power Reduction (Yellow Highlight 28nm Techniques) Lower Static Power Lower Dynamic Power Process innovations (65nm -> 40nm -> 28nm…) Programmable Power Technology Lower core voltage (1.1V -> 1.0V -> 0.85 V) Extensive hardening of IP, Embedded HardCopy Blocks Hard power-down of more functional blocks More granular clock gating Selective use of high-speed transistors Partial reconfiguration Dynamic on-chip termination Quartus II software PowerPlay power optimization
36.
37. Programmable Speed vs. Leakage Note: A simple “model” showing Programmable Power Technology. Actual implementation varies and is patented. Source substrate Drain Gate 0 V < 0 V High speed (HS) Low power (LP) V T – Automatically controlled by software Channel Power High speed Low power Threshold voltage
38. Programmable Power Technology Performance where you need it, lowest power everywhere else, automated by Quartus II software Logic array High-speed logic Timing critical path Low-power logic Unused low-power logic
Goal of this slide - Lay the foundations of power consumption. Engineers will understand all of the three types of power listed, but this introductory slide ensures you are all on the same page. In-Rush (AKA Powerup Icc, surge Icc, startup Icc. Icc means current) is the current drawn by the FPGA immediately after power is applied to the pins. Typically, FPGAs would suck a lot of current during this stage, but technological innovation implemented in Stratix II has allowed us to mitigate these effects as shown in the diagram. EP2S60ES surges approximately 2A. Virtex-4 will surge as indicated by their datasheet, see FAE power presentation. Stratix devices were also spec.’ed to surge. Static (AKA Standby, Quiescent , leakage) is the current drawn by the non-operating portions of the device. In this diagram, the static section shows the power consumption of the device in a non-operational condition (ie. no clock signal is applied). 90 nm Stratix II devices consume more Static power over 130 nm Stratix. Dynamic+Static (AKA Total Power) is the power consumed by the FPGA during normal operation of the device. The clock is oscillating and the customer’s design is operating. Actual dynamic power consumption (and therefore total power) will vary greatly from design to design based on frequency, resource utilization, operating temperature, and other factors.
Non-terminated standard consume very little static power
Choosing appropriate I/O standards can significantly reduce design power. To reduce power, use a low-voltage I/O standard (most important) and the lowest drive strength that will meet your speed requirements. For lower frequency applications or applications where I/Os are idle most of the time, I/O standards which are not resistively terminated, such as LVTTL, LVCMOS and PCI have the lowest total power, since they have very low static power. However, as I/O toggle rates increase these I/O standards eventually dissipate more power than resistively terminated standards such as SSTL, HSTL and LVDS, as the unterminated I/O standards generally have higher dynamic power. Use the PowerPlay Power Analyzer to analyze different I/O configurations and choose the lowest power option for your system. v
2. Block Model, Routing Model, Operating Conditions, Vectorless Activity Estimation 3. Thermal management: Heat sink and Fan (Airflow)
Glitch filtering
Power Dissipation of Stratix II Devices in 112 Different Designs Varied Logic Resource Utilization Across Available Resources All Clocks set to 200 MHz Assuming 12.5% Toggle Rate Everywhere Power Breakdown is Design Dependant Characteristics Vary Depending on Design Function
When instantiating a RAM using the MegaWizard Plug-In Manager …
Not really needed for Quartus synthesis, since you can tell it to directly optimize for power. Area synthesis
altclkctrl
You can access this MegaFunction with the MegaWizard Plug-In Manager in the Quartus GUI.
In addition to selectable core voltage feature, Stratix III also offers Programmable Power Technology that enables Stratix III core logic to be programmed at the tile level for high-speed or low-power mode configuration. This is done automatically by Quartus II software. Tiles are defined as a combination of a LAB and MLAB pair which also includes the adjacent routing associated with LAB and MLAB, A DSP block, A memory block and a I/O interface is also define as a tile. Tiles with DSP blocks, memory blocks, and I/O elements that are used in the design are always set to high-speed mode and are configured as low-power by default when they are not used in the design and reduces static and dynamic power.
Existing FPGA fabrics are designed to deliver the highest performance everywhere which results in high leakage current everywhere. (Click) However, the majority of designs have only a few timing critical paths. These paths require the highest performance. (Click) The rest of the logic, though, does not require the highest performance and with Stratix III all non-performance critical logic can be set to low-power mode. (Click) All unused logic is also set to low-power mode This reduces static power by 70% where low-power logic can be used. Productivity is a significant part of this story as this added complexity would be a huge burden for the customer to manage unless we fully automate this process in the Quartus II design methodology.
Designs that are very high in LE usage and frequency see the highest power reduction relative to the FPGA.