Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Power Saving Design Techniques with Low Cost FPGAs
1.
2.
3.
4.
5.
6.
7. Static and Dynamic Power vs. Node Source: International Technology Roadmap for Semiconductors (ITRS) 2001,2002.Moore’s Law Meets Static Power, Computer, December 2003, IEEE ComputerSociety
Welcome to this module on the EPC2M family FPGA from Lattice. This module will provide an overview of the sources of FPGA power dissipation, design practices that can help reduce consumption and thus junction temperature, how to estimate and analyze power, and then some tips for managing a variety of power sources required for an FPGA implementation
Here are some key problems you might face with any FPGA power implementation. What will be the system level power supply requirements? What will be the current draw? What voltage levels will be required, and what power up/down issues will there be? What will be the thermal conditions of the device, and will it work reliably given the environment and the design I expect to run? Will I need to design in cooling mechanics for the board to counteract a hot part? And then, given the variety of voltage sources for core voltage, how can I manage sequencing? So while management of FPGA power has become an important consideration for many designers, increased dissipation can lead to larger power supplies and cooling systems. So, using good design techniques can help reduce the demand on power-hungry. Reducing power consumption increases the reliability of integrated circuits and can help lower the costs of production with leaner power supplies and fewer cooling requirements. Traditionally, FPGA designers have been concerned with timing and area efficiently; however, as FPGAs have moved more and more into the role of replacing ASSPs and ASICs, they have been pressured to developed lower-power designs, produce better power estimates earlier in the design flow, and then manage the sequence in a variety of core and I/O voltages that often accompany an FPGA implementation.
Understanding FPGA thermodynamics will help you identify the high-impact, low-effort methods to reduce power. Total power is a function of certain types of sub-power producers, along with the characteristics of the process node and device packaging.
Powering electronic devices is often defined as the amount of work done by an electric current. Devices tend to convert work into heat, which is unfortunately not considered very useful for most applications unless your design is a heater or a light bulb. Power is expressed as Jules per second or Watts, given the equation -- equals V times I. CMOS FPGAs contribute to power dissipation from two primary sources, static and dynamic, and the total power dissipation is the sum of the static and dynamic power. The DC power depends on process, voltage, and temperature, or PVT variation. AC power is a strong function of the frequency and activity of the resources, and a weaker function of PVT. So Power Dynamic is expressed in the second equation: one-half beta times capacitance types VDD squared times frequency. The AC portion of the power consumption is associated with used resources of the device. The dynamic part of the power consumption. Dissipation is directly proportional to the frequency and activity at which the resource is running and the number of resource units used. From the equation, it becomes obvious that how power consumption can be influenced by lowering supply voltage -- the largest factor -- switch capacitance, switching activity of nodes, with a frequency of signal transitions.
This graph illustrates the relative consumption of static versus dynamic power consumption of the Lattice ECP15, a 130 nanometer FPGA, and the Lattice ECP2/M, a 90 nanometer. With a design that models 90% logic utilization, 100% utilization of embedded ASIC blocks like PLLs, memory, and DSP features, and around 80% utilization of I/Os using a mixture of LDCMOS1.2 and LDDS 2.3 DDR type signal standard. DC power can be further subdivided into the power consumption of the used and unused resources of FPGA.
In the older process nodes, CMOS FPGAs have a very low static DC power dissipation. Most energy was consumed during or by switching activity and by charge/discharge of load capacitances, largely a function of user design. But this convention changes around 90 nanometer process nodes and smaller. Transistor physics changes at smaller geometry such that the static leakage is now more significant. The graph here shows how static power is growing exponentially due to increasing transistor leakage. And the crossover point, where static power overtakes dynamic, is around 65 nanometer node. How does another semiconductor vendors address these issues largely in the respective fabrication processes and their transistor mix used in each device. This trend, however, makes adoption of a power verification methodology all the more important with 90 nanometer devices.
To illustrate the effects of node switching activity, this graph plots power consumption versus frequency with the sample hardware model described earlier in the ECP2/M and the ECP 130 nanometer device. The power benefit of the 90 nanometer FPGA is clear in this example.
This figure illustrates power consumption of the ECP2 versus the ECP, but looking at I/Os, here, based on the average output load capacitance and puffs. This plot demonstrates how power consumption of I/Os remains relatively constant between 130 and the 90 nanometer device families.
The relevant consumption by resource type also changes between process nodes. As an example, the figures illustrate contribution by resource, routing, logic -- such as gates and registers -- embedded block RAM, etc., in the 130 nanometer versus the 90 nanometer part. The charts demonstrate that while overall power consumption drops given the 90 nanometer device, the relative amount represented by I/Os increases. So at first glance, as FPGA technology process geometry shrinks, designers should benefit from reduced power consumption of smaller transistors and IC dies. However, this benefit could be many times offset by larger designs and higher speeds.
The primary sources of power: static, which is a function of PVT, dynamic, which is a function of activity -- and especially dynamic I/O activity should be accounted for when designing and verifying a design.
Heat is a key byproduct of work performed by a device and must be addressed to ensure an FPGA operations within the junction temperature specification. Semiconductor devices will operate normally as long as the temperature does not exceed an upper limit specified as the ambient temperature and the temperature of the junctions inside the semiconductor. If this upper limit is exceeded then the semiconductor stops working and operating normally and will be damaged. Thermal management is indispensable when using the FPGA for high-power applications or using it under high operating temperature. The concept of thermal resistance is used when considering heat dissipation. The basis for a power design methodology is based on the thermal device specs published in the respective datasheet. So for example, on the ECP2/M device datasheet, you'll find t JCOM and t JIND in the absolutely maximum rating section of the datasheet. While total power, ambient temperature, thermal resistance, and airflow all contribute to device thermal dynamics, the junction temperature t J is key to reliable device operation. You should also be aware of the min/max numbers for supply voltages, since they may help you reduce static power.
So to avoid reliability issues, semiconductor vendors specify a maximum allowable junction temperature in the datasheet that we've seen. You should always complete a thermal analysis of your design to ensure the device and the package don't exceed the junction temperature requirements. The internal data shown is relative, and the actual values depend on a variety of factors like die size, paddle size, airflow, power supplied, the PCB design itself, and, of course, the user application data there will superseded the package thermal data provided by the device vendor. The most common examples are θ JA , thermal resistance junction-to-ambient, θ JC , thermal resistance junction-to-case, and a common other factor is θ JB , the thermal resistance junction-to-board. The maximum junction temperature of the device is going to be calculated by these expressions: T J is T A plus the product of power times θ JA . We use the total power consumption of the device, and θ JA is commonly used with natural or forced convection air-cooled systems, and θ JC is useful when you're considering that the package has a high conductivity case mounted directly to the PCB or a heatsink.
This chart illustrates the thermal resistance, θ JA and θ JC characteristics across the package range of the ECP2/M family. It demonstrates the benefits of certain package types and airflow. When designing a system, designers much make sure that the devices will operate at specified temperatures within the system environment. This is particular important to consider before a system is designed. The ability the estimate the device's operating temperature prior to board design also allows the designer to better plan for budgeting and airflow. A commercial device is likely to show speed degradation in a junction temperature above 85 degrees C, and an industrial device at over 100 degrees C. It's required that the device temperature be kept below these limits to achieve your guaranteed speed operation.
Static or DC power reduction tips include using a sleep mode if it's available. For example, during a period of system activity with the LatticeXP device, they can be placed on a sleep mode. And during this mode standby current is reduced by over one thousand times. The power supplies don't have to be switched. Another technique is to reduce your operating voltage obviously to look at the specs for VCC and VCCJ at the lower end the device specification. Also, clearly minimizing the operating temperature can be counted for by using packages with a lower thermal impedance.
Here are some more I/O specific techniques for static consumption. Try reducing the switch capacitances and frequencies of I/Os, decouple I/Os when you're in the sleep mode. If this is impossible, power down the core and leave the VCCO applied. Try reducing the I/O voltage swing so keep the I/O drive as low as possible. Use lower voltage standards of your I/Os. Look at slew rate controls to reduce output switching current. Some FPGAs provide control over LVCMOS or LVTTL buffers. It can be configured for either a low noise or high speed performance. Also another possibility here is not to power I/O banks that are unused.
Also look for opportunities to lower your clock speed on your non-high performance clock domains. This reduced power consumption so that the dynamic power's directly proportional to the frequency of operation. Designers must determine if proportions are designed to be clocked at a lower rate. Also disable timing driven mapping and enable register retiming to optimization options you can find with the back end place and route tools will benefit power. The next set of reduction tips are more, again, design related and depend on how you write your HDL. Use signal encoding optimization of counters or state machines. Designers should be trying to target the embedded ASCI blocks of the device versus general fabric logic. So EBRs, DSPs, modern FPGAs will have a lower consumption over generic LUT or register logic.
In this arena, of course this is largely affecting the design itself. Your RTL or your source code actual contents of your design. One technique is to enable a synthesis area of optimization on all or a portion of your design. So by reducing the span of the device or the design across the device, a more closely placed design will help utilize fewer routing resources for less power consumption. The clock gating optimization and quadrant clocking is tied together. And then there are certainly approaches to use clock enable approaches, using gating clocks rather than operating on those on every cycle.
Here's an example of some of these products called Push Button optimization. So that the area of optimization technique and the register retiming approaches. In this case the impact of performance and power with a sample design implemented and an ECB2 device are measured. Using the area optimized synthesis, and register retiming, and the ICP lever design mapper, it can get power to drop around 20% and performance impact however on this case declined at about 10%.
Like simulation, FPGA thermal analysis is a verification flow that runs in parallel with the traditional FPGA implementation tools and then an ISP lever design flow for example. FPGA designers can estimate consumption at any stage, pre-synthesis, post-map, pre-route, post-route, and post-simulation. Use the ISP lever power calculator to estimate used resources, activity factors of logic blocks, and toggle rates of I/Os before synthesis and place a route, or designers can use it at a later stage as more implementation details are available.
Here's a screenshot of the power calculator UI. The calculator application inputs are at power parameters such as device characteristics, voltage, temperature, device variations, airflow, heat sink, resource utilization, activity and frequency. Uses all these factors to calculate the device consumption, and then it reports both the DC and AC portion of consumption. Once the device is imported or provided all the required information software produced power estimate and predict the junction temperature. Any time junction temperature is outside the limits specified in the data sheet, the viability of operating the device without some cooling technique must be reevaluated.
Here's an example equation from power calculator used to estimate consumption of the device look up tables. So in this expression, total AC Power for LUT is the power constant for the LUT blocks in millowatts per MHz that max frequency of the LUT clock measured in MHz, times the activity factory of the LUT, times the number of LUTs used in design. Activity factor is the percentage of switching activity. Power calculator use activity factors and toggle rates as a means to model dynamic power consumption.
AF percentages defined as the percentage of frequency or time that a signal is active or toggling the output. Most resources associated with a clock domain are running or toggling at some percentage of the frequency of which the clock is running. Activity factor can be calculated per each routing resource, output, or PFU. However, this can involve long calculations, so our general rule of thumb is that for design occupying roughly 30-70% of the device, an activity factor between 15-25% average value. The accurate value of activity factor depends on clock frequency, the stimulus to your design, and the final output. The key input term used for I/O consumption estimates is the I/O toggle rate. The activity of I/Os is determined by the signals provided by the user in the case of inputs, or is an output of the design in case of output signals. The rates at which I/Os toggle define their activity. The toggle rate or TR in MHz is the output as defined in the expression as shown.
Given an application where you must account for power consumption, power closure methodology should be adopted. The first step the designer should look for opportunities to create power-friendly RTL, high impact low effort practices include targeting embedded blocks, coding smaller state machines, organizing blocks in a manner that area optimization won't overly impact the performance. And if a device is a higher density 90 nanometer device, I/O programming and switching should be given extra scrutiny to save power. Next power friendly synthesis in place route optimizations like power registry time and area optimization should be applied. And finally, a robust test bench that reflects actual operating conditions will help build an accurate activity factor and toggle rate factors or post simulation analysis and that power estimation software.
The main point is this graph of the power rails. The power supply sequencing is an important consideration when you're managing the power budget of an FPGA. So for an example, there are three main power supplies that are required to power up the ECP2M device for proper operation. VCC, VCCAUX, and VCCIO8, BANK 8. Power management circuits have become an important companion on the circuit board to deal with power up and down sequencing. The sequencing circuit below illustrates how the Lattice impact POWR1014 device serves as a programmable controller with the various voltage rails attached to ECB2M FPGA. If there's a sample sequencing circuit, you can see the blue boxes here at DC to DC supplies and the input enables coming in from the controller. And they're sourcing up a variety of voltage rails with the FPGA on the right. There's supervisory and control signals coming in on the left and right, into the power manager.
The sequencing circuit as shown illustrates how the Lattice impact POWR1014 device serves as a programmable controller with the various voltage rails attached to ECB2M FPGA. If there's a sample sequencing circuit, you can see the blue boxes here at DC to DC supplies and the input enables coming in from the controller. And they're sourcing up a variety of voltage rails with the FPGA on the right. There's supervisory and control signals coming in on the left and right, into the power manager.
The ispPAC-POWR1014/A is a general-purpose power-supply monitor and sequence controller, incorporating both in-system programmable logic and in-system programmable analog functions implemented in non-volatile E2CMOS® technology. The ispPAC-POWR1014/A integrates many power management functions typically requiring multiple ICs.
In step one, the power manager is going to wait for an internal all-good signal to indicate its own power's available. In step two it turns on the 1.2, 2.5, and 3.3 supplies for the FPGA. In step three and four, once the VCC is crossed the VCC min value, it turns on the VCCAUX. Next it waits for all of these supplies to stabilize, and then a good power output signal is enabled to indicate that PGA is ready to be programmed. And in step seven, the sequence controller waits for the FPGA done to be issued from the ECB2M. At this -- if it exceeds using the internal timer, if it exceeds 520 milliseconds, the sub routine is run to initiate a shutdown. Finally step eight, the controller waits for any supply to fail or shutdown signal to go active. In step 9 through 14, this is the shut down sequence. The controller first disables the VCCAUX MOSFET and waits for it to reach 100 millivolts in the threshold. And then it powers off all of the other supplies and waits for a recycle signal to start the FPGA sequencing again.
One of the most critical factors in design is reducing the system power consumption, especially important for handheld devices and other modern electronic products. Low power design techniques depending on the device type targeted in the characteristics of the design to an understanding of the sources of the FPGA power consumption, static and dynamic, core and I/O will influence your power reduction strategy. Even the variety of voltage rails and sequencing requirements of modern FPGAs sequencing circuits and programmable controllers should be considered as part of any implementation.
Thank you for taking the time to view this presentation on Power Saving Design Techniques with Low Cost FPGAs . If you would like to learn more or go on to purchase some of these devices, you can either click on the link embedded in this presentation, or simple call our sales hotline. For more technical information you can either visit the Lattice site – link shown – or if you would prefer to speak to someone live, please call our hotline number, or even use our ‘live chat’ online facility.