Processor Core Power Specs: A Cautionary Tale
By Steve Leibson, Tensilica, Inc.
Anyone familiar with board-level design has developed an intuitive feel for packaged-processor power specifications: the processor draws a certain amount of power, give or take a percentage based on process variation and speed binning. For a variety of reasons, this intuition utterly fails with respect to vendor specifications for processor core IP.
While a packaged processor’s measured power specs must necessarily account for all circuitry in the package, processor core power specifications are based on simulations—vendors are free to delete or ignore any number of power-dissipating functions when reporting power numbers. Many factors will greatly affect the power specifications listed on the processor core’s spec sheet, including the target fabrication technology (both the lithography size and the process), the cell library used to generate the core, and the execution activity imposed on the processor during power simulation. Consequently, caution and judicious reading of the vendor data sheets are called for when comparing the power numbers for competing processor IP. Rarely will you find apples to compare to apples from the data sheets alone.
Figure 1 shows a small extract from a processor core’s data sheet. (The actual numbers have been obfuscated because they aren’t germane to this discussion.) It’s the footnote that’s important. The footnote informs the reader that the two power-dissipation numbers listed for the processor core are for TSMC’s 0.18μm G and 0.13μm LVLK IC-fabrication processes. The footnote also states that synthesis has been optimized for power and that the power specifications do not include the clock tree. Note that all of this information is critically important when comparing processor cores. Also note that the information in this footnote is incomplete.

Figure 1: Power specifications from the data sheet of a processor core.
(specs intentially blurred)
Consider the facts listed in Figure 1’s footnote:
- Process technology: The two process technologies listed in Figure 1 are the 0.18μm G and 0.13μm LVLK IC-fabrication processes from TSMC (Taiwan Semiconductor Manufacturing Company Limited). TSMC’s G processes are general-purpose, lowest-cost processes. Transistors fabricated in TSMC’s high-performance LVLK processes have a lower Vt (threshold voltage), which increases static-leakage power but can reduce operating power by allowing the chip to run at a lower operating voltage for a given clock frequency. Even at the same lithographic geometries, different IC-fabrication processes (for example, general-purpose versus high-speed or one manufacturer versus another) can result in as much as a 50% difference in power dissipation. If you want to compare apples to apples, you will need processor core specifications based on a common fabrication processes. You must know both the lithographic geometry used (expressed in microns or nanometers) and the process type (general-purpose, low-voltage, high-performance, etc.). If the numbers you have are not from precisely the same process, then they are not comparable. There are rules-of-thumb conversion factors available to translate from one process or process geometry to another, but these conversions insert yet another layer of uncertainty in the power numbers.
- Synthesis optimization setting: Figure 1’s footnote states that the power specifications were obtained by setting the logic-synthesis tool’s switches to produce a power-optimized design. That’s not the only setting available. Synthesis runs can also be optimized for area or for speed. Each optimization setting produces a processor core that consumes different amounts of power per clock cycle because each setting produces different amounts of logic. In fact, processor-core data sheets that list maximum clock rate, silicon area consumption, and power dissipation specifications may show numbers based on different synthesis optimizations for each specification. If the data sheet doesn’t explicitly state the synthesis optimization used to obtain a particular specification, you’ll need to ask the vendor.
- Circuit omissions: Curiously, Figure 1’s footnote states that the power specification does not include the clock tree. This is surprising and it’s a very important circuit omission because a processor’s clock tree includes the gates that operate at the core’s highest frequency. A processor’s clock tree dissipates much of the operating power in a properly designed processor. Omitting the clock-tree’s power dissipation from the core’s power specification can reduce the stated dissipation by 30-50%. However, it’s not possible to fabricate the processor core without a clock tree. Obviously, it would be grossly unfair to directly compare the power specifications of two processors and omit the clock tree from one of them.
Notably, there are four key factors missing from the footnote in Figure 1: operating voltage, physical cell library used during synthesis, whether the power numbers are pre- or post-P&R (place and route) numbers, and the program code used to exercise the processor core during power simulation. All of these additional factors greatly affect the processor core’s power-dissipation specification. At 130nm, processor cores can operate on supply voltages ranging from approximately 1.2V to 0.6V—a 2:1 difference in operating voltage and hence broadens the core’s operating-power range. Clearly, it’s important to know the supply voltage used to obtain the power specification listed on the processor core’s data sheet.
Combined with the core’s operating voltage, the physical cell library used during synthesis can also make a big difference. Table 1 shows the power-dissipation specifications for five of Tensilica’s Diamond Standard series processor cores. The specifications are all for TSMC’s 0.13μm G process but use different physical cell libraries and operating voltages. (Note: All Diamond Standard series power numbers include the power dissipated by the processor’s clock tree.) The power specification for each processor core consists of a dynamic power component and a static (leakage) component. The dynamic-power component increases linearly with operating frequency and the static-power component does not.
Processor Core |
|
|
Dynamic Power (mW/MHz) |
Leakage (Static) (mW) |
Dynamic Power (mW/MHz) |
Leakage (Static) (mW) |
Diamond 108Mini |
0.04 |
0.21 |
0.12 |
0.54 |
Diamond 212GP |
0.06 |
0.32 |
0.27 |
0.87 |
Diamond 232L |
0.08 |
0.37 |
0.38 |
1.08 |
Diamond 570T |
0.09 |
0.51 |
0.41 |
1.41 |
Diamond 330HiFi |
0.09 |
0.63 |
0.35 |
1.76 |
Table 1: Power specification for Diamond Standard processor cores synthesized with low-power and high-performance cell libraries, manufactured with TSMC's 0.13G processes
The low-power specifications were derived from simulations of processor cores running at 0.6V and built with ARM’s Artisan Metro low-power cell libraries. The low-power cores are constrained to clock speeds well below 100MHz due to the process technology, physical cell libraries, and synthesis options selected. The high-performance specifications were derived from simulations of processor cores running at 1.2V and built with ARM’s Artisan SageX cell libraries. These processor cores dissipate more power than do the low-power versions of the same cores, but the high-performance cores all run at clock rates in excess of 200MHz.
Table 2 compares the dynamic-power specifications for the same five Diamond Standard series processor cores synthesized for maximum operating frequency using ARM’s Artisan SageX libraries and manufactured with TSMC’s 0.13 G and 90 G manufacturing processes. Note that the dynamic power/MHz drops considerably using the finer geometries of TSMC’s 90 G process. The cores manufactured with TSMC’s 90 G process can achieve clock rates approximately 50% higher than the cores manufactured with TSMC’s 0.13 G process.
Processor Core |
Manufacturing Process |
130nm G |
90nm G |
Dynamic Power (mW/MHz) |
Dynamic Power (mW/MHz) |
| Diamond 108Mini |
0.12 |
0.062 |
| Diamond 212GP |
0.27 |
0.096 |
| Diamond 232L |
0.38 |
0.206 |
| Diamond 570T |
0.41 |
0.155 |
| Diamond 330HiFi |
0.35 |
0.147 |
Table 2: Power specification for Diamond Standard processor cores manufactured with TSMC's 0.13G and 90G processes using ARM's Artisan SageX cell library.
An additional factor that helps determine a processor core’s power numbers is whether the power simulation is performed before or after placement and routing. Typically, P&R increases a core’s area by about 10-15%. This increased area translates into additional capacitance, which in turn drives the dynamic-power dissipation numbers up accordingly. All Diamond Standard series power numbers in Tables 1 and 2 are for placed and routed cores.
Even with the factors described to this point, the background conditions for obtaining these power numbers are still not fully specified. One more detail remains: What is the processor doing while the power simulations run? If the program being run during the power simulation is a loop of NOPs, you would expect to get lower power numbers than if the processor were exercising its function units. Thus even the benchmark program being run during power simulation can influence the core’s power-dissipation specifications.
There are no standardized power-benchmarking programs for processor cores. EEMBC, the industry’s benchmark consortium (www.eembc.org), has developed a power benchmark for packaged processors called EnergyBench, but it requires a physical embodiment of the processor and can not yet be applied to processor-core simulations. The closest the core industry has to a power-benchmarking program standard is Dhrystone version 2.1, an old benchmarking favorite for people who compare processors. All of the power numbers for the Diamond Standard series processor cores shown in Tables 1 and 2 were obtained while running Dhrystone on the processor’s instruction-set simulator.
As this discussion has shown, it’s difficult to constrain power numbers for processor cores to make fair comparisons. However, it is possible. You must ask the processor vendor for the answers to questions raised above. Only then can you truly compare similar fruits.
|