Xtensa Processor Floating Point Unit

Floating Point Hardware

Tensilica offers two ways to accelerate floating point operations:

  • Single Precision Floating-point unit (FPU)
  • Double-precision floating point acceleration (DP-FPA)

Single Precision Floating Point Unit (FPU)

The Xtensa processor’s FPU adds the logic and architectural components needed for 32-bit IEEE 754 single-precision floating-point operations. These operations are common in DSP algorithms that require better than 16 bits of precision, such as high-quality audio compression and decompression, printing and graphics. Also, DSP algorithms operating on less precise data are more easily coded using floating-point (because of the wide dynamic range), and floating-point operations boost the performance of many programs written in high-level programming languages such as C.

Remarkably Small

The Xtensa processor's FPU is remarkably small for such a full-featured implementation that offers single cycle performance on most operations - the total gate count is only about 25K gates (including floating-point registers).

The floating-point unit is an option for the Xtensa processors. The floating-point unit uses separate integer and floating-point execution units; this provides high sustained throughput for floating-point intensive code. Because Xtensa processors are configurable and extensible, additional optimizations can be made by the designer. If the floating-point option is not selected, the compiler emulates the floating-point operations in software.

Major Features

  • 16 dedicated floating-point registers
  • Full set of load/stores, offset and indexed address update modes
  • 34 additional instructions added directly to hardware
  • Fully pipelined arithmetic operations in hardware:
    • add, sub, mul, madd, msub with 4-cycle latency
    • loads and converts with 2-cycle latency
    • moves, compares with 1-cycle latency
  • Full compiler support for C/C++ floating point
  • Peak performance 2.0 Mflops/MHz

Double-Precision Floating Point Acceleration

The configuration option for double-precision floating point acceleration adds several instructions that significantly accelerate double-precision operations. Customers who need low energy, moderate performance double-precision floating point operations should consider using this package. This package adds an estimated 4K gates when synthesizing for low area to a standard Xtensa processor and less than 7K gates when synthesizing for high speed.

In addition to speeding up double-precision functionality, the instruction extensions used for speeding up the floating point divide operation can also be used to speed up integer divide and modulus operations for configurations without the divide option.

 

Marketing Agency