The Fastest Processor Core Ever
Tensilica’s Xtensa LX2 processor takes application
performance to new heights. It is the only processor
core for system-on-chip (SOC) designs that provides
the I/O bandwidth, compute
parallelism, and low-power
optimization equivalent to hand-optimized, RTL-designed
non-programmable hardware blocks. With Tensilica’s
unique XPRES Compiler and automated
processor generator,
every Tensilica customer is able to quickly generate
a tailored version of the Xtensa LX2 optimized for
their particular application, Ideal for handling
traditional SOC embedded processor control tasks
as well as compute-intensive datapath hardware
tasks, the Xtensa LX2 processor is the basic
building block for complex SOC design.
The Xtensa LX2 32-bit architecture features a compact instruction set optimized for embedded designs. The base architecture has a 32-bit ALU, up to 64 general-purpose physical registers, 6 special purpose registers and 80 base instructions including improved 16- and 24-bit (rather than 32-bit) RISC instruction encoding.
Benchmarks Prove It - Xtensa LX2 is the Fastest
Xtensa LX2 tops the BDTI BenchmarksTM and EEMBC
benchmarks. Find
out more.
Create an Optimized Processor in Hours
Profile the application software, configure the
Xtensa LX2 processor and add new instructions to
optimize performance - all in a matter of hours.
Start with the XPRES Compiler to automatically
generate the optimizations. Use Xtensa
Xplorer,
a comprehensive environment that will help you
develop and analyze different configurations and
extensions. Then the Xtensa Processor Generator
will create a tailored processor, quickly and reliably,
in about an hour.
See how the XPRES Compiler really works in our
Demos on Demand presentation.
Highlights of Xtensa LX2
- Lower power - Tensilica automated the insertion
of fine-grain clock gating for every functional
element of the Xtensa LX2 processor, including
designer-defined functions. Plus the minimum
configuration dissipates a miserly 0.05 mW/MHz
in a representative 130 nm process technology
- less than half the power consumption of other
popular embedded processors.
- I/O throughput
at RTL speeds - Designers can
choose one or two 128-bit wide load/store units.
If that isn’t enough, you can add direct
ports and queues, which allow the Xtensa LX2 processor
to communicate as fast and flexibly as RTL blocks.
- Outstanding compute
performance - The Xtensa
LX2 processor’s innovative FLIX (Flexible
Length Instruction Xtensions) architecture allows
designers to pack multiple operations more efficiently
into wider words.
- Better interfaces on
on-chip memories - Designers
can select two additional clock cycles for memory
access if required by the application. The local memories are configurable up to 4MB with the option for parity or ECC. Designs can be optimized with independent interface widths for all local memories and the system bus.
- The XPRES Compiler - Tensilica’s XPRES
Compiler is a powerful synthesis tool that creates
tailored processor descriptions for the Xtensa
LX2 processor from native C/C++ code.
Unrivaled Performance
| Configuration |
Area
Optimized for area. |
Clock
Rate
130nm LV process Worst case conditions. Optimized for speed. |
Power
Dissipation
Speed-optimized netlist under typical operating conditions. |
RTOS-ready
configuration,
5-stage pipeline |
28,000 |
350
MHz |
76 µW/MHz |
| Performance
optimized 7-stage, no PIF |
28,000 |
400
MHz |
47 µW/MHz |
| Minimum
configuration |
20,000 |
350
MHz |
38 µW/MHz |
|
Xtensa LX2 90nm Specifications
| Max Frequency (worst case conditions) MHz |
590 |
655 |
440 |
725 |
500 |
| Core Area, mm2 |
0.206 |
0.224 |
2.256 |
0.177 |
0.118 |
| Gate Count |
37K |
45K |
276K |
32K |
23K |
| Dynamic Power with medium activity (typical operating conditions) mW/MHz |
0.059 |
0.074 |
0.171 |
0.074 |
0.048 |
| Leakage Power, mW |
1.90 |
2.43 |
18.11 |
2.43 |
1.01 |
- Base configuration = Xtensa LX2 with min sized Cache; includes PIF and Loop options (RTOS ready configuration)
- Minimum configuration =
2K local I-RAM and D-RAM only, no PIF, no Loop
- All 90 nm performance figures based on TSMC 90nm "GT" technology, Artisan standard cell library, Virage memory. Data taken from actual post-layout databases.
- TSMC 90GT high-performance process specs: 1.08V WC, 1.2V Typical;
Temp = 125o WC, 25o typical
- Area calculations based on 75% utilization for base and minimum configurations, 50% for the Vectra LX-equipped configuration
- Gate count = total post-synthesis cell area divided by actual area of 2X drive NAND gate.
See a quick comparison to the Xtensa 7 processor.
|
Configurability |
Configure your processor to fit your application. Get the options you want and not the ones you don't want |
Choose from a menu of common, pre-optimized data path elements like multipliers and shifters |
| Extensibility |
Add application-specific instructions to accelerate the hot spots in your application |
Add multi-cycle execution units, registers, register files, and SIMD units to create the same data path as you would in RTL |
| Designer-defined I/O interfaces |
Use TIE Ports (GPIOs) and Queues (FIFO interfaces) to avoid the bottlenecks of the system bus |
Interface to other RTL blocks and processors using direct wires and FIFOs, as you would if you were using RTL |
| Lower power |
Use application-specific extensions to create a higher-performance processor without increasing frequency and power |
Fine grained clock gating automatically generated by Xtensa Processor Generator. Higher power savings than with EDA-generated clock gating of manually produced RTL because clock nets are automatically gates off cycle-by-cycle under program flow execution. No risk of introducing bugs while adding clock gating |
| Lower verification effort |
Automatic pre-verified RTL generation, including control logic, bypass logic, and data path elements |
Only have to verify functional specification of custom instructions and execution units. Significantly lower verification effort than RTL |
| Flexibility |
Extending processor gives headroom to map more tasks as requirements and standards change, unlike fixed processors that rely on increasing frequency (MHz) to increase capability |
Programmability of processor means that multiple applications can be mapped to the same SOC, software can be updated as algorithms change, and bugs can be fixed post-silicon |
| Faster time to market |
Spend less time optimizing software or, on the backend, trying to increase frequency. Instead, just accelerate the application using designer-defined instructions |
Lower verification effort and easy scalability by adding more task-optimized processors |
| Smaller core area and memory area |
Base processor configuration is less than 20K gates. Also, 24-bit ISA with 16-bit narrow encodings means higher code density than conventional RISC and DSP cores and, therefore, smaller memory area |
Create optimized task engines with little or no area overhead for the processor |
Xtensa LX2 Solution Diagram
Tensilica offers a complete solution for the Xtensa LX2 processor including automatically generated RTL and EDA scripts, system modeling and design support, Xtensa tools v7, and the software to optimize the processor for your application.
|