Tech Support | Generator Login | Careers | Contact Us
METHODOLOGY

  Overview

  ESL Design

  C/C++ Design

  Speed RTL Design

  + Performace

  + I/O Throughput

  + No State Machines

  + GSM Codec Example

  + Viterbi Example

  + MPEG-4 Example

  + Low Power

  + Design Faster

  Multi Processor Dsgn

  Low Power Design

  Optimized with TIE

  EDA Design Flow

  System Modeling

Get RTL Performance

Getting RTL Performance from a Processor

Tensilica’s Xtensa LX2 processor takes application performance to new heights. In benchmark after benchmark, the Xtensa LX2 processor proves it can reach performance levels that are orders of magnitude above all other processor cores, rivaling RTL performance. How is this possible? Because you can configure and extend the processor to your exact application requirements, the Xtensa LX2 processor can reach RTL speeds in two major ways:

  • The Xtensa LX2 processor’s innovative FLIX (Flexible Length Instruction Xtensions) architecture allows designers to pack multiple operations more efficiently into wider words.
  • Designers can add RTL-like functions right into the execution units of the processor using Tensilica’s automated processes.

FLIX Packs It In

The FLIX architecture allows the implementation of highly parallel processors with a performance characteristic of specialty ultra-wide instruction word processors, without the negative code size implications typically found in such VLIW or ULIW solutions.

FLIX is a configuration option that allows designer-defined instructions to consist of multiple, independent operations bundled into a 32-bit or 64-bit instruction word. Wide 32-or-64-bit FLIX instruction formats are seamlessly and modelessly intermixed with the base Xtensa ISA’s existing 16-/24-bit instructions - there is no mode switch penalty to utilize a FLIX instruction.

Designers can figure out their own FLIX implementations or use the XPRES Compiler to automatically determine the best FLIX combinations.

Add RTL-like Functions Into the Processor’s Execution Units, Automatically

Tensilica lets designers add specialized functions right into the processor’s execution units without requiring that the designers understand the processor architecture. Designers just input their C/C++ algorithms into Tensilica’s XPRES Compiler, and the compiler will figure out the best possible configuration options and extensions for your design. Or you can decide what accelerators you want to add to the processors yourself.

Tensilica’s XPRES Compiler can take a quick (usually under one hour) look at your C/C++ algorithm and recommend several ways to extend the Xtensa processor to get the performance you need to run that algorithm without any RTL coding. The XPRES compiler uses a number of techniques (explained here) that allows it to get the 8X improvement shown in the EEMBC benchmark, below.

Benchmarks Prove It – Xtensa is Fast

The Xtensa LX configurable processor core received the highest certified out-of-the-box score ever recorded for any 32-bit or 64-bit processor core tested against the Consumer benchmark suite of the Embedded Microprocessor Benchmark Consortium (EEMBC). The Xtensa LX processor’s score of 0.51997 per MHz, which corresponds to 171.6 Consumermarks in a 330-MHz simulation, was nearly nine times faster than the next best 32-bit core and over five times as fast as the fastest 64-bit RISC CPU tested by EEMBC.


Xtensa LX outperforms every other licensable CPU core ever tested by EEMBC on the Consumer “Out of the Box” benchmark
Source: www.eembc.org

The “out of the box” scores are a good test of compiler performance. The more C-friendly the processor, the better the score, as the processor vendor is not allowed to modify the original EEMBC source code. The exceptional results for the Xtensa LX processor demonstrate Tensilica’s advanced XPRES Compiler technology.

On a separate benchmark, the Xtensa LX configurable processor core achieved the highest score recorded to date (as of May 2004) for a licensable processor core on the BDTI Benchmarks™ by Berkeley Design Technology, Inc. (BDTI). The Xtensa LX BDTIsimMark2000™ score of 6150 at 370 MHz is 70% faster than the score for the next-fastest licensable core benchmarked by BDTI, the CEVA-X1620.


Xtensa LX configuration as tested by BDTI: 248,600 “gates” (equivalent NAND2X cell area) at post-synthesis; 4.4mm2 actual layout area; 3D extracted final layout timing under worst case conditions: 369 MHz

For more detail on this benchmark, see our explanation of how we created a configuration of Xtensa LX optimized for this DSP application.

< previous page | next page >

SOC Book
RECOGNITION
Red Herring top 100
Read The Future of Multicore Processors from Instat/ Microprocessor Report
Read "More Patents for Tensilica" from In-Stat/Microprocessor Report
Portable Design 2006 Editor's Choice Award
EDN 100  Hot Products 2006
QUOTABLE

“We selected Tensilica’s Xtensa processor for its ability to help us achieve our goal of developing innovative-multi-gigabit, lower-power mmWave communications products. By optimizing the Xtensa processor into a tailored processor core, this enables our products to attain the performance these wireless applications demand.”

Kumar Mahesh, Manager of MAC and Software Design for SiBEAM, Inc.