Tech Support | Generator Login | Careers | Contact Us
PRODUCTS

  Overview

  Technology

  Diamond Standard

  Xtensa

    Configurable

    Config & Extensible

    Xtensa 7

    Xtensa LX2

  DSPs

    HiFi 2 Audio/Voice

    Video

    Communications

  HW/SW Dev Tools

  Literature & Doc

Digital Signal Processors (DSPs)

Xtensa - the Fastest DSP Core Ever

Tensilica’s customers today are using the Xtensa processor core for a variety of DSP tasks including audio processing, image processing, video processing, and communications channel processing. Additionally, Tensilica offers specialized pre-configured, optimized DSPs for audio and video processing.

Why are Xtensa processors so popular for DSP applications? Because the datapath can be exactly molded to the data characteristics of the application. By extending the processor using Tensilica Instruction Extensions (TIE), designers can optimize the design to fit the data.

See Building a Multi-Issue Vector DSP with Configurable-Processor Technology from GSPx 2004.

See BDTI's independent analysis of the Xtensa LX processor with Vectra LX.

See Microprocessor Report's article Applications Define DSP Speed.

Benchmark Leading Performance

The Xtensa LX2 processor excels at traditional CPU and DSP tasks in embedded SOCs as demonstrated by industry leading benchmark results on the BDTI BenchmarksTM by Berkeley Design Technology, Inc. (BDTI). The Xtensa LX configurable processor core  achieved the highest score recorded to date for a licensable processor core -- the Xtensa LX BDTIsimMark2000 score of 3820 at 230 MHz is 62% faster than the score for the next-fastest licensable core benchmarked by BDTI, the CEVA-X1620. See www.bdti.com for all BDTI benchmark scores.

The Xtensa processor can be used in such a wide variety of applications because Tensilica offers four separate means of accelerating DSP computations. Certain functions marked with * are only available on Xtensa LX2.

  DSP Performance Level
  Moderate Performance Very High Performance
Function or Application Specific Designer-Defined Instructions
Simple TIE
  • Single-Cycle Instructions
  • Single Operation per Instruction
  • Compiler Support Through Automatic Intrinsics
Advanced TIE
  • Multiple-Cycle Instructions
  • Multiple Operations per Instruction
  • SIMD Operations
  • Overlapping of Computation and Load/Store Operations
  • Compiler Support Through Automatic Intrinsics
  • Flexible Length Instruction Xtensions*
General-Purpose Click-Button Configuration Options
MAC16 Function Unit Option
  • Single MAC DSP configuration option
  • Fully supported by the compiler
Vectra LX DSP Engine*
  • Dual or Quad MAC DSP
  • SIMD Instruction Set
  • Vectorizing Compiler Support
  • 64-bit instruction words

*Available only on Xtensa LX2.

MAC16 Configuration Option

For moderate intensity signal processing applications, a 16-bit multiply-accumulate engine can be added to the base Xtensa LX2 processor core with just a click of a configuration button in the Xtensa processor generator. Inclusion of the MAC16 option adds a full suite of multiply / accumulate instructions including auto-incrementing loads and combined multiply-accumulate-load instructions for high performance computation. These DSP instructions are also 100% compiler supported.

Tensilica Instruction Extensions (TIE)

For applications with one or more signal processing applications that require some amount of acceleration beyond the base RISC processor features of Xtensa LX2, the designer can quickly add instructions and hardware execution units tailored to a specific algorithm.

For example: the “butterfly” operation used in Convolutional Coding / Viterbi Decoding applications is a series of combination Add-Compare-Select (ACS) operations. If the data in question consists of 8-bit values packed in the standard 32-bit registers of Xtensa LX2, a designer can easily add an ACS instruction the Xtensa LX2 processor with a small incremental block of execution unit hardware to greatly speed up Viterbi decoding for communications applications.

Advanced Tensilica Instruction Extensions (TIE)

For applications with well-defined, very high performance signal processing computational demands, the TIE language provides a fast means of developing extremely powerful DSP extensions. Add custom registers and register files for unique data types. Create complex multiple-operation instructions and automatically pipeline those instructions into multi-cycle instructions by specifying a command directive in the TIE language that takes only one line of text in a TIE description. Create SIMD (single instruction, multiple data) instructions to tackle algorithms with native data parallelism. Use software-pipelining techniques to create combined compute-and-load, compute-and-store instructions for high data-rate applications that enable continuous computation without the performance overhead of processor load and store cycles.

FLIX (Flexible Length Instruction Xtensions)

Tensilica improved compute performance in the Xtensa LX2 processor through its innovative FLIX (Flexible Length Instruction Xtensions) architecture. The FLIX architecture is a highly efficient implementation of the Xtensa instruction set architecture (ISA) that gives designers more options for cost/performance tradeoffs.

The FLIX technology provides the flexibility to freely and modelessly intermix instructions of various lengths (16-, 24-, or 32-/64-bit). By packing multiple operations into a wide 32- or 64-bit instruction word, FLIX technology allows designers to accelerate a broader class of “hot spots” in embedded applications. FLIX eliminates the performance and code-size drawbacks that can occur when using a one-size-fits-all instruction length. 

Compared to rigid, high-performance processor designs that either encode only one RISC operation per instruction or use ultra-wide 64b/128b/256b VLIW (very long instruction word) formats, FLIX delivers high-performance concurrent execution exactly and only when needed, yet preserves the industry leading code density advantages of the Xtensa processor’s native 16b/24b base architecture instruction formats. Read more.

Vectra LX DSP Engine

See our Embedded Processor Forum 2004 presentation on Vectra LX titled, “A Second-Generation High-performance DSP Engine.”

The Vectra LX DSP engine can be added to the base Xtensa LX2 processor core with just a click of a configuration button in the Xtensa LX2 processor generator. The Vectra LX engine takes advantage of the FLIX architecture and uses 64-bit instruction words containing three issue slots for ALU, multiply-accumulate, and load/sore operations. Design teams interested in modifying the Vectra LX DSP engine for specific configurations should contact Tensilica. The Vectra LX engine is fully supported by the entire Tensilica software environment including advanced auto-vectorization capabilities in the Xtensa C/C++ Compiler (XCC). XCC enables Vectra LX engine users to reap the benefits of vector processing on a SIMD engine without manual assembly-level coding.

Simple RISC Engine Minimal configuration Xtensa LX2 using software multiply 155,389 cycles
Scalar Performance Base Xtensa LX2 processor with MUL32 option 23,633 cycles
FLIX Performance Xtensa LX2 with Vectra LX option 994 cycles

Vectra LX DSP engine really accelerates FFT performance

* The BDTIsimMark2000™ provides a summary measure of DSP speed. For more information and scores see www.BDTI.com. Scores © 2004 BDTI.

CORE OF THE YEAR
Best Processor Cores of 2004
PRODUCT RESOURCES
Xtensa LX2 Product Brief
Xtensa Processor Developers Toolkit Product Brief
Microprocessor Report’s review of Xtensa LX
  Microprocessor Report's Update on Xtensa LX2 and Xtensa 7
BDTI’s Report on Tensilica Xtensa LX Processor with Vectra LX
  EEMBC Benchmarks
  BDTI Benchmarks
  Epson printer
WHITE PAPERS
FLIX: Fast Relief for Performance-Hungry Applications
XPRES Compiler
Automated Configurable Processor Design Flow
  more >

ARTICLES

Hit Performance Goals with Configurable Processors
FLIX Helps Low-Power CPU Flex its Performance
Compiler Automates RTL Generation
  EDN's 2006 Hot 100 Products
 
QUOTABLE

“Tensilica’s introduction of the Xtensa LX and its revolutionary tool, the XPRES design compiler, made it the clear winner. Even without XPRES, Xtensa LX would be the leading contender for this award, but the combination is unbeatable.”

Tom R. Halfhill,
Senior Analyst, Microprocessor Report

get more information