Tech Support | Generator Login | Careers | Contact Us
PRODUCTS

  Overview

  Technology

  Diamond Standard

  Xtensa

    Configurable

    Config & Extensible

    Xtensa 7

    Xtensa LX2

  DSPs

    HiFi Audio

    Video

    Communications

  HW/SW Dev Tools

  + For Processor
     Designers

    – Xplorer IDE

    – TIE Compiler

    – Create TIE

    – Develop Configs

    – Analyze Configs

    – XPRES Compiler

    – ISS & TurboXim

    – System Modeling

    – Chip Tools

  + For Software
     Developers

    – Xplorer IDE

    – XCC Compiler

    – ISS & TurboXim

    – System Modeling

    – Real-time Trace

  + IDE & RTOS Support

  Literature & Doc

Create TIE

Full Extensibility with TIE

TIE offers a wide range of flexibility in adding multi-cycle, pipelined execution units, register files, state registers, SIMD arithmetic and logic units, creating wide (up to 128-bit) load-store instructions, and adding designer-defined I/O Ports (GPIO), Queues (FIFO interfaces), and Lookup Ports.

Full Extensibility with TIE From a RISC to a VLIW, Vector Machine

You can create your own TIE instructions to customize your Xtensa processor, or Tensilica has a number of automated tools that will help you create tie. One of those tools is the XPRES Compiler, which uses standard C/C++ code as input and automatically determines different TIE optimizations that will run that C code much faster. Tensilica also offers a Flexible Length Instruction eXtension (FLIX) generator for VLIW accelerations and a Manual Fusion Editor to help designers create chains or fusions of fundamental computation operation to improve performance.

Using TIE instructions with a Tensilica processor core never compromises the underlying base Xtensa instruction set, thereby ensuring availability of a robust ecosystem of third party application software and development tools. All configurable, extensible Xtensa processors are always compatible with major operating systems, debug probes and ICE solutions; and always come with an automatically generated, complete software development toolchain including an advanced integrated development environment based on the ECLIPSE framework, a world-class compiler, a cycle-accurate SystemC-compatible instruction set simulator, and the full industry-standard GNU toolchain.

Creating FLIX (VLIW) Acceleration

An Xtensa LX2 processor can become a multi-issue VLIW processor. This coupled with the Xtensa C/C++ compiler’s ability to aggressively extract instruction-level parallelism from C/C++ code and bundle and schedule multiple operations in a VLIW instruction lead to an order of magnitude improvement in performance.

The beauty of Tensilica's implementation is that VLIW is only used where it's needed, eliminating the code bloat found in standard VLIW processors. Designers can manually figure out the best VLIW instructions or use Tensilica's automated FLIX generator, which profiles a designer's target C code and suggests VLIW instruction specifications that can significantly accelerate the most critical code. By allowing two or three instructions to execute simultaneously, FLIX allows and Xtensa LX2 processor to act as a 2- or 3-issue VLIW CPU, accelerating general purpose code by 40-60 percent.

After the processor core has been created using these new VLIW instructions, software developers programming the Xtensa LX2 core need only use the standard Xtensa C/C++ Compiler (XCC), which automatically extracts the instruction-level parallelism from C/C++ code and bundles operations into VLIW instructions whenever possible. So the programmer doesn't have to modify the application code to take advantage of the VLIW instruction extensions to speed up the code.

Creating Multi-cycle Pipelined Execution Units with Fusions

It is possible to create multi-cycle execution units using TIE that are pipelined up to 31 stages. The designer only has to specify the functionality of the units in the high-level TIE language and Tensilica tools automatically generate the decode, pipeline, control, and bypass logic and update the software tool chain (including compiler, debugger, ISS) to recognize the new instructions and registers associated with the execution unit.

The TIE specification can be done in one of two ways: manually or using Tensilica's Manual Fusion Editor, a graphical tool that helps the designer quickly identify the fusions in an application, as shown below.

As an example, the figure below shows the TIE and corresponding TIE execution unit that performs a 16x16 multiply and saturates the result down to 16 bits:

operation MUL_SAT_16 {out AR z, in AR a, in AR b} {}
{
wire [31:0] m = TIEmul(a[15:0],b[15:0],1);

assign z = {16'b0,
                   m[31] ? ((m[31:23]==9'b1) ? m[23:8] : 16'h8000)
                               : ((m[31:23]==9'b0) ? m[23:8] : 16'h7fff) };
}
schedule ms {MUL_SAT_16} {def z 2;}

Creating Pipelined Instructions

Creating SIMD TIE Execution Units

Creating SIMD execution units with TIE is just as simple. Here is the same MUL-SAT from the previous example, but designed as a SIMD unit that does two multiply-saturates in a SIMD fashion.

operation MUL_SAT_16 {out AR z, in AR a, in AR b} {}
{         
     wire [31:0] m1 = TIEmul(a[31:16],b[31:16],1);        
     wire [31:0] m0 = TIEmul(a[15:0],  b[15:0],  1);
     assign z = {m1[31] ? ((m1[31:23]==9'b1) ? m1[23:8] : 16'h8000)
                                          : ((m1[31:23]==9'b0) ? m1[23:8] : 16'h7fff),
                         {m0[30] ? ((m0[31:23]==9'b1) ? m0[23:8] : 16'h8000)
                                          : ((m0[31:23]==9'b0) ? m0[23:8] : 16'h7fff) }; }
schedule ms {MUL_SAT_16} {def z 2;}

SIMD : Exploiting Data Parallelism

Creating Vector Register Files to Couple with SIMD Units

Designers can specify a new register file with the following simple one line TIE statement:

regfile VecReg 64 16 vr

This TIE statement instantiates a register file called “VecReg” that consists of 16 registers that are 64-bit wide and the assembler will refer to the registers in this register file as “vr0, vr1, vr2, …, vr15”.

Designer-defined vector register files are particularly useful when coupled with SIMD execution units as shown in the example below. In this example, we created a 4-way SIMD multiple-saturate execution unit (and corresponding instruction) that uses the 64-bit vector register file for source and destination operands.

regfile VR 64 16 vr
operation MUL_SAT_4x16 {out VR z, in VR a, in VR b} {}
{
      wire [31:0] m3 = TIEmul(a[63:48],b[63:48],1);
      wire [31:0] m2 = TIEmul(a[47:32],b[47:32],1);
      wire [31:0] m1 = TIEmul(a[31:16],b[31:16],1);
      wire [31:0] m0 = TIEmul(a[15:0], b[15:0], 1);
      assign z = {m3[31] ? ((m3[31:23]==9'b1) ? m3[23:8] : 16'h8000)
                                            : ((m3[31:23]==9'b0) ? m3[23:8] : 16'h7fff),
                            m2[31] ? ((m2[31:23]==9'b1) ? m2[23:8] : 16'h8000)
                                            : ((m2[31:23]==9'b0) ? m2[23:8] : 16'h7fff),
                            m1[31] ? ((m1[31:23]==9'b1) ? m1[23:8] : 16'h8000)
                                            : ((m1[31:23]==9'b0) ? m1[23:8] : 16'h7fff),
                            m0[31] ? ((m0[31:23]==9'b1) ? m0[23:8] : 16'h8000)
                                            : ((m0[31:23]==9'b0) ? m0[23:8] : 16'h7fff) };
}
schedule ms {MUL_SAT_4x16} {def z 2;}

Using a Vector Register File with SIMD Instructions

PRODUCT RESOURCES
Xtensa Processor Developer's Product Brief
Xtensa Software Developer's Product Brief
Flash demonstration of Xtensa Xplorer
WHITE PAPERS
Automated Configurable Processor Design Flow
How to Quickly Simulate Entire SOCs to Explore and Optimize Architectural Performance
ARTICLES
Eclipse Platform Eases SOC Development
Automated Verification of  Configurable IP Blocks
How Tensilica Verifies Processor Cores
Optimizing C Programs for Embedded SOC Applications
QUOTABLE

“In the technology race, however, Tensilica’s start-to-finish processor-development system sets the company apart from the pack.”

Tom R. Halfhill,
Senior Analyst, Microprocessor Report

get more information