Microprocessor Report's review of Xtensa LX3
Processor Ports and Queues: Easily Overcome I/O Bandwidth Obstacles in Your Next ASIC or SOC Design
How to Increase ASICs and SOC Computational Performance with Long-Wod Processors
Minimize Energy Consumption While Maximizing ASIC and SOC Performance
See entire white paper library
Most embedded processors offer fixed hardware functionality with options for memory size, cache size, and bus interface. Performance is proportional to the clock speed. Beyond that, application code optimization effort or a move to the next processor in the roadmap is required. Tensilica offers something different - the opportunity to optimize the processor itself using Tensilica's TIE (Tensilica Instruction Extension) language.
Tensilica's TIE language is used to describe new instructions, registers and execution units that are then automatically added to the Xtensa LX processor. TIE is a Verilog-like language used to describe desired instruction mnemonics, operands, encoding, and execution semantics. The TIE files are inputs to the Xtensa Processor Generator. The Generator automatically builds the processor and the complete software tool chain that incorporates all configuration options and new TIE instructions. The base instruction set remains for maximum compatibility with third party development tools and operating systems.
The TIE language unlocks the true power of Xtensa LX. It allows designers to get orders of magnitude performance increases for their applications and create differentiation.
Just as the designer can choose from a set of predefined functional options to improve processor performance, the designer can now create instructions that can speed up standard or proprietary algorithms. Using the tools provided, application hot spots can be identified and additional logic created to process these hot spots more efficiently, without the need to increase the clock frequency or re-write the software.

Tensilica offers a proven method of adding designer-defined functions units and interfaces to the Xtensa LX DPU
When processors have fixed hardware functionality and your competitors are using the same or similar processors, then differentiation is often limited to the algorithm implementation itself. Fixed processors are good at general-purpose computing, but not so good at any specific algorithm. Tensilica gives you the opportunity to differentiate at the hardware level and implement algorithms more efficiently by designing logic that will accelerate your particular algorithm. This means that your design will be almost impossible to copy, as only your hardware will reach the performance required on the same software implementation.
TIE offers a wide range of flexibility in adding multi-cycle, pipelined execution units, register files, state registers, SIMD arithmetic and logic units, creating wide (up to 512-bit) load-store instructions, and adding designer-defined I/O Ports (direct wires) and Queues (FIFO interfaces).

Full Extensibility with TIE From a RISC to a VLIW, Vector Machine
Adding TIE instructions never compromises the underlying base Xtensa instruction set, thereby ensuring availability of a robust ecosystem of third party application software and development tools. All configurable, extensible Xtensa processors are compatible with major operating systems, debug probes and ICE solutions. They always come with an automatically generated, complete software development toolchain including an advanced integrated development environment based on the ECLIPSE framework, a world-class compiler, a cycle-accurate SystemC-compatible instruction set simulator, and the full industry-standard GNU toolchain.
An Xtensa LX processor can become a multi-issue VLIW processor. This coupled with the Xtensa C/C++ compiler’s ability to aggressively extract instruction-level parallelism from C/C++ code and bundle and schedule multiple operations in a VLIW instruction can lead to an order of magnitude improvement in performance.

The beauty of Tensilica's implementation is that VLIW is only used where it's needed, eliminating the code bloat found in standard VLIW processors. Designers can manually figure out the best VLIW instructions or use Tensilica's automated FLIX generator, which profiles a designer's target C code and suggests VLIW instruction specifications that can significantly accelerate the most critical code. By allowing two or three instructions to execute simultaneously, FLIX allows and Xtensa LX processor to act as a 2-, 3-, or 4-issue VLIW CPU, accelerating general purpose code by 40-60 percent.
After the processor core has been created using these new VLIW instructions, software developers programming the Xtensa LX core need only use the standard Xtensa C/C++ Compiler (XCC), which automatically extracts the instruction-level parallelism from C/C++ code and bundles operations into VLIW instructions whenever possible. The programmer doesn't have to modify the application code to take advantage of the VLIW instruction extensions to speed up the code.
It is possible to create multi-cycle execution units using TIE that is pipelined up to 31 stages. The designer only has to specify the functionality of the units in the high-level TIE language and Tensilica tools automatically generate the decode, pipeline, control and bypass logic as well as updating the software tool chain (including compiler, debugger, ISS) to recognize the new instructions and registers associated with the execution unit.
The TIE specification can be done in one of two ways: manually or using Tensilica's Manual Fusion Editor, a graphical tool that helps the designer quickly identify the fusions in an application, as shown below.

Tensilica's Manual Fusion Editor
There's a lot more to learn about TIE, and we provide training and an extensive list of application notes. Visit our to find out more.