Unlimited I/O Bandwidth
Three major innovations improve I/O throughput in
Xtensa LX2 processors:
All of these innovations can quickly and easily be implemented using the Tensilica Instruction Extension (TIE) language.
Second Load/Store Unit Option
Designers can choose one or two 128-bit wide load/store
units. Most standard embedded processors have only
a single narrow (32- or 64-bit) load/store unit.
However, many applications benefit from two load/store
units for data-intensive inner loops - a standard
feature of many high-end DSP processors. The Xtensa
LX2 processor’s optional second load/store
unit provides greater sustained general-purpose
I/O bandwidth and an XY-style memory access for
DSP applications. Additionally, at 128-bits, it’s
much wider and can accommodate much more data than
standard load/store units.

The second load/store
option is particularly valuable for DSP designs.
Ports and Queues
TIE Ports (GPIOs) and Queues (FIFO interfaces) are interfaces that a designer creates directly between two Xtensa processor datapaths or between one Xtensa processor and a block of RTL. These innovative interfaces allow I/O performance of up to 100s of bits per cycle, comparable to the speed designers achieve between blocks on an ASIC design. For example, TIE Ports can drive an external logic block with instructions or state and control information through direct wires that control that logic.
But unlike RTL-based design, configured and extended Xtensa LX2 processors are pre-verified by the Xtensa Processor Generator, and do not require hard-wired implementation of complex state machines. Instead of state machines, the complex datapaths added to Xtensa LX2 cores are sequenced/controlled by the instruction stream of the Xtensa LX2 processor. That means the "control logic" is fully software programmable and debuggable - reducing verification time and risk for the entire SOC.
Ports act like GPIO (general-purpose I/O) and are wires that directly connect two Xtensa
LX2 processors or an Xtensa LX2 processor to external
RTL. Ports are created using simple one-line declarations in a TIE file.
Port connections can be arbitrarily wide,
allowing wide data types to be transferred easily
without the need for multiple load/store operations.
As many as one million signals (1024 1024-bit-wide
ports) can be used, and while this is an outrageous
number, far exceeding the performance demands of
real systems today (providing 350 terabits/sec.
of direct data flow per processor in a 130 nm CMOS
process), this clearly demonstrates that old notions
of the I/O bottlenecks inherent in a processor-based
solution are now obsolete.

Designers can add
ports, queues, and lookup interfaces to get virtually limitless I/O.
While ports are ideal to quickly convey control
and status information, queues provide a high-speed
mechanism to transfer streaming data. Input queues
and output queues operate to the programmer’s
viewpoint like traditional processor registers
- with the notable exception that data is always
available without the need to load or store the
data before and after computation.

Ports and Queues speed data through the processor, bypassing the system bus.
Queues can sustain data rates as high as one transfer
every clock cycle or over 350 Gbits/sec for each
queue added to an Xtensa LX2 processor. Custom instructions
can perform multiple queue operations per cycle,
perhaps combining inputs from two input queues
with local data and sending the computed values
to two output queues. The high bandwidth and low
control overhead of queues allows the Xtensa LX2
processor to be used in applications with extreme
data rates. TIE input Queues present a familiar pop/empty/data interface to the external logic, while TIE output Queues present a similar push/full/data interface.
Automated - Easy to Add to Your SOC Design
Ports and Queues specified by the designer are
automatically added to the Xtensa LX2 processor
and are 100% fully modeled by Tensilica’s
Xtensa Processor Generator. The full behavior of
the Port or Queue, just like any other modification
made to the Xtensa LX2 processor, is automatically
reflected in the custom software development tools,
instruction set simulator, bus functional model
and EDA scripts - in about an hour. And because
it’s automated using Tensilica’s patented
technology, it’s pre-verified and correct
by construction - no need to re-verify the processor.
Simple one-line declarations in a TIE file define
new I/O Ports for configurations of Xtensa LX2 processors.
Only a handful
of commands need to be specified to create a high-bandwidth
set of I/O Queues and execution units that operate
on those queues.
Memory Lookup Interfaces
The TIE Lookup port feature allows the creation of new memory interfaces beyond those already available as local instruction and data memories. Memories connected to these new designer-defined TIE Lookup ports can be read and written directly from the processor data path without using load and store instructions. These interfaces can also be used to connect an external device to an Xtensa LX2 processor that can be accessed directly from the data path without using load and store instructions. These interfaces are useful for connecting RAMS for doing table lookups or for connecting long-latency hardware computation units.

TIE Lookup Interface save valuable power by minimizing memory accesses.
Video system designers can use a TIE Lookup port to connect a local buffer that stores video frame data that is filled/refilled by external hardware to the processor data path without using power-hungry DMA (Direct Memory Access).
Network designers can use TIE Lookup ports to connect large lookup tables that then can be quickly accessed by the processor.
|