The What, Why and How of Configurable Processors
How to Increase ASICs and SOC Computational Performance with Long-Word Processors
Processor Ports and Queues: Easily Overcome I/O Bandwidth Obstacles in Your Next ASIC or SOC Design
Processor Configuration with Chris Rowen
Digital signal processing – or DSP – tasks comprise the bulk of the data-intensive computational requirements for most SOCs in consumer, telecommunications, and wireless systems applications. Tensilica’s customers today are already using the Xtensa processor core for a variety of DSP tasks including audio processing, image processing, video processing, and communications channel processing. The Xtensa processor can be used in such a wide variety of applications because Tensilica offers several ways you can acccelerate DSP computations.
|
||||||||||||
For moderate intensity signal processing applications, a 16-bit multiply-accumulate engine can be added to the base Xtensa processor core with just a click of a configuration button in the Xtensa processor generator. Inclusion of the MAC16 option adds a full suite of multiply / accumulate instructions including auto-incrementing loads and combined multiply-accumulate-load instructions for high performance computation.
Similarly, you can quickly add a 16- or 32-bit multiplier, low-area integer divider, and/or a single/double precision floating point unit to your processor by using standard configuration options. These DSP instructions are 100% compiler supported.
For applications with one or more signal processing applications that require some amount of acceleration beyond the base Xtensa RISC processor features, the designer can quickly add instructions and hardware execution units tailored to a specific algorithm.
For example: the “butterfly" operation used in Convolutional Coding / Viterbi Decoding applications is a series of combination Add-Compare-Select (ACS) operations. If the data in question consists of 8-bit values packed in the standard 32-bit registers of Xtensa, a designer can easily add an ACS instruction the Xtensa processor with a small incremental block of execution unit hardware to greatly speed up Viterbi decoding for communications applications.
For applications with well-defined, very high performance signal processing computational demands, the TIE language provides a fast means of developing extremely powerful DSP extensions. Add custom registers and register files for unique data types. Create complex multiple-operation instructions and automatically pipeline those instructions into multi-cycle instructions by specifying a command directive in the TIE language that takes only one line of text in a TIE description. Create SIMD (single instruction, multiple data) instructions to tackle algorithms with native data parallelism. Use software-pipelining techniques to create combined compute-and-load, compute-and-store instructions for high data-rate applications that enable continuous computation without the performance overhead of processor load and store cycles.
For more information on the Xtensa processor architecture, click