The What, Why and How of Configurable Processors
How to Avoid the Traps and Pitfalls of SOC Design
A Processor & DSP Selection Checklist
Get your ASICs and SOCs off the Bus!
Processor Configuration with Chris Rowen
Viterbi decoding also comes from GSM cellular telephony. GSM employs Viterbi decoding to pull information symbols out of a noisy transmission channel. This decoding scheme employs “butterfly" operations consisting of 8 logical operations (4 additions, 2 comparisons, and 2 selections) and uses 8 butterfly operations to decode each symbol in the received digital information stream.
Typically, RISC processors need 50 to 80 instruction cycles to execute one Viterbi butterfly although the stock Xtensa processor only needs 42 instructions. A high-end VLIW DSP is much better than most processors for Viterbi decoding because it only requires 1.75 cycles per Viterbi butterfly. However, high-end VLIW DSPs are very expensive and consume a lot of power.
Tensilica’s TIE (Tensilica Instruction Extension) language allows a designer to add a Viterbi butterfly instruction to the ISA including the corresponding pipeline hardware shown in the figure below (approximately 11,000 gates), and the processor’s configurable 128-bit I/O bus to load data for 8 symbols at a time.

These additions result in an average Viterbi butterfly execution time of 0.16 cycles per butterfly (a 250x speed improvement) for the augmented Xtensa processor – much better than even the high-end VLIW DSP.