HiFi 2 Audio DSP Product Brief
Cut DSP Development Time - Get High Performance From C, No Assembly Required
Optimizing a DSP Architecture for Wireless Baseband
A Designers Guide to HD Video Pre- and Post-Processing
Put Low-Power, Low-Overhead, High-Fidelity Digital Sound in Your Next ASIC or SOC
See our complete white paper library
Tensilica DSP Targets LTE Advanced - Microprocessor Report review of ConnX BBE64
Tensilica Plays Baseband - New ConnX Core Aims for Low-Power Wireless Communications - Microprocessor Report review of ConnX BBE16
Tensilica Xtensa LX Processor with Vectra LX - BDTI
\'Read Microprocessor Report\'s
The ConnX BBE64 Baseband Engine family is based on an ultra-high performance DSP designed for use in next-generation communication baseband processors in LTE Advanced and other next-generation 4G cellular radios and multi-standard broadcast receivers. The high computation requirements in such applications require new, innovative architectures with a high degree of parallelism and efficient I/O. The ConnX BBE64 family meets these needs by combining a 32-way SIMD, 4-issue VLIW processing pipeline with a rich and extensible set of interfaces. With the option to add 64 extra multipliers, the ConnX BBE64-128 offers up to 128 MACs/cycle.
The ConnX BBE64 family is built around a set of versatile pipelined execution units including flexible precision real and complex multiply-add, adders, bit manipulation, shift and normalization, select, shuffle and interleave units. The results of all these operations can be extended precision of 40 bits per component or truncated/rounded/saturated and shifted to meet the needs of different algorithms and implementations.
The optional accelerated FIR unit offers even higher performance for a wide range of filtering tasks, including complex data (real coefficient at 64 taps/cycle and complex coefficient at 32 taps/cycle) and real data (symmetric real coefficient at 256 taps/cycle and asymmetric real coefficient at 128 taps/cycle).
The ConnX BBE64 architecture is code compatible with ConnX BBE16 and further expands on the Boolean predication architecture of ConnX BBE16. This enables the compiler to achieve high throughput with vectorization even on complex functions with conditional operations embedded in the inner loops.
The ConnX BBE64 family supports programming in C with a vectorizing compiler. Automatic vectorization of scalar C and full support for vector datatypes allows the development of algorithms without the need to program at the assembly level. Native C operator overloading is supported for natural programming with standard C operators on real and complex vector data types.
The BBE64 family includes an extensive feature set specifically optimized for LTE Advanced wireless. To meet the needs of handsets and infrastructure, Tensilica created two processors, both of which can be further tailored to meet application requirements:
The ConnX BBE64 processors are options for the popular Xtensa dataplane processor (DPU). The power of the ConnX BBE64 architecture comes from a comprehensive DSP and baseband instruction set with over 100 instructions.
A wide variety of load/store operations supports six different addressing modes with support for 16b/32b scalar and vector data types. Unaligned Load/Stores with masking deliver full bandwidth Loads and Stores for unaligned data. Vector data management is supported with data packing and shifting.
Multiply operations include complex and scalar 18bx18b multiply, multiply-round and multiply-add functions. Complex-number functions include support for conjugate arithmetic and magnitude operations as well as full precision arithmetic and saturated/rounded outputs. The ConnX BBE64 is capable of performing up to 128 multiplies per operation. BBE64 includes extended precision with guard bits on all register data and full support of double precision data, 40-bit accumulation on MAC operations without performance penalty. A wide variety of arithmetic, logical and shift operations are supported for up to 96 operations per cycle. There is full support for matrix multiplication for both packed and component data representation with acceleration for OFDM matrix operations.
The ConnX BBE64 directly supports 16-way radix-4 and radix-8 FFT butterly steps, 16-32 complex tap FIR or 64-128 real tap FIR operations (256 real symmetric FIR taps) per cycle. The instruction set efficiently implements odd-radix DFTs commonly found in LTE and LTE-Advanced.
For further application acceleration optional instruction packages are available for 32-way SIMD integer and fractional divide, 16-way SIMD reciprocal square root and de-spreading functions (64 complex MACs/cycle).
BBE64 supports custom ports (general purpose wire interfaces) and queue (FIFO) interfaces for efficient connection to coprocessors. These custom interfaces can be defined to match the interfaces of existing RTL hardware blocks. Buffered communication between two ConnX BBE64s or between a ConnX BBE64 and an RTL block can be automatically implemented using queue interfaces and are fully supported in programming and modeling tools.
Multiple parallel local memories can be connected directly to a ConnX BBE64 DSP using the Lookup interface, allowing more than 32 independent memory references per cycle. The ConnX BBE64 also can be further extended by defining new instructions, registers, and execution units to augment the existing instruction set.
A complete set of tools are available to support the ConnX BBE64. A comprehensive instruction set simulator (ISS) allows developers to quickly simulate and evaluate performance. The fast, functional TurboXim simulator option achieves speeds that are more than 40 times faster than the ISS for efficient software development and functional verification. System C (XTSC) and C-based (XTMP) system modeling can aid in full-chip simulations.
The toolset includes a high-performance C/C++ compiler with automatic vectorization to support the VLIW pipeline in the BBE64 core. This comprehensive tool set also includes the linker, assembler, debugger, profiler, an energy estimation tool and graphical visualization tools. All major back-end EDA flows are supported.