HiFi 2 Audio DSP Product Brief
Cut DSP Development Time - Get High Performance From C, No Assembly Required
Optimizing a DSP Architecture for Wireless Baseband
A Designers Guide to HD Video Pre- and Post-Processing
Put Low-Power, Low-Overhead, High-Fidelity Digital Sound in Your Next ASIC or SOC
See our complete white paper library
Tensilica DSP Targets LTE Advanced - Microprocessor Report review of ConnX BBE64
Tensilica Plays Baseband - New ConnX Core Aims for Low-Power Wireless Communications - Microprocessor Report review of ConnX BBE16
Tensilica Xtensa LX Processor with Vectra LX - BDTI
The ConnX BBE32UE Baseband Engine is a high-performance DSP designed for use in the next-generation communication baseband processors in LTE-Advanced and multi-standard User Equipment PHY (Layer 1) systems. As PHY system developers move from LTE to LTE-Advanced, they face the challenge of up to 5x performance increase with a very low power budget. Couple this with new algorithms, field testing and demand for fast time to market, and developers are looking more to DSP cores to offer the flexibility and fast development they need, but at very low power consumption.
The ConnX BBE32UE is built around a core vector pipeline made of 32 MACs. The 16b x 16b multipliers with singed, unsigned support and associated adder and multiplexer trees enable operations such as Matrix computation, parallel complex multiple operations and signal filter structures. The results of these operations can be full precision or truncated/rounded/saturated and shifted to meet the needs of different algorithms and implementations. High precision is a key factor with ConnX BBE32UE and, as a result, more signed multiplication results can be accumulated without loss of precision, with fewer register spills and lower power.
The ConnX BBE32UE supports programming in C with a vectorizing compiler. Automatic vectorization of scalar C and full support for vector data types allows the development of algorithms without the need to program at the assembly level. Native C operator overloading is supported for natural programming with standard C operators on real and complex vector data types.
The instruction set and architecture has been optimized for user equipment applications for LTE-Advanced and multi-standard communications, and tuned to meet the performance and computation requirements of this market. This results in a smaller, much more energy efficient DSP core. The core is also better suited to User Equipment modem system integration. For example the large 1K and 2K FFT algorithms are generally run in offload accelerators. ConnX BBE32UE is optimized assuming this type of offloading.
A wide variety of load/store operations supports seven different addressing modes with support for 16b/32b scalar and vector data types. The option to add unaligned load/stores with masking delivers full bandwidth loads and stores for unaligned data. Vector data management is supported with data packing and shifting.
Multiply operations include complex and scalar 17b x 17b multiply, multiply-round, multiply-add and multiply- subtract functions. Complex-number functions include support for conjugate arithmetic and magnitude operations as well as full precision arithmetic and saturated/rounded outputs. The ConnX BBE32UE includes extended precision with guard bits on all register data and full support of double precision data, 40-bit accumulation on all MAC operations without performance penalty. A wide variety of arithmetic, logical and shift operations are supported for up to 16 data words per cycle. There is full support for matrix multiplication with acceleration for OFDM matrix operations.
For further application acceleration, optional instruction packages are available offering 16-way SIMD integer and fractional divide, as well as a 8-way SIMD reciprocal square root. There is also a de-spread acceleration package using 16 complex MACs/cycle that includes Hadamard transforms. For 3GPP applications, option packages for soft bit demapping and LFSR generation are available.
ConnX BBE32UE supports custom Ports (general purpose wire interfaces) and Queue (FIFO) interfaces for efficient connection to offload accelerators. These custom interfaces can be defined to match the interfaces of existing offload accelerators. Buffered communication between two ConnX BBE32UE cores or between a ConnX BBE32UE and an offload accelerator can be automatically implemented using Queue interfaces and are fully supported in programming and modeling tools. These interfaces are dedicated to the offload accelerator and single cycle access. This is specifically important for user equipment applications as many functions may be moved to an offload accelerator. Thus, ConnX BBE32UE can access these offload accelerators in a single cycle, cycle deterministic operation, greatly reducing power consumption.
Local memories can be connected directly to a ConnX BBE32UE DSP using the Lookup interface, bypassing the processor memory bus. This allows efficient implementation of functions that require storage of multiple intermediate datasets. The ConnX BBE32UE also can be modified and extended by defining new instructions, registers, and execution units to augment the existing instruction set.
A complete set of tools are available to support the ConnX BBE32UE. A comprehensive instruction set simulator (ISS) allows developers to quickly simulate and evaluate performance. The fast, functional TurboXim simulator option achieves speeds that are 40 to 80 times faster than the ISS for efficient software development and functional verification. System C (XTSC) and C-based (XTMP) system modeling can aid in full-chip simulations. Pin Level XTSC offers joint simulation of SystemC and RTL level offload accelerator blocks for fast, cycle accurate simulations.
The toolset includes a high-performance C/C++ compiler with automatic vectorization to support the VLIW pipeline in ConnX BBE32UE. This comprehensive tool set also includes the linker, assembler, debugger, profiler, an energy estimation tool and graphical visualization tools. All major back-end EDA flows are supported.