Avoid Bus Bottlenecks

Avoid Bus Bottlenecks and Get More Performance from Your ASIC and SOC Designs by Using High-Speed, On-Chip I/O

The two big bottlenecks in the design of high-speed blocks for ASICs and SOCs are I/O performance and computational performance. A processor core's main bus represents a significant I/O bottleneck. All data passing into and out of the processor must travel over the processor's main bus. Consequently, two factors constrain I/O traffic in and out of the processor. First, the main bus can only perform one transfer at a time so other pending transfers must wait for the current transfer to clear. Second, because a processor's main bus is designed to accommodate many system configurations, these buses tend to require multiple cycles to effect bus transactions. As a result of these limitations, processor cores have lacked the I/O bandwidth required by many tasks performed in SOCs.

Tensilica developed innovative techniques to relieve the congestion on their main bus. Tensilica's Xtensa customizable processor cores incorporate several features that improve I/O bandwidth and allow the processor to bypass the bus and move data significantly faster than conventional 32-bit RISC processor cores. In fact, these I/O features allow Xtensa processor cores to achieve data-transfer rates that can match those of hand-designed RTL blocks. That's why we call our processors DPUs - dataplane processing units - because they're designed with features like these unique I/Os that make them idea for the computationally complex tasks in the SOC dataplane.

The key features that allow the Xtensa DPUs to achieve these high data-transfer rates are its XLMI local-memory bus and its GPIO ports and FIFO queue interfaces. The XLMI bus is a simple, fast, single-cycle bus that performs transfers much faster than the Xtensa processor's main bus (the PIF) because it's not designed to support multiple bus masters. The ports and queues allow designers to add many new input and output connections directly into and out of the processor's execution unit.

Ports and queues are available to Xtensa designers via simple check-box options during the configuration process. In addition, with Tensilica's Xtensa LX processor, designers can add specialized custom port and queue interfaces using TIE, Tensilica's Instruction Extension language, which allows designers to extend a processor's abilities using a Verilog-like syntax to describe the new abilities functionality-without the need to describe the structure of the hardware that implements these abilities. The structure is generated automatically by the processor generator.

Using these features to maximize I/O bandwidth, ASIC and SOC designers can implement high-speed, processor-based I/O with transfer rates that are similar to those achieved by hand-coded RTL function blocks. However, high-speed I/O implemented as a processor feature requires much less effort from the ASIC or SOC development team because the Xtensa DPU is generated automatically by Tensilica's Xtensa Processor Generator and it is guaranteed correct by construction.

The Tyranny of One Bus

Figure 1 shows the microprocessor core configuration typically found in SOC designs. The sole data highway into and out of the processor is its main bus. Because processors often interact with other types of bus masters including other processors and DMA controllers, their main buses have sophisticated transaction protocols and arbitration mechanisms for sharing the bus among masters. These mechanisms result in bus transactions that occur over several clock cycles.

 

 

Click here to read this white paper

Marketing Agency