Processor Ports and Queues: Easily overcome I/O-bandwidth obstacles in your next ASIC or SOC design

Processor Ports and Queues:

Easily overcome I/O-bandwidth obstacles in your next ASIC or SOC design

Computational horsepower is always a concern for ASIC and SOC designers. I/O bandwidth is yet another. Configurable processor cores allow ASIC and SOC designers to add internal registers and function units that boost computational throughput, often by a factor of 10 to 100. However, increased computational performance places even more demand on the processor's I/O speed, to bring operands into the core and to ship results out. Conventional bus-centric design for on-chip I/O simply cannot handle the resulting increased traffic. Direct port I/O and FIFO queue interfaces can quickly ease the traffic load on overused buses, which greatly simplifies the design of complex chips.

Most processors, configurable or not, rely on one or a few buses to move data into and out of the processor core. These buses are increasingly inadequate for high-throughput applications such as video compression/decompression or high-speed networking. A new, configurable feature called "ports and queues," , based on tried-and-true system technology, provides the ASIC and SOC design team with as much bandwidth as any system can possibly use, essentially unlimited I/O bandwidth.

Figure 1 portrays a simple processor-based system. This single-bus approach to processor-based design dates back to the introduction of the first microprocessor in 1971. It hasn't changed much after more than 30 years despite radical improvement in processor performance. Simply said, the processor loads incoming data from input devices over the bus and sends results to output devices over the same bus.

For discrete integrated circuits and printed circuit boards, the adoption of bus-based I/O saved package pins and circuit-board traces. However, for ASIC and SOC designs, the bus-centric design approach leaves a lot of IC technology's fundamental performance untouched. The reason is simple. In Figure 1, the two memory devices and the two peripherals share the processor's bus. Each of these four devices receives some fraction of the processor's bus bandwidth and the system's overall throughput suffers. As the number of blocks connected to the bus increases, the I/O congestion increases as well.

To counter this problem, most contemporary embedded processor cores have separate local-memory buses to remove instruction and local data traffic from the bus. The resulting systems look something like the one shown in Figure 2.

Click here to read this white paper

Marketing Agency