Communications

Choosing the Right Communications Structure

Once the rough number and types of processors is known and tasks are tentatively assigned to the processors, basic communication structure design starts. The goal is to discover the least expensive communications structure that satisfies the bandwidth and latency requirements of the tasks, including changes in the task load that may occur as the SOC’s use evolves over time or across a variety of target systems.

When low cost and good flexibility are most important, a shared-bus architecture, in which all resources are connected to one bus, may be most appropriate. Buses have two significant advantages: they tend to have low hardware complexity and bus-design issues are familiar to most designers. The glaring liability of the shared bus is long and unpredictable latency, particularly when a number of bus masters contend for access to different shared resources.

When the biggest challenge is total communications throughput with flexibility, the preferred structure is a general-purpose parallel communications network. A crossbar connection is the most common example, as is a two-level hierarchy of buses. A simple example of a mesh topology, with nine processors, is shown below.

General-Purpose Parallel Communications Style: On-Chip Mesh Network

Mesh Network

When the communication pattern is well known at design time and likely to be stable, the architect can optimize the communications around that particular pattern of data flow. The drawing below shows the direct connections made when the communications between the processors is well understood and will not change.

Direct Communications

Optimized Direct Parallel Communications

Communications = Software Mode + Hardware Interconnect

Intertask communications are built on two foundations: the software communications mode and the corresponding hardware mechanism. The three basic styles of software communications between tasks include message passing, shared memory, and device drivers.

Message passing makes all communication between tasks overt. All data is private to a task except when operands are sent by one task and received by another task. The send/receive model implies a queue; messages cannot be sent if the output queue is full and cannot be received if the input queue is empty. Hardware queues give the lowest latency and processor overhead, especially for small, fixed-length messages such as simple operands. Message passing is generally easier to code than shared-memory communications techniques when the tasks are largely independent, but is often harder to code efficiently when the tasks are very tightly coupled.

With shared-memory communications, only one task reads from or writes to the data buffer in memory at a time. Successful use of shared memory requires explicit access synchronization. A destination task must know when the sourcing task has written valid data or else old data may be read. Embedded software languages, such as C, typically include features that ease shared-memory programming.

The hardware-device-plus-software-device-driver model is most commonly used with complex I/O interfaces, such as networks or storage devices. The device-driver mode combines elements of message passing and shared-memory access. The principles of the device driver can be applied to almost any pair of communicating tasks, especially where the interface between tasks looks like a series of requests and responses.

For more information, get a copy of the book “Engineering the Complex SOC: Fast, Flexible Design with Configurable Processors, by Chris Rowen, published by Prentice Hall.

Marketing Agency