Use Configurable, Extensible Processors as Building
Blocks
General-purpose microprocessor cores can’t
deliver the application throughput, cost, and power
efficiency needed for most computationally demanding
embedded system-on-chip (SOC) tasks. These processors
aren’t designed to efficiently manipulate
audio, video or network packets or do other highly
specialized tasks.
Until now, these demanding tasks had to be hard
coded in RTL to get the speed required. However,
designing millions of gates in RTL takes too long,
is too hard to verify, and can’t be changed
once the chip is fabricated.
Now there’s a real alternative to RTL design.
You can use configurable, extensible Xtensa processors
instead of RTL to finish your design much faster
and add flexibility to adapt to changing standards
or product requirements. Tensilica’s Xtensa
processors provide both the I/O throughput and
the computational performance previously only available
with RTL design.
Tensilica's XPRES
Compiler automates the design
of optimized configurable processors from ANSI
C/C++ code. The generated processor fully rivals
the performance and efficiency of hand-coded RTL
blocks with many concurrent operations, efficient
data types and optimized multiple wide deep pipelines.
Advantages of Using Xtensa Processors as RTL Alternatives
- Lower verification effort and time: Designing hardwired RTL blocks has become more about verification than about design. Design teams typically spend twice the number of resources and person months on verification than on design. Design changes made late in the project cycle are often limited by the verification effort. Furthermore, whereas 90% of the area of hardwired RTL blocks lies in the data path and only 10% in the control logic, most of the bugs are found (perhaps 90%) in the control logic. The ability to extend the Xtensa processor using TIE enables designers to create the same data path inside the Xtensa processor as they would in a hardwired RTL block. Yet no control FSM need be generated and verified by the designer; instead the control logic is expressed as software – instructions that execute on the processor. It is easier to verify TIE extensions made to the Xtensa processor than it is to verify the corresponding RTL data path, since only the input-output relationship or functional behavior of the operations specified in TIE have to verified. The TIE Compiler and Xtensa Processor Generator take care of converting the TIE specification into data path elements in the processor pipeline and implementing the control, decode, and bypass logic in the processor control units. Thus, the designer does not have to verify RTL anymore, just the TIE behavioral specification.
- Lower power because of finer clock gating: Since each Xtensa processor is automatically generated by the Xtensa Processor Generator, the Generator statically analyses pipeline activity and does aggressive clock gating on each functional unit and flip-flop in the processor RTL. To do an equivalent amount of clock gating in hardwired RTL blocks would be impractical due to the time and resources it would take and more importantly the burden it would add to the verification team on verifying that the clock gating is done correctly.
- Reuse of the same hardware for multiple tasks: Complex SOCs consist of millions of gates of logic and are designed to perform multiple tasks. Often these multiple tasks do not need to be performed at the same time. This provides an opportunity for multiple tasks to share the same hardware units. Processors are particularly amenable to enabling this sort of sharing. Designers can specify in TIE a data path that consists of a set of execution units that can be used by the multiple tasks and use the programmability of the processor to determine which tasks are executed. For example, a designer can build a video engine that can be used to implement a range of video codecs like H.264, MPEG-4, VC-1, etc.
- Flexibility to upgrade algorithms post-silicon: Using a task-optimized Xtensa processor to implement an algorithm lets the designer implement modifications, enhancements, and tweaks to the algorithm after the SOC has taped out. For example, half-toning algorithms in printers are a subject for continuous research and, therefore, are good candidates for implementation using an Xtensa processors.
Tensilica’s unique Xtensa processors break
through your strongest preconceptions about traditional
conventional processor cores:
Three examples, ranging from simple to complex,
illustrate how datapath extensions allow extensible
processors to replace RTL hardware in a variety
of situations.
Let’s first take a look at how Xtensa processors
deliver RTL-equivalent performance.
> Start
Tour
|