Q1: How do I have to modify my C code to use
XPRES?
A1: The good news is
that you don’t have to modify your C or C++
code at all to use the XPRES Compiler. The XPRES
Compiler will figure out the best ways to optimize
an Xtensa processor to run your unmodified C
code. Of course, you might want to rewrite sections
of your C code to expose more parallelism for better
optimizations, but many engineering teams find
that no modifications are needed.
Q2: How long does it take to use the XPRES Compiler?
A2: The XPRES Compiler
is very fast. Even very large algorithms, such
as full video codecs, can be run in less than an
hour. Smaller algorithms will run in just a few
minutes.
Q3: How is XPRES different from Behavioral Synthesis
or ESL?
A3: A variety of approaches – with
buzzwords such as “behavioral synthesis,” “C-language
hardware synthesis,” and “ESL” – have
all fallen short of the mark because they all try
to solve what is essentially an intractable problem:
transforming a description written in a sequentially
executable language into a parallel collection
of interoperating, non-programmable hardware blocks.
Tensilica’s XPRES Compiler tackles this
design problem using a far simpler, more direct
approach. Instead of attempting to create application-specific
hardware from scratch, the XPRES Compiler starts
with a fully functional microprocessor core and
then adds hardware to it in the form of additional
execution units and corresponding machine instructions
to speed processor execution for the target application.
Thus the XPRES Compiler starts with a working hardware
design (the Xtensa microprocessor core) and
makes it run faster for the targeted application
code. Because this is a considerably less complex
problem, the XPRES Compiler does its job very quickly.
It completes its search of the available design
space in less than an hour.
Q4: Does the XPRES Compiler work with Xtensa
7?
A4: Yes.
Q5: What if my algorithm changes? How can I still
take advantage of the accelerations designed into
my Xtensa processor?
A5: Just run your new
C code through the Xtensa C/C++ compiler (XCC)
generated with your custom Xtensa LX configuration.
That compiler recognizes and remembers the accelerations
built into your processor. Your C code will be
able to then take advantage of those accelerations
- that’s the big benefit of using a programmable
solution.
Q6: Can I still write my own TIE instructions?
A6: Yes. You can input
your own TIE instructions to further modify the
processor or even modify the TIE instructions produced
by the XPRES Compiler.
Q7: What about Ports and Queues - can the XPRES
Compiler use these?
A7: If you create custom ports and queues with an Xtensa LX2 processor, the
XPRES Compiler can recognize these and take advantage
of the high-speed I/O. However, the XPRES
Compiler will not create custom ports and queues.
Q8: How does the XPRES Compiler work?
A8: A simple 5-step
process explains how the XPRES Compiler is used.
Step 1. Compile the
original C/C++ application code. No recoding is
required. Tensilica’s C/C++ compiler generates
information about the application, performing such
functions as ranking code regions by frequency,
determining which loops can be vectorized, generating
dataflow graphs for important regions, and performing
operation counts for each type of opcode for every
region.
Step 2. Run the XPRES
Compiler to determine the best processor configuration
and extensions for that code. The XPRES Compiler
evaluates all generated configurations across all
regions and determines the best set of merged configurations
given a particular gate-count budget. The XPRES
Compiler is able to conduct abstract evaluations
and search through millions of configuration possibilities,
usually in less than an hour.
Step 3. (Optional) Manually
tune the automatically generated custom configuration.
Power users will want to refine or optimize the
code and add additional instructions for algorithms
or functions that other programs might need.
Step 4. Generate the
optimized processor. Use Tensilica’s proven
processor-generator technology and, in less than
one hour, generate the complete processor RTL with
EDA support, complete software-development tool
chain, simulations and modeling environment, and
RTOS support.
Step 5. Compile and
use the original, unmodified C/C++ application
using the newly optimized processor configuration.
No need to modify the C code to make it Tensilica-specific.
No need to use time-consuming assembly level optimizations. No
need to design custom hardware accelerators using
traditional RTL design methods.
|