|
See Microprocessor Forum Presentation: Diamond 388VDO Dual Core Video Decoder/Encoder (PDF)
Xtensa Configurable Processors for Video and Diamond Standard 388VDO Engine for H.264, VC-1/WMV9, MPEG-4 and MPEG-2 Video
Tensilica's Xtensa configurable processors are ideal for processing demanding video streams. Tensilica has three solutions for video:
- Several of Tensilica’s customers have built their own video engines and other audio-video solutions based on the Xtensa technology. See "Creating multi-standard, multi-resolution video engines using configurable processors."
- Tensilica offers the Diamond Standard 388VDO Video Engine, optimized for multi-standard and multi-resolution video, with software for H.264 Main profile decode, MPEG-4 Advanced Simple profile decode, VC-1/WMV9 decode, MPEG-2 Main profile decode, and MPEG-4 Advanced Simple profile encode.
- Tensilica's partners offer a spectrum of programmable video encode and decode sub-systems that cover the range of video standards (H.264, VC1, MPEG-4, MPEG-2) and video resolutions (QCIF, CIF, VGA, D1, HD). All codecs are written entirely in C.
Diamond Standard 388VDO Video Engine
Targeted at mobile handsets and personal media players (PMPs), Tensilica's Diamond Standard 388VDO Video Engine is fully programmable to support all popular VGA and standard definition (SD, also known as D1) video codecs with resolutions up to 720x480 (NTSC) and 720x576 (PAL) including H.264 Main Profile, VC-1 Main Profile, MPEG-4 Advanced Simple Profile (ASP), and MPEG-2 Main Profile, each of which is available from Tensilica. Lower resolutions such as QCIF, QVGA, CIF and VGA are also supported.
Supported Video Decoders and Encoders
Decoders |
Encoders |
|
H.264 Baseline Profile
H.264 Main Profile
JPEG
MPEG-2 Main Profile
MPEG-4 Simple Profile
MPEG-4 Advanced Simple Profile
VC-1/WMV9 Simple Profile
VC-1/WMV9 Main Profile |
H.264 Baseline Profile
JPEG
MPEG-4 Simple Profile
MPEG-4 Advanced Simple Profile |
The Diamond Standard 388VDO Engine hosts all the key video processing functions in software on the cores – including the network abstraction layer, picture layer, slice layer, bit-stream parsing and entropy decoding and encoding. This includes the computationally demanding CABAC (Context Adaptive Binary Arithmetic Coding) decoding in the H.264 Main profile decoder that most other solutions omit, implement in a separate and complex non-programmable hardware block or necessitate more than 700 MHz of general CPU workload which significantly increases power consumption. By implementing CABAC in instruction set extensions, Tensilica was able to create a low MHz and power efficient version of CABAC in less than half the area of a typical CABAC hardware block.
The Diamond 388VDO offers both Baseline and Main profile solutions – Main profile offers superior data compression and video quality and is the preferred coding scheme at resolutions of D1 and higher for advanced handset and PMP applications. Most other video solutions for SOC design only implement Baseline profile video.
For an in-depth discussion of the architecture, see the paper "Anatomy of a hardware video codec" from Video/Imaging Designline.
Full Software Suite Including Decoders and Encoders
Tensilica has developed encoders and decoders for the Diamond 388VDO Engine, so this is a complete solution with the hardware and software available directly from Tensilica. SOC designers do not need to rely on third-party application providers. Tensilica also provides a complete matching software development tool-chain including an advanced integrated development environment based on the ECLIPSE framework, a world-class compiler, a cycle-accurate SystemC-compatible instruction set simulator, and the full industry-standard GNU toolchain.
With the optimized video instructions, developers can port their own codecs to the Diamond 388VDO Engine entirely in C, saving time compared to assembly language programming. In addition, Tensilica’s wide partner network provides operating systems, debug probes, ICE solutions, and other support needed to help get Tensilica’s processors designed in quickly.
The Flexibility of Processor-Based Video Decoding
The Diamond 388VDO Engine compare quite favorably to the traditional approach of using pure hardware based video accelerators in tandem with conventional CPUs. First, the Diamond 388VDO offloads the full video decode task – including all bit-stream parsing – from the system host CPU. Conventional hardware accelerators only offload the pixel processing functions like motion estimation, and leave a large compute burden {often more than 100 MHz of continuous host CPU overhead} on the system controller.
Second, conventional solutions consisting of a CPU plus a hardware accelerator burn a huge amount of wasted power in the system bus when shuffling data to and from the CPU and accelerator – power that is often conveniently not counted by other IP vendors that boast that their HW accelerator block itself burns only a small amount of power.
Third, when the Diamond 388VDO Engine is not being used to perform video tasks, it is a ready resource of over 500 Dhrystone MIPS of general-purpose CPU power available to perform other system tasks – whereas a dedicated video HW block can never be reused.
Fourth, the Diamond 388VDO Engine is programmable and, therefore, can host future video standards that emerge in the coming years.
And finally, the Diamond 388VDO Engine delivers all these benefits in a compact footprint, consuming as little as 8 mm2 (including processor logic and attached local memories) in 130nm silicon processes.
Low Area, Low Power Solutions for SOC Design
The Diamond Standard 388VDO is optimized for mobile applications and requires a smaller area and consumes less power than competing solutions. Through the use of fine grained clock gating, a feature of the Xtensa processor architecture, and the integration of power management instructions which provide programmability to throttle power under varying video work loads, active power is further minimized. Additional power efficiency is achieved through the implementation of the DMA engine and interface to the Stream and Pixel Processors that minimizes the external memory bandwidth requirements.
In area efficiency for example, the full-featured Diamond 388VDO delivers Main profile H.264 support for decode and MPEG-4 ASP encode at D1 resolution yet consumes only 12 mm2, including memories, and runs at 200 MHz in TSMC 0.13G process technology. The Diamond 388VDO, while decoding the “Foreman” H.264 Main Profile video stream, consumes 24.5 mW plus 9.7 mW for the memory, for a total power consumption of only 34.2 mW, based on TSMC 0.90G pre-layout with wireload estimates.
Performance Summary
Standard |
Pixel Rate |
Bitrate |
Max Clock Rate Req* |
DRAM Bandwidth |
Power |
H.264 Main Profile Decode |
D1 |
5 Mbps |
162 MHz |
86.2 MB/s |
59 mW |
MPEG-4 Advanced Simple Profile Decode |
D1 |
6 Mbps |
167 MHz |
59.8 MB/s |
35 mW |
VC-1/WMV9 Main Profile Decode |
D1 |
6 Mbps |
172 MHz |
88.9 MB/s |
50 mW |
MPEG-2 Main Profile Deocde |
D1 |
8 Mbps |
151 MHz |
46.1 MB/s |
38 mW |
MPEG-4 Advanced Simple Profile Encode |
D1 |
4 Mbps |
188 MHz |
148 MB/s |
|
|