Built for Next-Generation Imaging/Video Requirements
Did You Know?
A Complete Platform for Image and Video Processing.
The IVP is a licensable, synthesizable subsystem with rich software tools and libraries. The instruction set, memory system and data types have all been optimized for high-throughput 8-, 16- and 32-bit pixel processing.
IVP is much more than just a processor. It's a complete platform for image and video processing.
The IVP Platform
Details on the Platform
micro-DMA Transfer Engine
The μDMA engine is a closely coupled chaining DMA with interleaved 3D transfers that is closely coupled to the IVP core to reduce DMA programming and completion overhead. It has autonomous parallel operations with an independent 512-bit/cycle memory port into local data memory and a 128-bit/cycle port onto the AXI bus. It eliminates memory latency for loads and stores. It offers up to 10 GBytes/second of throughput to keep up with the rapid pace of resolution and frame rate requirements.
Direct RTL Interfaces
Optional port connections to legacy RTL blocks let designers stream data and control between the IVP core and the RTL blocks without having to go through memory.
The memory/network interface combines the IVP core and micro-DMA transactions with cluster traffic.
Highly Energy Efficient
The IVP is highly energy efficient compared to CPUs or GPUs for 16-bit pixel operations (e.g., absolute-difference, multiply-add, shift-saturate). As an example, for IVP implemented in an automatic synthesis, place-and-route flow in 28nm HPM process, regular Vt, a 32-bit integral image computation on 16b pixel data at 1080p30 consumes 10.8 mW. The integral image function is commonly used in applications such as face and object detection and gesture recognition.
IVP’s high performance is demonstrated by complex kernels such as motion search and normalized cross-correlation, commonly used in high-precision block and feature matching and optical flow. For a smart motion search on 16-bit data over a 1920x1080 frame with 256x16 pixel search range and 9x3 pixel block size, IVP can achieve a rate of 142 sums of absolute differences per cycle. a normalized cross-correlation function on 16-bit pixel data with 32-bit accuracy achieves 1 million 8x8 blocks per second.
A 32-Element Engine, 4-Way VLIW, 16-bit Fixed Point Imaging/Video DSP
The IVP is a licensable, synthesizable subsystem with rich software tools and libraries. The instruction set, memory system and data types have all been optimized for high-throughput 8-, 16- and 32-bit pixel processing. It has an architecture that can scale by both the number of element engines as well as the number of processors.
The IVP Core Architecture
With Sample Memory Sizes Selected
Details on the Core Architecture
The IVP core is based on our proven Xtensa architecture. It uses a 4-way instruction issue with up to three pixel arithmetic operations per cycle (MUL, MAC, select, shift, ALU). It is capable of two 512-bit (32x16-bit) pixel data memory references per cycle. It features a 32-way vector SIMD with 16-bit register elements. It employs 8– and 16-bit memory elements. And its instruction set is further extensible by the designer.
32-way Vector SIMD Dataset
Each SIMD slide contains a rich set of computations resources:
- 3 independent 16-bit ALUs, 16x16 multiplier, 16-bit variable shifter
- 3 register files per slide (pixel register file, predicate register file and shift select register file)
- Interface to memory system that loads 8- or 16-bit data
- Cross-element select and reduction network for arbitrary number of element swaps or reduction of operations per cycle (e.g. reduction min-max, reduction adds)
- Memory rotator that operates at full rate on data from or to unaligned structures
- Prediction fully supported by compiler for high utilization
The 4-way VLIW issue of vector operations gives an almost arbitrary mix of loads, stores, multiplies, and three ALU operations all taking place simultaneously across all 32 element engines.
The IVP features many imaging-specific operations to accelerate 8-, 16- and 32-pixel data types and video operation patterns.
A Highly Customizable Processor
Our Proven, Comprehensive HW and SW Design Environment
For Processor Designers
Cadence delivers patented, proven tools that automate the process of further customizing and delivering the IVP along with matching software tools. These tools have been proven in hundreds of designs. You get RTL, EDA scripts, and reference test bench and test cases. You also get an instruction set simulator, fast functional simulator, SystemC modeling tools, and pin-level cosimulation.
View the complete set of tools for processor designers.
For Software Developers
Cadence provides a comprehensive Software Developer's Toolkit with code generation and analysis tools that speed the development process. Our Eclipse-based Xtensa Xplorer Integrated Development Environment (IDE) serves as the cockpit for the entire development experience. Our C/C++ Compiler is very highly rated, and has auto-vectorization to make compiling your code onto the IVP much easier.
View the complete set of tools for software developers.
Port your software quickly in C - no assembly porogramming is required or recommended. Even our partners port and optimize their software in C. We also provide an image processing library to speed up your software design.
Application Demo Platform
Running imaging/video applications in real time requires the complete pipeline from sensor to processor to video output. This FPGA-based demo platform allows for integration of imaging applications in a real-time environment.
Documentation & Literature
|Title||File Size||Last Modified|
|IVP Imaging/Video DSP Product Brief
The IVP imaging/video DSP includes a unique instruction set tuned for multi-frame image capture and video pre- and post-processing algorithms, as well as video stabilization, HDR for image and video, object and face recognition and tracking, low-light enhancement, digital zoom and gesture recognition.