Add SIMD SAD Hardware for a 46x Speed-Up
One of the most difficult parts of encoding MPEG-4
video data is motion estimation, which requires
the ability to search adjacent video frames for
similar pixel blocks.
The search algorithm employs a SAD (sum of absolute
differences) operation that involves a subtraction,
taking the absolute value of the subtraction, and
then accumulating that result across the entire
video frame.
For a QCIF (quarter common image format) video
frame at 15 frames/second, the SAD operation for
motion estimation requires just over 641 million
operations/second.
As shown in the picture below, it is possible
to add SIMD (single instruction, multiple data)
SAD hardware capable of executing 16 pixel-wide
SAD instructions per cycle using Tensilica’s
TIE language. Using a 128-bit maximum bus, it’s
also possible to load 16 pixels worth of data in
one instruction.

Adding a SIMD SAD
computational engine reduces the
computational load by 46x
Combining all three SAD component operations into
one instruction and the SIMD extension of this
instruction that computes the values for the 16
pixels in one clock cycle funnels the 641 million
operations/second requirement into 14 million instructions/second,
a reduction of 46x.
< previous
page | next page >
|