IMG_corr_5x5_i16s_c16s


Detailed Description


Functions

void IMG_corr_5x5_i16s_c16s (const short *restrict imgin_ptr, int *restrict imgout_ptr, short width, short pitch, const short *restrict mask_ptr, short shift, int round)


Function Documentation

void IMG_corr_5x5_i16s_c16s ( const short *restrict  imgin_ptr,
int *restrict  imgout_ptr,
short  width,
short  pitch,
const short *restrict  mask_ptr,
short  shift,
int  round 
)

Description:
The convolution kernel accepts five rows of 'pitch' input pixels and produces one row of 'width' output pixels using the 5 pixel square filter mask provided on input.
The correlation sum is calculated as a point by point multiplication of the 5x5 mask with the input image. The 25 resulting multiplications are summed together to produce a 40-bit intermediate sum. A rounding constant is added to the sum and then right-shifted to produce a 32-bit output value that is subsequently stored in the output array. Overflow and saturation of the accumulated sum are not explicitly prevented, however assumptions are made on filter gain to avoid them.
The mask is moved one column at a time, advancing the mask over the entire image until the entire 'width' is covered. The mask and the input image pixels are both provided as 16-bit signed values, while the output pixels are 32-bit signed. The mask chosen for correlation is typically part of the input image or another image.
Parameters:
imgin_ptr Pointer to an input image of 16-bit pixels
imgout_ptr Pointer to an output image of 32-bit pixels
width Number of output pixels
pitch Number of columns in the image
mask_ptr Pointer to a 16-bit filter mask
shift User specified right shift on sum
round User specified round value
Algorithm:
The natural C implementation has no restrictions. The optimized intrinsic C code has restrictions as noted in Assumptions below.
Assumptions:
  • The input array and output array should not overlap
  • The output array must be 64-bit aligned
  • The input and mask arrays must be 16-bit aligned
  • The image pitch must be greater than or equal to the width
  • The width parameter must be a non-zero multiple of 2
  • Internal accuracy of the computations is 40 bits. To ensure correctness on a 16 bit input data, the maximum permissible filter gain in terms of bits is 24-bits i.e. the cumulative sum of the absolute values of the filter coefficients should not exceed 2^24 - 1.
  • Shift should be selected to ensure a 32-bit result. Overflows are not handled.
  • Valid filter mask co-efficient range is -32767 to 32767.
Implementation Notes:
  • This code is fully interruptible
  • This code is compatible with C66x processors
Benchmarks:
See IMGLIB_Test_Report.html for cycle and memory information.


Copyright 2012, Texas Instruments Incorporated