IMG_conv_3x3_i16s_c16s


Detailed Description


Functions

void IMG_conv_3x3_i16s_c16s (const short *restrict imgin_ptr, short *restrict imgout_ptr, short width, short pitch, const short *restrict mask_ptr, short shift)


Function Documentation

void IMG_conv_3x3_i16s_c16s ( const short *restrict  imgin_ptr,
short *restrict  imgout_ptr,
short  width,
short  pitch,
const short *restrict  mask_ptr,
short  shift 
)

Description:
The convolution kernel accepts three rows of 'pitch' input pixels and produces one row of 'width' output pixels using the three pixel square filter mask provided on input.
The input mask is rotated 180 degrees prior to calculating the convolution sum. The convolution sum is calculated as a point by point multiplication of the rotated mask with the input image. The 9 resulting multiplications are summed together to produce a 32-bit intermediate sum. Overflow during accumulation is not prevented, though assumptions may be applied to filter gain to avoid overflow.
The user defined shift value is used to shift the convolution sum down to a 16-bit range prior to storing in the output array. The stored result is saturated accordingly. The mask is moved one column at a time, advancing the mask over the image until the entire 'width' is covered.
Parameters:
imgin_ptr Pointer to an input image of 16-bit pixels
imgout_ptr Pointer to an output image of 16-bit pixels
width Number of output pixels
pitch Number of columns in the image
mask_ptr Pointer to a 16-bit filter mask
shift User specified right shift on sum
Algorithm:
The natural C implementation has no restrictions. The optimized intrinsic C code has restrictions as noted in Assumptions below.
Assumptions:
  • The input and output arrays should not overlap
  • The output image array must be 32-bit aligned
  • The mask and input image array must be 16-bit aligned.
  • The width parameter must be a non-zero multiple of 2
  • The image pitch must be greater than or equal to the width and a multiple of 2
  • Internal accuracy of the computations is 32 bits. To ensure correctness on a 16 bit input data, the maximum permissible filter gain in terms of bits is 16-bits i.e. the cumulative sum of the absolute values of the filter coefficients should not exceed 2^16 - 1.
Implementation Notes:
  • This code is fully interruptible
  • This code is compatible with C66x processors
Benchmarks:
See IMGLIB_Test_Report.html for cycle and memory information.


Copyright 2012, Texas Instruments Incorporated