IMG_conv_7x7_i16s_c16s


Detailed Description


Functions

void IMG_conv_7x7_i16s_c16s (const short *restrict imgin_ptr, short *restrict imgout_ptr, short width, short pitch, const short *restrict mask_ptr, short shift)


Function Documentation

void IMG_conv_7x7_i16s_c16s ( const short *restrict  imgin_ptr,
short *restrict  imgout_ptr,
short  width,
short  pitch,
const short *restrict  mask_ptr,
short  shift 
)

Description:
The convolution kernel accepts seven rows of 'pitch' input pixels and produces one row of 'width' output pixels using the 7 pixel square filter mask provided on input.
The input mask is rotated 180 degrees prior to calculating the convolution sum. The convolution sum is calculated as a point by point multiplication of the rotated mask with the input image. The 49 resulting multiplications are summed together to produce a 32-bit intermediate sum. Overflow during accumulation is not prevented, though assumptions may be applied to filter gain to avoid overflow.
The user defined shift value is used to shift the convolution sum down to a 16-bit range prior to storing in the output array. The stored result is saturated accordingly. The mask is moved one column at a time, advancing the mask over the image until the entire 'width' is covered.
Parameters:
imgin_ptr Pointer to an input image of 16-bit pixels
imgout_ptr Pointer to an output image of 16-bit pixels
width Number of output pixels
pitch Number of columns in the image
mask_ptr Pointer to a 16-bit filter mask
shift User specified right shift on sum
Algorithm:
The natural C implementation has no restrictions. The optimized intrinsic C code has restrictions as noted in Assumptions below.
Assumptions:
  • The input and output arrays should not overlap
  • The input and output arrays must be 64-bit aligned
  • The mask array must be 16-bit aligned
  • The image pitch must be greater than or equal to the width and a multiple of 4
  • The width parameter must be a non-zero multiple of 8
  • Internal accuracy of the computations is 32 bits. To ensure correctness on a 16 bit input data, the maximum permissible filter gain in terms of bits is 24-bits i.e. the cumulative sum of the absolute values of the filter coefficients should not exceed 2^24 - 1.
Implementation Notes:
  • This code is fully interruptible
  • This code is compatible with C66x processors
Benchmarks:
See IMGLIB_Test_Report.html for cycle and memory information.


Copyright 2012, Texas Instruments Incorporated