IMG_histogram_8


Detailed Description


Functions

void IMG_histogram_8 (const unsigned char *restrict image, int n, short accumulate, short *restrict t_hist, short *restrict hist)


Function Documentation

void IMG_histogram_8 ( const unsigned char *restrict  image,
int  n,
short  accumulate,
short *restrict  t_hist,
short *restrict  hist 
)

Description:
This code takes a histogram of an array of n, 8 bit inputs. It returns the histogram of 256 bins at 16 bit precision. It can either add-to or subtract-from an existing histogram via the 'accumulate' control. The implementation requires temporary storage for 4 256-bin histograms which are summed for the final result.
Parameters:
image Input image pointer containing "n" unsigned 8-bit pixels
n Size of image in pixels
accumulate Control to add or subtract from the running histogram. This control is only defined for the values 1 and -1 for ADD and SUBTRACT respectively
t_hist Scratch buffer for temporary histogram storage (1024 bytes)
hist Running histogram bins (256)
Algorithm:
This code operates on four interleaved histogram bins. The loop is divided into two halves. The even half operates on even words of pixels and the odd half operates on odd words. Both halves utilize the same 4 histogram bins. This introduces a memory dependency which would ordinarily degrade performance. To break the memory depenencies, the two halves forward their results to each other. Exact memory access ordering obviates the need to predicate stores.
The algorithm is ordered as follows:
  1. Load from histogram for even half
  2. Store odd_bin to histogram for odd half (previous iteration)
  3. IF data_even == previous data_odd THEN increment even_bin by 2 ELSE increment even_bin by 1, forward to odd
  4. Load from histogram for odd half (current iteration)
  5. Store even_bin to histogram for even half
  6. IF data_odd == previous data_even THEN increment odd_bin by 2 ELSE increment odd_bin by 1, forward to even
  7. Repeat from 1.
With this particular ordering, forwarding is necessary between the even and odd halves when pixels in adjacent halves fall in the same bin. The store is never predicated and occurs speculatively as it will be overwritten by the next value containing the extra forwarded value.
The four scratch histograms are interleaved with each bin spaced four half words apart and each histogram starting in a different memory bank. This allows the four histogram accesses to proceed in any order without worrying about bank conflicts. The diagram below illustrates this: (addresses are halfword offsets)
         0       1       2       3       4       5       6   ...        
     | hst 0 | hst 1 | hst 2 | hst 3 | hst 0 | hst 1 | ...   ...        
     | bin 0 | bin 0 | bin 0 | bin 0 | bin 1 | bin 1 | ...   ...        
   
Assumptions:
  • The temporary array, t_hist, is initialized to zero for the first call to the routine (for "accumulate")
  • The input array of image data is aligned on a 4 byte boundary
  • The number of pixels is a non-zero multiple of 8.
  • The maximum bin number is 32767.
  • The maximum n is 262143.
Implementation Notes:
  • This code is fully interruptible
  • This code is compatible with C66x processors
  • No bank conflicts should occur in this code
Benchmarks:
See IMGLIB_Test_Report.html for cycle and memory information.


Copyright 2012, Texas Instruments Incorporated