I'm a computer science student pretty new to gpu programming. I have an ATI / AMD 4870 gpu (I think no Image support) and I want to compute a large amount of min max mip maps on it.
Lets say we have a greyscale image with 4x4 pixels. We take 2x2 areas and save the maximum value of this 2x2 area into one pixel (of an other image). The same thing is done with the min value. So the result is are two 2x2 images which contain max / min data. This is very useful for ray tracing.
Here is a small example (blue pixels are the beginning of a 2x2 area, each square is a pixel):
The images are power of 2 images (256*256, 512*512, 1024*1024 etc.).
Im looking for a good way to implement it. My problem is bounding a kernel to the right areas of memory. And skipp the uneven lines, image this is our image I want only 0, 2, 8 and 10 to be the start points of a kernel call:
|Header 1||Header 2||Header 3||Header 4|
Any good idea how I can do this? In Cuda if statements are pretty expensive. And the output memory pixel is quite strange to compute since 2x2 pixel become one. My idea:
//dimension is the width (and the height since its the same since its power of 2)
__kernel void kernel (__global uchar* input, uint dimension, __global char* uoutput_min, __global uchar* output_min)
uint positon_x = whatfunction();
if ((positon_x % 2) == 0)
if(input[position_x] > input[position_x+1] && input input[position_x] > input[position_x+dimension+1] && input[position_x] > input[position_x+dimension])
output[position_x/2] = input[position_x];
//and so on