I am using int4 for read from the input Image and also to write to the output Image.
I have a 256*256 Image.
initially i assiged global work dimensions as 256,256 but its a bad idea as i used to get 16 Images of size 64*64 as its an int4 it reads 4 pixels for one work item and with 64 work items in X and Y direction it can read the entire Image.
Then i changed the global work item dimension by 64*256 ...as 64*4 is 256(in x direction the int4 reads 4 pixel/work item)...bad idea as it read the Image as 64*256.
I tried thinking if i make the global work item dimension loook like this 000100040008000120001600020...i tot work item will be executed when its a multiple of 4...definitely wrong.
Then reading inputImage as int4* and writing back as int* is not possible or?
...i want to read image as an int4 and output as an int4...is it possible?
/************This code is for global work dimension 64*256*******************/
__kernel void convolution(
__global int4 *inputImage, //cl_image_format.channel datatype CL_SIGNED_INT32; CL_R;
__global int4 *outputImage,
int inputWidth, //256
int inputHeight, //256
__constant int *filter, //sobel filter
int filterWidth) //filterWidth=3
int x = get_global_id(0);
int y = get_global_id(1);
int4 sum = 0;
int kx, ky;
int widthby4 = inputWidth/4;
for (ky = -filterWidth/2; ky <= filterWidth/2; ++ky)
for (kx = -filterWidth/2; kx <= filterWidth/2; ++kx)
sum += inputImage[(y + ky) * widthby4 + (x + kx)] * filter[(ky + filterWidth/2) * filterWidth + (kx + filterWidth/2)];
sum /= 9;
outputImage[(y)*widthby4+(x)] = sum;