Can I store a 1D array in the image memory? If so, can you give an example which shows how it's done.
you can use clCreateImage2D and set height as 1.
But how do I read it from the kernel??. the function "read_imagef" can take only int2 co-ordinates.
Can I take y as 0?
if you set dimension y to size = 1, then you can use (cl_int2){x,1} to access elements.
Originally posted by: himanshu.gautam you can use clCreateImage2D and set height as 1.
Doing this alone will only let you get 8192 elements on even high-end GPUs. You need to do modular arithmetic to map 1D coordinates onto 2D if you need more elements. clUtil does exactly this:
#define image1d_t image2d_t float4 read_1Dimagef(read_only image1d_t image, sampler_t sampler, int coord) { int2 imageDim = get_image_dim(image); int2 sampleCoord; sampleCoord.x = coord % imageDim.x; sampleCoord.y = coord / imageDim.x; return read_imagef(image, sampler, sampleCoord); }
Thanks for explaining this workaround. That should be interesting to many here.
There are a few more restrictions with this, namely you can't guarantee that clamping behavior works in a 1D sense, but I don't know how often people using 1D images will need this. Also, using typedef instead of #define gives you type safety, but craps out NVIDIA's compiler (at least when I tested it).
There's room for optimization: if you always make width be 8192 and allocate height accordingly, then you can replace imageDim.x with a constant power of 2, allowing for the % and / to be replaced by & and >>.
Hello;
It is possible to map image2d to 1D vectors, as in the attached code, which assumes 8192 elements per row.
As mentioned by rick.weber, powers of 2 make the code much faster because you can use bitwise operations (>> and &).
Please allow me to suggest this topic: http://www.cmsoft.com.br/index.php?option=com_content&view=category&layout=blog&id=115&Itemid=172
float ReadVecFromImg(int ind, __read_only image2d_t img) { const sampler_t smp = CLK_NORMALIZED_COORDS_FALSE | //Natural coordinates CLK_ADDRESS_CLAMP | //Clamp to zeros CLK_FILTER_NEAREST; //Don't interpolate if (ind < 0) return 0; //Divide desired position by 4 because there are 4 components per pixel int imgPos = ind >> 2; int2 coords; coords.x = imgPos >> 13; coords.y = imgPos & 0x1fff; //Reads the float4 float4 temp = read_imagef(img, smp, coords); //Computes the remainder of imgPos / 4 to check if function should return x,y,z or w component. imgPos = ind & 0x0003; if (imgPos < 2) { if (imgPos == 0) return temp.x; else return temp.y; } else { if (imgPos == 2) return temp.z; else return temp.w; } }