Archives Discussions

yariv · ‎02-15-2010

Hello,

I fail to compile and run my OpenCL kernel with image buffers on the ATI 5870. Any known bugs ?

I am using OpenCL 2.01 and AMD desktop with Win7 64 bits.

Thanks, --Yariv

nou · ‎02-15-2010

image are not support currently.

karls · ‎02-17-2010

When is the image support implemented in OpenCL for GPUs?

When can we expect the release of this important feature?

genaganna · ‎02-17-2010

Originally posted by: karls When is the image support implemented in OpenCL for GPUs?

When can we expect the release of this important feature?

we can't give an exact date but should be in the next few months.

jcpalmer · ‎02-18-2010

I am attempting to emulate images , because I need to use Atomics to completely redesign my heuristic, which should achieve at least 1 order of magnitude in performance. My 8800GTX can no longer be used. My options are:

- Use my MacBookPro. Easy since Java is my host lang, but not a 30" display.

- Buy a new Nvidia. It's a weird time to buy Nvidia. Want to wait.

- Make use of my 4890.

Checking out the 4890 option, I am sharing how I wanted to emulate images. It works when you actually have images, but fails on Nvidia when you force it to believe it does not.

The error is:

GeForce 8800 GTX: :25: error: cannot codegen this l-value expression yet

int4 charImgVec = READ_IMAGE_I_2D(charImage , (int2) (300, 0) , charImgSz );

BTW the 25 is the line #

Can someone here see if this compiles for ATI, before I bother to reconfigure my system? Thanks!

#ifdef TRUE_IMAGES const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_NONE | CLK_FILTER_NEAREST; #define READ_IMAGE_I_2D(image, coord, sz) read_imagei(image, sampler, coord) #define READ_IMAGE_F_2D(image, coord, sz) read_imagef(image, sampler, coord) #define READ_IMAGE_I_3D(image, coord, sz) read_imagei(image, sampler, coord) #define READ_IMAGE_F_3D(image, coord, sz) read_imagef(image, sampler, coord) #else #define READ_IMAGE_I_2D(image, coord, sz) convert_int4 (vload4((size_t) ((coord.s0 + (coord.s1 * sz.s0)) * 4), image) ) #define READ_IMAGE_F_2D(image, coord, sz) convert_float4(vload4((size_t) ((coord.s0 + (coord.s1 * sz.s0)) * 4), image) ) #define READ_IMAGE_I_3D(image, coord, sz) convert_int4 (vload4((size_t) ((coord.s0 + (coord.s1 * sz.s0) + (coord.s2 * sz.s1 * sz.s0)) * 4), image) ) #define READ_IMAGE_F_3D(image, coord, sz) convert_float4(vload4((size_t) ((coord.s0 + (coord.s1 * sz.s0) + (coord.s2 * sz.s1 * sz.s0)) * 4), image) ) #endif kernel void main( #ifdef TRUE_IMAGES __read_only image2d_t charImage , __read_only image2d_t intImage , __read_only image3d_t float3dImage #else global const char * charImage , global const int * intImage , global const float * float3dImage #endif , global float *output) { int2 charImgSz = (int2) (600, 1); int4 charImgVec = READ_IMAGE_I_2D(charImage , (int2) (300, 0) , charImgSz ); int2 intImgSz = (int2) (8192, 2); int4 intImgVec = READ_IMAGE_I_2D(intImage , (int2) (230, 1) , intImgSz ); int4 float3dImgSz = (int4) (2048, 376, 5, 0); float4 float3dImgVec = READ_IMAGE_F_3D(float3dImage, (int4) (17, 23, 3, 0), float3dImgSz); }

jcpalmer · ‎02-18-2010

Well, I tried the above code on OSX. You never seen so many errors! I do not think these preprocessors are up to the job, or I am doing something wrong.

Any way, I went back to Cuda, and got less aggressive & it compiled, shown below. I have not actually run a kernel yet, because I will not actually implement this way.

I assemble my kernels on the fly which allows me to pass less parameters, store the source in the Java Class that makes it easiest to maintain, & obfuscate. I'll write Java functions, which gen the source inline. I'll probably gen using the # conditionals, so I do not have to know at assembly time whether a device supports images.

It would still help if someone could compile it on ATI. I am snowed in, so I need to shovel, & am probably done working today. Thanks!

const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_NONE | CLK_FILTER_NEAREST; kernel void main( #ifdef TRUE_IMAGES __read_only image2d_t charImage , __read_only image2d_t intImage , __read_only image3d_t float3dImage #else global const char * charImage , global const int * intImage , global const float * float3dImage #endif , global float *output) { int2 charImgSz = (int2) (600, 1); int2 charImgCoord = (int2) (300, 0); #ifdef TRUE_IMAGES int4 charImgVec = read_imagei(charImage, sampler, charImgCoord); #else int4 charImgVec = convert_int4 (vload4((size_t) ((charImgCoord.s0 + (charImgCoord.s1 * charImgSz.s0)) * 4), charImage) ); #endif int2 intImgSz = (int2) (8192, 2); int2 intImgCoord = (int2) (230, 1); #ifdef TRUE_IMAGES int4 intImgVec = read_imagei(intImage, sampler, intImgCoord); #else int4 intImgVec = convert_int4 (vload4((size_t) ((intImgCoord.s0 + (intImgCoord.s1 * intImgSz.s0)) * 4), intImage) ); #endif int4 float3dImgSz = (int4) (2048, 376, 5, 0); int4 float3dCoord = (int4) ( 17, 23, 3, 0); #ifdef TRUE_IMAGES int4 float3dImgVec = read_imagei(float3dImage, sampler, intImgCoord); #else int4 float3dImgVec = vload4((size_t) ((float3dCoord.s0 + (float3dCoord.s1 * float3dImgSz.s0) + (float3dCoord.s2 * float3dImgSz.s1 * float3dImgSz.s0)) * 4), float3dImage); #endif }

MicahVillmow · ‎02-18-2010

jcpalmer,
These are the messages I get when I attempt to compile your kernel.

390.cl(3): warning: global variable declaration is corrected by the compiler
to have addrSpace constant
const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_NONE | CL
_FILTER_NEAREST;
^

390.cl(36): error: bad coord type to opencl image op: expected int2/uint2 for
image2d_t, int4/uint4 for image3d_t
int4 float3dImgVec = read_imagei(float3dImage, sampler, intImgCoord);
^

390.cl(17): warning: variable "charImgSz" was declared but never referenced
int2 charImgSz = (int2) (600, 1);
^

390.cl(25): warning: variable "intImgSz" was declared but never referenced
int2 intImgSz = (int2) (8192, 2);
^

390.cl(33): warning: variable "float3dImgSz" was declared but never referenced
int4 float3dImgSz = (int4) (2048, 376, 5, 0);
^

390.cl(34): warning: variable "float3dCoord" was declared but never referenced
int4 float3dCoord = (int4) ( 17, 23, 3, 0);
^

1 error detected in the compilation of "390.cl".

jcpalmer · ‎02-18-2010

Thanks! I am numb, but I am back.

I am looking at the results. At first I thought, in all the switching code around, and back and forth between emulation and true images, I left some invalid code on the true image side, and never tested that way again afterward. Then preprocessor made sure the compiler never even saw that code.

Sure enough, if I add this as the first line to turn images on:

#define TRUE_IMAGES

I get the same error. The second to last line should say float3dCoord , not intImgCoord.

The problem is all those warnings about stuff never referenced. That stuff IS referenced on the non-image side. Makes me kind of wonder, Is ATI's preprocessor working right.

That is enough info for me to make the switch, which is a pain. I need to get a better picture of where ATI is anyway. Thanks again.

MicahVillmow · ‎02-18-2010

jcpalmer,
The preprocessor removes code before the analysis of the code is start, by this point nothing exists that references the various variables, hence the warnings. This is the correct behavior of a compiler.

jcpalmer · ‎02-18-2010

Micah,

But look at one of the 3 tests (one without that error). The error msg is:

390.cl(17): warning: variable "charImgSz" was declared but never referenced
int2 charImgSz = (int2) (600, 1);
^

The partial code is:

   int2   charImgSz     = (int2) (600, 1);
    int2   charImgCoord  = (int2) (300, 0);
#ifdef TRUE_IMAGES
    int4   charImgVec = read_imagei(charImage, sampler, charImgCoord);
#else
    int4   charImgVec = convert_int4 (vload4((size_t) ((charImgCoord.s0 + (charImgCoord.s1 * charImgSz.s0)) * 4), charImage) );
#endif

Now if there is no "#define TRUE_IMAGES" line, which there is not, then the

"#ifdef TRUE_IMAGES" should fail and the "#else" version should be submitted to the compiler. charImgSz is in the second version.

MicahVillmow · ‎02-18-2010

jcpalmer,
Sorry, I forgot to mention that TRUE_IMAGES was defined when I found those errors as I wanted to test your image code.

bpurnomo · ‎02-18-2010

jcpalmer,

An easy way to test the kernel is to run it through Stream KernelAnalyzer(SKA). You can just copy and paste your kernel to the tool. As long as you have ATI Stream v2.01 installed (CPU mode is ok), it will work. It doesn't even require an ATI graphics card in your machine.