Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

Help! I can't get right result.

Hi guys, I met a problem when I develop a kernel program with reading image. I have confirmed where the problem was, but I didn't know how to fix it. The following is the part of code:



          float scalar = read_imagef(data, dataSampler, pos).w;

          float4 result = read_imagef(transFunc, transFuncSampler, (int2)(convert_ushort_sat_rte(scalar * 65535.0f), 0));//after debugging, im sure the problem is here.

In the host side, the concerned C++ code is:        

          cl_image_format data;

          data.image_channel_order = CL_A;

          data.image_channel_data_type = CL_UNORM_INT16;


         cl_image_format transferFunc;

         transferFunc.image_channel_order = CL_RGBA;

         transferFunc.image_channel_data_type = CL_UNORM_INT8;

Anyone can help fix the problem? Thank you!

8 Replies

Can you explain, what is the expected output of your code, and what are you getting.

Please post a copy of your code (as zip file) so that we can reproduce here.

Please include the following details as well.

1. Platform - win32 / win64 / lin32 / lin64 or some other?

Win7 or win vista or Win8.. Similarly for linux, your distribution

2. Version of driver

3. CPU or GPU Target?

4. CPU/GPU details of your hardware


Im doing volume rendering just as the sample in SDK, so the result is a 3D picture. Sorry for can not post my whole source code due to the secrecy agreement in company.

My develop environment includes Win7 32Bits OS, Geforce GTX 460 and GPU driver is ForceWare 306.97.


My understanding of images is very peripheral.

Nonetheless, consider the following snippet.

>>>> float4 result = read_imagef(transFunc, transFuncSampler, (int2)(convert_ushort_sat_rte(scalar * 65535.0f), 0));

Since you are scaling the normalizedScalar float value by 65535, I hope the image "transFunc" was allocated with that much memory. Can you confirm?

Also, Can you throw some light on "transFunc" - how you allocated it, what was the image_descr used?


Im prety sure that I have allocated enough memory for transFunc, and scalar value is normalized from 0 to 4095. So I allocated transFunc like the following:


     cl_image_format transferFunc;

    transferFunc.image_channel_order = CL_RGBA;

    transferFunc.image_channel_data_type = CL_UNORM_INT8;

     clTransferFuncArray = clCreateImage2D(context, CL_MEM_READ_ONLY, &transferFunc,

     4096, 1, 0, NULL, &m_ciErrNum);


    const size_t origin[3] = {0, 0, 0};
    const size_t region[3] = {4096, 1, 1};

    m_ciErrNum = clEnqueueWriteImage(commandQueue, clTransferFuncArray, CL_FALSE,
            origin, region, 4096* sizeof(unsigned char) * 4, 0, pcolor, 0, 0, 0);


Can you tell whether the data in clTransferFuncArray is unnormalized?

Your clTransferFuncArray seems to have size of 4096 elements (of CL_RGBA and CL_UINT8 sampler). So the maximum position you can read from transfunc is actually 4096 (otherwise it is taken based on clamp condition, IMHO). 

The variable scalar is not normalized in general sense (between 0 and 4095, as per you), which IMHO should mostly result in reading the clamped values out of transfunc image object. I suggest you to use scalar in the range 0 to 1 and then probably multiply it with 4096 to read the transfunc image properly.

Can you confirm you aim to read the transfunc image object in a co-alesced manner by different work-items of a workgroup? You can print the image indexes and check what part of the image object is being read by a work-item.


Sorry, I didn't post whole code. Actually, scalar value is between 0 and 4095, as above it's normalized in 16 bits format, so I multiplied 65535.0f to read transfunc image in kernel.

In another way, if I normalized scalar value between 0 and 255, and the image format is set to 8 bits, thus I multiplied 255.0f to read transfunc image in kernel and can get right result.


Oh yes. i missed that. Thanks for correcting..

btw, you have not told us - what exactly is the problem?

Is it a correctness issue (or) running issue? Can you detail more on what is the problem you are facing.

One more thing to do is to just dump all the scalar value - convert_ushort_sat_rte(scalar * 65535.0f) and see if it is in expected range.

Also to reiterate, clCreateImage2D is deprecated.

Is your code running fine on CPU target? (or) any other OpenCL platform?


clCreateImage2D is deprecated in OpenCL 1.2

clCreateImage itself can help you do this. You may want to give it a try.

Please note that if you are looking at cross-platform development, nvidia is yet to release OpenCL 1.2

Also, Since your scalar value is normalized to 4096, Can't you multiply it by 4095.0f instead of 65535.0f?

Can you tell us if that helps,