Hello,
let me explain. A test case with to kernels
col := a floating-point buffer, alloc size = 4 * sizeof(cl_float) * cnt;
buf := uchar buffer, alloc size = 3 * sizeof(cl_uchar) * cnt;
__kernel void initBuf ( __global float3 col )
{
const size_t index = get_global_id (0);
col[index] = 0.1 * 100.0;
}
__kernel void convertToUChar ( __constant float3 col, __global uchar buf )
{
const size_t index = get_global_id (0);
float3 c = col[index];
buf[index * 3 + 0] = (uchar) (c.x);
buf[index * 3 + 1] = (uchar) (c.y);
buf[index * 3 + 2] = (uchar) (c.z);
}
Problem: After running "initBuf" and than "convertToUChar" the values of buf are zero. If I replace the line "col[index] = 0.1 * 100.0;" with "col[index] = 1.0 * 10.0;" the values of are correct (10).
I'm running the OpenCL example on an HP nw8240 with Ubuntu 11.04, AMD APP SDK 2.4 and use the CPU.
Where is my mistake?
of course as you convert floating point number to integer so it got rounded down.
Originally posted by: nou of course as you convert floating point number to integer so it got rounded down.
ähm 10.0f --> 0 ????
float c = 0.1f * 100.0f; // c=10.0f
uchar d = (uchar) (c); // d = 0 ???
Why is d zero?!
I do not understand the roundings.
convert_uchar_rte(1.99f * 100.0f);
1.99 * 100.0 = 199.0. But the result of OpenCL is 100.
convert_uchar_rte(199.0f) --> 199
It does look right (assuming that you've just missed out the *s in the parameter declarations)... rte should round to nearest even, so even if it rounded individually it should get 2*100.
Does it make a difference if you use the convert functions instead of the old-fashioned casts in your original example?
Does col contain the expected floating point values in between calls to that and the convert kernel? You're relying on 0.1*100.0 being implicitly converted from double to float3. What happens if you map it to a float3 directly: float a = 0.1f * 100.0f; col[index] = (float3)(a, a, a);
Lee
Thanks LeeHowes.
It's interessting. The problem is not the rounding mode. It's the float variable.
float3 or float does not matter.
I writing in OpenCL
__kernel void initBuf ( __global float a )
{
a = 100.23f;
}
If I'm reading the value in C++ then the result is 100.0f.
Originally posted by: omion I think Rick Weber is right that you need a pointer.
Ok. I use the CPU and
...
__global float3 *colors
...
colors
colors
colors
#pragma OPENCL EXTENSION cl_amd_printf : enable
printf("%e, ", colors
printf("%e, ", colors
printf("%e\n", colors
Console-Output: 2,000000e+00, 3,000000e+00, 1,000000e+04
Edit: same result with local variable
float3 c = (float3) (2.9f, 3.49f, 10.123f*1000.0ff);
#pragma OPENCL EXTENSION cl_amd_printf : enable
printf("%e, ", c.x);
printf("%e, ", c.y);
printf("%e\n", c.z);
Console-Output: 2,000000e+00, 3,000000e+00, 1,000000e+04
Does printf detect if you're in Europe and put commas in place of decimals? That's amazing.
From what you've posted, the results don't make a whole lot of sense. Can you try using float4 instead of float3 and discard the s3 entry?
Originally posted by: rick.weber ...and discard the s3 entry?
What does it means?
float and float4 --> same results
Instead of using float3, use a float4 and throw away the 'w' entry. Both types take the same amount of storage space, so I honestly don't know what the point of float3 even is.
Originally posted by: rick.weberBoth types take the same amount of storage space, so I honestly don't know what the point of float3 even is.
Ok. I think, the reason of my problem is the intel cpu. Intel cpu's are not official supported 😞
I doubt that's it since AMD and Intel processors both have the x86_64 ISA and I've been using Intel processors with APP for some time now with no problems. Maybe there's some weird SSE instructions being emitted, but I seriously doubt it. Are you sure it's not something else screwing things up?
I just ran this kernel (verbatim) with a 1x1 grid on a Nehalem and got:
2.900000e+00 3.490000e+00 1.012300e+04
#pragma OPENCL EXTENSION cl_amd_printf : enable __kernel void test() { float3 c = (float3)(2.9f, 3.49f, 10.123 * 1000.0f); printf("%e %e %e\n", c.x, c.y, c.z); }
I changed 10.123 to 10.123f
Console-Output: 2,000000e+00, 3,000000e+00, 1,000000e+04
I did the same and I still get the correct result.
It appears to be stuck here. I request both of you to post your system details:
CPU,GPU,SDK,DRIVER,OS.
Computer: HP nw8240 (Laptop)
CPU: intel pentium m (2,13 GHZ) (with SSE2)
GPU: ATI FireGL V5000 Mobility (not used)
SDK: AMD APP SDK 2.4 (only CPU is detect)
OS: Linux, Ubuntu 11.04 (current updates) 32bit
graphic driver: default opencl driver of ubuntu (no ATI or anything else open source special driver)
Siassel,
If only CPU is detected then it should be some driver installation issue. I am also able to get : 2.900000e+00 3.490000e+00 1.012300e+04 values from the kernel you posted on caymen and AMD CPU.
Other problems might be:
Your CPU seems quite old . Are you sure it completes the system requirements mentioned for AMD APP SDK? Can you try on a newer CPU?
And Ubuntu 11.04 is not a supported Operating system.
Originally posted by: himanshu.gautamYour CPU seems quite old . Are you sure it completes the system requirements mentioned for AMD APP SDK? Can you try on a newer CPU?
And Ubuntu 11.04 is not a supported Operating system.
I has three computers (cpus are 5, 2 and 0,2 years old and all intels) and one laptop avaible and ubuntu 11.04 is on all pc's installed. On the three computers AMD APP SDK works fine. Only on the laptop it does not work.
Why? The CPU is old, that is true. But it does not matter. The CPU support SSE2.
So you mean that printf behaves sanely on other intel systems you have?
Well I don't think even intel supports its opencl SDK on that system. AMD APP SDK should work if it is SSE2 capable but AMD does not support intel devices officially.
MAy be some one here can give some insight on this.
I don't think it's the OS, since I ran it on Ubuntu 11.04 and got the correct results.
I take it this is a test case you've created, as the __global float3 types should be __global float3*. As presented, I don't think this will compile. Are you doing this on a CPU or GPU? If it's on the GPU, you have to turn on byte addressing to store sizes shorter than an int to global memory. If you're on the CPU, you can use printf() in your kernel to print the values of col.xyz so you can see that your values are what they should be.
The value 0.1 can't be exactly represented in floating point, but it shouldn't matter as compilers usually compute literal operations for you at compile time (e.g. it will compute 0.99999f * 100.0f, round up and put in 10.0f on the right hand side of col[index]).