Hi,
On trying to build the kernel code using clBuildProgram, I get a build error (with many warnings) and the Link Failed message. This program runs fine on the NVIDIA OpenCL SDK. Why does this happen and how can I fix this?
Thanks,
The kernel is a Coulombic Potential calculation. Nothing very fancy.... And the clBuildProgram returns a failure, with "Link Failed". Here's the code if you would like to test it.
Thanks,
Deepthi
__kernel void cenergy(int numatoms, float gridspacing, __global float * energygrid, __constant float4 atominfo[]) { unsigned int xindex = (((get_group_id(0)* get_local_size(0))*8)+get_local_id(0)); unsigned int yindex = ((get_group_id(1) * get_local_size(1))+get_local_id(1)); unsigned int outaddr = ((((get_num_groups(0)* get_local_size(0))*8)*yindex)+xindex); float coory = (gridspacing*yindex); float coorx = (gridspacing*xindex); float energyvalx1 = 0.0; float energyvalx2 = 0.0; float energyvalx3 = 0.0; float energyvalx4 = 0.0; float energyvalx5 = 0.0; float energyvalx6 = 0.0; float energyvalx7 = 0.0; float energyvalx8 = 0.0; float gridspacing_u = (gridspacing*16); int atomid; for (atomid=0; atomid<numatoms; atomid ++ ) { float dy = (coory-atominfo[atomid].y); float dyz2 = ((dy*dy)+atominfo[atomid].z); float dx1 = (coorx-atominfo[atomid].x); float dx2 = (dx1+gridspacing_u); float dx3 = (dx2+gridspacing_u); float dx4 = (dx3+gridspacing_u); float dx5 = (dx4+gridspacing_u); float dx6 = (dx5+gridspacing_u); float dx7 = (dx6+gridspacing_u); float dx8 = (dx7+gridspacing_u); energyvalx1+=(atominfo[atomid].w*(1.0/sqrt(((dx1*dx1)+dyz2)))); energyvalx2+=(atominfo[atomid].w*(1.0/sqrt(((dx2*dx2)+dyz2)))); energyvalx3+=(atominfo[atomid].w*(1.0/sqrt(((dx3*dx3)+dyz2)))); energyvalx4+=(atominfo[atomid].w*(1.0/sqrt(((dx4*dx4)+dyz2)))); energyvalx5+=(atominfo[atomid].w*(1.0/sqrt(((dx5*dx5)+dyz2)))); energyvalx6+=(atominfo[atomid].w*(1.0/sqrt(((dx6*dx6)+dyz2)))); energyvalx7+=(atominfo[atomid].w*(1.0/sqrt(((dx7*dx7)+dyz2)))); energyvalx8+=(atominfo[atomid].w*(1.0/sqrt(((dx8*dx8)+dyz2)))); } energygrid[outaddr]+=energyvalx1; energygrid[(outaddr+(1*16))]+=energyvalx2; energygrid[(outaddr+(2*16))]+=energyvalx3; energygrid[(outaddr+(3*16))]+=energyvalx4; energygrid[(outaddr+(4*16))]+=energyvalx5; energygrid[(outaddr+(5*16))]+=energyvalx6; energygrid[(outaddr+(6*16))]+=energyvalx7; energygrid[(outaddr+(7*16))]+=energyvalx8; }
Your kernel doesn't show any build program failure on my system, which SDK version you are using?
Hi,
I'm using SDK v2.0, Catalyst Driver 9.12 Hotfix, Windows 7, Intel Core i5 and Radeon 5870. Can you share the build code you used to compile my kernel with?
Thanks,
Deepthi
Originally posted by: deepthi Hi,
I'm using SDK v2.0, Catalyst Driver 9.12 Hotfix, Windows 7, Intel Core i5 and Radeon 5870. Can you share the build code you used to compile my kernel with?
You can use any existing sample for this purpose.
Hi,
Thanks all. Fixed this error. The problem was when I was reading the kernel file, I used malloc and had an extra uninitialised character at the end of file that was causing trouble. Changing to calloc/getting rid of this extra character in malloc solved the issue.
Thanks,
Deepthi
Originally posted by: deepthi Hi,
Thanks all. Fixed this error. The problem was when I was reading the kernel file, I used malloc and had an extra uninitialised character at the end of file that was causing trouble. Changing to calloc/getting rid of this extra character in malloc solved the issue.
Deepthi,
Please post the part of the code which was causing trouble.
Hi,
For kernel compilation, I was reading from the header file and kernel file, and then combining the two strings. In the final string, I had (accidentally) allocated an extra character (using malloc). This final character was initialized to zero when I ran on the Linux-based machine for NVIDIA device, but when I ran on Windows for AMD device, the uninitialised char gave me trouble.
Originally posted by: deepthi The kernel is a Coulombic Potential calculation. Nothing very fancy.... And the clBuildProgram returns a failure, with "Link Failed". Here's the code if you would like to test it.
Thanks,
Deepthi
Hi Deepthi, as an advice try to comment single lines or little blocks of code, and see if the compilation doesn't give you the link error anymore. In this way you can pinpoint the instruction that gives you this problem.
Originally posted by: deepthi The kernel is a Coulombic Potential calculation. Nothing very fancy.... And the clBuildProgram returns a failure, with "Link Failed". Here's the code if you would like to test it.
Deepthi,
Could you please send your runtime code also? Make sure your kernel file is present with .exe. Please send also OS, CPU, GPU, SDK version and Driver version details.
For what it is worth, this error can also pop up (V2.01 SDK) when you've left a printf in a kernel that you're trying to compile for a GPU.
(I often insert printfs in kernels when I'm debugging -- this works as long as you are running on the CPU rather than the GPU and as long as you do all the normal things that you have to do when using printf to debug parallel programs.)
It would be really really nice if the compiler defined a few symbols that we could use in conditional compilation statements within kernels. A symbol that identified the target would be useful. A symbol that identified the version of the compiler and runtime system would be useful too.
deepthi,
I see that you've fixed your problem, but I'd also warn you that the 2.0 and 2.01 SDK have a heisenbug around handling of floating constants.
In particular, I've observed incorrect results, hangs, and other unpleasantness in some codes where I've had this kind of statement:
float foo = 0.0;
When the line was changed to
float foo = 0.0f;
the problem went away. Unfortunately, I can't build a reproducer for this, as the bug is very sensitive to its context. In the mean time, I've been ensuring that I don't ever put the compiler in the position of doing (double)->(float) implicit conversions.
matt.
Consider some debug code that I don't want to compile if I'm on a GPU, but do want to include if I'm on a CPU:
#ifdef OCL_CPU_DEV
if(get_global_id(0) == 42) printf("this is a debug statement v = %f\n", v);
#endif
Or what if I want to conditionalize a chunk of code based on the model?
#ifdef OCL_DEV_RV770
do_op_with_rv770_workaround(a, b,....);
#else
do_op_without_workaround(a, b, ...);
#endif