Hello,
I'm trying to port the CUDA code to OpenCL. I'm using "AMD-APP-SDK-v2.8-lnx32" for compiling my host code. As per my understanding I have made the changes to run it on OpenCL. The host code compiles fine, but when I run the binary created, it tries to compile the kernel code and it fails saying
ERROR: Failed to build executable
"/tmp/OCLoPv5RX.cl", line 1: error: unrecognized token
�$.
^
"/tmp/OCLoPv5RX.cl", line 1: error: expected a declaration
�$.
^
2 errors detected in the compilation of "/tmp/OCLoPv5RX.cl".
Frontend phase failed compilation.
How to debug these type of errors - any pointers? I'm trying to learn OpenCl now, please help.
The changes I made from CUDA to Open CL
CUDA
__global void draw_gpu(unsigned char *buf, scene_gpu *myScene)
OpenCl
__kernel void draw_gpu(__global unsigned char *buf,__global scene_gpu *myScene)
CUDA
int j= blockIdx.x;
int i= threadIdx.x;
OpenCl
int j= get_group_id(0);
int i= get_local_id(0);
CUDA
__device__ bool hitSphere_d(const ray &r, const sphere &s, float &t)
OpenCl
bool hitSphere_d(const ray &r, const sphere &s, float &t)
CUDA
__device__ float min_d (float a, float b)
OpenCl
float min_d (float a, float b)
Please note that here I had __device__ qualified functions, ie these functions are called by device(GPU) from the kernel function. Whereas in OpenCl I read giving extra qualifier is not necessary. Is this if fine ?
I have attached the cuda kernel and OpenCl kernel code, please have a look and suggest any changes.
The attached kernel code has lot of structures, which must be defined on the kernel side. PFA the build log i got from kernel analyzer.
As far as the errors you reported, i have seen such errors when there is some hidden special symbol in the kernel file, Probably creating a fresh kernel file will help.
Hi Himanshu,
I tried fixing these issues, but still I see build failure, here is my log,
I tried installing AMD kernel Analyzer on windows, I couldn't install it.
Here is the build log
err code: -44
ERROR: Failed to build executable
len: 0
--- Build log ---
Following code was complied on CPU which has 0 compute units !
I don't know why I'm getting 0 compute units, earlier before this changes, I used to get 4 !!
Also I'm trying to capture the error log from the function,
err = clBuildProgram(program,0,NULL,NULL,NULL,NULL);
if(err !=CL_SUCCESS)
{
size_t len;
char buffer[32048];
printf("err code: %d\n",err);
printf("ERROR: Failed to build executable \n ");
clGetProgramBuildInfo(program,deviceid,CL_PROGRAM_BUILD_LOG, 32048, buffer , &len);
buffer[32048] = '\0';
printf("len: %d\n",len);
cout<<"--- Build log ---\n "<<buffer<<endl;
//return FAILURE;
}
But I don't see any error msgs, but still my build fails, could you please check once ?
Also is there any better way in linux to debug Opencl kernel functions ? This is very messy.
Attaching the updated files herewith
There must be something odd with your host code. Maybe you can try to use an SDK example to load your kernel file? When I put your .cl file as a kernel for my programs, I see all the compile errors (the first is
line 11: error:
identifier "point" is undefined
point pos;
^
)
Another idea would be to printf (or cout) program, in order to see if your program correctly loaded the .cl file into memory.
And finally there's a mismatch in the devices of your clBuildProgram and clGetProgramBuildInfo calls. I'm not sure if that matters, but I use
clBuildProgram(program, 1, &devices[devnumber], ...
...
clGetProgramBuildInfo (program, devices[devnumber], ...
Bingo !! it is working now. I could able to compile successfully and my output matches the CUDA version ..
Thanks himanshu and Bdot for your suggestions.