Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

Facing problems in porting CUDA code to OpenCL


    I'm trying to port the CUDA code to OpenCL. I'm using "AMD-APP-SDK-v2.8-lnx32" for compiling my host code. As per my understanding I have made the changes to run it on OpenCL. The host code compiles fine, but when I run the binary created, it tries to compile the kernel code and it fails saying

ERROR: Failed to build executable
"/tmp/", line 1: error: unrecognized token

"/tmp/", line 1: error: expected a declaration

2 errors detected in the compilation of "/tmp/".

Frontend phase failed compilation.

How to debug these type of errors - any pointers? I'm trying to learn OpenCl now, please help.

The changes I made from CUDA to Open CL


__global void draw_gpu(unsigned char *buf, scene_gpu *myScene)


__kernel void draw_gpu(__global unsigned char *buf,__global scene_gpu *myScene)


    int j= blockIdx.x;

    int i= threadIdx.x;


    int j= get_group_id(0);

    int i= get_local_id(0);


__device__ bool hitSphere_d(const ray &r, const sphere &s, float &t)


bool hitSphere_d(const ray &r, const sphere &s, float &t)


__device__ float min_d (float a, float b)


float min_d (float a, float b)

Please note that here I had __device__ qualified functions, ie these functions are called by device(GPU) from the kernel function. Whereas in OpenCl I read giving extra qualifier is not necessary. Is this if fine ?

I have attached the cuda kernel and OpenCl kernel code, please have a look and suggest any changes.

4 Replies

The attached kernel code has lot of structures, which must be defined on the kernel side. PFA the build log i got from kernel analyzer.

As far as the errors you reported, i have seen such errors when there is some hidden special symbol in the kernel file, Probably creating a fresh kernel file will help.

Hi Himanshu,

    I tried fixing these issues, but still I see build failure, here is my log,

I tried installing AMD kernel Analyzer on windows, I couldn't install it.

Here is the build log

err code: -44

ERROR: Failed to build executable

len: 0

--- Build log ---

Following code was complied on CPU which has 0 compute units !

I don't know why I'm getting 0 compute units, earlier before this changes, I used to get 4 !!

Also I'm trying to capture the error log from the function,

err = clBuildProgram(program,0,NULL,NULL,NULL,NULL);

if(err !=CL_SUCCESS)


size_t len;

char buffer[32048];

printf("err code: %d\n",err);

printf("ERROR: Failed to build executable \n ");

clGetProgramBuildInfo(program,deviceid,CL_PROGRAM_BUILD_LOG, 32048, buffer , &len);

buffer[32048] = '\0';

printf("len: %d\n",len);

cout<<"--- Build log ---\n "<<buffer<<endl;

//return FAILURE;


But I don't see any error msgs, but still my build fails, could you please check once ?

Also is there any better way in linux to debug Opencl kernel functions ? This is very messy.

Attaching the updated files herewith


There must be something odd with your host code. Maybe you can try to use an SDK example to load your kernel file? When I put your .cl file as a kernel for my programs, I see all the compile errors (the first is

line 11: error:

          identifier "point" is undefined

        point pos;



Another idea would be to printf (or cout) program, in order to see if your program correctly loaded the .cl file into memory.

And finally there's a mismatch in the devices of your clBuildProgram and clGetProgramBuildInfo calls. I'm not sure if that matters, but I use

clBuildProgram(program, 1, &devices[devnumber], ...


clGetProgramBuildInfo (program, devices[devnumber], ...


Bingo !! it is working now. I could able to compile successfully and my output matches the CUDA version ..

Thanks himanshu and Bdot for your suggestions.