cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rotor
Journeyman III

How to avoid executable program re-compile OCL kernel?

compiling OpenCL Kernel

Hi all,

 

I have a question about compiling OpenCl kernel. I have a program (lets call it PA), if I run PA in debug mode mutiple time and does not change anything in the kernel then the program does not need to re-compile the OCL kernel again. However, when I run PA in release mode, it keep re-compiling the OCL kernel again and again everytime I lunch the PA executable, even I did not change anything in the kernel.

 

So does any one here have the same problem? And do you all know how to fix it to avoid very long OCL compiling time everytime we lunch the main executable program?

 

Thank you,

Roto

0 Likes
8 Replies
rick_weber
Adept II

Call clGetProgramInfo with CL_PROGRAM_BINARIES to get the program binaries for each device. You can then save these to disk in some canonical way and load the with clCreateProgramFromBinary.

0 Likes

Hi Rick.Weber,

Thanks for the answer. I know that we can do this with AMD SDK. But I also run my app on Nvidia Card using Nvidia 4.0 SDK and as far as I know it does not support binary kernel. Thus, is there another way to go around this problem rather using binary kernel?

It's quite weird that when I run on debug mode it does not re-compile the code, but it does in release mode? Anyone know the reason?

Thanks,

Roto

 

0 Likes

From the CUDA 4.0 programming guide:

Kernels written in OpenCL C are compiled into PTX, which is CUDA’s instruction set architecture and is described in a separate document.
Currently, the PTX intermediate representation can be obtained by calling clGetProgramInfo() with CL_PROGRAM_BINARIES. It can be passed to clCreateProgramWithBinary() to create a program object only if it is produced and consumed by the same driver. This will likely not be supported in future versions.

It sounds like it should work for the time being, but things may change in the future. What this means is ambiguous: they might drop PTX and the binaries will no longer work with future versions or the binary functionality may cease to exist altogether.

0 Likes
notzed
Challenger

As it happens I just implemented a compiler cache today.

This has the details: http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=115

I can't confirm it works on nvidia but i guess it would: however whilst doing some spring cleaning the other day I noticed a pile of stuff the nvidia sdk was leaving in ~/.nvidia (or something i can't remember now - i cleared it out) which sure looked like a compiler cache to me.  So maybe it's already doing it.

It's nice and fast on AMD though.

 

0 Likes

Thanks notzed and Rick,

I will try. Nvidia sometime they says that it should support but actually it does not :P.

I use both AMD 5870 and Nvidia GTX 480 that why I need to ran accross platforms. AMD 5870 is nice ;-).

Roto

0 Likes

Just a quick report: using the method that notzed implemented in his link (aka: http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=115 ), after a few tweeks, I made it runnable on Nvidia card.

Thanks Notzed a lot.

Roto

0 Likes

Originally posted by: rotor Just a quick report: using the method that notzed implemented in his link (aka: http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=115 ), after a few tweeks, I made it runnable on Nvidia card.

 

Thanks Notzed a lot.

 

Roto

 

Rotor,

You said in debug it is not compiling every time and in release it is compiling everytime.

if you use clBuildProgram, it will build always irrespetive of configuration(debug or release).

I am not sure how you got this conclusion.

 

0 Likes

Hi genaganna,

I didn't make up a conclusion :P. I know it should behave exactly as what you said but what happend with my program is different. That why I said it weird. Basically, when I run in debug mode the program, the clbuildprogram does not re-compile the code but if I run in realease mode it does. I don't know.

Anyway now I fix it with the method of using binary kernel.

Thanks,

Roto

0 Likes