cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

bubu
Adept II

Precompiled kernel doubts

Hi,

 

I'm considereing to precompile my kernels but I have several doubts. Pls, if you could answer them I'll be very thankful:

 

1. Can I execute a precompiled kernel for a Radeon 5750 into a 5870 or a 6970? Are the new cards respecting a backwards-compatible principle at just the cost of some speed?

 

2. Can I use the SKA tool to precompile the kernels or only via CL's functions? I like SKA because I can see the CAL code generated so I control better the # of registers used, ALU throughput, etc...

 

3. How much stable in the time will be the pre-compilation? Do you plan to change something in the drivers that could make my precompiled kernel invalid in the middle term?

 

4. How stable and reliable is the current pre-compilation implementation? I ask this because I've heard almost all the IHVs are having problems with this in their drivers.

 

thx.

0 Likes
9 Replies
nou
Exemplar

1. this should work. as i understand in current binary form containd GPU specific ISA code, IL code and original OpenCL code. and when you load binary kernel from other card it will recompile from IL code. so GPU binary kernel should work on all GPU. CPU and GPU not. but ISA code is not compatible between different chip. so your example is not valid as it is Cypress and Juniper

2. i recomend have normal compilation from source at the first time and as a fallback. use binary kernels just for cache.

more info http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=115

0 Likes

Well, one of the resons why I want to use binary kernels is to hide my original CL source code from other persons... so the idea was just to provide pre-compiled kernels for the hardware we support/certify.

 

The problem is... that I cannot test with ALL the ATI's GPUs, so I was expecting just to precompile the kernels, let's say, for a 5770 and a 5870 and, for instance, if the user plugs a 6850 should work ok in compatibility mode ( at perhaps just a small speed penalty ).

 

 

0 Likes

bubu,
If you want to compile for all ATI cards without owning them, please try the cl_amd_offline_devices platform extension.

The way to do so is to pass in the extension when setting up the platform.
Fro example using the C++ api:
cl_context_properties cps[] = {
CL_CONTEXT_PLATFORM, (cl_context_properties)(*i)(),
CL_CONTEXT_OFFLINE_DEVICES_AMD, (cl_context_properties)1,
NULL, NULL
};
context_ = cl::Context(getContextType(march_), cps, NULL, NULL, &status);

You can then query the device name for if it is available or not to determine if it is an online or offline device.

Also, with SDK 2.3, there has been a lot of work on the binary objects and being able to recompile. I haven't personally tested this, but it should work for what you want.

0 Likes

Originally posted by: MicahVillmow bubu, If you want to compile for all ATI cards without owning them, please try the cl_amd_offline_devices platform extension.


Oooooh, that's nice ! I'll try it, thx!

Btw, have you considered to add a tool ( or to modify the SKA tool ) to precompile the kernels and to save a binary file for all the ATI's devices? Intel created a similar tool for their OpenCL SDK:

 

Intel OpenCL SDK precompilation tool

 

In that way I should just copy-paste my kernel source, click a button and save the result to a file. Then encrypt it and use it with clBuildProgramFromBinary() if the OpenCL's vendorID is AMD.

 

0 Likes

at the end of page which i linked there is a example wich offline compile kernel. modify source code so it will do what you need is easy.

0 Likes

Originally posted by: MicahVillmow bubu, If you want to compile for all ATI cards without owning them, please try the cl_amd_offline_devices platform extension. The way to do so is to pass in the extension when setting up the platform. Fro example using the C++ api: cl_context_properties cps[] = { CL_CONTEXT_PLATFORM, (cl_context_properties)(*i)(), CL_CONTEXT_OFFLINE_DEVICES_AMD, (cl_context_properties)1, NULL, NULL }; context_ = cl::Context(getContextType(march_), cps, NULL, NULL, &status); You can then query the device name for if it is available or not to determine if it is an online or offline device. Also, with SDK 2.3, there has been a lot of work on the binary objects and being able to recompile. I haven't personally tested this, but it should work for what you want.


 

Micah Villow:

I have seen this extension since I installed 2.3, but I could not find the documentation for it. Could you please document it and publish into OpenCL registry? At least I now know what it does (very useful).

Can you also document cl_amd_popcnt

0 Likes

read opencl programing guide from AMD.

0 Likes

Yep, the CL_CONTEXT_OFFLINE_DEVICES_AMD is definitevely useful but I think a tool like Intel's one is much easier and effective to use ( it requires just a pair of clicks ).

Perhaps you should modify a bit your SKA tool to be able to save the binary kernels. Just keep it as is now plus add a list box with the devices you want to precompile the kernels for and add a "save binary precompiled kernels" button as the one of the Intel's tool.

0 Likes

Btw, if you modify the SKA tool to make that, would be a good idea to re-write the SKA tool using Qt or something similar, so it could be ported to other OSs like MacOSX or linux very easy.

0 Likes