cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

KumarSaurabh
Journeyman III

Offline Compilation of OpenCL kernel.

Pre compiling opencl kernels.

Hi,

I am trying to generate a binary(.clbin file) for offline compilation of my opencl kernels. Is there any example that can illustrate the process of offline compilation using AMD OpenCl devices.

Any suggestion in this direction shall be highly appreciated.

Thanks and Regards.

Kumar Saurabh.

0 Likes
12 Replies
himanshu_gautam
Grandmaster

All SDK Samples support the option --dump where binares for all AMD devices ae dumped. They can later be used with --load option.

0 Likes

Thanks a lot Himanshu for your reply to our query.

We wish to do offline compilation of self created codes. Can you please help?

Any help shall be appreciated.

0 Likes

I still think you really should add a simply button in the SKA tool to precompile your kernel and output it as a binary file as the Intel's IOC tool does.

I hope we could get that feature in the SDK 2.5 😛

0 Likes

kumar saurabh,

IT is possible to do offline compilation and you should be able to find relevent code in the SDKUtils library we ship with the SDK.

 

Bubu,

AFAIK, you are able to see the device specific ISA code by compiling the opencl code. Can you please explain the feature request.

0 Likes

Originally posted by: himanshu.gautam kumar saurabh,

 

IT is possible to do offline compilation and you should be able to find relevent code in the SDKUtils library we ship with the SDK.

 

 

 

Bubu,

 

AFAIK, you are able to see the device specific ISA code by compiling the opencl code. Can you please explain the feature request.

 

Modify your SKA tool adding a simple "Precompile to file" button that works like this:

 

1. you paste your kernel source into the left side.

2. you click the button.

3. A dialog appears to choose the platform and devices you want to pre-compile the kernels for ( Radeon 5750, Radeon 6870, Phenom II, etc... )... and also add an option to strip completely the debugging symbol or any thing that could compromise your IP, etc...

4. The kernels are precompiled and output to a single .bin file.

 

Also, add that functionality via command-line, so we can call the precompilation tool from a bash shell, ms-dos command-line, batches, etc...

 

You could load after that .bin file from memory/disk using clLoadProgramFromBinary(). The implementation must filter internally all the binary code which does not belong to the current architecture.

 

This is probably the most important feature you should add, because I personally know a lot for commercial projects discarding OpenCL because they simply don't want to distribute their kernels's sources in any form.

0 Likes

There are free tools that will do an online compilation and save the result for you. There's http://sourceforge.net/projects/clcc/, and I've written http://gitorious.org/onlineclc

The catch of course is that you can only get binaries compiled for a device that is physically in your machine e.g. I don't have an AMD GPU so I can't compile for one.

0 Likes

you should look into cl_amd_offline_devices

0 Likes
bmerry
Journeyman III

Sounds interesting. Is there a spec available for it? It's not listed in the OpenCL registry, it's not described in the AMD APP programming guide and Google didn't pop up with one (mostly it hits posts of lists of reported extensions).

0 Likes

it is really simple. you just need add CL_CONTEXT_OFFLINE_DEVICES_AMD property into platform properties during context creation. after that you got virtual devices which you can then create program and retrieve binary.

more http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=115

0 Likes
bmerry
Journeyman III

I have to say I found this to be pretty poor developer support:

- The extension doesn't appear in the Khronos registry, it doesn't get a mention in the developer guide, and it's only documented in an impossible-to-find knowledge-base article describing for an obsolete version of the SDK (try searching for the extension name in a search engine and see if you can find it), and which doesn't even bother to format the sample code readably. And that is a just recipe for how to use it for one purpose rather than a specification of what it does.

- The token doesn't appear in the official cl_ext.h, and the version of cl_ext.h that AMD ships doesn't compile.

There's no indication of how to actually obtain the device IDs for the offline devices, since clGetDeviceIDs doesn't enumerate them. I assume you have to create a context with all devices, enumerate them via clGetContextInfo querying CL_CONTEXT_DEVICES.

0 Likes

That extension is ok, but to modify the SKA tool won't hurt neither.

Intel did that in their IOC tool and works like a charm.

0 Likes

Originally posted by: himanshu.gautam All SDK Samples support the option --dump where binares for all AMD devices ae dumped. They can later be used with --load option.

 

I've tried to dump il and isa using option (--dump 3) for MatrixTranspose and MemoryOptimizations Samples of SDK v2.5. The apllication stops responding and debugging takes to this portion of code.

I'm using VS 2010 on windows 7, ATI M 5650 GPU and Catalyst 11.7.

Is there something wrong with the sample??

/* create a cl program executable for all the devices specified */ status = clBuildProgram(program, 0, NULL, flagsStr.c_str(), NULL, NULL); sampleCommon->checkVal(status, CL_SUCCESS, "clBuildProgram failed.");

0 Likes