I was trying to work with binary kernels using KB115, and found some differences, so I would like to share my results, so others can work with binary kernels too:
My way to generate a working binary kernel was:
1) call clCreateProgramWithSource
2) call clBuildProgram (if you don't do this, clGetProgramInfo will return 0 for CL_PROGRAM_BINARY_SIZES)
3) call clGetProgramInfo with CL_PROGRAM_BINARY_SIZES to get the size of the binary.
4) Do a malloc() with the size informed by clGetProgramInfo. Record the size to use it in the load process.
5) call clGetProgramInfo with CL_PROGRAM_BINARIES and the pointer generated with malloc().
6) Save to a file the contents inside the pointer.
To load the binary kernel I did the folowing:
1) Do a malloc() with the size of the binary.
2) load the file to the pointer allocated with malloc().
3) call clCreateProgramWithBinary
4) call clBuildProgram (if you don't do it, clCreateKernel will fail)
5) call clCreateKernel
My question is: It is possible to have a binary kernel in ISA format for a specific GPU? Because I could save some time if I don't need to call clBuildKernel when the application is running.