Does anyone know how to compile the OpenCL kernel offline?
you must create your own "compiler"
1. load program source with clCreateProgramWithSource()
2. built it
3. clGetProgramInfo(..., CL_PROGRAM_BINARY_SIZES,...) get size of binary program
4. clGetProgramInfo(..., CL_PROGRAM_BINARIES, ...)
but according to release notes it not possible
Currently, it is not possible to create a cl_program from a binary image
(clCreateProgramWithBinary), nor is it possible to extract a binary image from a cl_program
created with clCreateProgramWithSource
So it is impossilbe to debug the kernel at present? Should I write the code with notepad?
It is not possible to debug the kernel as of now. Yes, you will have to write the code in a file.
binary program will not be compatibile between CPU and GPU?probably not.
No, the binary program will not be compatible as the instruction set is different for CPU and GPU.
Actually, there is an experimental feture that allows you to debug kernels on Linux using GDB. This currenlty does not work on Windows, but if you have access to a Linux platform you can try the following.
As with traditonal debugging the OpenCL program to be debugged must be compiled for debug, which can be done by passing the “-g” option the compiler via the options string to clBuildProgram. For example, using the C++ API you might right something like:
err = program.build(devices,"-g");
To avoid source changes an alternative is to set the environment variable:
CPU_COMPILER_OPTIONS="-g"
Attached is a simple hello world kernel and the following GDB session shows how to debug this kernel. It is important to set CPU_MAX_COMPUTE_UNITS=1, otherwise you will see strange behavior with different threads switching in for different sets of workgroups running in parallel, and also note that we set a break point somewhere after the OpenCL C program has been compiled and run to this point before setting the break point for the kernel. This is required as no symbols are defined for the OpenCL C program until it has been compiled and loaded by the runtime and of course this happens dynamically.
Before presenting the GDB output it should be noted that kernel’s name is mangled, something like __OpenCL_name_kernel, and this must be used to set the break point (we are working on getting the mapping implicit).
~/dev/cltutorials/lesson1$ CPU_MAX_COMPUTE_UNITS=1 gdb ./a.out
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
(gdb) b 77
Breakpoint 1 at 0x4023bf: file lesson1.cpp, line 77.
(gdb) run
Starting program: /home/bgaster/dev/cltutorials/lesson1/a.out
[Thread debugging using libthread_db enabled]
[New Thread 0x7f282a722700 (LWP 5030)]
[New Thread 0x7f28280b6950 (LWP 5045)]
[New Thread 0x7f28278b5950 (LWP 5046)]
[Switching to Thread 0x7f282a722700 (LWP 5030)]
Breakpoint 1, main () at lesson1.cpp:83
83 &event);
(gdb) b __OpenCL_hello_kernel
Breakpoint 2 at 0x7f28280b73d0: file OCLmfzLYs.cl, line 18.
(gdb) c
Continuing.
[Switching to Thread 0x7f28278b5950 (LWP 5046)]
Breakpoint 2, __OpenCL_hello_kernel (out=0x7f28278b4ec0) at OCLmfzLYs.cl:18
18 __kernel void hello(__global char * out)
(gdb) list
13 }
14
15 return length;
16 }
17
18 __kernel void hello(__global char * out)
19 {
20 size_t tid = get_global_id(0);
21
22 if (tid < (strlen(helloWorld)+1)) {
(gdb)
23 out[tid] = helloWorld[tid];
24 }
25 }
26 ^@(gdb) n
__OpenCL_hello_kernel (out=0x1ddf7b0) at OCLmfzLYs.cl:20
20 size_t tid = get_global_id(0);
(gdb)
22 if (tid < (strlen(helloWorld)+1)) {
(gdb) p tid
$1 = 0
(gdb) c
Continuing.
Breakpoint 2, __OpenCL_hello_kernel (out=0x7f28278b4ec0) at OCLmfzLYs.cl:18
18 __kernel void hello(__global char * out)
(gdb) n
__OpenCL_hello_kernel (out=0x1ddf7b0) at OCLmfzLYs.cl:20
20 size_t tid = get_global_id(0);
(gdb) n
22 if (tid < (strlen(helloWorld)+1)) {
(gdb) p tid
$2 = 1
(gdb)
__constant char helloWorld[] = "Hello World\n"; uint strlen(__constant char str[]) { uint length = 0; while(*str != '\0') { length++; str++; } return length; } __kernel void hello(__global char * out) { size_t tid = get_global_id(0); if (tid < (strlen(helloWorld)+1)) { out[tid] = helloWorld[tid]; } }