I have a kernel which causes opencl to core dump inside libamdocl64.so. It was causing my program to crash so I narrowed down the problem to the compile step, and down to a single line. I've reproduced this issue in a simple kernel:
typedef struct test_struct test_t;
typedef __global test_t * test_p;
struct test_struct {
test_p next;
};
typedef struct btest_struct btest_t;
struct btest_struct {
size_t sz;
test_p dt[];
};
__kernel void compilerCoreDump(__global const int* in, int n) {
int i = get_global_id(0);
if (i >= n)
return;
int val = in;
btest_t btt;
test_p tp = btt.dt[val]; // <-- Compile core dump
}
I'm aware that the code above is nonsense - the actual usage is more complex but the (compile) result is the same. I assume that the compiler should be reporting an error instead of core dumping, so I have two questions:
My versions:
Thanks,
Chris...
Core Dump stack trace (bottom bit):
#28 0x00007ffff1d47fc4 in aclCompile () from /usr/lib/fglrx/libamdocl64.so
#29 0x00007ffff159ec8e in ?? () from /usr/lib/fglrx/libamdocl64.so
#30 0x00007ffff1559365 in ?? () from /usr/lib/fglrx/libamdocl64.so
#31 0x00007ffff156dd1c in ?? () from /usr/lib/fglrx/libamdocl64.so
#32 0x00007ffff154e12d in clBuildProgram () from /usr/lib/fglrx/libamdocl64.so
#33 0x00007ffff7b9e762 in beProgramBuilderOpenCL::BuildOpenCLProgramWrapper (this=<optimized out>, status=@0x7fffffffd0f8: 6780480, program=<optimized out>,
num_devices=<optimized out>, device_list=<optimized out>, options=<optimized out>, pfn_notify=0x0, user_data=0x0)
at /data/jenkins/workspace/CodeXL-Full-Linux-Release-1.4/release/CodeXL/1.4/CommonProjects/AMDTBackEnd/src/beProgramBuilderOpenCL.cpp:1674
#34 0x00007ffff7ba41b4 in beProgramBuilderOpenCL::CompileOpenCLInternal (this=0x6417a0, programSource=..., oclOptions=..., requestedDeviceId=0x677640,
program=@0x7fffffffda18: 0xdd3990, definesAndOptions=..., iCompilationNo=0, errString=...)
at /data/jenkins/workspace/CodeXL-Full-Linux-Release-1.4/release/CodeXL/1.4/CommonProjects/AMDTBackEnd/src/beProgramBuilderOpenCL.cpp:1377
#35 0x00007ffff7ba6604 in beProgramBuilderOpenCL::Compile (this=0x6417a0, programSource=..., oclOptions=..., pSourcePath=<optimized out>)
at /data/jenkins/workspace/CodeXL-Full-Linux-Release-1.4/release/CodeXL/1.4/CommonProjects/AMDTBackEnd/src/beProgramBuilderOpenCL.cpp:1297
#36 0x000000000041a632 in kcCLICommanderCL::Compile (this=<optimized out>, config=...)
at Components/KernelAnalyzer/AMDTKernelAnalyzerCLI/src/kcCLICommanderCL.cpp:111
#37 0x000000000041a80a in kcCLICommanderCL::RunCompileCommands (this=0x641710, config=..., callback=<optimized out>)
at Components/KernelAnalyzer/AMDTKernelAnalyzerCLI/src/kcCLICommanderCL.cpp:361
#38 0x000000000040c2bc in main (argc=<optimized out>, argv=0x7fffffffded8) at Components/KernelAnalyzer/AMDTKernelAnalyzerCLI/src/kcMain.cpp:78
Solved! Go to Solution.
Hi,
struct btest_struct {
size_t sz;
test_p dt[];
};
As per OpenCL 1.2 spec, inside the kernel code variable length arrays and structures with flexible (or unsized) arrays are not supported. [see chapter 6.9 Restrictions]. So, you need to supply size of the array at compile time.
Now, regarding the compiler crashing, I found the same problem for above kernel code with driver 14.30 (but on Windows 7) using both CodeXL and VSC++. I may need to submit a internal bug report for this issue (thanks for reporting). But it worked fine when size of the array "test_p dt[]" was set.
To test the points 2 and 3 under your additional notes, I added the following lines to the end of the kernel code and tried to compile. It worked fine for both CodeXL and VSC++.
void ** vpp = btt.dt;
test_p * tpp = btt.dt;
So, my suggestion is please check with the latest driver and let us know your findings.
Regards,
Additional notes:
Chris...
Hi,
struct btest_struct {
size_t sz;
test_p dt[];
};
As per OpenCL 1.2 spec, inside the kernel code variable length arrays and structures with flexible (or unsized) arrays are not supported. [see chapter 6.9 Restrictions]. So, you need to supply size of the array at compile time.
Now, regarding the compiler crashing, I found the same problem for above kernel code with driver 14.30 (but on Windows 7) using both CodeXL and VSC++. I may need to submit a internal bug report for this issue (thanks for reporting). But it worked fine when size of the array "test_p dt[]" was set.
To test the points 2 and 3 under your additional notes, I added the following lines to the end of the kernel code and tried to compile. It worked fine for both CodeXL and VSC++.
void ** vpp = btt.dt;
test_p * tpp = btt.dt;
So, my suggestion is please check with the latest driver and let us know your findings.
Regards,
Hi Dipak,
Thanks for your response. I will note one slight error with your response. While the spec does not allow for an array with no specified size, it doesn't disallow a size of 0 which also fails. I suppose that could be a choice - the AMD compiler could choose to fail on a size of 0 (since that usually means 'variable'). I do find that this part of the spec makes for slightly more messy code when the actual size of the structure is variable.
Cheers,
Chris...