When testing latest AMD OpenCL CPU implementation against my ruby bindings I found that I could not get the binaries from a compiled OpenCL program.
This happens with both the driver from the APP SDK v3.0 and the Radeon Crimson 15.12-15.302. I am using a 64bit Ubuntu 16.04 (4.4.0-21 kernel) with gcc 5.3.1 on an i7-2760QM CPU with 8GB of RAM.
I made a test case in C (attached) that reproduces the behavior. The heap is corrupted during the call to clGetProgramInfo:
gcc test_amd.c -lOpenCL -Wall -g valgrind ./a.out -v ==2642== Memcheck, a memory error detector ==2642== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==2642== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==2642== Command: ./a.out -v ==2642== 26816 ==2642== Invalid write of size 8 ==2642== at 0x4C3453F: memset (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==2642== by 0x60EDAFD: clGetProgramInfo (in /opt/amd/opencl/lib64/libamdocl64.so) ==2642== by 0x400B3E: main (test_amd.c:56) ==2642== Address 0xb368df8 is 0 bytes after a block of size 8 alloc'd ==2642== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==2642== by 0x400A87: main (test_amd.c:48) ==2642== ==2642== ==2642== HEAP SUMMARY: ==2642== in use at exit: 418,964 bytes in 746 blocks ==2642== total heap usage: 33,643 allocs, 32,897 frees, 9,010,065 bytes allocated ==2642== ==2642== LEAK SUMMARY: ==2642== definitely lost: 5,544 bytes in 18 blocks ==2642== indirectly lost: 320,109 bytes in 84 blocks ==2642== possibly lost: 0 bytes in 0 blocks ==2642== still reachable: 93,311 bytes in 644 blocks ==2642== suppressed: 0 bytes in 0 blocks ==2642== Rerun with --leak-check=full to see details of leaked memory ==2642== ==2642== For counts of detected and suppressed errors, rerun with: -v ==2642== ERROR SUMMARY: 17 errors from 1 contexts (suppressed: 0 from 0)
Without valgrind the program segfaults at resource releases or sometimes earlier.
I ran the same test case against Intel and nVidia implementations to double check but they ran fine with and without valgrind.
Aside from that everything worked fine in fact (program creation, build, kernel creation and test, introspection...).
Let me know if I overlooked something.
test_amd.c.zip 978 bytes