cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

alexaverbuch
Journeyman III

Repeat runs causes Memory Dump

Need to make multiple runs of Kernel program to gauge performance

Hi again,

I've now finished the app for my OpenCL school project and I'd like to run it multiple times to gain a more accurate view of it's performance.

Just to get started I tried running the entire contents of the Host main function twice, but this causes a memory dump.

I'm pretty sure I've freed all local variables, so now I'm wondering if it's caused by the runtime or Stream framework.

Is there an obvious reason why I shouldn't be able to do this?

int main(int argc, char * argv[]) { IplImage* cvRaw = cvLoadImage("raw.bmp", 1); width = cvRaw->width; height = cvRaw->height; clInitializeHost(cvRaw); // Initialize Host application clInitialize(); // Initialize OpenCL resources clRunKernels(); // Run the CL program clCleanup(); // Releases OpenCL resources clCleanupHost(); // Release host resources clInitializeHost(cvRaw); // Initialize Host application clInitialize(); // Initialize OpenCL resources clRunKernels(); // Run the CL program clCleanup(); // Releases OpenCL resources clCleanupHost(); // Release host resources return 0; }

0 Likes
11 Replies
alexaverbuch
Journeyman III

The memory dump is below:

Thanks in advance,

Alex

*** glibc detected *** ./EdgeDetect: munmap_chunk(): invalid pointer: 0x08352464 *** ======= Backtrace: ========= /lib/tls/i686/cmov/libc.so.6[0xb72a6604] /usr/lib/libstdc++.so.6(_ZdlPv+0x21)[0xb7488231] ./EdgeDetect[0x804e085] /lib/tls/i686/cmov/libc.so.6(exit+0x89)[0xb7265bb9] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xed)[0xb724d77d] ./EdgeDetect[0x804a7e1] ======= Memory map: ======== 08048000-08055000 r-xp 00000000 08:05 193024 /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect 08055000-08056000 r--p 0000c000 08:05 193024 /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect 08056000-08057000 rw-p 0000d000 08:05 193024 /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect 082b4000-085db000 rw-p 082b4000 00:00 0 [heap] 99e4c000-99e6d000 rw-p 99e4c000 00:00 0 9a244000-9a245000 ---p 9a244000 00:00 0 9a245000-9a255000 rwxp 9a245000 00:00 0 9a255000-9a256000 ---p 9a255000 00:00 0 9a256000-9a266000 rwxp 9a256000 00:00 0 9a266000-9a267000 ---p 9a266000 00:00 0 9a267000-9a2a7000 rwxp 9a267000 00:00 0 9a2a7000-9a6e1000 r--p 00000000 08:05 213940 /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/builtins_x86-32.bc 9a6e1000-9a6e2000 ---p 9a6e1000 00:00 0 9a6e2000-9a6f2000 rwxp 9a6e2000 00:00 0 9a6f2000-9a6f3000 ---p 9a6f2000 00:00 0 9a6f3000-9a703000 rwxp 9a6f3000 00:00 0 9a703000-9a704000 ---p 9a703000 00:00 0 9a704000-9a744000 rwxp 9a704000 00:00 0 9a744000-a2bee000 rw-p 9a744000 00:00 0 b3542000-b6706000 rw-p b3542000 00:00 0 b6706000-b670a000 r-xp 00000000 08:05 9372 /usr/lib/libXdmcp.so.6.0.0 b670a000-b670b000 rw-p 00003000 08:05 9372 /usr/lib/libXdmcp.so.6.0.0 b670b000-b670d000 r-xp 00000000 08:05 9361 /usr/lib/libXau.so.6.0.0 b670d000-b670e000 r--p 00001000 08:05 9361 /usr/lib/libXau.so.6.0.0 b670e000-b670f000 rw-p 00002000 08:05 9361 /usr/lib/libXau.so.6.0.0 b670f000-b6733000 r-xp 00000000 08:05 9618 /usr/lib/libexpat.so.1.5.2 b6733000-b6735000 r--p 00023000 08:05 9618 /usr/lib/libexpat.so.1.5.2 b6735000-b6736000 rw-p 00025000 08:05 9618 /usr/lib/libexpat.so.1.5.2 b6736000-b6737000 rw-p b6736000 00:00 0 b6737000-b674f000 r-xp 00000000 08:05 131780 /usr/lib/libxcb.so.1.1.0 b674f000-b6750000 r--p 00017000 08:05 131780 /usr/lib/libxcb.so.1.1.0 b6750000-b6751000 rw-p 00018000 08:05 131780 /usr/lib/libxcb.so.1.1.0 b6751000-b6757000 r-xp 00000000 08:05 76784 /usr/lib/libxcb-render.so.0.0.0 b6757000-b6758000 r--p 00005000 08:05 76784 /usr/lib/libxcb-render.so.0.0.0 b6758000-b6759000 rw-p 00006000 08:05 76784 /usr/lib/libxcb-render.so.0.0.0 b6759000-b675c000 r-xp 00000000 08:05 10325 /usr/lib/libxcb-render-util.so.0.0.0 b675c000-b675d000 r--p 00002000 08:05 10325 /usr/lib/libxcb-render-util.so.0.0.0 b675d000-b675e000 rw-p 00003000 08:05 10325 /usr/lib/libxcb-render-util.so.0.0.0 b675e000-b6771000 r-xp 00000000 08:05 9558 /usr/lib/libdirect-1.0.so.0.1.0 b6771000-b6772000 r--p 00012000 08:05 9558 /usr/lib/libdirect-1.0.so.0.1.0 b6772000-b6773000 rw-p 00013000 08:05 9558 /usr/lib/libdirect-1.0.so.0.1.0 b6773000-b677a000 r-xp 00000000 08:05 9640 /usr/lib/libfusion-1.0.so.0.1.0 b677a000-b677b000 r--p 00006000 08:05 9640 /usr/lib/libfusion-1.0.so.0.1.0 b677b000-b677c000 rw-p 00007000 08:05 9640 /usr/lib/libfusion-1.0.so.0.1.0 b677c000-b677d000 rw-p b677c000 00:00 0 b677d000-b67e1000 r-xp 00000000 08:05 9560 /usr/lib/libdirectfb-1.0.so.0.1.0 b67e1000-b67e2000 r--p 00063000 08:05 9560 /usr/lib/libdirectfb-1.0.so.0.1.0 b67e2000-b67e3000 rw-p 00064000 08:05 9560 /usr/lib/libdirectfb-1.0.so.0.1.0 b67e3000-b6823000 r-xp 00000000 08:05 10110 /usr/lib/libpixman-1.so.0.13.2 b6823000-b6825000 r--p 0003f000 08:05 10110 /usr/lib/libpixman-1.so.0.13.2 b6825000-b6826000 rw-p 00041000 08:05 10110 /usr/lib/libpixman-1.so.0.13.2 b6826000-b683e000 r-xp 00000000 08:05 2663 /lib/libselinux.so.1 b683e000-b683f000 r--p 00017000 08:05 2663 /lib/libselinux.so.1 b683f000-b6840000 rw-p 00018000 08:05 2663 /lib/libselinux.so.1 b6840000-b6844000 r-xp 00000000 08:05 9376 /usr/lib/libXfixes.so.3.1.0 b6844000-b6845000 rw-p 00003000 08:05 9376 /usr/lib/libXfixes.so.3.1.0 b6845000-b6846000 rw-p b6845000 00:00 0 b6846000-b6848000 r-xp 00000000 08:05 9370 /usr/lib/libXdamage.so.1.1.0 b6848000-b6849000 rw-p 00001000 08:05 9370 /usr/lib/libXdamage.so.1.1.0 b6849000-b684b000 r-xp 00000000 08:05 9366 /usr/lib/libXcomposite.so.1.0.0 b684b000-b684c000 r--p 00001000 08:05 9366 /usr/lib/libXcomposite.so.1.0.0 b684c000-b684d000 rw-p 00002000 08:05 9366 /usr/lib/libXcomposite.so.1.0.0 b684d000-b6937000 r-xp 00000000 08:05 9355 /usr/lib/libX11.so.6.2.0 b6937000-b6938000 ---p 000ea000 08:05 9355 /usr/lib/libX11.so.6.2.0 b6938000-b6939000 r--p 000ea000 08:05 9355 /usr/lib/libX11.so.6.2.0 b6939000-b693b000 rw-p 000eb000 08:05 9355 /usr/lib/libX11.so.6.2.0 b693b000-b693c000 rw-p b693b000 00:00 0 b693c000-b6944000 r-xp 00000000 08:05 9368 /usr/lib/libXcursor.so.1.0.2 b6944000-b6945000 rw-p 00007000 08:05 9368 /usr/lib/libXcursor.so.1.0.2 b6945000-b694b000 r-xp 00000000 08:05 9394 /usr/lib/libXrandr.so.2.2.0 b694b000-b694c000 r--p 00006000 08:05 9394 /usr/lib/libXrandr.so.2.2.0 b694c000-b694d000 rw-p 00007000 08:05 9394 /usr/lib/libXrandr.so.2.2.0 b694d000-b6955000 r-xp 00000000 08:05 29517 /usr/lib/libXi.so.6.0.0 b6955000-b6956000 r--p 00007000 08:05 29517 /usr/lib/libXi.so.6.0.0 b6956000-b6957000 rw-p 00008000 08:05 29517 /usr/lib/libXi.so.6.0.0 b6957000-b6958000 rw-p b6957000 00:00 0 b6958000-b695a000 r-xp 00000000 08:05 9384 /usr/lib/libXinerama.so.1.0.0 b695a000-b695b000 rw-p 00001000 08:05 9384 /usr/lib/libXinerama.so.1.0.0 b695b000-b6963000 r-xp 00000000 08:05 9396 /usr/lib/libXrender.so.1.3.0 b6963000-b6964000 r--p 00007000 08:05 9396 /usr/lib/libXrender.so.1.3.0 b6964000-b6965000 rw-p 00008000 08:05 9396 /usr/lib/libXrender.so.1.3.0 b6965000-b6973000 r-xp 00000000 08:05 9374 /usr/lib/libXext.so.6.4.0 b6973000-b6974000 r--p 0000d000 08:05 9374 /usr/lib/libXext.so.6.4.0 b6974000-b6975000 rw-p 0000e000 08:05 9374 /usr/lib/libXext.so.6.4.0 b6975000-b69a5000 r-xp 00000000 08:05 2649 /lib/libpcre.so.3.12.1 b69a5000-b69a6000 r--p 0002f000 08:05 2649 /lib/libpcre.so.3.12.1 b69a6000-b69a7000 rw-p 00030000 08:05 2649 /lib/libpcre.so.3.12.1 b69a7000-b69a8000 rw-p b69a7000 00:00 0 b69a8000-b69ed000 r-xp 00000000 08:05 9950 /usr/lib/libjasper.so.1.0.0 b69ed000-b69ee000 r--p 00044000 08:05 9950 /usr/lib/libjasper.so.1.0.0 b69ee000-b69f1000 rw-p 00045000 08:05 9950 /usr/lib/libjasper.so.1.0.0 b69f1000-b69f7000 rw-p b69f1000 00:00 0 b69f7000-b6a49000 r-xp 00000000 08:05 131824 /usr/lib/libtiff.so.4.2.1 b6a49000-b6a4a000 ---p 00052000 08:05 131824 /usr/lib/libtiff.so.4.2.1 b6a4a000-b6a4c000 r--p 00052000 08:05 131824 /usr/lib/libtiff.so.4.2.1 b6a4c000-b6a4d000 rw-p 00054000 08:05 131824 /usr/lib/libtiff.so.4.2.1 b6a4d000-b6a61000 r-xp 00000000 08:05 2695 /lib/libz.so.1.2.3.3 b6a61000-b6a62000 r--p 00013000 08:05 2695 /lib/libz.so.1.2.3.3 b6a62000-b6a63000 rw-p 00014000 08:05 2695 /lib/libz.so.1.2.3.3 b6a63000-b6a82000 r-xp 00000000 08:05 9952 /usr/lib/libjpeg.so.62.0.0 b6a82000-b6a83000 rw-p 0001e000 08:05 9952 /usr/lib/libjpeg.so.62.0.0 b6a83000-b6aa7000 r-xp 00000000 08:05 10116 /usr/lib/libpng12.so.0.27.0 b6aa7000-b6aa8000 r--p 00023000 08:05 10116 /usr/lib/libpng12.so.0.27.0 b6aa8000-b6aa9000 rw-p 00024000 08:05 10116 /usr/lib/libpng12.so.0.27.0 b6aa9000-b6b5f000 r-xp 00000000 08:05 9717 /usr/lib/libglib-2.0.so.0.2000.1 b6b5f000-b6b60000 r--p 000b5000 08:05 9717 /usr/lib/libglib-2.0.so.0.2000.1 b6b60000-b6b61000 rw-p 000b6000 08:05 9717 /usr/lib/libglib-2.0.so.0.2000.1 b6b61000-b6b62000 rw-p b6b61000 00:00 0 b6b62000-b6b65000 r-xp 00000000 08:05 9729 /usr/lib/libgmodule-2.0.so.0.2000.1 b6b65000-b6b66000 r--p 00002000 08:05 9729 /usr/lib/libgmodule-2.0.so.0.2000.1 b6b66000-b6b67000 rw-p 00003000 08:05 9729 /usr/lib/libgmodule-2.0.so.0.2000.1 b6b67000-b6ba3000 r-xp 00000000 08:05 9771 /usr/lib/libgobject-2.0.so.0.2000.1 b6ba3000-b6ba4000 r--p 0003b000 08:05 9771 /usr/lib/libgAborted

0 Likes

This seems to be a heap corruption error. Try using a tool like Valgrind to do a memory check.

0 Likes

Hi omkaranathan,

Thanks again for your help. I used Valgrind and it seemed to point to my code where I call int runTimerKey = streamsdk::SDKCommon::createTimer()

I was calling this line twice and for some reason it was doing bad things.

I have now moved that line to Host main(), made runTimerKey global, and only called once in the beginning, then reuse the same timer. Problem (seems to be) solved.

Please see the attached valgrind.log contents as I think the error is inside the SDK itself and may be out of my control

Cheers,

Alex

==25257== Memcheck, a memory error detector. ==25257== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==25257== Using LibVEX rev 1884, a library for dynamic binary translation. ==25257== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==25257== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework. ==25257== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al. ==25257== ==25257== My PID = 25257, parent PID = 18897. Prog and args are: ==25257== ./EdgeDetect ==25257== --25257-- --25257-- Command line --25257-- ./EdgeDetect --25257-- Startup, with flags: --25257-- -v --25257-- --tool=memcheck --25257-- --leak-check=full --25257-- --num-callers=40 --25257-- --log-file=valgrind.log --25257-- Contents of /proc/version: --25257-- Linux version 2.6.28-15-generic (buildd@palmer) (gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) ) #49-Ubuntu SMP Tue Aug 18 18:40:08 UTC 2009 --25257-- Arch and hwcaps: X86, x86-sse1-sse2 --25257-- Page sizes: currently 4096, max supported 4096 --25257-- Valgrind library directory: /usr/lib/valgrind --25257-- Reading syms from /lib/ld-2.9.so (0x4000000) --25257-- Reading debug info from /lib/ld-2.9.so .. --25257-- .. CRC mismatch (computed 0755dd8f wanted fd1af95b) --25257-- object doesn't have a symbol table --25257-- Reading syms from /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect (0x8048000) --25257-- Reading syms from /usr/lib/valgrind/x86-linux/memcheck (0x38000000) --25257-- object doesn't have a dynamic symbol table --25257-- Reading suppressions file: /usr/lib/valgrind/default.supp --25257-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_core.so (0x4020000) --25257-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so (0x4023000) --25257-- Reading syms from /lib/tls/i686/cmov/libpthread-2.9.so (0x4041000) --25257-- Reading debug info from /lib/tls/i686/cmov/libpthread-2.9.so .. --25257-- .. CRC mismatch (computed 8742ae9f wanted b6e5211d) --25257-- Reading syms from /lib/tls/i686/cmov/libdl-2.9.so (0x405a000) --25257-- Reading debug info from /lib/tls/i686/cmov/libdl-2.9.so .. --25257-- .. CRC mismatch (computed f4700f67 wanted 9261fc98) --25257-- object doesn't have a symbol table --25257-- Reading syms from /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so (0x405f000) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libcxcore.so.1.0.0 (0x495d000) --25257-- Reading debug info from /usr/lib/libcxcore.so.1.0.0 .. --25257-- .. CRC mismatch (computed d5fcd8ce wanted c15b1187) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libcv.so.1.0.0 (0x4a74000) --25257-- Reading debug info from /usr/lib/libcv.so.1.0.0 .. --25257-- .. CRC mismatch (computed 871c2b90 wanted 4a46cf76) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libhighgui.so.1.0.0 (0x4b43000) --25257-- Reading debug info from /usr/lib/libhighgui.so.1.0.0 .. --25257-- .. CRC mismatch (computed 856ccd8c wanted 3535f73b) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libcvaux.so.1.0.0 (0x4b6c000) --25257-- Reading debug info from /usr/lib/libcvaux.so.1.0.0 .. --25257-- .. CRC mismatch (computed 67c3343b wanted 38510577) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libml.so.1.0.0 (0x4c07000) --25257-- Reading debug info from /usr/lib/libml.so.1.0.0 .. --25257-- .. CRC mismatch (computed eb162527 wanted 81be93f4) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libstdc++.so.6.0.10 (0x4c45000) --25257-- Reading debug info from /usr/lib/libstdc++.so.6.0.10 .. --25257-- .. CRC mismatch (computed 87794c5d wanted bcd37461) --25257-- object doesn't have a symbol table --25257-- Reading syms from /lib/tls/i686/cmov/libm-2.9.so (0x4d34000) --25257-- Reading debug info from /lib/tls/i686/cmov/libm-2.9.so .. --25257-- .. CRC mismatch (computed 12e307f3 wanted 8d9692d5) --25257-- object doesn't have a symbol table --25257-- Reading syms from /lib/libgcc_s.so.1 (0x4d5a000) --25257-- Reading debug info from /lib/libgcc_s.so.1 .. --25257-- .. CRC mismatch (computed 224ab3f8 wanted 89276151) --25257-- object doesn't have a symbol table --25257-- Reading syms from /lib/tls/i686/cmov/libc-2.9.so (0x4d69000) --25257-- Reading debug info from /lib/tls/i686/cmov/libc-2.9.so .. --25257-- .. CRC mismatch (computed 6de3199f wanted 8d898f0d) --25257-- object doesn't have a symbol table --25257-- Reading syms from /lib/tls/i686/cmov/librt-2.9.so (0x4ecc000) --25257-- Reading debug info from /lib/tls/i686/cmov/librt-2.9.so .. --25257-- .. CRC mismatch (computed 43333164 wanted bdf5bbc7) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgomp.so.1.0.0 (0x4ed6000) --25257-- Reading debug info from /usr/lib/libgomp.so.1.0.0 .. --25257-- .. CRC mismatch (computed 2a97c8f2 wanted fc3b509e) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgthread-2.0.so.0.2000.1 (0x4edf000) --25257-- Reading debug info from /usr/lib/libgthread-2.0.so.0.2000.1 .. --25257-- .. CRC mismatch (computed b7331d00 wanted 73da8835) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgtk-x11-2.0.so.0.1600.1 (0x4ee5000) --25257-- Reading debug info from /usr/lib/libgtk-x11-2.0.so.0.1600.1 .. --25257-- .. CRC mismatch (computed 0db20067 wanted e22c823e) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgdk-x11-2.0.so.0.1600.1 (0x5296000) --25257-- Reading debug info from /usr/lib/libgdk-x11-2.0.so.0.1600.1 .. --25257-- .. CRC mismatch (computed 16db213f wanted 2b521efd) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libatk-1.0.so.0.2609.1 (0x5323000) --25257-- Reading debug info from /usr/lib/libatk-1.0.so.0.2609.1 .. --25257-- .. CRC mismatch (computed 0dbeadec wanted a0f02cda) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libpangoft2-1.0.so.0.2400.1 (0x533f000) --25257-- Reading debug info from /usr/lib/libpangoft2-1.0.so.0.2400.1 .. --25257-- .. CRC mismatch (computed c42e4c50 wanted a8587683) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libpangocairo-1.0.so.0.2400.1 (0x5368000) --25257-- Reading debug info from /usr/lib/libpangocairo-1.0.so.0.2400.1 .. --25257-- .. CRC mismatch (computed 91a0a0b7 wanted 7d6c82a7) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgio-2.0.so.0.2000.1 (0x5374000) --25257-- Reading debug info from /usr/lib/libgio-2.0.so.0.2000.1 .. --25257-- .. CRC mismatch (computed 5c96aa59 wanted 4c03bcde) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libcairo.so.2.10800.6 (0x53e2000) --25257-- Reading debug info from /usr/lib/libcairo.so.2.10800.6 .. --25257-- .. CRC mismatch (computed 1fdfb294 wanted 7ec915a5) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libpango-1.0.so.0.2400.1 (0x545c000) --25257-- Reading debug info from /usr/lib/libpango-1.0.so.0.2400.1 .. --25257-- .. CRC mismatch (computed 26a4e52e wanted 07686a62) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libfreetype.so.6.3.20 (0x54a0000) --25257-- Reading debug info from /usr/lib/libfreetype.so.6.3.20 .. --25257-- .. CRC mismatch (computed 3459e5c9 wanted b53a2170) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libfontconfig.so.1.3.0 (0x5517000) --25257-- Reading debug info from /usr/lib/libfontconfig.so.1.3.0 .. --25257-- .. CRC mismatch (computed eb5b491f wanted c441d2cf) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgdk_pixbuf-2.0.so.0.1600.1 (0x5544000) --25257-- Reading debug info from /usr/lib/libgdk_pixbuf-2.0.so.0.1600.1 .. --25257-- .. CRC mismatch (computed 23f59eab wanted dba61049) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgobject-2.0.so.0.2000.1 (0x555e000) --25257-- Reading debug info from /usr/lib/libgobject-2.0.so.0.2000.1 .. --25257-- .. CRC mismatch (computed 0db70d52 wanted 51030dca) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libgmodule-2.0.so.0.2000.1 (0x559c000) --25257-- Reading debug info from /usr/lib/libgmodule-2.0.so.0.2000.1 .. --25257-- .. CRC mismatch (computed 3e5dc739 wanted f4fb266c) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libglib-2.0.so.0.2000.1 (0x55a2000) --25257-- Reading debug info from /usr/lib/libglib-2.0.so.0.2000.1 .. --25257-- .. CRC mismatch (computed 72130d12 wanted db61242a) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libpng12.so.0.27.0 (0x565a000) --25257-- Reading debug info from /usr/lib/libpng12.so.0.27.0 .. --25257-- .. CRC mismatch (computed 26f47527 wanted 59ea1569) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libjpeg.so.62.0.0 (0x5680000) --25257-- Reading debug info from /usr/lib/libjpeg.so.62.0.0 .. --25257-- .. CRC mismatch (computed 85f681c5 wanted d70a584d) --25257-- object doesn't have a symbol table --25257-- Reading syms from /lib/libz.so.1.2.3.3 (0x56a0000) --25257-- Reading debug info from /lib/libz.so.1.2.3.3 .. --25257-- .. CRC mismatch (computed 6f997343 wanted 90f70a2f) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libtiff.so.4.2.1 (0x56b6000) --25257-- Reading debug info from /usr/lib/libtiff.so.4.2.1 .. --25257-- .. CRC mismatch (computed 58b50ba7 wanted a4e40215) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libjasper.so.1.0.0 (0x570c000) --25257-- Reading debug info from /usr/lib/libjasper.so.1.0.0 .. --25257-- .. CRC mismatch (computed 470dc489 wanted e00840bc) --25257-- object doesn't have a symbol table --25257-- Reading syms from /lib/libpcre.so.3.12.1 (0x575c000) --25257-- Reading debug info from /lib/libpcre.so.3.12.1 .. --25257-- .. CRC mismatch (computed c1b54a06 wanted 740d7eeb) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXext.so.6.4.0 (0x578e000) --25257-- Reading debug info from /usr/lib/libXext.so.6.4.0 .. --25257-- .. CRC mismatch (computed aebfe0cd wanted ab108a38) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXrender.so.1.3.0 (0x579e000) --25257-- Reading debug info from /usr/lib/libXrender.so.1.3.0 .. --25257-- .. CRC mismatch (computed 9442c42e wanted 09588c2e) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXinerama.so.1.0.0 (0x57a8000) --25257-- Reading debug info from /usr/lib/libXinerama.so.1.0.0 .. --25257-- .. CRC mismatch (computed 9bb9fcff wanted 88bb0683) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXi.so.6.0.0 (0x57ac000) --25257-- Reading debug info from /usr/lib/libXi.so.6.0.0 .. --25257-- .. CRC mismatch (computed f49e34fe wanted 552bdc1b) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXrandr.so.2.2.0 (0x57b6000) --25257-- Reading debug info from /usr/lib/libXrandr.so.2.2.0 .. --25257-- .. CRC mismatch (computed 6dff934e wanted 6b001629) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXcursor.so.1.0.2 (0x57be000) --25257-- Reading debug info from /usr/lib/libXcursor.so.1.0.2 .. --25257-- .. CRC mismatch (computed 353e160f wanted b2634adb) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libX11.so.6.2.0 (0x57c7000) --25257-- Reading debug info from /usr/lib/libX11.so.6.2.0 .. --25257-- .. CRC mismatch (computed 1f902b93 wanted 316dacdc) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXcomposite.so.1.0.0 (0x58b6000) --25257-- Reading debug info from /usr/lib/libXcomposite.so.1.0.0 .. --25257-- .. CRC mismatch (computed 8671d553 wanted 245ec3bf) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXdamage.so.1.1.0 (0x58ba000) --25257-- Reading debug info from /usr/lib/libXdamage.so.1.1.0 .. --25257-- .. CRC mismatch (computed 9407c5e3 wanted 33f85503) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXfixes.so.3.1.0 (0x58be000) --25257-- Reading debug info from /usr/lib/libXfixes.so.3.1.0 .. --25257-- .. CRC mismatch (computed 11b5b3aa wanted aaf9c283) --25257-- object doesn't have a symbol table --25257-- Reading syms from /lib/libselinux.so.1 (0x58c3000) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libpixman-1.so.0.13.2 (0x58dd000) --25257-- Reading debug info from /usr/lib/libpixman-1.so.0.13.2 .. --25257-- .. CRC mismatch (computed 7b5e423c wanted ed86b9c7) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libdirectfb-1.0.so.0.1.0 (0x5920000) --25257-- Reading debug info from /usr/lib/libdirectfb-1.0.so.0.1.0 .. --25257-- .. CRC mismatch (computed 64b07ce9 wanted 66c09962) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libfusion-1.0.so.0.1.0 (0x5987000) --25257-- Reading debug info from /usr/lib/libfusion-1.0.so.0.1.0 .. --25257-- .. CRC mismatch (computed 845a7b0a wanted dc06c697) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libdirect-1.0.so.0.1.0 (0x5990000) --25257-- Reading debug info from /usr/lib/libdirect-1.0.so.0.1.0 .. --25257-- .. CRC mismatch (computed f84820f5 wanted 38916263) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libxcb-render-util.so.0.0.0 (0x59a5000) --25257-- Reading debug info from /usr/lib/libxcb-render-util.so.0.0.0 .. --25257-- .. CRC mismatch (computed 9100b0c4 wanted fd42e994) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libxcb-render.so.0.0.0 (0x59aa000) --25257-- Reading debug info from /usr/lib/libxcb-render.so.0.0.0 .. --25257-- .. CRC mismatch (computed f6a47d9f wanted f33380a7) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libxcb.so.1.1.0 (0x59b2000) --25257-- Reading debug info from /usr/lib/libxcb.so.1.1.0 .. --25257-- .. CRC mismatch (computed cce0963e wanted 2297268f) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libexpat.so.1.5.2 (0x59cd000) --25257-- Reading debug info from /usr/lib/libexpat.so.1.5.2 .. --25257-- .. CRC mismatch (computed 30743ed7 wanted a52f20e6) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXau.so.6.0.0 (0x59f4000) --25257-- Reading debug info from /usr/lib/libXau.so.6.0.0 .. --25257-- .. CRC mismatch (computed 753fc8bd wanted d5b4e482) --25257-- object doesn't have a symbol table --25257-- Reading syms from /usr/lib/libXdmcp.so.6.0.0 (0x59f8000) --25257-- Reading debug info from /usr/lib/libXdmcp.so.6.0.0 .. --25257-- .. CRC mismatch (computed 7892784a wanted 1c340519) --25257-- object doesn't have a symbol table --25257-- REDIR: 0x4ddff00 (index) redirected to 0x4027420 (index) --25257-- REDIR: 0x4de1e40 (memchr) redirected to 0x4027b00 (memchr) --25257-- REDIR: 0x4de0a60 (rindex) redirected to 0x4027330 (rindex) --25257-- REDIR: 0x4ddc930 (malloc) redirected to 0x4026f20 (malloc) --25257-- REDIR: 0x4dda520 (free) redirected to 0x4025d40 (free) --25257-- REDIR: 0x4de5390 (strchrnul) redirected to 0x40286c0 (strchrnul) --25257-- REDIR: 0x4de05e0 (strlen) redirected to 0x40276e0 (strlen) --25257-- REDIR: 0x4de2850 (memcpy) redirected to 0x4027b50 (memcpy) --25257-- REDIR: 0x4de23a0 (mempcpy) redirected to 0x4028720 (mempcpy) --25257-- REDIR: 0x4ddcde0 (realloc) redirected to 0x4027030 (realloc) --25257-- REDIR: 0x4de2340 (memset) redirected to 0x40285f0 (memset) --25257-- REDIR: 0x4ddc600 (calloc) redirected to 0x4024fd0 (calloc) --25257-- REDIR: 0x4cfff20 (operator new(unsigned int)) redirected to 0x4026930 (operator new(unsigned int)) --25257-- REDIR: 0x4d00070 (operator new[](unsigned int)) redirected to 0x4026250 (operator new[](unsigned int)) --25257-- REDIR: 0x4de00e0 (strcpy) redirected to 0x4027740 (strcpy) --25257-- REDIR: 0x4de22d0 (memmove) redirected to 0x4028650 (memmove) --25257-- REDIR: 0x4cfe210 (operator delete(void*)) redirected to 0x40258e0 (operator delete(void*)) --25257-- REDIR: 0x4de0910 (strncpy) redirected to 0x4027810 (strncpy) --25257-- REDIR: 0x4cfe270 (operator delete[](void*)) redirected to 0x40253a0 (operator delete[](void*)) ==25257== Warning: set address range perms: large range [0xba17028, 0x1c36a088) (undefined) --25257-- REDIR: 0x4de0070 (strcmp) redirected to 0x40279e0 (strcmp) --25257-- REDIR: 0x4de0690 (strnlen) redirected to 0x40276a0 (strnlen) --25257-- REDIR: 0x4de52c0 (rawmemchr) redirected to 0x4028700 (rawmemchr) --25257-- REDIR: 0x4e64140 (__strcpy_chk) redirected to 0x4028ca0 (__strcpy_chk) --25257-- REDIR: 0x4ddcd40 (posix_memalign) redirected to 0x4024f70 (posix_memalign) --25257-- REDIR: 0x4d000b0 (operator new[](unsigned int, std::nothrow_t const&)) redirected to 0x4026010 (operator new[](unsigned int, std::nothrow_t const&)) --25257-- REDIR: 0x4de0800 (strncmp) redirected to 0x4027950 (strncmp) --25257-- memcheck GC: 1024 nodes, 1024 survivors (100.0%) --25257-- memcheck GC: increase table size to 2048 --25257-- memcheck GC: 2048 nodes, 2048 survivors (100.0%) --25257-- memcheck GC: increase table size to 4096 --25257-- memcheck GC: 4096 nodes, 3169 survivors ( 77.3%) --25257-- memcheck GC: increase table size to 8192 --25257-- memcheck GC: 8192 nodes, 7200 survivors ( 87.8%) --25257-- memcheck GC: increase table size to 16384 --25257-- REDIR: 0x4cfffd0 (operator new(unsigned int, std::nothrow_t const&)) redirected to 0x40266f0 (operator new(unsigned int, std::nothrow_t const&)) --25257-- REDIR: 0x4cfe240 (operator delete(void*, std::nothrow_t const&)) redirected to 0x4025720 (operator delete(void*, std::nothrow_t const&)) --25257-- memcheck GC: 16384 nodes, 16379 survivors ( 99.9%) --25257-- memcheck GC: increase table size to 32768 --25257-- Reading syms from /tmp/OCLqHuWqN.so (0x76c7000) ==25257== Warning: client switching stacks? SP change: 0x403c1c0 --> 0x5a29eec ==25257== to suppress, use: --max-stackframe=27188524 or greater ==25257== Warning: client switching stacks? SP change: 0x5e521c0 --> 0x5a35eec ==25257== to suppress, use: --max-stackframe=4309716 or greater ==25257== Warning: client switching stacks? SP change: 0x403c1c0 --> 0x5a29eec ==25257== to suppress, use: --max-stackframe=27188524 or greater ==25257== further instances of this message will not be shown. ==25257== Thread 3: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== Address 0x5a29eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== Invalid read of size 4 ==25257== at 0x76C8A9C: __OpenCL_edgeDetectKernel_stub (in /tmp/OCLqHuWqN.so) ==25257== Address 0x5a29eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== Thread 4: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x5E521BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x5a35eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== Invalid read of size 4 ==25257== at 0x76C8A9C: __OpenCL_edgeDetectKernel_stub (in /tmp/OCLqHuWqN.so) ==25257== by 0x5E521BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x5a35eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) --25257-- Discarding syms at 0x76c76f0-0x76c98be in /tmp/OCLqHuWqN.so due to munmap() ==25257== Warning: set address range perms: large range [0xba17018, 0x1c36a098) (noaccess) ==25257== Warning: set address range perms: large range [0xba17028, 0x1c36a088) (undefined) --25257-- memcheck GC: 32768 nodes, 26278 survivors ( 80.1%) --25257-- memcheck GC: increase table size to 65536 --25257-- Reading syms from /tmp/OCL74kffw.so (0x772a000) ==25257== ==25257== Thread 1: ==25257== Invalid free() / delete / delete[] ==25257== at 0x402599A: operator delete(void*) (vg_replace_malloc.c:342) ==25257== by 0x804E355: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7C0: clRunKernels() (EdgeDetect.cpp:722) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== Address 0x5aa9834 is 4 bytes inside a block of size 60 alloc'd ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7CF: clRunKernels() (EdgeDetect.cpp:723) ==25257== by 0x804D816: main (EdgeDetect.cpp:1216) ==25257== ==25257== Invalid free() / delete / delete[] ==25257== at 0x402599A: operator delete(void*) (vg_replace_malloc.c:342) ==25257== by 0x804E355: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7CF: clRunKernels() (EdgeDetect.cpp:723) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== Address 0x5a476ec is 4 bytes inside a block of size 88 alloc'd ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7C0: clRunKernels() (EdgeDetect.cpp:722) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== ==25257== Thread 7: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x77291BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6d53eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== Thread 6: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x77181BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6acbeec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== Thread 7: ==25257== Invalid read of size 4 ==25257== at 0x772BA9C: __OpenCL_edgeDetectKernel_stub (in /tmp/OCL74kffw.so) ==25257== by 0x77291BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6d53eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== Thread 6: ==25257== Invalid read of size 4 ==25257== at 0x772BA9C: __OpenCL_edgeDetectKernel_stub (in /tmp/OCL74kffw.so) ==25257== by 0x77181BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6acbeec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Warning: set address range perms: large range [0xba17018, 0x1c36a098) (noaccess) ==25257== ==25257== Thread 1: ==25257== Invalid free() / delete / delete[] ==25257== at 0x402599A: operator delete(void*) (vg_replace_malloc.c:342) ==25257== by 0x804E094: streamsdk::SDKCommon::~SDKCommon() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x4D97BB8: exit (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== by 0x4D7F77C: (below main) (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x5a81b3c is 4 bytes inside a block of size 116 alloc'd ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7CF: clRunKernels() (EdgeDetect.cpp:723) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== ==25257== ERROR SUMMARY: 59 errors from 11 contexts (suppressed: 281 from 5) ==25257== ==25257== 1 errors in context 1 of 11: ==25257== Invalid free() / delete / delete[] ==25257== at 0x402599A: operator delete(void*) (vg_replace_malloc.c:342) ==25257== by 0x804E094: streamsdk::SDKCommon::~SDKCommon() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x4D97BB8: exit (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== by 0x4D7F77C: (below main) (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x5a81b3c is 4 bytes inside a block of size 116 alloc'd ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7CF: clRunKernels() (EdgeDetect.cpp:723) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== ==25257== 1 errors in context 2 of 11: ==25257== Invalid free() / delete / delete[] ==25257== at 0x402599A: operator delete(void*) (vg_replace_malloc.c:342) ==25257== by 0x804E355: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7CF: clRunKernels() (EdgeDetect.cpp:723) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== Address 0x5a476ec is 4 bytes inside a block of size 88 alloc'd ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7C0: clRunKernels() (EdgeDetect.cpp:722) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== ==25257== 1 errors in context 3 of 11: ==25257== Invalid free() / delete / delete[] ==25257== at 0x402599A: operator delete(void*) (vg_replace_malloc.c:342) ==25257== by 0x804E355: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7C0: clRunKernels() (EdgeDetect.cpp:722) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== Address 0x5aa9834 is 4 bytes inside a block of size 60 alloc'd ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7CF: clRunKernels() (EdgeDetect.cpp:723) ==25257== by 0x804D816: main (EdgeDetect.cpp:1216) ==25257== ==25257== 7 errors in context 4 of 11: ==25257== Thread 7: ==25257== Invalid read of size 4 ==25257== at 0x772BA9C: __OpenCL_edgeDetectKernel_stub (in /tmp/OCL74kffw.so) ==25257== by 0x77291BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6d53eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== 7 errors in context 5 of 11: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x77291BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6d53eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== 7 errors in context 6 of 11: ==25257== Thread 6: ==25257== Invalid read of size 4 ==25257== at 0x772BA9C: __OpenCL_edgeDetectKernel_stub (in /tmp/OCL74kffw.so) ==25257== by 0x77181BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6acbeec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== 7 errors in context 7 of 11: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x77181BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x6acbeec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== 7 errors in context 8 of 11: ==25257== Thread 3: ==25257== Invalid read of size 4 ==25257== at 0x76C8A9C: ??? ==25257== Address 0x5a29eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== 7 errors in context 9 of 11: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== Address 0x5a29eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== 7 errors in context 10 of 11: ==25257== Thread 4: ==25257== Invalid read of size 4 ==25257== at 0x76C8A9C: ??? ==25257== by 0x5E521BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x5a35eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== 7 errors in context 11 of 11: ==25257== Invalid write of size 4 ==25257== at 0x418BF24: cpu::WorkGroupOperation::execute() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x5E521BF: ??? ==25257== by 0x418B9A9: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== Address 0x5a35eec is 7,916 bytes inside a block of size 8,192 alloc'd ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B601: cpu::WorkerThread::WorkerThread(device::VirtualDevice&, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4185AE8: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) --25257-- --25257-- supp: 4 dl-hack3-cond-1 --25257-- supp: 10 glibc-2.9-on-SUSE-10.3-(x86) --25257-- supp: 87 dl-hack3-cond-4 --25257-- supp: 59 dl-hack5-32bit-addr-4 --25257-- supp: 121 Debian libc6 (2.9.x) stripped dynamic linker ==25257== ==25257== IN SUMMARY: 59 errors from 11 contexts (suppressed: 281 from 5) ==25257== ==25257== malloc/free: in use at exit: 192,103,499 bytes in 9,102 blocks. ==25257== malloc/free: 240,877 allocs, 231,778 frees, 932,314,944 bytes allocated. ==25257== ==25257== searching for pointers to 9,102 not-freed blocks. ==25257== checked 193,195,532 bytes. ==25257== ==25257== Thread 1: ==25257== ==25257== 11 bytes in 1 blocks are definitely lost in loss record 4 of 88 ==25257== at 0x4026FDE: malloc (vg_replace_malloc.c:207) ==25257== by 0x418E8BE: amd::CodeCache::registerCode(amd::Assembler const&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418E9B6: amd::CodeCache::init() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419629C: amd::Runtime::init(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41705B7: amd::HostThread::HostThread() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4176D4E: clCreateContextFromType (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804CCE3: clInitialize() (EdgeDetect.cpp:394) ==25257== by 0x804D811: main (EdgeDetect.cpp:1215) ==25257== ==25257== ==25257== 16 bytes in 2 blocks are definitely lost in loss record 10 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x4198E91: (within /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419C14E: amd::llvmLinkOptCG(std::string&, std::string&, std::string&, bool, bool) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4187DD8: cpu::Program::compile(std::string const&, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4183BF8: device::Program::build(std::string const*, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418FBBE: amd::Program::build(std::vector<amd::Device*, std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x417A51A: clBuildProgram (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804D377: clInitialize() (EdgeDetect.cpp:618) ==25257== by 0x804D830: main (EdgeDetect.cpp:1223) ==25257== ==25257== ==25257== 16 bytes in 2 blocks are definitely lost in loss record 11 of 88 ==25257== at 0x4026FDE: malloc (vg_replace_malloc.c:207) ==25257== by 0x418C94C: amd::HeapObject::operator new(unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193C2B: amd::CommandQueue::~CommandQueue() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4196116: amd::ReferenceCountedObject::release() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419323F: amd::NDRangeKernelCommand::~NDRangeKernelCommand() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4196116: amd::ReferenceCountedObject::release() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C2C8: cpu::WorkGroupOperation::~WorkGroupOperation() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418B9B1: cpu::WorkerThread::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C5AC: cpu::WorkerThread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== ==25257== 16 bytes in 2 blocks are definitely lost in loss record 13 of 88 ==25257== at 0x40260CE: operator new[](unsigned int, std::nothrow_t const&) (vg_replace_malloc.c:288) ==25257== by 0x4185A9B: cpu::VirtualCPU::VirtualCPU(cpu::Device&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418A1C6: cpu::Device::createVirtualDevice() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4193721: amd::CommandQueue::loop() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419478C: amd::CommandQueue::Thread::run(void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419854F: amd::Thread::main() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418D34C: amd::Thread::entry(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x40474FE: start_thread (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x4E4D49D: clone (in /lib/tls/i686/cmov/libc-2.9.so) ==25257== ==25257== ==25257== 115 (20 direct, 95 indirect) bytes in 1 blocks are definitely lost in loss record 21 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x433BD9C: llvm::MemoryBuffer::getFile(char const*, std::string*, long long) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x433C2CD: llvm::MemoryBuffer::getFileOrSTDIN(char const*, std::string*, long long) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419B624: amd::llvmLinkOptCG(std::string&, std::string&, std::string&, bool, bool) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4187DD8: cpu::Program::compile(std::string const&, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4183BF8: device::Program::build(std::string const*, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418FBBE: amd::Program::build(std::vector<amd::Device*, std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x417A51A: clBuildProgram (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804D377: clInitialize() (EdgeDetect.cpp:618) ==25257== by 0x804D811: main (EdgeDetect.cpp:1215) ==25257== ==25257== ==25257== 50 bytes in 2 blocks are definitely lost in loss record 26 of 88 ==25257== at 0x4026FDE: malloc (vg_replace_malloc.c:207) ==25257== by 0x418854B: cpu::Program::compile(std::string const&, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4183BF8: device::Program::build(std::string const*, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418FBBE: amd::Program::build(std::vector<amd::Device*, std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x417A51A: clBuildProgram (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804D377: clInitialize() (EdgeDetect.cpp:618) ==25257== by 0x804D830: main (EdgeDetect.cpp:1223) ==25257== ==25257== ==25257== 68 bytes in 1 blocks are possibly lost in loss record 27 of 88 ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x417BBC3: clCreateBuffer (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804D03B: clInitialize() (EdgeDetect.cpp:519) ==25257== by 0x804D830: main (EdgeDetect.cpp:1223) ==25257== ==25257== ==25257== 104 (56 direct, 48 indirect) bytes in 1 blocks are definitely lost in loss record 30 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x4503063: llvm::ArrayType::get(llvm::Type const*, unsigned long long) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E05B3: llvm::BitcodeReader::ParseTypeTable() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E172C: llvm::BitcodeReader::ParseModule(std::string const&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E2742: llvm::BitcodeReader::ParseBitcode() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E2AD0: llvm::getBitcodeModuleProvider(llvm::MemoryBuffer*, llvm::LLVMContext&, std::string*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E2C0B: llvm::ParseBitcodeFile(llvm::MemoryBuffer*, llvm::LLVMContext&, std::string*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x5DF635B: ??? ==25257== ==25257== ==25257== 84 bytes in 1 blocks are possibly lost in loss record 31 of 88 ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x46C694E: void* llvm::object_creator<llvm::PseudoSourceValue [4]>() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== ==25257== ==25257== 116 bytes in 1 blocks are possibly lost in loss record 33 of 88 ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7CF: clRunKernels() (EdgeDetect.cpp:723) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== ==25257== ==25257== 148 bytes in 2 blocks are definitely lost in loss record 36 of 88 ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804E1A9: streamsdk::SDKCommon::createTimer() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/samples/opencl/bin/debug/x86/EdgeDetect) ==25257== by 0x804B7C0: clRunKernels() (EdgeDetect.cpp:722) ==25257== by 0x804D835: main (EdgeDetect.cpp:1224) ==25257== ==25257== ==25257== 148 bytes in 1 blocks are definitely lost in loss record 37 of 88 ==25257== at 0x4026FDE: malloc (vg_replace_malloc.c:207) ==25257== by 0x4974B8B: (within /usr/lib/libcxcore.so.1.0.0) ==25257== by 0x49749E4: cvAlloc (in /usr/lib/libcxcore.so.1.0.0) ==25257== by 0x49877AB: cvCreateImageHeader (in /usr/lib/libcxcore.so.1.0.0) ==25257== by 0x4987878: cvCreateImage (in /usr/lib/libcxcore.so.1.0.0) ==25257== by 0x4B5AC3F: (within /usr/lib/libhighgui.so.1.0.0) ==25257== by 0x4B5ADAF: cvLoadImage (in /usr/lib/libhighgui.so.1.0.0) ==25257== by 0x804D7E8: main (EdgeDetect.cpp:1196) ==25257== ==25257== ==25257== 209 bytes in 7 blocks are possibly lost in loss record 41 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x4CDAAD3: std::string::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.10) ==25257== by 0x4CDB734: (within /usr/lib/libstdc++.so.6.0.10) ==25257== by 0x4CDB8A5: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.10) ==25257== by 0x4192B75: amd::CommandQueue::CommandQueue(amd::Context&, amd::Device&, unsigned long long) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41776AB: clCreateCommandQueue (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804CE4C: clInitialize() (EdgeDetect.cpp:453) ==25257== by 0x804D811: main (EdgeDetect.cpp:1215) ==25257== ==25257== ==25257== 436 bytes in 2 blocks are definitely lost in loss record 47 of 88 ==25257== at 0x4024EFA: memalign (vg_replace_malloc.c:460) ==25257== by 0x4024FAE: posix_memalign (vg_replace_malloc.c:569) ==25257== by 0x418D4D1: amd::Os::alignedMalloc(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418C9B3: amd::AlignedMemory::allocate(unsigned int, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4186201: cpu::Device::init() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4183B86: amd::Device::init() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41962CD: amd::Runtime::init(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41705B7: amd::HostThread::HostThread() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4176D4E: clCreateContextFromType (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804CCE3: clInitialize() (EdgeDetect.cpp:394) ==25257== by 0x804D811: main (EdgeDetect.cpp:1215) ==25257== ==25257== ==25257== 896 (416 direct, 480 indirect) bytes in 8 blocks are definitely lost in loss record 50 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x4502A44: llvm::PointerType::get(llvm::Type const*, unsigned int) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E0823: llvm::BitcodeReader::ParseTypeTable() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E172C: llvm::BitcodeReader::ParseModule(std::string const&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E2742: llvm::BitcodeReader::ParseBitcode() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E2AD0: llvm::getBitcodeModuleProvider(llvm::MemoryBuffer*, llvm::LLVMContext&, std::string*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41996C9: (within /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419B624: amd::llvmLinkOptCG(std::string&, std::string&, std::string&, bool, bool) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4187DD8: cpu::Program::compile(std::string const&, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4183BF8: device::Program::build(std::string const*, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418FBBE: amd::Program::build(std::vector<amd::Device*, std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x417A51A: clBuildProgram (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804D377: clInitialize() (EdgeDetect.cpp:618) ==25257== by 0x804D830: main (EdgeDetect.cpp:1223) ==25257== ==25257== ==25257== 1,008 bytes in 6 blocks are possibly lost in loss record 56 of 88 ==25257== at 0x4025092: calloc (vg_replace_malloc.c:397) ==25257== by 0x401134B: _dl_allocate_tls (in /lib/ld-2.9.so) ==25257== by 0x4046672: pthread_create@@GLIBC_2.1 (in /lib/tls/i686/cmov/libpthread-2.9.so) ==25257== by 0x418D2E6: amd::Os::createOsThread(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4198127: amd::Thread::Thread(std::string const&, unsigned int, bool) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4192B90: amd::CommandQueue::CommandQueue(amd::Context&, amd::Device&, unsigned long long) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41776AB: clCreateCommandQueue (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804CE4C: clInitialize() (EdgeDetect.cpp:453) ==25257== by 0x804D811: main (EdgeDetect.cpp:1215) ==25257== ==25257== ==25257== 1,024 bytes in 1 blocks are definitely lost in loss record 58 of 88 ==25257== at 0x4026FDE: malloc (vg_replace_malloc.c:207) ==25257== by 0x418CAF8: amd::Assembler::Assembler() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418E944: amd::CodeCache::init() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419629C: amd::Runtime::init(amd::Thread*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41705B7: amd::HostThread::HostThread() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4176D4E: clCreateContextFromType (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804CCE3: clInitialize() (EdgeDetect.cpp:394) ==25257== by 0x804D811: main (EdgeDetect.cpp:1215) ==25257== ==25257== ==25257== 1,380 bytes in 58 blocks are definitely lost in loss record 61 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x4CDAAD3: std::string::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.10) ==25257== by 0x4CDB734: (within /usr/lib/libstdc++.so.6.0.10) ==25257== by 0x4CDB951: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned int, std::allocator<char> const&) (in /usr/lib/libstdc++.so.6.0.10) ==25257== by 0x47E9F20: llvm::MCSectionELF::Create(llvm::StringRef const&, unsigned int, unsigned int, llvm::SectionKind, bool, llvm::MCContext&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x45AA2C8: llvm::TargetLoweringObjectFileELF::getELFSection(llvm::StringRef, unsigned int, unsigned int, llvm::SectionKind, bool) const (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x45AB4A3: llvm::TargetLoweringObjectFileELF::Initialize(llvm::MCContext&, llvm::TargetMachine const&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x47A87A7: llvm::AsmPrinter::doInitialization(llvm::Module&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x450E0D5: llvm::FPPassManager::doInitialization(llvm::Module&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x450E15C: llvm::FunctionPassManagerImpl::doInitialization(llvm::Module&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x450E1B7: llvm::FunctionPassManager::doInitialization() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419BFA1: amd::llvmLinkOptCG(std::string&, std::string&, std::string&, bool, bool) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4187DD8: cpu::Program::compile(std::string const&, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4183BF8: device::Program::build(std::string const*, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418FBBE: amd::Program::build(std::vector<amd::Device*, std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x417A51A: clBuildProgram (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804D377: clInitialize() (EdgeDetect.cpp:618) ==25257== by 0x804D830: main (EdgeDetect.cpp:1223) ==25257== ==25257== ==25257== 12,444 (7,396 direct, 5,048 indirect) bytes in 101 blocks are definitely lost in loss record 70 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x4503BF5: llvm::FunctionType::get(llvm::Type const*, std::vector<llvm::Type const*, std::allocator<llvm::Type const*> > const&, bool) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E07B8: llvm::BitcodeReader::ParseTypeTable() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E172C: llvm::BitcodeReader::ParseModule(std::string const&) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E2742: llvm::BitcodeReader::ParseBitcode() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x44E2AD0: llvm::getBitcodeModuleProvider(llvm::MemoryBuffer*, llvm::LLVMContext&, std::string*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x41996C9: (within /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x419B624: amd::llvmLinkOptCG(std::string&, std::string&, std::string&, bool, bool) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4187DD8: cpu::Program::compile(std::string const&, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4183BF8: device::Program::build(std::string const*, char const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x418FBBE: amd::Program::build(std::vector<amd::Device*, std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x417A51A: clBuildProgram (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x804D377: clInitialize() (EdgeDetect.cpp:618) ==25257== by 0x804D830: main (EdgeDetect.cpp:1223) ==25257== ==25257== ==25257== 7,988 bytes in 2 blocks are definitely lost in loss record 71 of 88 ==25257== at 0x402630E: operator new[](unsigned int) (vg_replace_malloc.c:268) ==25257== by 0x804CA54: convertToString(char const*) (EdgeDetect.cpp:168) ==25257== by 0x804D2A5: clInitialize() (EdgeDetect.cpp:599) ==25257== by 0x804D811: main (EdgeDetect.cpp:1215) ==25257== ==25257== ==25257== 22,844 (2,244 direct, 20,600 indirect) bytes in 561 blocks are definitely lost in loss record 73 of 88 ==25257== at 0x40269EE: operator new(unsigned int) (vg_replace_malloc.c:224) ==25257== by 0x4501EBB: llvm::DerivedType::dropAllTypeUses() (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x45020BF: llvm::DerivedType::unlockedRefineAbstractTypeTo(llvm::Type const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x450B45C: llvm::TypeMap<llvm::PointerValType, llvm::PointerType>::RefineAbstractType(llvm::PointerType*, llvm::DerivedType const*, llvm::Type const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x4503F08: llvm::PointerType::refineAbstractType(llvm::DerivedType const*, llvm::Type const*) (in /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== by 0x494FFF3: (within /home/alex/bin/packages/ati-stream-sdk-v2.0-beta3-lnx32/lib/x86/libOpenCL.so) ==25257== ==25257== ==25257== 69,553,176 bytes in 1 blocks are possibly lost in loss record 88 of 88 ==25257== at 0x4026FDE: malloc (vg_replace_malloc.c:207) ==25257== by 0x804BF25: clInitializeHost(_IplImage*) (EdgeDetect.cpp:359) ==25257== by 0x804D80C: main (EdgeDetect.cpp:1214) ==25257== ==25257== LEAK SUMMARY: ==25257== definitely lost: 21,365 bytes in 747 blocks. ==25257== indirectly lost: 26,271 bytes in 1,258 blocks. ==25257== possibly lost: 69,554,661 bytes in 17 blocks. ==25257== still reachable: 122,501,202 bytes in 7,080 blocks. ==25257== suppressed: 0 bytes in 0 blocks. ==25257== Reachable blocks (those to which a pointer was found) are not shown. ==25257== To see them, rerun with: --leak-check=full --show-reachable=yes --25257-- memcheck: sanity checks: 14710 cheap, 151 expensive --25257-- memcheck: auxmaps: 0 auxmap entries (0k, 0M) in use --25257-- memcheck: auxmaps_L1: 0 searches, 0 cmps, ratio 0:10 --25257-- memcheck: auxmaps_L2: 0 searches, 0 nodes --25257-- memcheck: SMs: n_issued = 14020 (224320k, 219M) --25257-- memcheck: SMs: n_deissued = 10758 (172128k, 168M) --25257-- memcheck: SMs: max_noaccess = 65535 (1048560k, 1023M) --25257-- memcheck: SMs: max_undefined = 4244 (67904k, 66M) --25257-- memcheck: SMs: max_defined = 438 (7008k, 6M) --25257-- memcheck: SMs: max_non_DSM = 8563 (137008k, 133M) --25257-- memcheck: max sec V bit nodes: 40212 (2042k, 1M) --25257-- memcheck: set_sec_vbits8 calls: 256395 (new: 48626, updates: 207769) --25257-- memcheck: max shadow mem size: 139354k, 136M --25257-- translate: fast SP updates identified: 71,176 ( 88.9%) --25257-- translate: generic_known SP updates identified: 7,994 ( 9.9%) --25257-- translate: generic_unknown SP updates identified: 848 ( 1.0%) --25257-- tt/tc: 1,057,492 tt lookups requiring 15,020,266 probes --25257-- tt/tc: 1,057,492 fast-cache updates, 4 flushes --25257-- transtab: new 83,773 (2,009,613 -> 24,780,529; ratio 123:10) [0 scs] --25257-- transtab: dumped 0 (0 -> ??) --25257-- transtab: discarded 39 (3,382 -> ??) --25257-- scheduler: 1,470,779,170 jumps (bb entries). --25257-- scheduler: 14,710/105,799,416 major/minor sched events. --25257-- sanity: 14711 cheap, 151 expensive checks. --25257-- exectx: 98,317 lists, 62,733 contexts (avg 0 per list) --25257-- exectx: 481,835 searches, 470,521 full compares (976 per 1000) --25257-- exectx: 55,842 cmp2, 1,001 cmp4, 0 cmpAll --25257-- errormgr: 122 supplist searches, 16,466 comparisons during search --25257-- errormgr: 340 errlist searches, 1,693 comparisons during search

0 Likes

Could you post the complete source code? 

0 Likes

Originally posted by: omkaranathan Could you post the complete source code? 

 

 

Sure, but it's a bit of a mess!

It's also changed since I posted that error. But, to cause that error just go to where I start the timer, copy that line, and paste it a few times in succession

0 Likes

HOST:

#include "EdgeDetect.hpp" #include <malloc.h> ///////////////////////////////////////////////////////////////// // Util Methods ///////////////////////////////////////////////////////////////// void cvDisplay(IplImage* image, char windowName[], int x, int y) { CvSize imageSize = cvGetSize(image); cvNamedWindow(windowName); cvResizeWindow(windowName, imageSize.width, imageSize.height); cvMoveWindow(windowName, x, y); cvShowImage(windowName,image); cvWaitKey(0); cvDestroyAllWindows(); cvReleaseImage(&image); } //cl_uint *cvImageToClArray(IplImage* raw) my_uint4 *cvImageToClArray(IplImage* raw) { int width = raw->width; int height = raw->height; my_uint4 *imageArray = (my_uint4*)malloc(width * height * sizeof(my_uint4)); for (int y=0; y<height; y++) { //std::cout << "\n"; for (int x=0; x<width; x++) { CvScalar colourValue = cvGet2D(raw,y,x); int index = (y*width) + x; imageArray[index].u32[0] = (cl_uint)(colourValue.val[0]); //B imageArray[index].u32[1] = (cl_uint)(colourValue.val[1]); //G imageArray[index].u32[2] = (cl_uint)(colourValue.val[2]); //R imageArray[index].u32[3] = (cl_uint)(0); //A //std::cout << "[" << index << "-" << imageArray[index].u32[0] << "]"; } } return imageArray; } //converts raw image into intensity values IplImage* clArrayToCvImage(cl_uint* output, int resultWidth, int resultHeight) { CvSize size; size.width = resultWidth; size.height = resultHeight; IplImage* resultImg = cvCreateImage(size,IPL_DEPTH_8U,1); //std::cout << "\noutput.width=" << resultWidth << ", output.height=" << resultHeight << "\n"; //generate intensity image for (int y=0; y<resultHeight; y++) { //std::cout << "\n"; for (int x=0; x<resultWidth; x++) { CvScalar colourSelect; int index = (y*resultWidth) + x; //std::cout << "[" << index << "-" << output[index] << "]"; colourSelect.val[0] = output[index]; cvSet2D(resultImg,y,x,colourSelect); } } return resultImg; } // Converts the contents of a file into a string std::string convertToString(const char *filename) { size_t size; char* str; std::string s; std::fstream f(filename, (std::fstream::in | std::fstream::binary)); if(f.is_open()) { size_t fileSize; f.seekg(0, std::fstream::end); size = fileSize = f.tellg(); f.seekg(0, std::fstream::beg); str = new char[size+1]; if(!str) { f.close(); return NULL; } f.read(str, fileSize); f.close(); str[size] = '\0'; s = str; return s; } return NULL; } ///////////////////////////////////////////////////////////////// // Serial (OpenCV) Methods ///////////////////////////////////////////////////////////////// int cvDoFindEdges(IplImage* cvImg) { // IplImage* intensityImage = cvCreateImage(cvGetSize(raw),IPL_DEPTH_8U,1); // IplImage* sobelImg = cvCreateImage(cvGetSize(intensityImage),IPL_DEPTH_8U, 1); //gray scale representation of raw image IplImage* cvImgIntensity = cvGenerateIntensityImage(cvImg); //resultant image after Sobel operator applied to raw image IplImage* cvImgSobel = cvGenerateSobelImage(cvImgIntensity); // cvDisplay(cvImg,"raw",0,0); // cvDisplay(cvImgIntensity,"intensity",0,0); // cvDisplay(cvImgSobel,"imgSobel",0,0); // cvWaitKey(0); // cvDestroyAllWindows(); // cvReleaseImage(&cvImgIntensity); // cvReleaseImage(&cvImgSobel); } //converts raw image into intensity values IplImage* cvGenerateIntensityImage(IplImage* raw) { IplImage* intensityImage = cvCreateImage(cvGetSize(raw),IPL_DEPTH_8U,1); //generate intensity image for (int y=0; y<raw->height; y++) for (int x=0; x<raw->width; x++) { CvScalar colourValue = cvGet2D(raw,y,x); CvScalar colourSelect; colourSelect.val[0] = (colourValue.val[0]+colourValue.val[1]+colourValue.val[2])/3; cvSet2D(intensityImage,y,x,colourSelect); } return intensityImage; } //applies Sobel Operator IplImage* cvGenerateSobelImage(IplImage* intensityImage) { IplImage* sobelImg = cvCreateImage(cvGetSize(intensityImage),IPL_DEPTH_8U, 1); //matrix representation of Sobel image. to ensure negative values are not stored as 0 CvMat* sobelMat = cvCreateMat(intensityImage->height,intensityImage->width,CV_64FC1); //generate sobel image for (int y=0; y<intensityImage->height; y++) for (int x=0; x<intensityImage->width; x++) { cl_uint Gx; cl_uint Gy; cl_uint G; if ((y==0) || (y==intensityImage->height-1) || (x==0) || (x==intensityImage->width-1)) { G = cvGet2D(intensityImage,y,x).val[0]; } else { Gx = cvGet2D(intensityImage,y-1,x-1).val[0] * cvSobelOpX[0][0] + cvGet2D(intensityImage,y-1,x).val[0] * cvSobelOpX[0][1] + cvGet2D(intensityImage,y-1,x+1).val[0] * cvSobelOpX[0][2] + cvGet2D(intensityImage,y,x-1).val[0] * cvSobelOpX[1][0] + cvGet2D(intensityImage,y,x).val[0] * cvSobelOpX[1][1] + cvGet2D(intensityImage,y,x+1).val[0] * cvSobelOpX[1][2] + cvGet2D(intensityImage,y+1,x-1).val[0] * cvSobelOpX[2][0] + cvGet2D(intensityImage,y+1,x).val[0] * cvSobelOpX[2][1] + cvGet2D(intensityImage,y+1,x+1).val[0] * cvSobelOpX[2][2]; Gx = abs(Gx); Gy = cvGet2D(intensityImage,y-1,x-1).val[0] * cvSobelOpY[0][0] + cvGet2D(intensityImage,y-1,x).val[0] * cvSobelOpY[0][1] + cvGet2D(intensityImage,y-1,x+1).val[0] * cvSobelOpY[0][2] + cvGet2D(intensityImage,y,x-1).val[0] * cvSobelOpY[1][0] + cvGet2D(intensityImage,y,x).val[0] * cvSobelOpY[1][1] + cvGet2D(intensityImage,y,x+1).val[0] * cvSobelOpY[1][2] + cvGet2D(intensityImage,y+1,x-1).val[0] * cvSobelOpY[2][0] + cvGet2D(intensityImage,y+1,x).val[0] * cvSobelOpY[2][1] + cvGet2D(intensityImage,y+1,x+1).val[0] * cvSobelOpY[2][2]; Gy = abs(Gy); G = Gx + Gy; } CvScalar colourSelect; colourSelect.val[0] = G; cvmSet(sobelMat,y,x,G); cvSet2D(sobelImg,y,x,colourSelect); } return sobelImg; } ///////////////////////////////////////////////////////////////// // Parallel (OpenCL) Methods ///////////////////////////////////////////////////////////////// //todo: complete this //cl_mem clDoCreateImage( // char* filename, // cl_context context) //{ // //todo: temp test code // IplImage* tempRaw = cvLoadImage(filename, 1); // size_t width = (size_t)(tempRaw->width); // size_t height = (size_t)(tempRaw->height); // //todo: maybe let OpenCL deal with this (it will be width*bytes-per-pixel // size_t rowpitch = 0; //width*4; // // void* image = fopen(filename,"rb"); // if (image != NULL) { // std::cout<<"image loaded successfully: " << filename << "\n"; // } else { // std::cout<<"image could not be loaded: " << filename << "\n"; // } // // // // set the image format properties and option flags // cl_image_format format; // format.image_channel_order = CL_RGBA; // format.image_channel_data_type = CL_UNORM_INT8; // //format.image_channel_data_type = CL_UNSIGNED_INT8; // // cl_mem_flags flags; //// flags = CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR; // flags = CL_MEM_READ_ONLY; // // cl_int error = CL_SUCCESS; // // std::cout<<"BEFORE\n"; // // cl_mem myClImage = clCreateImage2D( // context, // a valid OpenCL context // flags, // option flags [1] // &format, // image format properties [2] // width, // width of the image in pixels // height, // height of the image in pixels // rowpitch, // scan-line pitch in bytes [3] // image, // pointer to the image data // &error // on return, the result code // ); // // std::cout<<"AFTER\n"; // // if(image == 0 || error != CL_SUCCESS) // { // std::cout<<"Error: Could not create 2D image (clCreateImage2D)\n"; // } // return myClImage; //} // Host Initialization: Allocate & init memory on the host. Print input array. void clInitializeHost(IplImage* cvRawImg) { input = NULL; intermediate = NULL; output = NULL; input = cvImageToClArray(cvRawImg); // input = clTestArray(10); if(input==NULL) { std::cout<<"Error: Failed to allocate host memory. (input)\n"; return; } intermediate = (cl_uint*)malloc(width * height * sizeof(cl_uint)); if(intermediate==NULL) { std::cout<<"Error: Failed to allocate host memory. (intermediate)\n"; return; } output = (cl_uint*)malloc(width * height * sizeof(cl_uint)); if(output==NULL) { std::cout<<"Error: Failed to allocate host memory. (output)\n"; return; } } // OpenCL related initialization // -> Create Context, Device list, Command Queue // -> Create OpenCL memory buffer objects // -> Load CL file, compile, link CL source // -> Build program and kernel objects void clInitialize(void) { cl_int status = 0; size_t deviceListSize; ///////////////////////////////////////////////////////////////// // Create an OpenCL context ///////////////////////////////////////////////////////////////// //todo: experiment with CL_DEVICE_TYPE_ ALL, DEFAULT, GPU, ACCELERATOR, CPU context = clCreateContextFromType(0, CL_DEVICE_TYPE_CPU, NULL, NULL, &status); if(status != CL_SUCCESS) { std::cout<<"Error: Creating Context. (clCreateContextFromType)\n"; return; } /* First, get the size of device list data */ status = clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, NULL, &deviceListSize); if(status != CL_SUCCESS) { std::cout<< "Error: Getting Context Info (device list size, clGetContextInfo)\n"; return; } ///////////////////////////////////////////////////////////////// // Detect OpenCL devices ///////////////////////////////////////////////////////////////// devices = (cl_device_id *)malloc(deviceListSize); if(devices == 0) { std::cout<<"Error: No devices found.\n"; return; } /* Now, get the device list data */ status = clGetContextInfo( context, CL_CONTEXT_DEVICES, deviceListSize, devices, NULL); if(status != CL_SUCCESS) { std::cout<<"Error: Getting Context Info (device list, clGetContextInfo)\n"; return; } ///////////////////////////////////////////////////////////////// // Create an OpenCL command queue ///////////////////////////////////////////////////////////////// /* The block is to move the declaration of prop closer to its use */ //todo: set this to 0 later maybe cl_command_queue_properties prop = 0; if (PROFILE) { prop |= CL_QUEUE_PROFILING_ENABLE; } commandQueue = clCreateCommandQueue( context, devices[0], prop, &status); if(status != CL_SUCCESS) { std::cout<<"Creating Command Queue. (clCreateCommandQueue)\n"; return; } ///////////////////////////////////////////////////////////////// // Create OpenCL memory buffers ///////////////////////////////////////////////////////////////// inputBuffer = clCreateBuffer( context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint4) * width * height, input, &status); if(status == CL_INVALID_CONTEXT) { //context not valid std::cout<<"Error: clCreateBuffer - invalid context - (inputBuffer)\n"; return; } if(status == CL_INVALID_VALUE) { //flags value not valid std::cout<<"Error: clCreateBuffer - invalid flags value - (inputBuffer)\n"; return; } if(status == CL_INVALID_BUFFER_SIZE) { //size==0 or size>CL_DEVICE_MAX_MEM_ALLOC_SIZE std::cout<<"Error: clCreateBuffer - invalid buffer size - (inputBuffer)\n"; return; } if(status == CL_INVALID_HOST_PTR) { //(host_ptr == NULL) && (CL_MEM_USE_HOST_PTR || CL_MEM_COPY_HOST_PTR in flags) //|| //(host_ptr != NULL) && (CL_MEM_COPY_HOST_PTR || CL_MEM_USE_HOST_PTR _not_ in flags) bool isNull = (input==NULL); std::cout<<"Error: clCreateBuffer - invalid host pointer - (inputBuffer) - NULL==" << isNull << "\n"; return; } if(status == CL_MEM_OBJECT_ALLOCATION_FAILURE) { //there is a failure to allocate memory for buffer object std::cout<<"Error: clCreateBuffer - mem object alloc failure - (inputBuffer)\n"; return; } if(status == CL_OUT_OF_HOST_MEMORY) { //there is a failure to allocate resources required by the OpenCL implementation on the host std::cout<<"Error: clCreateBuffer - out of host mem - (inputBuffer)\n"; return; } if(status != CL_SUCCESS) { std::cout<<"Error: clCreateBuffer (inputBuffer)\n"; return; } intermediateBuffer = clCreateBuffer( context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, sizeof(cl_uint) * width * height, intermediate, &status); if(status != CL_SUCCESS) { std::cout<<"Error: clCreateBuffer (intermediateBuffer)\n"; return; } outputBuffer = clCreateBuffer( context, CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint) * width * height, output, &status); if(status == CL_INVALID_CONTEXT) { //context not valid std::cout<<"Error: clCreateBuffer - invalid context\n"; } if(status == CL_INVALID_VALUE) { //flags value not valid std::cout<<"Error: clCreateBuffer - invalid flags value\n"; } if(status == CL_INVALID_BUFFER_SIZE) { //size==0 or size>CL_DEVICE_MAX_MEM_ALLOC_SIZE std::cout<<"Error: clCreateBuffer - invalid buffer size\n"; } if(status == CL_INVALID_HOST_PTR) { //(host_ptr == NULL) && (CL_MEM_USE_HOST_PTR || CL_MEM_COPY_HOST_PTR in flags) //|| //(host_ptr != NULL) && (CL_MEM_COPY_HOST_PTR || CL_MEM_USE_HOST_PTR _not_ in flags) bool isNull = (input==NULL); std::cout<<"Error: clCreateBuffer - invalid host pointer - NULL==" << isNull << "\n"; } if(status == CL_MEM_OBJECT_ALLOCATION_FAILURE) { //there is a failure to allocate memory for buffer object std::cout<<"Error: clCreateBuffer - mem object alloc failure\n"; } if(status == CL_OUT_OF_HOST_MEMORY) { //there is a failure to allocate resources required by the OpenCL implementation on the host std::cout<<"Error: clCreateBuffer - out of host mem\n"; } if(status != CL_SUCCESS) { std::cout<<"Error: clCreateBuffer (outputBuffer)\n"; return; } sobelOpXBuffer = clCreateBuffer( context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint ) * MASK_HEIGHT * MASK_WIDTH, clSobelOpX, &status); if(status != CL_SUCCESS) { std::cout<<"Error: clCreateBuffer (sobelOpXBuffer)\n"; return; } sobelOpYBuffer = clCreateBuffer( context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint ) * MASK_HEIGHT * MASK_WIDTH, clSobelOpY, &status); if(status != CL_SUCCESS) { std::cout<<"Error: clCreateBuffer (sobelOpYBuffer)\n"; return; } ///////////////////////////////////////////////////////////////// // Load CL file, build CL program object, create CL kernel object ///////////////////////////////////////////////////////////////// const char * filename = "EdgeDetect_Kernels.cl"; std::string sourceStr = convertToString(filename); const char * source = sourceStr.c_str(); size_t sourceSize[] = { strlen(source) }; //std::cout << source << "\n"; program = clCreateProgramWithSource( context, 1, &source, sourceSize, &status); if(status != CL_SUCCESS) { std::cout<<"Error: Loading Binary into cl_program (clCreateProgramWithSource)\n"; return; } /* create a cl program executable for all the devices specified */ status = clBuildProgram(program, 1, devices, NULL, NULL, NULL); //error checking code if(!sampleCommon.checkVal(status,CL_SUCCESS,"clBuildProgram failed.")) { //print kernel compilation error char programLog[4096]; cl_int tempStatus = clGetProgramBuildInfo(program, devices[0], CL_PROGRAM_BUILD_LOG, 4096, programLog, 0); std::cout<<"\n---Build Log---\n"<<programLog<<"\n---Build Log---\n"; return; } if(status == CL_INVALID_PROGRAM) { //if program is not a valid program object. std::cout<<"Error: Invalid program object. (clBuildProgram)\n"; return; } if(status == CL_INVALID_VALUE) { // (device_list == NULL) && (num_devices > 0) // || // (device_list != NULL) && (num_devices ==0) // || // (pfn_notify == NULL) && (user_data != NULL) std::cout<<"Error: Invalid value - device_list==NULL:" << (devices==NULL) << " - (clBuildProgram)\n"; return; } if(status == CL_INVALID_DEVICE) { // OpenCL devices listed in device_list are not in the list of // devices associated with program. std::cout<<"Error: Invalid device. (clBuildProgram)\n"; return; } if(status == CL_INVALID_BINARY) { // if program is created with clCreateWithProgramBinary and // devices listed in device_list do not have a valid program binary loaded. std::cout<<"Error: Invalid binary. (clBuildProgram)\n"; return; } if(status == CL_INVALID_BUILD_OPTIONS) { // if the build options specified by options are invalid std::cout<<"Error: Invalid build options. (clBuildProgram)\n"; return; } if(status == CL_INVALID_OPERATION) { // if the build of a program executable for any of the devices // listed in device_list by a previous call to clBuildProgram for program has not // completed // || // if there are kernel objects attached to program. std::cout<<"Error: Invalid operation. (clBuildProgram)\n"; return; } if(status == CL_COMPILER_NOT_AVAILABLE) { // CL_COMPILER_NOT_AVAILABLE if program is created with // clCreateProgramWithSource and a compiler is not available i.e. // CL_DEVICE_COMPILER_AVAILABLE specified in table 4.3 is set to CL_FALSE. std::cout<<"Error: Compiler not available. (clBuildProgram)\n"; return; } if(status == CL_BUILD_PROGRAM_FAILURE) { // if there is a failure to build the program executable. // This error will be returned if clBuildProgram does not return until the build has // completed. std::cout<<"Error: Build program failure. (clBuildProgram)\n"; return; } if(status == CL_OUT_OF_HOST_MEMORY) { // if there is a failure to allocate resources required by the // OpenCL implementation on the host. std::cout<<"Error: Out of host memory. (clBuildProgram)\n"; return; } if(status != CL_SUCCESS) { std::cout<<"Error: Building Program (clBuildProgram)\n"; return; } /* get a kernel object handle for a kernel with the given name */ kernel = clCreateKernel(program, "edgeDetectKernel", &status); if(status != CL_SUCCESS) { std::cout<<"Error: Creating Kernel from program. (clCreateKernel)\n"; return; } } // Run OpenCL program // -> Bind host variables to kernel arguments // -> Run the CL kernel double clRunKernels(cl_uint alloc_type, cl_uint kernelCount) { double runTime; cl_int status; cl_event events[2]; size_t globalThreads[1]; size_t localThreads[1]; globalThreads[0] = kernelCount; localThreads[0] = 1; ////////////////////////////////////////// // Set appropriate arguments to the kernel ////////////////////////////////////////// /* the input array to the kernel */ status = clSetKernelArg( kernel, 0, sizeof(cl_mem), (void *)&inputBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: Setting kernel argument. (input)\n"; return -1; } /* the intermediate array to the kernel */ status = clSetKernelArg( kernel, 1, sizeof(cl_mem), (void *)&intermediateBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: Setting kernel argument. (intermediate)\n"; return -1; } /* the output array to the kernel */ status = clSetKernelArg( kernel, 2, sizeof(cl_mem), (void *)&outputBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: Setting kernel argument. (output)\n"; return -1; } status = clSetKernelArg( kernel, 3, sizeof(cl_mem), (void *)&sobelOpXBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: Setting kernel argument. (sobelx)\n"; return -1; } status = clSetKernelArg( kernel, 4, sizeof(cl_mem), (void *)&sobelOpYBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: Setting kernel argument. (sobely)\n"; return -1; } cl_uint2 inputOutputDim = {width, height}; status = clSetKernelArg( kernel, 5, sizeof(cl_uint2), (void *)&inputOutputDim ); if(status != CL_SUCCESS) { std::cout<<"Error: Setting kernel argument. (inputOutputDimensions)\n"; return -1; } status = clSetKernelArg( kernel, 6, sizeof(cl_uint), (void *)&alloc_type ); if(status != CL_SUCCESS) { std::cout<<"Error: Setting kernel argument. (alloc_type)\n"; return -1; } sampleCommon.resetTimer(runTimerKey); sampleCommon.startTimer(runTimerKey); ////////////////////////////////////////// // Enqueue a kernel run call. ////////////////////////////////////////// status = clEnqueueNDRangeKernel( commandQueue, kernel, 1, NULL, globalThreads, localThreads, 0, NULL, &events[0]); if(status != CL_SUCCESS) { std::cout<<"Error: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)\n"; return -1; } ////////////////////////////////////////// // wait for the kernel call to finish execution ////////////////////////////////////////// status = clWaitForEvents(1, &events[0]); if(status != CL_SUCCESS) { std::cout<<"Error: Waiting for kernel run to finish. (clWaitForEvents 0)\n"; return -1; } sampleCommon.stopTimer(runTimerKey); runTime = (double)(sampleCommon.readTimer(runTimerKey)); if (PROFILE) { long long kernelsStartTime; long long kernelsEndTime; status = clGetEventProfilingInfo( events[0], CL_PROFILING_COMMAND_START, sizeof(long long), &kernelsStartTime, NULL); if(status != CL_SUCCESS) { std::cout<<"Error: clGetEventProfilingInfo failed (start)\n"; return -1; } status = clGetEventProfilingInfo( events[0], CL_PROFILING_COMMAND_END, sizeof(long long), &kernelsEndTime, NULL); if(status != CL_SUCCESS) { std::cout<<"Error: clGetEventProfilingInfo failed (end)\n"; return -1; } /* Compute total time (also convert from nanoseconds to seconds) */ double totalTime = (double)(kernelsEndTime - kernelsStartTime)/1e9; printf("\nTIME: %f\n", totalTime); //std::cout<<"TIME: " << totalTime << "\n"; } clReleaseEvent(events[0]); ////////////////////////////////////////// // Enqueue readBuffer ////////////////////////////////////////// status = clEnqueueReadBuffer( commandQueue, outputBuffer, CL_TRUE, 0, width * height * sizeof(cl_uint), output, 0, NULL, &events[1]); if(status != CL_SUCCESS) { std::cout <<"Error: clEnqueueReadBuffer failed. (clEnqueueReadBuffer)\n"; } ////////////////////////////////////////// // Wait for the read buffer to finish execution ////////////////////////////////////////// status = clWaitForEvents(1, &events[1]); if(status != CL_SUCCESS) { std::cout<<"Error: Waiting for read buffer call to finish. (clWaitForEvents)\n"; return -1; } clReleaseEvent(events[1]); return runTime; } // Release OpenCL resources (Context, Memory etc.) void clCleanup(void) { cl_int status; status = clReleaseKernel(kernel); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseKernel \n"; return; } status = clReleaseProgram(program); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseProgram\n"; return; } status = clReleaseMemObject(inputBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseMemObject (inputBuffer)\n"; return; } status = clReleaseMemObject(intermediateBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseMemObject (intermediateBuffer)\n"; return; } status = clReleaseMemObject(outputBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseMemObject (outputBuffer)\n"; return; } status = clReleaseMemObject(sobelOpXBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseMemObject (sobelOpXBuffer)\n"; return; } status = clReleaseMemObject(sobelOpYBuffer); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseMemObject (sobelOpYBuffer)\n"; return; } status = clReleaseCommandQueue(commandQueue); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseCommandQueue\n"; return; } status = clReleaseContext(context); if(status != CL_SUCCESS) { std::cout<<"Error: In clReleaseContext\n"; return; } } // Releases program's resources void clCleanupHost(void) { if(input != NULL) { free(input); input = NULL; } if(intermediate != NULL) { free(input); input = NULL; } if(output != NULL) { free(output); output = NULL; } if(clSobelOpX != NULL) { free(output); output = NULL; } if(clSobelOpY != NULL) { free(output); output = NULL; } if(devices != NULL) { free(devices); devices = NULL; } } /*Display OpenCL system info */ void clPrintInfo() { int MAX_DEVICES = 10; size_t p_size; size_t arr_tsize[3]; size_t ret_size; char param[100]; cl_uint entries; cl_ulong long_entries; cl_bool bool_entries; cl_device_id devices[MAX_DEVICES]; size_t num_devices; cl_device_local_mem_type mem_type; cl_device_type dev_type; cl_device_fp_config fp_conf; cl_device_exec_capabilities exec_cap; clGetDeviceIDs( NULL, CL_DEVICE_TYPE_DEFAULT, MAX_DEVICES, devices, &num_devices); printf("Found Devices:\t\t%d\n", num_devices); for (int i = 0; i < num_devices; i++) { printf("\nDevice: %d\n\n", i); clGetDeviceInfo(devices, CL_DEVICE_TYPE, sizeof(dev_type), &dev_type, &ret_size); printf("\tDevice Type:\t\t"); if (dev_type & CL_DEVICE_TYPE_GPU) printf("CL_DEVICE_TYPE_GPU "); if (dev_type & CL_DEVICE_TYPE_CPU) printf("CL_DEVICE_TYPE_CPU "); if (dev_type & CL_DEVICE_TYPE_ACCELERATOR) printf("CL_DEVICE_TYPE_ACCELERATOR "); if (dev_type & CL_DEVICE_TYPE_DEFAULT) printf("CL_DEVICE_TYPE_DEFAULT "); printf("\n"); clGetDeviceInfo(devices, CL_DEVICE_NAME, sizeof(param), param, &ret_size); printf("\tName: \t\t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_VENDOR, sizeof(param), param, &ret_size); printf("\tVendor: \t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_VENDOR_ID, sizeof(cl_uint), &entries, &ret_size); printf("\tVendor ID:\t\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_VERSION, sizeof(param), param, &ret_size); printf("\tVersion:\t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_PROFILE, sizeof(param), param, &ret_size); printf("\tProfile:\t\t%s\n", param); clGetDeviceInfo(devices, CL_DRIVER_VERSION, sizeof(param), param, &ret_size); printf("\tDriver: \t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_EXTENSIONS, sizeof(param), param, &ret_size); printf("\tExtensions:\t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_MAX_WORK_ITEM_SIZES, 3 * sizeof(size_t), arr_tsize, &ret_size); printf("\tMax Work-Item Sizes:\t(%d,%d,%d)\n", arr_tsize[0], arr_tsize[1], arr_tsize[2]); clGetDeviceInfo(devices, CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(size_t), &p_size, &ret_size); printf("\tMax Work Group Size:\t%d\n", p_size); clGetDeviceInfo(devices, CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(cl_uint), &entries, &ret_size); printf("\tMax Compute Units:\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_MAX_CLOCK_FREQUENCY, sizeof(cl_uint), &entries, &ret_size); printf("\tMax Frequency (Mhz):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE, sizeof(cl_uint), &entries, &ret_size); printf("\tCache Line (bytes):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(cl_ulong), &long_entries, &ret_size); printf("\tGlobal Memory (MB):\t%llu\n", long_entries / 1024 / 1024); clGetDeviceInfo(devices, CL_DEVICE_LOCAL_MEM_SIZE, sizeof(cl_ulong), &long_entries, &ret_size); printf("\tLocal Memory (MB):\t%llu\n", long_entries / 1024 / 1024); clGetDeviceInfo(devices, CL_DEVICE_LOCAL_MEM_TYPE, sizeof(cl_device_local_mem_type), &mem_type, &ret_size); if (mem_type & CL_LOCAL) printf("\tLocal Memory Type:\tCL_LOCAL\n"); else if (mem_type & CL_GLOBAL) printf("\tLocal Memory Type:\tCL_GLOBAL\n"); else printf("\tLocal Memory Type:\tUNKNOWN\n"); clGetDeviceInfo(devices, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(cl_ulong), &long_entries, &ret_size); printf("\tMax Mem Alloc (MB):\t%llu\n", long_entries / 1024 / 1024); clGetDeviceInfo(devices, CL_DEVICE_MAX_PARAMETER_SIZE, sizeof(size_t), &p_size, &ret_size); printf("\tMax Param Size (MB):\t%d\n", p_size); clGetDeviceInfo(devices, CL_DEVICE_MEM_BASE_ADDR_ALIGN, sizeof(cl_uint), &entries, &ret_size); printf("\tBase Mem Align (bits):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_ADDRESS_BITS, sizeof(cl_uint), &entries, &ret_size); printf("\tAddress Space (bits):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_IMAGE_SUPPORT, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tImage Support:\t\t%d\n", bool_entries); clGetDeviceInfo(devices, CL_DEVICE_TYPE, sizeof(fp_conf), &fp_conf, &ret_size); printf("\tFloat Functionality:\t"); if (fp_conf & CL_FP_DENORM) printf("DENORM support "); if (fp_conf & CL_FP_ROUND_TO_NEAREST) printf("Round to nearest support "); if (fp_conf & CL_FP_ROUND_TO_ZERO) printf("Round to zero support "); if (fp_conf & CL_FP_ROUND_TO_INF) printf("Round to +ve/-ve infinity support "); if (fp_conf & CL_FP_FMA) printf("IEEE754 fused-multiply-add support "); if (fp_conf & CL_FP_INF_NAN) printf("INF and NaN support "); printf("\n"); clGetDeviceInfo(devices, CL_DEVICE_ERROR_CORRECTION_SUPPORT, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tECC Support:\t\t%d\n", bool_entries); clGetDeviceInfo(devices, CL_DEVICE_EXECUTION_CAPABILITIES, sizeof(cl_device_exec_capabilities), &exec_cap, &ret_size); printf("\tExec Functionality:\t"); if (exec_cap & CL_EXEC_KERNEL) printf("CL_EXEC_KERNEL "); if (exec_cap & CL_EXEC_NATIVE_KERNEL) printf("CL_EXEC_NATIVE_KERNEL "); printf("\n"); clGetDeviceInfo(devices, CL_DEVICE_ENDIAN_LITTLE, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tLittle Endian Device:\t%d\n", bool_entries); clGetDeviceInfo(devices, CL_DEVICE_PROFILING_TIMER_RESOLUTION, sizeof(size_t), &p_size, &ret_size); printf("\tProfiling Res (ns):\t%d\n", p_size); clGetDeviceInfo(devices, CL_DEVICE_AVAILABLE, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tDevice Available:\t%d\n", bool_entries); } } void testEdgeOutput(cl_uint *out) { for (int y=1; y<height-1; y++) for (int x=1; x<width-1; x++) { int index = (y * width) + x; if (out[index] != 1) printf("index[%d]=%d\n",index,out[index]); } } char* allocTypeToStr(int alloc_type) { char* result = "UNKNOWN"; switch (alloc_type) { case ALLOC_TILE: result = "ALLOC_TILE"; break; case ALLOC_HORZ: result = "ALLOC_HORZ"; break; case ALLOC_VERT: result = "ALLOC_VERT"; break; } return result; } int main(int argc, char * argv[]) { ////////////////////////////// // Init ////////////////////////////// runTimerKey = sampleCommon.createTimer(); IplImage* cvRaw = cvLoadImage("raw.bmp", 1); width = cvRaw->width; height = cvRaw->height; // clPrintInfo(); ////////////////////////////// // Serial (OpenCV) ////////////////////////////// // cvDoFindEdges(cvRaw); ////////////////////////////// // Parallel (OpenCL) ////////////////////////////// int repetitions = 6; int maxKernels = 128; for (int kernelCount = 1; kernelCount <= maxKernels; kernelCount=kernelCount*2) { printf("\n-----------\n"); printf("Threads:\t%d\n",kernelCount); for (int alloc_type = ALLOC_TILE; alloc_type <= ALLOC_VERT; alloc_type++) { printf("Allocation:\t%s\n",allocTypeToStr(alloc_type)); clInitializeHost(cvRaw); // Initialize Host application clInitialize(); // Initialize OpenCL resources for (int run = 0; run < repetitions; run++) { double runTime = clRunKernels(alloc_type,kernelCount); // Run the CL program printf("Run[%d]: %f\n",run,runTime); } clCleanup(); // Releases OpenCL resources clCleanupHost(); // Release host resources } } // IplImage* clSobel = clArrayToCvImage(output,width,height); // cvSaveImage("resultCL.bmp",(CvArr*)clSobel); return 0; }

0 Likes

KERNEL:

 

struct MyWork { uint kernelCount; uint tid; uint width; uint height; uint localWidth; uint localHeight; uint x_start; uint x_end; uint y_start; uint y_end; uint edge_x_start; uint edge_x_end; uint edge_y_start; uint edge_y_end; }; void initTiledWork(struct MyWork* myWork) { uint baseX = 0; uint baseY = 0; uint realWidth = myWork->width; uint realHeight = myWork->height; bool powerOf4 = ( (uint)log2(myWork->kernelCount) % 2) == 0; if (!powerOf4) { if (myWork->width > myWork->height) { myWork->width = ceil( (float)myWork->width / 2 ); if ( myWork->tid >= (myWork->kernelCount/2) ) { baseX = myWork->width; myWork->tid = myWork->tid - (myWork->kernelCount/2); } } else { myWork->height = ceil( (float)myWork->height / 2 ); if ( myWork->tid >= (myWork->kernelCount/2) ) { baseY = myWork->height; myWork->tid = myWork->tid - (myWork->kernelCount/2); } } myWork->kernelCount = myWork->kernelCount / 2; } myWork->localWidth = ceil( (float)myWork->width / (float)(sqrt(myWork->kernelCount)) ); myWork->localHeight = ceil( (float)myWork->height / (float)(sqrt(myWork->kernelCount)) ); myWork->x_start = baseX + ( myWork->tid % (uint)(sqrt(myWork->kernelCount)) ) * myWork->localWidth; myWork->x_end = baseX + min( myWork->x_start + myWork->localWidth , myWork->width); myWork->y_start = baseY + ( myWork->tid / (uint)(sqrt(myWork->kernelCount)) ) * myWork->localHeight; myWork->y_end = baseY + min( myWork->y_start + myWork->localHeight , myWork->height); myWork->edge_x_start = (myWork->x_start > 0) ? myWork->x_start : myWork->x_start + 1; myWork->edge_x_end = (myWork->x_end < realWidth) ? myWork->x_end : myWork->x_end - 1; myWork->edge_y_start = (myWork->y_start > 0) ? myWork->y_start : myWork->y_start + 1; myWork->edge_y_end = (myWork->y_end < realHeight) ? myWork->y_end : myWork->y_end - 1; } void initHorizWork(struct MyWork* myWork) { myWork->localWidth = myWork->width; myWork->localHeight = ceil( (float)myWork->height / (float)(myWork->kernelCount) ); myWork->x_start = 0; myWork->x_end = myWork->width; myWork->y_start = myWork->tid * myWork->localHeight; myWork->y_end = min( myWork->y_start + myWork->localHeight , myWork->height); myWork->edge_x_start = 1; myWork->edge_x_end = myWork->x_end - 1; myWork->edge_y_start = (myWork->y_start > 0) ? myWork->y_start : myWork->y_start + 1; myWork->edge_y_end = (myWork->y_end < myWork->height) ? myWork->y_end : myWork->y_end - 1; } void initVertWork(struct MyWork* myWork) { myWork->localWidth = ceil( (float)myWork->width / (float)(myWork->kernelCount) ); myWork->localHeight = myWork->height; myWork->x_start = myWork->tid * myWork->localWidth; myWork->x_end = min( myWork->x_start + myWork->localWidth , myWork->width); myWork->y_start = 0; myWork->y_end = myWork->height; myWork->edge_x_start = (myWork->x_start > 0) ? myWork->x_start : myWork->x_start + 1; myWork->edge_x_end = (myWork->x_end < myWork->width) ? myWork->x_end : myWork->x_end - 1; myWork->edge_y_start = 1; myWork->edge_y_end = myWork->y_end - 1; } __kernel void edgeDetectKernel( __global uint4 * input, __global uint * intermediate, __global uint * output, __global uint * clSobelOpX, __global uint * clSobelOpY, const uint2 inputOutputDim, const uint alloc_type ) { struct MyWork myWork; myWork.tid = get_global_id(0); myWork.kernelCount = get_global_size(0); myWork.width = inputOutputDim.x; myWork.height = inputOutputDim.y; switch (alloc_type) { case 0: //TILE initTiledWork( &myWork ); break; case 1: //HORIZONTAL initHorizWork( &myWork ); break; case 2: //VERTICAL initVertWork( &myWork ); break; } int y; int x; int index; for (y = myWork.y_start; y < myWork.y_end; y++) for (x = myWork.x_start; x < myWork.x_end; x++) { index = (y * inputOutputDim.x) + x; intermediate[index] = (input[index].x + input[index].y + input[index].z)/3; // output[index] = intermediate[index]; // output[index] = ((float)get_global_id(0)/(float)kernelCount)*(float)(255.0); } // barrier(CLK_GLOBAL_MEM_FENCE); mem_fence(CLK_GLOBAL_MEM_FENCE); int Gx; int Gy; for (y = myWork.edge_y_start; y < myWork.edge_y_end; y++) { for (x = myWork.edge_x_start; x < myWork.edge_x_end; x++) { index = (y * inputOutputDim.x) + x; Gx = (intermediate[index - inputOutputDim.x - 1] * clSobelOpX[0]) + (intermediate[index - inputOutputDim.x] * clSobelOpX[1]) + (intermediate[index - inputOutputDim.x + 1] * clSobelOpX[2]) + (intermediate[index - 1] * clSobelOpX[3]) + (intermediate[index] * clSobelOpX[4]) + (intermediate[index + 1] * clSobelOpX[5]) + (intermediate[index + inputOutputDim.x - 1] * clSobelOpX[6]) + (intermediate[index + inputOutputDim.x] * clSobelOpX[7]) + (intermediate[index + inputOutputDim.x + 1] * clSobelOpX[8]); Gx = abs(Gx); Gy = (intermediate[index - inputOutputDim.x - 1] * clSobelOpY[0]) + (intermediate[index - inputOutputDim.x] * clSobelOpY[1]) + (intermediate[index - inputOutputDim.x + 1] * clSobelOpY[2]) + (intermediate[index - 1] * clSobelOpY[3]) + (intermediate[index] * clSobelOpY[4]) + (intermediate[index + 1] * clSobelOpY[5]) + (intermediate[index + inputOutputDim.x - 1] * clSobelOpY[6]) + (intermediate[index + inputOutputDim.x] * clSobelOpY[7]) + (intermediate[index + inputOutputDim.x + 1] * clSobelOpY[8]); Gy = abs(Gy); output[index] = Gx+Gy; } } }

0 Likes

I guess you missed the header file. Also the .cpp file seems to be incomplete. These would help me try reproducing the issue. 

Also, you said creating multiple timers is causing the crash, is this happening only on your application, or with the samples which come with the SDK too?

0 Likes

Sorry, I copy-n-pasted all the file. Not sure if "attach code" has a size limit or maybe I just made a mistake. If it doesn't all show up this time it's "attach code" at fault...

EDIT: The second half of CPP is here

/*Display OpenCL system info */ void clPrintInfo() { int MAX_DEVICES = 10; size_t p_size; size_t arr_tsize[3]; size_t ret_size; char param[100]; cl_uint entries; cl_ulong long_entries; cl_bool bool_entries; cl_device_id devices[MAX_DEVICES]; size_t num_devices; cl_device_local_mem_type mem_type; cl_device_type dev_type; cl_device_fp_config fp_conf; cl_device_exec_capabilities exec_cap; clGetDeviceIDs( NULL, CL_DEVICE_TYPE_DEFAULT, MAX_DEVICES, devices, &num_devices); printf("Found Devices:\t\t%d\n", num_devices); for (int i = 0; i < num_devices; i++) { printf("\nDevice: %d\n\n", i); clGetDeviceInfo(devices, CL_DEVICE_TYPE, sizeof(dev_type), &dev_type, &ret_size); printf("\tDevice Type:\t\t"); if (dev_type & CL_DEVICE_TYPE_GPU) printf("CL_DEVICE_TYPE_GPU "); if (dev_type & CL_DEVICE_TYPE_CPU) printf("CL_DEVICE_TYPE_CPU "); if (dev_type & CL_DEVICE_TYPE_ACCELERATOR) printf("CL_DEVICE_TYPE_ACCELERATOR "); if (dev_type & CL_DEVICE_TYPE_DEFAULT) printf("CL_DEVICE_TYPE_DEFAULT "); printf("\n"); clGetDeviceInfo(devices, CL_DEVICE_NAME, sizeof(param), param, &ret_size); printf("\tName: \t\t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_VENDOR, sizeof(param), param, &ret_size); printf("\tVendor: \t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_VENDOR_ID, sizeof(cl_uint), &entries, &ret_size); printf("\tVendor ID:\t\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_VERSION, sizeof(param), param, &ret_size); printf("\tVersion:\t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_PROFILE, sizeof(param), param, &ret_size); printf("\tProfile:\t\t%s\n", param); clGetDeviceInfo(devices, CL_DRIVER_VERSION, sizeof(param), param, &ret_size); printf("\tDriver: \t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_EXTENSIONS, sizeof(param), param, &ret_size); printf("\tExtensions:\t\t%s\n", param); clGetDeviceInfo(devices, CL_DEVICE_MAX_WORK_ITEM_SIZES, 3 * sizeof(size_t), arr_tsize, &ret_size); printf("\tMax Work-Item Sizes:\t(%d,%d,%d)\n", arr_tsize[0], arr_tsize[1], arr_tsize[2]); clGetDeviceInfo(devices, CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(size_t), &p_size, &ret_size); printf("\tMax Work Group Size:\t%d\n", p_size); clGetDeviceInfo(devices, CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(cl_uint), &entries, &ret_size); printf("\tMax Compute Units:\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_MAX_CLOCK_FREQUENCY, sizeof(cl_uint), &entries, &ret_size); printf("\tMax Frequency (Mhz):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE, sizeof(cl_uint), &entries, &ret_size); printf("\tCache Line (bytes):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(cl_ulong), &long_entries, &ret_size); printf("\tGlobal Memory (MB):\t%llu\n", long_entries / 1024 / 1024); clGetDeviceInfo(devices, CL_DEVICE_LOCAL_MEM_SIZE, sizeof(cl_ulong), &long_entries, &ret_size); printf("\tLocal Memory (MB):\t%llu\n", long_entries / 1024 / 1024); clGetDeviceInfo(devices, CL_DEVICE_LOCAL_MEM_TYPE, sizeof(cl_device_local_mem_type), &mem_type, &ret_size); if (mem_type & CL_LOCAL) printf("\tLocal Memory Type:\tCL_LOCAL\n"); else if (mem_type & CL_GLOBAL) printf("\tLocal Memory Type:\tCL_GLOBAL\n"); else printf("\tLocal Memory Type:\tUNKNOWN\n"); clGetDeviceInfo(devices, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(cl_ulong), &long_entries, &ret_size); printf("\tMax Mem Alloc (MB):\t%llu\n", long_entries / 1024 / 1024); clGetDeviceInfo(devices, CL_DEVICE_MAX_PARAMETER_SIZE, sizeof(size_t), &p_size, &ret_size); printf("\tMax Param Size (MB):\t%d\n", p_size); clGetDeviceInfo(devices, CL_DEVICE_MEM_BASE_ADDR_ALIGN, sizeof(cl_uint), &entries, &ret_size); printf("\tBase Mem Align (bits):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_ADDRESS_BITS, sizeof(cl_uint), &entries, &ret_size); printf("\tAddress Space (bits):\t%d\n", entries); clGetDeviceInfo(devices, CL_DEVICE_IMAGE_SUPPORT, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tImage Support:\t\t%d\n", bool_entries); clGetDeviceInfo(devices, CL_DEVICE_TYPE, sizeof(fp_conf), &fp_conf, &ret_size); printf("\tFloat Functionality:\t"); if (fp_conf & CL_FP_DENORM) printf("DENORM support "); if (fp_conf & CL_FP_ROUND_TO_NEAREST) printf("Round to nearest support "); if (fp_conf & CL_FP_ROUND_TO_ZERO) printf("Round to zero support "); if (fp_conf & CL_FP_ROUND_TO_INF) printf("Round to +ve/-ve infinity support "); if (fp_conf & CL_FP_FMA) printf("IEEE754 fused-multiply-add support "); if (fp_conf & CL_FP_INF_NAN) printf("INF and NaN support "); printf("\n"); clGetDeviceInfo(devices, CL_DEVICE_ERROR_CORRECTION_SUPPORT, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tECC Support:\t\t%d\n", bool_entries); clGetDeviceInfo(devices, CL_DEVICE_EXECUTION_CAPABILITIES, sizeof(cl_device_exec_capabilities), &exec_cap, &ret_size); printf("\tExec Functionality:\t"); if (exec_cap & CL_EXEC_KERNEL) printf("CL_EXEC_KERNEL "); if (exec_cap & CL_EXEC_NATIVE_KERNEL) printf("CL_EXEC_NATIVE_KERNEL "); printf("\n"); clGetDeviceInfo(devices, CL_DEVICE_ENDIAN_LITTLE, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tLittle Endian Device:\t%d\n", bool_entries); clGetDeviceInfo(devices, CL_DEVICE_PROFILING_TIMER_RESOLUTION, sizeof(size_t), &p_size, &ret_size); printf("\tProfiling Res (ns):\t%d\n", p_size); clGetDeviceInfo(devices, CL_DEVICE_AVAILABLE, sizeof(cl_bool), &bool_entries, &ret_size); printf("\tDevice Available:\t%d\n", bool_entries); } } char* allocTypeToStr(int alloc_type) { char* result = "UNKNOWN"; switch (alloc_type) { case ALLOC_TILE: result = "ALLOC_TILE"; break; case ALLOC_HORZ: result = "ALLOC_HORZ"; break; case ALLOC_VERT: result = "ALLOC_VERT"; break; } return result; } int main(int argc, char * argv[]) { ////////////////////////////// // Init ////////////////////////////// runTimerKey = sampleCommon.createTimer(); IplImage* cvRaw = cvLoadImage("raw.bmp", 1); width = cvRaw->width; height = cvRaw->height; // clPrintInfo(); ////////////////////////////// // Serial (OpenCV) ////////////////////////////// // cvDoFindEdges(cvRaw); ////////////////////////////// // Parallel (OpenCL) ////////////////////////////// int repetitions = 6; int maxKernels = 128; for (int kernelCount = 1; kernelCount <= maxKernels; kernelCount=kernelCount*2) { printf("\n-----------\n"); printf("Threads:\t%d\n",kernelCount); for (int alloc_type = ALLOC_TILE; alloc_type <= ALLOC_VERT; alloc_type++) { printf("Allocation:\t%s\n",allocTypeToStr(alloc_type)); clInitializeHost(cvRaw); // Initialize Host application clInitialize(); // Initialize OpenCL resources for (int run = 0; run < repetitions; run++) { double runTime = clRunKernels(alloc_type,kernelCount); // Run the CL program printf("Run[%d]: %f\n",run,runTime); } clCleanup(); // Releases OpenCL resources clCleanupHost(); // Release host resources } } // IplImage* clSobel = clArrayToCvImage(output,width,height); // cvSaveImage("resultCL.bmp",(CvArr*)clSobel); return 0; }

0 Likes

Yeh, sorry, it doesnt fit.

Not sure about all the samples, most work, but I dont know how many of them start the timer multiple times?

Here's (at least some of) the header file:

 

#ifndef TEMPLATE_H_ #define TEMPLATE_H_ #include <CL/cl.h> #include <string.h> #include <cstdlib> #include <iostream> #include <string> #include <fstream> #include <SDKUtil/SDKCommon.hpp> #include <SDKUtil/SDKApplication.hpp> #include <SDKUtil/SDKCommandArgs.hpp> #include <SDKUtil/SDKFile.hpp> #include <stdio.h> #include <stdlib.h> #include </usr/include/opencv/cv.h> #include </usr/include/opencv/cxcore.h> #include </usr/include/opencv/highgui.h> typedef union my_uint4 { cl_uint u32[4]; } my_uint4; typedef union my_uint2 { cl_uint u32[2]; } my_uint2; /*** GLOBALS ***/ const bool PROFILE = false; const cl_uint ALLOC_TILE = 0; const cl_uint ALLOC_HORZ = 1; const cl_uint ALLOC_VERT = 2; const cl_uint MASK_WIDTH = 3; /**< mask dimensions */ const cl_uint MASK_HEIGHT = 3; /**< mask dimensions */ streamsdk::SDKCommon sampleCommon; int runTimerKey; cl_double totalKernelTime; /**< Time for kernel execution */ cl_double totalProgramTime; /**< Time for program execution */ cl_double referenceKernelTime;/**< Time for reference implementation */ // problem size for 1D algorithm and width of problem size for 2D algorithm cl_uint width; cl_uint height; my_uint4 *input; // Input data is stored here. cl_uint *intermediate; // Output data is stored here. cl_uint *output; // Output data is stored here. //// Sobel Operators are stored here. cl_uint clSobelOpX[9] = { -1, 0, 1, -2, 0, 2, -1, 0, 1}; cl_uint clSobelOpY[9] = { 1, 2, 1, 0, 0, 0, -1,-2,-1}; int cvSobelOpX[3][3] = {{-1, 0, 1}, {-2, 0, 2}, {-1, 0, 1}}; int cvSobelOpY[3][3] = {{ 1, 2, 1}, { 0, 0, 0}, {-1,-2,-1}}; // The memory buffer that is used as input/output for OpenCL kernel cl_mem inputBuffer; cl_mem intermediateBuffer; cl_mem outputBuffer; cl_mem sobelOpXBuffer; cl_mem sobelOpYBuffer; cl_context context; cl_device_id *devices; cl_command_queue commandQueue; cl_program program; /* This program uses only one kernel and this serves as a handle to it */ cl_kernel kernel; /*** FUNCTION DECLARATIONS ***/ // Utility funs void cvDisplay(IplImage* image, char windowName[], int x=0, int y=0); my_uint4 *cvImageToClArray(IplImage* raw); IplImage* clArrayToCvImage(cl_uint* output, int width, int height); // OpenCV related funs int cvDoFindEdges(void); IplImage* cvGenerateIntensityImage(IplImage* raw); IplImage* cvGenerateSobelImage(IplImage* intensity); // OpenCL related funs void clInitialize(void); std::string convertToString(const char * filename); /* * This is called once the OpenCL context, memory etc. are set up, * the program is loaded into memory and the kernel handles are ready. * * It sets the values for kernels' arguments and enqueues calls to the kernels * on to the command queue and waits till the calls have finished execution. * * It also gets kernel start and end time if profiling is enabled. */ void clRunKernels(void); /* Releases OpenCL resources (Context, Memory etc.) */ void clCleanup(void); /* Releases program's resources */ void clCleanupHost(void); #endif /* #ifndef TEMPLATE_H_ */

0 Likes

Originally posted by: alexaverbuch

 

It's also changed since I posted that error. But, to cause that error just go to where I start the timer, copy that line, and paste it a few times in succession

 

I am not getting any error with your code, with multiple startTimer calls.

0 Likes