cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

nibal
Challenger

Ocl 1.2 memory corruption?

Hi,

I am attaching cltest.tgz which contains the build directory, cltest, with the sample problem, clinfo, and val.out which is the valgrind output showing that this corruption is deep within ocl1.2. To build the test case, just run the included script:

makecl

in any Ubuntu box.

In my test case the corruption affected the out buffer (output of the fft splitter) so I have bracketed its addresses with printfs. In my main program it is elsewhere. If it is elsewhere with your as well, just comment out the "NIKOS!!!" printfs.

Reproducible: Always

Observed symptoms: When I try to free the allocated address on exit, it drops a core:

(https://www.dropbox.com/s/7hzs968h8ra98pa/cltest.tgz?dl=0 

0 Likes
5 Replies
nibal
Challenger

Please disregard.

I continued the bracketing approach after I posted this and found the memory corruption to be in my code, not ocl's.

Don't quite understand why valgrind would post this trace stack, but I trust more my printfs:)

Sorry about the confusion:(

Nikos

0 Likes

Actually I continued the investigation with the bracketed printfs. Seems that 8 B invalid write is in the waitForEventAndRelease

bracketed by NIKOS2 & NIKOS3. The strange thing is that it is in a loop, but happens only once in the second pass.

It's always reproducible. You can download the test case, valgrind output and clinfo from:

https://www.dropbox.com/s/88ug7lvqazy440a/cltest.tgz?dl=0

0 Likes

Any access to more waitForEventAndRelease's will generate invalid reads and drop cores.

Seems we cannot use events in AMD's ocl1.2:(

0 Likes

As I can see, waitForEventAndRelease() contains these two OpenCL calls - clWaitForEvents and clReleaseEvent

Could you please try a simple test-case like the below code snippet to see if the issue is reproducible?

 

cl_event event;

for(int i = 0; i < N; i++) {

    clEnqueue<command>(..., &event, ..); // generate an event using any command

    clWaitForEvents(1, &event); 

    clReleaseEvent(event);

}

 

0 Likes

Hi,

Thanks for looking into it,

New sources uploaded to https://www.dropbox.com/s/a3bxb86a49z3a21/cltest.tgz?dl=0

You just expand the archive:

-> cd cltest

-> makecl

-> cltest

Cltest, the new executable is much simpler. Just contains 2x cl calls to clEnqueMemObject and clEnqueUnmap along with their clWaitForEvents and clReleaseEvent.

val.out is the valgrind output, and clinfo.txt my clinfo

setupCL and shutdownCL are still called to setup the 2 buffers and clear them up. No cl kernel is used.

The original invalid 8 bit write is still there and seems to be bracketed by the first clWaitForEvents. It happens only once.

This is always reproducible. I was not able to observe the invalid reads on subsequent events (omap), even when I looped it 1000x and was not able to reproduce the strange effect of happening on the 2nd pass.

Commenting out the loop, didn't generate the invalid write (setupCL and shutdownCL are clear). However, just commenting  out the clWaitForEvents and clReleaseEvent, generated more invalid writes. It could be that the problem might be in the clMapMemObject, happens only once, and with clWaitForEvents delays it just enough to appear with it.

No crashes or cores this time, but should I be concerned about the quality of the output?

0 Likes