Sorry for late reply.
Still getting compilation errors.
My work environment :
using Visual studio 2010.
AMD APP SDK 2.9 with catalyst 13.11beta.
In VS2013 I also saw warnings about calloc/malloc, but after including malloc.h it gone. Which error you see?
In attached video my new experiment - instead last propagation I simple copy data from layer 1 to local buffer and from buffer to layer 2. In straight order it shows from 0 to 6 errors on Devstator, with reverse order it also appended 63 errors - seems like it early accessed global data that is waiting computation and have not yet filled. And the same reason is why it sometimes got bad data from 1st propagation, sometimes not - sometimes it have time to finish, sometimes not. There is a question - how to properly sync computation for guaranteed sharing data between work-items? Global barriers not helps in this case.
all sources attachment