on my HD4850, I use opencl to write a Video post-processing kernel, and allocation some local memory in the kernel, found a issue:when use a 800x600 globalThreads only the beginning few lines be processed. when i removal the local memory allocation, it works correct.
I can reproduce this issue use the sample "MatrixTranspose" in the sdk. Change the command line to: -t -x 800 -y 800 -e, to process a 800x800 matrix. Check the result, only the first 200 line be transposed.
I know, on RV770 the local memory is simulation by the global memory.Dose this cause the issue?