Maybe I'm just being an idiot, but I can't get any program with async_work_group_copy to compile.
Can anyone say why this kernel wouldn't compile?
__kernel void test( __global uchar *src, __local uchar *dst ) { event_t e = async_work_group_copy(dst, src, 4, 0); }
what is better to copy from global to local. async_copy+async_wait or normal copy + barrier.
Originally posted by: nou what is better to copy from global to local. async_copy+async_wait or normal copy + barrier.
Well, since I can't get async_work_group_copy to work at all, doing a normal copy is definitely faster.
Originally posted by: omion Originally posted by: nou what is better to copy from global to local. async_copy+async_wait or normal copy + barrier.
Well, since I can't get async_work_group_copy to work at all, doing a normal copy is definitely faster.
Please use like following it works fine.
async_work_group_copy(dst, src, (size_t)4, 0);
It looks like compiler forces user to use explicit casing.
Thanks for reporting this issue.
Originally posted by: genaganna
Please use like following it works fine.
async_work_group_copy(dst, src, (size_t)4, 0);
It looks like compiler forces user to use explicit casing.
Thanks for reporting this issue.
It works! Thanks for the response.