The small kernel below which use an atomic_min operation fails on HD7970 (meaning the buffer C is not changed at all, can test with C = (0,1,2,3,4,5,6,7,8, ... 99) and global size 99). But if I unquote the printf(a); line, it works (meaning C is changed correctly)!? But using printf offcourse slows down the computation. I am currently using Ubuntu 12.04, AMD driver 12.9 beta and AMD APP 2.7
Note that this works on AMD 5870 and 5850, and NVIDIA cards.
Does anyone know a solution or workaround for this problem? Thanks.
__kernel void test(
volatile __global int * C
) {
int id = get_global_id(0);
atomic_min(C+id*2+1, C[id*2]);
//printf("a");
}
I also discovered that some compiler optimizations doesn't work on the HD7970 Tahiti device which results in the compiler seg faulting. Using the option "-cl-opt-disable" avoids this problem.
Solved! Go to Solution.
I found a solution to this problem. Storing the result of the atomic_min operation in a variable declared volatile helps:
__kernel void test(
volatile __global int * C
) {
int id = get_global_id(0);
volatile int i = atomic_min(C+id*2+1, C[id*2]);
}
What would happen if you remove "volatile"?
It doesn't affect anything
I found a solution to this problem. Storing the result of the atomic_min operation in a variable declared volatile helps:
__kernel void test(
volatile __global int * C
) {
int id = get_global_id(0);
volatile int i = atomic_min(C+id*2+1, C[id*2]);
}