AnsweredAssumed Answered

Bug in AMD OpenCL implementation?

Question asked by ddemidov on Apr 16, 2013
Latest reply on Apr 19, 2013 by himanshu.gautam

Hello,

 

I am not sure if this is the right place for OpenCL bug reports, so please forgive me if I am wrong. Here is the link to the simple program that should add two vectors multiple times: https://gist.github.com/ddemidov/5398174. The source is also attached here for convenience.

 

This simple program, when compiled with

 

    g++ -std=c++0x -o vector_sum vector_sum.cpp -lOpenCL

 

outputs 4096 == 4096 on NVIDIA and Intel OpenCL implementations. When, however, it is executed on AMD GPUs (the ones I tested are HD 7970 'Tahiti' and HD 7770 'Capeverde'), it may output 4096 == 4081, 4096 == 4082, or something else.

 

Adding call to cl::CommandQueue::finish() after each kernel launch (but not after the complete loop) solves the issue, but should be unnecessary according to standard.

 

Replacing definition of global_size at line 99 with

 

    size_t global_size = alignup(N, workgroup_size);

 

also helps, but is equally unnecessary.

 

The current operating system is Gentoo linux, kernel version 3.7.1. ati-drivers package has version 13.1. But I have observed this behavior on several machines for several consecutive versions of ati-drivers (and several linux kernels).

 

Is this a bug in AMD OpenCL, or am I doing something wrong?

Attachments

Outcomes