Is there a way to do atomic operations on texture values? I need to do atomic updates on pixel vailes. Each pixel could possibly be updated by different threads from the same or different workgroups. The atomic operations are only allowed for global or local memory variables. There might be a way to create critical sections for each pixel, but I don't think it will be efficient.
I need to read values from image, the do some maths operation on it, and write it back. Need something like atomic_add on pixel value instead of global or locals.
How do you write back the pixel currently? I am assuming you are using "write_image" and if I understand correctly, you are looking for some variant of write_image() which can add atomically as well.
I don't think this can be done using OpenCL as of now.
I can pass on the feature-request to some one who can talk to Khronos.... But I need an usecase for this.
Can you tell me what kind of application you are working on? And, how frequently this feature is needed in your problem domain.
As a workaround, you can atomically write to global memory and then copy that global memory into an image object later for usage as an image.
I am using image because it's faster and it also handles the out of bound pixels correctly. The workaround sure would work, but it is slower and I have to handle the boundary situation.
hmmm, That may be useful, but we need to actually study why images are faster than global buffers. IIRC, that is because of the use of 2-D texture cache , which better exploits the spatial memory access pattern for image processing algorithms. I will ask some experts, if atomics and cl_images can go hand in hand.
In case it is possible, i would like it to be in OpenCL spec (so it should come via KHRONOS), instead of a amd specific extension. I will update you, if i hear something in this regard.
The problem is that images also come with filtering. Filtering is a scenario where multiple reads are performed. So while it is not impossible to do atomics on images (and I couldn't comment on roadmaps of either Khronos or AMD in that regard) it is certainly not a simple operation and might well never make sense for a lot of algorithms if it limited the types of images that would work. For example, if enabling atomics disabled filtering, would images then have enough benefits over simply using a buffer for the same purpose? When you consider that atomics may have to read/write more distant cache levels (to achieve memory consistency) you might also lose the additional caching benefits of images.
On modern architectures for pure read/write scenarios I don't think images offer much of a performance benefit outside of bounds checking, which is only usually significant overhead on relatively short kernels.