Hi Everyone,
I recently did the upgrade from Fedora38 to Fedora39 - which changed rocm-opencl-5.5.1 to rocm-opencl-5.7.1.
I have a very basic OpenCL kernel doing a comparison of 64 bit integers against a large (14GB) array in the GPU global memory. The kernel iterates through 3,500 64 bit integer values and compares those values against the global_memory_array[global_id(0)]. If there is a match then a value is returned. As I said... very very basic kernel for massive parallelization.
Under rocm-opencl-5.5.1 I was ticking along at 1.2 Trillion comparisons per second on a 6900XT.
With rocm-opencl-5.7.1 this crashed down to 320 million comparisons per second.
No improvement with downgrading the kernel - so I have reinstalled Fedora 38 with rocm-opencl-5.5.1 and everything is working properly again.
Is anyone aware of changes in OpenCL drivers recently that would explain the reduction is performance of basic 64_bit_integer == 64_bit_integer in global memory comparisons?
Thanks,
Anthony