It seems like I can use the GDS on Linux with ROCm without any problems. On Windows, my code that does just 'm0=0x1000; ds_write_b32 ...' works, yet 'm0=0x1000; ds_write_b32 ... gds' fails.
I've read numerous posts on this forum and elsewhere, discussing various ways of using the GDS.
One of the posts, by dipak, mentions that OpenCL does not use the GDS, so it is not initialized. It is not a part of OpenCL spec, so why would it be initialized. Another post says that I need to send a PM4 packet to initialize it, I think. There are more posts about the GDS, but none of what I've read has worked for me so far.
What do I do if I want to use GDS in a reliable way on Windows, or on Linux without ROCm?
Any help will be highly appreciated.