cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

andyste1
Adept II

OpenCL/DirectGMA Problems on Win11?

Our Windows desktop software runs a 3rd-party PCI acquisition card that acquires a stream of data in the form of "records". Each record gets transferred to the GPU (Radeon Pro WX7100) via DirectGMA, where the record data is processed and a small number of results transferred back to the host software. The PCs are Dell Precision (Xeon) 5860 with Win10, and all of this has been working fine on many customer sites for a number of years.

Dell recently started shipping the PCs with Win11, which we got around to testing last week, however we've found that our software isn't working on these. We've tested on two different PCs using different acq. cards and GPUs, with the same problems every time. Everything works fine again when we re-image to Win10. We usually install the AMD "Pro Edition" driver v22.Q4 but also tried 24.Q4 to no avail.

 

I was reluctant to list the problems in any great detail here, as I suspect they are all merely different symptoms of whatever is going on at a lower level, and don't want to muddy the waters. Here goes anyway:

When the software starts the acquisition process, it seems to fail within half a second, or run for a couple of minutes before failing. Weirdly, the very short runs report impossibly high data throughputs (~140Gb/s) while the very long ones report very low throughputs, possibly suggesting a problem with DirectGMA or the "clEnqueueWaitSignalAMD" mechanism? The software isn't crashing as a result of (say) an OCL command that fails to execute. Instead, it crashes because it eventually receives a "corrupt" buffer from the GPU (full of 0's rather than "valid" results). Also strange is that event profiling shows an average execution time of 0 for every cl... command (although it sometimes reports an unfeasibly high avg execution time of around 1.8^10s for clEnqueueWaitSignalAMD!).

When running an acquisition we were also occasionally seeing black AMD error message boxes pop up. It might have been the "driver timeout" error, but I can't be certain (unfortunately I didn't take a screenshot, and since re-imaging to Win10 it'll be a while before I can test anything again). 

 

To expand a little on what our software is doing, it's essentially a "read loop" like this:

 

While not received all records

    clEnqeueWaitSignalAMD (wait for next record to arrive from acq. card)

    Execute several kernels used to process the received data

    clEnqueueReadBuffer (to transfer the results to the host software)

    Host software processes these results

end while

 

I haven't found anything specific online other than complaints about AMD drivers being replaced by Windows Updates (which isn't happening here), so I'm after any suggestions or pointers on where to start looking...

 

Thanks in advance.

 

 

5 Replies

AMD Forum Moderator for Pro GPU cards can probably assist you with your issues with that program and Pro GPU cards @fsadough 

elstaci
0 Likes
fsadough
Moderator

This is an issue to be investigated by our Software team. I would need detailed information so I can file an internal ticket for our SW team.

 

Can you please provide me with an AMDZ Report using Radeon Pro WX7100 on Win 11?

AMDZ Report
- Please extract the amdz-v353.zip available from https://we.tl/t-HpaFhHqeQX

- Run amdz.exe file as an Administrator
- Select Save All and TXT as the output format
- Click on the blue button to save the report
- The .txt file will be saved in the same folder where you extracted the zipped file

Thanks. Will do, but it might be a while before I can get back to you as we'll need to source another PC (the two we were using last week have now shipped to customers).

Please download the tool asap though. 

0 Likes

As requested I have uploaded the AMDZ text file here 

Let me know if any issues accessing the link.