Hi,
as updates to older threads don't seem to receive a lot of attention, I'm creating this new one. I'm asking to head over to the original thread and help me find a workaround or solution to this "behavior" which I'd call an AMD-OCL-compiler bug (until proven otherwise ;-).
Thanks
Sorry for the delay. The team is looking into the issue. I'll keep you posted.
Thanks for trying the use-case. What you tested is, that vectorizing the data is a valid workaround:
VectorSize | 4 |
In order to reproduce the failure, please copy the mfakto.ini file from use-case.zip into your local directory (...\use-case-src\x64\Debug). This will provide the required settings (e.g. VectorSize 1).
Yes, the program fails when vector size is made 1. However the code is quite big and I am unable to say if it is failing due to some bug in the code itself or due to a bug in driver/API implementation.
If you think that the bug is in driver/ lower level api implementation, it would be great help if you can capture the bug in a simpler code, which just illuminates this bug.
Well, the fact that adding another printf statement makes the code work well kind of proves that it is a lower-level bug. These dependencies to other code parts made it impossible for me to come up with a simpler use-case. But I will try again.
Reviving the thread. Can you try to create a small reproducible use case?