Looks like AMD do not take us seriuosly .
When a smallest bug accur in one of games - they work very hard to release patches and hotfixes , because gamers ARE valued customers .
And I ? I've bought more AMD graphic cards than most of other people on planet , I know people who bought DOSENS of these cards for bitcoin mining purposes : we ARE a huge comunity with lot of buying power - why we are not treated as valued customers ?!
yes, bitcoin mining is dying.
but, AMD did not take us seriuosly even at the time when bitcoin mining scene was healthy. i dont see a relation.
i am wondering how it is possible to organize and make so much noise (with advertisement) for an contest like the one on topcoder.com. i guess the participants on there just do not know how unstable AMDs OpenCL runtime or the Catalyst driver is.
This is a real shame. Not even one comment from AMD here if I'm not mistaken...
Perhaps AMD is giving up against CUDA, which is realy coming along. And now there is OpenAcc, too. Guess AMD + Linux isn't meant to be. If this bug isn't fixed within 2 weeks, AMD is dead for me for good.
Catalyst 11.11 for linux partially fixes the problem. CPU load is way lower now (~30% as compared to ~100% previously). GPU_USE_SYNC_OBJECTS is still a noop, thus this works regardless how the environment variable was set.
I almost regained the pre-2.5 speeds which is good. Well, CPU utilization is still much higher than before (~30% vs 5-6%) but it's much better now. This catalyst version is the best thing that came out of AMD for the last 6 months.
Another thing to mention (sorry for hijacking the thread) - somehow the IL compiler was improved and it does not generate excessive MOV instructions when working with long types. One of my kernels is doing SHA-512, where all operations are 64-bit, rather than 32-bit. Going from 11.10 to 11.11 increased the speed by more than 100%, which is very nice 🙂
Still I have that problem with the BFE_UINT optimization ((a&b)>>c). For some reason, there are now excessive MOVs here which have no valid explanation. With 2.4, this ended up as AND+SHR, now we have MOV+BFE, sometimes even two MOVs and a BFE. It's just slower. It hurts the performance of my DES kernels 😞