17 Replies Latest reply on Nov 16, 2011 9:52 PM by kphillisjr

    Multi-GPU broken with SDK 2.5

    gat3way

      Previously we had GPU_USE_SYNC_OBJECTS environment variable and it apparently does not work now. We have again those spinlocks in the runtime and the 100% CPU usage problem..performance drops. Thank you, but I am sticking with 2.4 until that's solved.

      bitselect() still not mapped to BFI_INT. Why?

      The BFE_UINT optimization (which is mentioned in the docs) for some reason is slower when it operates on values from __local memory, for some reason additional MOV instructions are generated and now some of my kernels are slower. Because MOV+BFE is slower than LSHR+AND.

      offline compilation now broken too.

      I am rather disappointed :(

        • Multi-GPU broken with SDK 2.5
          MicahVillmow
          gat3way,
          Can you give an example on how offline compilation is broken?
            • Multi-GPU broken with SDK 2.5
              nou

              someone on forum reported that even offline compilation example from AMD knowledge base is broken.

                • Multi-GPU broken with SDK 2.5
                  gat3way

                  Correct, clBuildProgram() either crashes or returns error with empty log.

                  strace shows that it is trying to open /usr/lib/libatiocl32.so (even though I am using the 64-bit runtime)....also atiocl32.so? Hmmm

                  OK this was related to previous ICD profiles I believe...still can't get offline devices compilation done though :(

                    • Multi-GPU broken with SDK 2.5
                      gat3way

                      I tried building for particular devices only - it either crashes or returns errors as well. This is rather annoying. The bad thing is that I can't even stay with 2.4 for kernel compilation then use the 2.5 runtime as it has this GPU_USE_SYNC_OBJECTS_NOT_WORKING problem which basically kills any performance benefits gained from reduced kernel launch latency/host-device transfers and renders overall performance worse. SDK 2.5 becomes a big problem at least for me. Obviously I don't own all kinds of AMD hardware so that I can build binaries. And I also have no way to get multi-gpu running seamlessly like it did in 2.4 and 2.3.

                      I don't know if that's linux problem only, probably on windows, those work, probably not.

                      I know you are all focused on those fancy new APUs, but please do not break anything that used to work :(

                      Well sorry for ranting, but that basically killed my enthusiasm for the new SDK, I expected things to improve, instead things that used to work are now broken :(

                        • Multi-GPU broken with SDK 2.5
                          genaganna

                           

                          Originally posted by: gat3way I tried building for particular devices only - it either crashes or returns errors as well. This is rather annoying. The bad thing is that I can't even stay with 2.4 for kernel compilation then use the 2.5 runtime as it has this GPU_USE_SYNC_OBJECTS_NOT_WORKING problem which basically kills any performance benefits gained from reduced kernel launch latency/host-device transfers and renders overall performance worse. SDK 2.5 becomes a big problem at least for me. Obviously I don't own all kinds of AMD hardware so that I can build binaries. And I also have no way to get multi-gpu running seamlessly like it did in 2.4 and 2.3.

                           

                          I don't know if that's linux problem only, probably on windows, those work, probably not.

                           

                          I know you are all focused on those fancy new APUs, but please do not break anything that used to work :(

                           

                          Well sorry for ranting, but that basically killed my enthusiasm for the new SDK, I expected things to improve, instead things that used to work are now broken :(

                           

                          gat3Way,

                          You are facing two problems 

                              1. GPU_USE_SYNC_OBJECTS  not working

                              2. offline compilation issues

                                  Could you please run following and let me know what is happening?

                                      ./Reduction --dump binaryName

                          Could you please give us following information also?

                               OS, Driver version, CPU and GPU?

                            • Multi-GPU broken with SDK 2.5
                              gat3way

                              Tried it - got segfault too.

                               

                              OS: Debian Testing

                              Driver version: Catalyst 11.7

                              CPU: AMD Phenom x4

                              GPU: AMD Radeon HD 6870

                                • Multi-GPU broken with SDK 2.5
                                  gat3way

                                  For anyone interested: I made offline compilation work finally!!!

                                   

                                  Looks like the compiler crashes for those three targets:

                                  * Lions

                                  * Bears

                                  * Tigers

                                   

                                  I don't even know what those are (future 7xxx GPUs?).

                                   

                                  Anyway, the trick is to create a context using all offline devices, then do clBuildProgram for each one of them, excluding those three.

                                   

                                  Now, the GPU_USE_SYNC_OBJECTS problem is the other thing that we need to discover a workaround for :)

                                    • Multi-GPU broken with SDK 2.5
                                      genaganna

                                       

                                      Originally posted by: gat3way  

                                       

                                      Now, the GPU_USE_SYNC_OBJECTS problem is the other thing that we need to discover a workaround for :)

                                       

                                      GPU_USE_SYNC_OBJECTS issue will be fixed in upcoming drivers. Please see release note of driver whether it is fixed or not.

                                        • Multi-GPU broken with SDK 2.5
                                          Meteorhead

                                          I have multi-gpu issue also, namely it crashes the computer alltogether. I tested luxmark as a multi-gpu benchmark tool, and it works alright without setting COMPUTE=:0, but when it is set, first it instantly froze the machine, second, I saw a few corrupted images rendered by the kernels before OS crashed.

                                          (The glossy ball image is either fuzzy at the start, or completely black. However I saw vivid random color pixels on the rendered image, and only there, so most likely it was not frame buffer corruption, but kernel output itself.)

                                          OS:Ubuntu 10.04.3 64-bit LTS, Catalyst 11.8, SDK 2.5

                                          ps.: luxmark in CPU mode performed flawless.

                                          ps.2:I have not tried using GPU_USE_SYNC_OBJECTS.

                                            • Multi-GPU broken with SDK 2.5
                                              Meteorhead

                                              Could someone reassure me that this issue is not only on my side? I would very much like to know whether I should wait for a driver, or revert back to SDK 2.4 until next SDK comes out?

                                                • Multi-GPU broken with SDK 2.5
                                                  quadboon

                                                  The best combination for me is Cat 11.4 + SDK 2.4. It works on Windows and Linux. It does not generate false positives as seen on Cat 11.5. It works well with Cayman Devices as Cat 11.6 does not. And it does not generate 100% load on CPU on Linux (but on Windows) as Cat 11.7 and Cat 11.8 does. Multi-GPU works if you set GPU_USE_SYNC_OBJECTS to 1.

                                                    • Multi-GPU broken with SDK 2.5
                                                      Meteorhead

                                                      Thanks for the info, I'll try reverting then. The reason I want to get it working so badly, is because of the cached reads enabled by default, which brought about 50-75% increase in multiple OpenCL applicaions. It's a huge boom in performance, shame it came accompanied by this multi-gpu messup.

                                                      All I wanted was someone official to state: yes, we are aware, it is screwed up, expect a fix in the next driver OR SDK. I'm curious about the 'OR' part, which one should we wait for?

                                                        • Multi-GPU broken with SDK 2.5
                                                          quadboon

                                                          I've just tested new Catalyst 11.9. This annoying 100% CPU bug still exists on both, Linux and Windows. Rumours saying its fixed on Windows are fake.

                                                          To AMD: How is this possible? Anything we (the users) can do fix this problem? Maybe some donations?

                                                           

                                                            • Multi-GPU broken with SDK 2.5
                                                              hashman

                                                              A bump.

                                                              Any response from AMD?

                                                              Can you at least confirm the issue exists and you are working on a solution?

                                                              • Multi-GPU broken with SDK 2.5
                                                                genaganna

                                                                 

                                                                Originally posted by: quadboon I've just tested new Catalyst 11.9. This annoying 100% CPU bug still exists on both, Linux and Windows. Rumours saying its fixed on Windows are fake.

                                                                To AMD: How is this possible? Anything we (the users) can do fix this problem? Maybe some donations?

                                                                 



                                                                We found few more issues on windows. we are working on this.

                                                                  • Multi-GPU broken with SDK 2.5
                                                                    kphillisjr

                                                                    I can confirm this bug on windows also, and it's using Catalyst 11.10 with a Radeon HD6720G2 ( hybrid crossfire solution). Anyways, the software is one of the projects supported by amd, namely Bullet Physics ( version 2.79-rev2440 ). The only fix i could find was to Copy to the CPU and back to the graphics card in opengl.