38 Replies Latest reply on Jan 17, 2012 9:05 AM by nou

    SDK 2.6 issues/questions

      placeholder for all


      First of all let me start by saying that new runtime and SDK looks neat. Some release notes onto the webpage concerning new functionanlity would be nice, but it's still 13th, so no harm done. I have checked a few things, and here are my findings:

      The issue about the needless insertion of atomics into one of my programs that should have only required sync now works without data becoming corrupt. I will need to verify this other than my laptop, but preliminary tests seem promising.

      New samples are very neat, bug thanks for them (they'll come in handy very soon for my project). I would like to ask for one additional sample, that is not that simple, however many people would utilize it: as a followup of rendering to image, it would be nice to see some example of using built-in functions for encoding the image straight to H264 and have it saved as an avi on HDD.

      SimpleGL example was broken (INVALID_KHR_SHREGROUP) in SDK 2.5. It did not bother me that much, as it was not crucial to me up so far. Now I'm starting to develop CL_GL share into a project of mine, but I have noticed that performance dropped significantly from SDK 2.4. Previously on my Mobility 5870, under Ubuntu 800-1200 FPS was normal, now it is 400-500 FPS. What hurts performance so much? If it is some new functionality concerning CL_GL share that comes at this cost, then I guess it is inevitable, but I think this is worth looking into.

      I have not been able to check the reported booting issue on multi-GPU system, but I'll try to make it to the research center on thursday and check that also. Hope the system instability while running multi-GPU applications is a persistent solution.



        • SDK 2.6 issues/questions

          as you mentioned CL_GL interoperability issues. which method do you use to install drivers and which version of Ubuntu do you use?

          because i had/have another issue with both 2.5 and 2.6 and CL_GL interoperability. it return INVALID_GL_OBJECT error after creating CL object from GL object.

          i must set LD_LIBRARY_PATH=/usr/lib/fglrx as workaround. without it driver just open a /usr/lib/libGL.so.1 so and try call OpenGL from mesa lib.

            • SDK 2.6 issues/questions

              Where can I download SDK 2.6? The download page still has link to 2.5 eventhough md5sum is updated. Also, I just used http://developer.amd.com/Downloads/AMD-APP-SDK-v2.6-lnx64.tgz address directly but got a version with RC3 instead of the released version.


              • SDK 2.6 issues/questions

                Right now I just installed the driver over the old one, and it works fine. I do not set LD_LIBRARY_PATH, as under Ubuntu as far as I know it's effect do not apply, as it does not use this variable, but I may be wrong on this point.

                I also saw that APP SDK page got reverted to 2.5 which is strange. RC3 is actually te version you are looking for. SDK2.5 was RC2 the final version that got released.

                I hope it is not that SDK was found to have something in it and that is why they removed it from the download page. Some info would be appreciated.

                  • SDK 2.6 issues/questions

                    OK, I have checked. Catalyst 11.12 still fails to boot multi-GPU system. Once system is installed, I run the driver installer, issue "aticonfig -f --initial --adapter=ALL" and if I press ctr+backspace to logout, XServer cannot start and system hangs. Rebooting the system gets me as far as "Starting udev..." at boot time, and system hangs with a black screen upon trying to start X.

                    This all happens on SLC5.7, and before you tell me that SLC is not supported, the same thing happens with Ubuntu 10.04.3 LTS 64-bit, with the sole exception that XServer manages to start itself roughly 20% of the times, in other cases GUI hangs, or if I'm fast enough I can press ctrl+alt+f1 to get to init1 and do some console recovery of xorg.conf . This I cannot do with SLC.

                    Now since I got exams coming up (not to mention holidays, but that is quite universal for everyone) I really haven't got the time to **ck around installing different distros of linux. SLC works fine with Catalyst on my notebook, but it just wrecks the test machine.

                    Machine is: CPU Core-i7 920, Motherboard ASUS P6T6 Revolution, 12GB RAM, 3X HD5970

                    This is starting to become unbelievable. I haven't had the chance to run simulations properly on the computer for the past 2 months, since drivers are crap. I want to finish writing a paper, but I can't do any test runs. I'll just switch over to the (multi-GPU) Fermi test machine, as that at least works fine with SLC installed.

                    SLC is free to download, so feel free to reproduce the issue, but I have a guess that most distros will behave like this. For once it would be nice to see a Catalyst hotfix (11.12b for eg.) that do not fix game issues, but ******* bugs like this.

                      • SDK 2.6 issues/questions

                        the best is to make drivers opensource, one can check vcs for fix(s)

                        don't talk about spy(s) that wait for looking at driver source code

                        hardware spy(s) use hardware tools to spy

                          • SDK 2.6 issues/questions

                            Meteorhead: did you tryed run it without xorg.conf? i read somewhere that you should nt using xorg.conf at all as it is deprectated and aticonfig produce incorrect config (thouth that may be in case of multihead setup)

                              • SDK 2.6 issues/questions

                                No, I have not tried yet, as it was never mentioned anywhere, that it is deprecated. I browse throguh the forum roughly every 30 minutes, read through most of the interesting stuff, help people where I can, but I have not come across this information.

                                Driver release notes mentioning such "subtle" changes would be nice.

                                Monday I'll get back to the institue and try recovering xorg.conf with an Ubuntu live pendrive.

                                  • SDK 2.6 issues/questions

                                    Someone official or unofficial please help me: how on Earth can you get multi-GPU running? All the features of SDK 2.6 seem kick@ss, but before I compliment on the great work, someone tell me what the trick is.

                                    "aticonfig --initial -f --adapter=ALL" simply breaks the machine, having it unconfigured (default Xorg.conf) makes "aticonfig --list-adapters" say that it fails to connect to local display, and having it default aticonfig setup (with just one adapter configured) leaves me with only one GPU recognised by opencl.

                                    So someone tell me how to get this multi-GPU support for 5970 working under linux, because I have not found the way.

                                    As a sidenote, is there any way to workaround having one user logged into the machine to use OpenCL? It is specially annoying, that ONLY THAT user can run OpenCL programs, that is logged in. This way I have to create a guest user for everyone to use for running simulations, not to mention having auto-login if I would like this to persist after a reboot. (This is far from being proffessional)

                                      • SDK 2.6 issues/questions

                                        rm /etc/X11/xorg.conf and try again.

                                        it is definitly possible as here http://www.luxrender.net/wiki/LuxMark_Results are few multiGPU system which works. also maybe you just need migrate to Ubuntu as it is officialy supported distro which can be important.

                                        there was pdf about running CAL program via ssh. and core part about that was allow remote acess to Xserver and export DISPLAY=:0


                                          • SDK 2.6 issues/questions

                                            Does the Multi-GPU work with Xinerama enabled now? I had multiple GPUs working in Linux before, but only if Xinerama wasn't enabled.

                                              • SDK 2.6 issues/questions

                                                don't enable Xinerama. it is deprectated old way to create one multidisplay. use xrandr that is now supported way.

                                                • SDK 2.6 issues/questions

                                                  I haven't received any replies questions on multi-GPU in this thread:

                                                  I wanted to follow up in this thread.

                                                  Using the following system configuration:
                                                  CentOS 6.0 64-bit
                                                  Catalyst 11.12
                                                  AMD APP SDK 2.6
                                                  Two Radeon HD 6970s
                                                  Environment variables: COMPUTE=:0 and DISPLAY=:0

                                                  I am able to run codes with single context, multiple GPUs (2x6970), but the runtime appears to serialize the kernel execution with Catalyst 11.12 + SDK 2.6.  Codes take twice as long to execute as they should, however, they do produce the expected result.  This did not occur with the 11.4 driver + 2.4 runtime in most cases.

                                                  AMD had multi-GPU kernel execution serialization in Catalyst up until 10.4 (yes, 2010) when they fixed it.  The drivers then worked properly, with respect to multi-GPU kernel execution, for the next year until Catalyst 11.4.  Now we're getting pretty close to a year without a proper driver.  The GCN "Tahiti" architecture is very much compute-oriented and there's little incentive for anyone to put more than one of these in a system if they're too much of a challenge to develop.  It's not like developers can fall back to an older version of Catalyst with the new uarch.

                                                  To utilize multiple GPUs with Catalyst 11.12 + SDK 2.6 + Linux, one must create X OpenCL contexts, each addressing one unique GPU device, where X is the number of GPUs in your system.  This is counter to the premise of the OpenCL specification but the separate contexts/devices may then execute kernel code in parallel.

                                                  I haven't had serious trouble booting with multiple GPUs in RedHat-based Linux for some time.  I recommend booting to the command line (runlevel 3) to install drivers.  After installing drivers, run "aticonfig --adapter=all --initial -f".  Reboot just to be safe.  I've never had to play with Xinerama or xrandr.

                                                  If anyone does find the magic formula for proper asynchronous kernel execution in Catalyst >=11.12 + SDK 2.6 + Linux + single-context + multi-GPU, please let everyone know.

                                                    • SDK 2.6 issues/questions

                                                      Enabling the cl_khr_gl_sharing flag still comes up with an error:

                                                      error: can't enable all OpenCL extensions or unrecognized OpenCL extension
                                                        #pragma OPENCL EXTENSION cl_khr_gl_sharing : enable

                                                      ...OpenGL sharing does still work though (as always) and give the expected results, but can't understand why enabling this extension still brings up an error  (on my system atleast) as I reported it a while back.

                                                      Win 7 64bit professional, ATI 5870, SDK 2.6, Catalyst 11.12

                                                        • SDK 2.6 issues/questions

                                                          enabling extension in kernel with #pragma is only for extension which affects OpenCL C language.

                                                          • SDK 2.6 issues/questions

                                                            Is there a place to submit bugs other than the forum?

                                                            I believe that the vload3 is not implemented properly. So from the documentation, is has a prototype of (datatype *) vload3( offset, ptr) and should read from ptr starting from offset * 3.

                                                            But, it seems like it's reading from offset*4. Is there something that I am missing?

                                                              • SDK 2.6 issues/questions

                                                                As I know it, 3 component vectors are the same size as four component vectors (the w component is hidden), and as such their offset it calculated the same (sizeof should also report the same size).

                                                                I guess they were introduced as they can permit the hidden component to be safely ignored during any arithmetic operations common in 3D calculations.

                                                                  • SDK 2.6 issues/questions

                                                                    Regarding the CL_KHR_gl_sharing extension, I realise that AMDs samples dont enable this extension, exactly similar to mine. That's because it doesn't work. Just because it doesn't need to be enabled, doesn't mean that it should throw an error when you try and enable it!

                                                                    What if I want to write something that's portable, how do I know that if I don't enable it some other SDK won't report an error? AMD lists gl sharing as one of the supported extensions, so it should atleast accept it even if it does nothing about it.

                                                                    Secondly, has anyone tried the new KernelAnalyzer? It reports the same statistics no matter what Function you select.

                                                                    • SDK 2.6 issues/questions


                                                                      Originally posted by: antzrhere As I know it, 3 component vectors are the same size as four component vectors (the w component is hidden), and as such their offset it calculated the same (sizeof should also report the same size).


                                                                      I guess they were introduced as they can permit the hidden component to be safely ignored during any arithmetic operations common in 3D calculations.


                                                                      Sure. It's implemented as 4 compenent vector for alignment. But how it's implemented under the hood should be consistent in usage. From the reference:


                                                                      Return sizeof (gentypen) bytes of data read from address (p + (offset * n)).

                                                                      In vload3 and and similarly in vstore3, it should at least read from  (p+(offset*3)) and not (p+offset*4).


                                                                      Another question:

                                                                      Why isn't there a "shuffle" for doubles?

                                                                      When I try to shuffle doubles around, I get:


                                                                        no instance of overloaded function "shuffle"

                                                                      Will it be implemented in the future?

                                                            • SDK 2.6 issues/questions

                                                              Ubuntu has as a big problem that basically you need to be online for this distro.


                                                              Majority of the gpgpu codes you cannot develop while being online.

                                                              Even though i'm not developing military software here, though some falls under weapons of mass destruction (decryption/factorisation) though it'll be open source one day i guess, i'm getting hacked so silly here that i just cannot develop software online. Note majority here is game development, but that's even more wanted to get hacked by every consultant which has a flatrate connection to the net.


                                                              All development machines are airgapped and even that doesn't remove attackers. They are there *massively*. In a manner you'd get nightmares about i bet. 

                                                              Sometimes it goes like this. Suddenly a dude shows up at your msn and starts talking to you. You say 'hello', he says hello. You ask him what his profession is. Math teacher at a highschool. I ask in which city. Answer: Tehran. At which point you decide to not talk about politics, as it's a mathguy and not a politician.


                                                              Funny conversations - yet all unpaid.


                                                              There was a wave of attacks from seemingly south america (brazil) recently here at my linux firewall, they obviously tried to get through it (old machine stripped linux kernel - took me months to setup some years ago - and i'm sure i probably made some mistakes there as i don't do that daily). 


                                                              In fact i can't run on windows online, within a few days all the hacks in it force me to reinstall the machines then; windows actively hacks itself.


                                                              So the windows machines i run i run also airgapped from the internet.


                                                              This security problem eats a lot of time mine, and it's all unpaid.


                                                              Now majority of the software i program somehow makes it to the net. Majority of those who write gpgpu code however, do this in all secrecy for their company. So far most of them focussed upon nvidia. After all for big factories that already have their own energy production for their huge factory, the amount of power gpu's eat is not exactly a problem to do all calculations and the HD7970 seems ahead of the competition.


                                                              Which distro's will keep getting supported? 


                                                              As ubuntu isn't a great distro if you cannot be online. Distro's you can download entirely are for example opensuse or debian (quite a lot of DVD's). Especially Debian is attractive for offline development. Probably i switch soon to it, if the infiniband also can work with it fine.


                                                              Right now i'm using opensuse + realtime kernel which was adviced by AMD some while ago when i installed it first time. 


                                                              Do these distro's that allow airgapped machines keep getting supported somehow?


                                                              Kind Regards, Vincent

                                              • SDK 2.6 issues/questions
                                                It is because the extension is not a compiler extension but a runtime extension.
                                                  • SDK 2.6 issues/questions

                                                    Micah: I understand it effects the runtime environment and not the language/compiler itself, but, Just to clarify, I don't have a problem with the compiler not requiring it, just the fact it generates an error when encountering it. The question is: Is there any possible instance where I potentially would have enable it? I mean what about a future SDK/GPU combination from another vendor?

                                                    Would a *future imaginery* SDK have the right to not function correctly unless it had been enabled with a #pragma directive (and yet still comply with the spec)?

                                                    The reason I ask is because the OpenCL MAN pages talk about enabling it as you would with any other extension (using #pragma) implying it would not be technically incorrect for a OpenCL device *in theory* to disable/not enable gl sharing on runtime without this extension switched on. Otherwise what would be the purpose of documenting how to enable it? If so would it be wise to simply allow the AMD OpenCL compiler to simply accept the extension and ignore it for compatibility sakes?

                                                    Like i said its clearly not a problem, just a sake of completeness....

                                                      • SDK 2.6 issues/questions

                                                        OK. thanks. That explains the difference. Nice to clear it up

                                                          • SDK 2.6 issues/questions

                                                            Thanks for your help Micah - this isn't the first time some documentation other than the official spec sheet has led my mind astray! Regards

                                                              • SDK 2.6 issues/questions

                                                                It would still be nice to know how to get multi-GPU working with new runtime... it is funny, that as far as we had unofficial support for multiple HD5970s it worked, and now I cannot even boot the machine in a manner that OpenCL recognizes all cards.

                                                                  • SDK 2.6 issues/questions

                                                                    OK, could someone answer the rather sarcastic question of why don't supported multi-GPU environment don't work? There are two answers:

                                                                    1) "Meteorhead, we stand puzzeled at your statement, our multi-5970 environments boot just fine both Ubuntu and Redhat. Simply doing {...} gets everything working."

                                                                    2) "Yes, we are aware of the issue and 'are working on it'/'have absolutely no clue whatsoever why it arises'.

                                                                    Also, just out of curiosity, could someone tell me an official or unofficial opinion on why aren't the drivers made similar to NV, so that any user can load the kernel module in any shell? It would make life so much easier and make runtime so less unprofessional. Changing this:

                                                                    • wolud solve the entire GPU_MAX_HEAP_SIZE issue and allow us to use entire VRAM if no GUI is present.
                                                                    • would solve all the damned issues of Xorg.conf which renders machine unable to boot (and recover) if misconfigured, as input handling is very much dependant on GUI if installed.
                                                                    • would allow multiple users to run applications, since not only XServer needs to be run by a user for cards to be detected, but THE VERY user needs to run it. This causes that only a dummy user with a known password to all is able to run applications. This prevents any corporate/serious usage, where user administration and GPU-node adminisitration is not in the same hands (as is my situation, where kerberos authentication is not possible because of this, as I cannot introduce a dummy user campus-wide). 
                                                                    • would allow the removal of auto-login to GUI for the dummy user, which is a definite security risk.

                                                                    I have a feeling that the person who made presently used driver connection to XServer is no longer a part of dev team and there is noone else able to change this. This issue has been asked to be changed for such a long time, that I cannot imagine that no coding time could've been dedicated to the problem.

                                                                    I know everyone considers their own problem to be the most serious, however I do feel that these are a DIRE issues (both driver design and booting issues). Although I am pretty pissed, I hope I could keep the style at a tolerable level.

                                                                    • SDK 2.6 issues/questions

                                                                      hi, i read something about 2.6 enabling multi-gpu for 5970 as well. But not sure what was meant by that. Does this mean that opencl works for both gpu's on the 5970 card now?


                                                                      Kind Regards,

                                                                      Vincent Diepeveen

                                                            • SDK 2.6 issues/questions
                                                              The reason why it is not needed because there is nothing in the language that is enabled by the extension.

                                                              I would look more at the cl_khr_gl_sharing spec and not the man page for reference here:
                                                              There is no mention of a pragma existing, but if you look at a language extension, here:

                                                              There is a pragma specified. So it looks like the man page is wrong in this case.