17 Replies Latest reply on Apr 27, 2014 12:52 AM by rebirther

    Help with clBuildProgram crash

    rogue

      I'm running into a problem with the attached program.  This program will build and run on Mac, Linux, and most Windows boxes as a 32-bit or 64-bit app.  As I said, it runs for "most" Windows users.  I have one Windows user for whom this code crashes in clBuildProgram (aticaldd64.dll) and I have no idea why.  I have another program (not attached) that is about 90% the same as the attached.  That program is a bit more complicated than this, but it runs without a problem.  Since the attached program builds and runs correctly almost everywhere else and since a similar program runs on the same box without a problem, I have to assume it is a problem with the AMD APP SDK.  This is with AMD APP SDK 2.7 on Windows 7 64-bit with VS 2012.  Here is the platform and device info.

       

      Platform 0 is a Advanced Micro Devices, Inc. AMD Accelerated Parallel Processing, version OpenCL 1.1 AMD-APP (844.5)

        Device 0 is a Advanced Micro Devices, Inc. Tahiti

       

      If this is not the place to submit this, then please point me to the right location.

        • Re: Help with clBuildProgram crash
          yurtesen

          In the past when buildprogram crashed for me, it also crashed the app kernel analyzer program. Did you try to put your kerne in there and see what it does? Are you getting an error from build log? Do all your users have the same GPU?

            • Re: Help with clBuildProgram crash
              rogue

              No, I'm not familiar with the app kernel analyzer and Windows isn't the platform I do a majority of my development on.  When you say "build log", are you referring to a file created by clBuildProgram or something else?  If clBuildProgram returned an error, then I would capture that error and print it out, but the problem is that it crashes the application instead of returning an error.  The code is designed and written to be GPU agnostic.  It will run on both AMD and NVIDIA GPUs.

                • Re: Help with clBuildProgram crash
                  yurtesen

                  Yes, I ment the clbuildprogram log, I realized your code was checking it, but you are right... if it is crashing...

                   

                  You can simply copy / paste the kernel code to AppKernelAnalyzer in windows. It is a very simple program, just copy/pate and click of a button. I used the following one on Windows:

                  http://developer.amd.com/tools/heterogeneous-computing/app-kernel-analyzer/

                  However, if you prefer, you can have a linux version (and v2 actually) of kernel analyzer on Linux also. You should download CodeXL and there it comes with it!

                  http://developer.amd.com/tools/heterogeneous-computing/codexl/

                  I see on Linux -> CodeXL/0.94.774.0/AMDAPPKernelAnalyzer2-V2.0.1089.0

                   

                  Since the kernel analyzer have to somewhat compile the kernel to analyze it, it might also crash. If it does that, then you can perhaps have a solid proof that something is wrong with the buildprogram itself easily.

                   

                  The reason I asked about the same or different GPU is you said that code compiles for some GPUs, I wonder if there are differences on AMD GPUs causing it to crash sometimes. For example I had code which compiled on GPU but caused crash on Bulldozer at some point. The compiler makes different ISA code for different AMD GPUs also, so the problem might be restricted to only certain models.

                    • Re: Help with clBuildProgram crash
                      rogue

                      It does not crash the kernal analyzer, nor does it give an error.

                       

                      Unless I pass an inherently bad argument into clBuildProgram, it shouldn't crash.  The fact that the same code/kernel builds and runs everywhere else points to a problem with clBuildProgram.  Even if my code cannot run on that device for some reason, clBuildProgram should return and it should return with a status other then CL_SUCCESS.

                        • Re: Help with clBuildProgram crash
                          yurtesen

                          Why dont you check the arguments to clbuildprogram to see if they are correct? (perhaps print them out?). I am surprised that kernelanalyzer did not crash (you ran it on the problem machine with a target of the same GPU right?)

                           

                          Did you try other programs if they work on this specific user's machine? I am sorry but I dont have access to windows development tools at this point. Otherwise I would gladly try to compile/run your code. Hopefully somebody else can help in that area.

                            • Re: Help with clBuildProgram crash
                              rogue

                              I did.  They are.  I did not run kernel analyzer on the machine that has the problem.  It isn't my machine, so I need to work with the user to get TeamViewer access to do that.

                               

                              I wrote another program that uses the exact same function calls to clBuildProgram.  The only difference is the kernel.  That program works on the box that this one fails on.

                               

                              As I said, I believe this is a Windows only issue.  It does not happen on other OSes or on other Windows boxes.  I'm hoping that someone from AMD could get involved to help solve this problem.  If this isn't the place to get help from AMD, then I would appreciate if someone could point me in the right direction.

                                • Re: Help with clBuildProgram crash
                                  yurtesen

                                  You can try the support section at amd web site. But I am not sure how they can help you if the problem occurs only in your customer's machine. I mean, if they cant replicate it, they probably cant help.

                                   

                                  KernelAnalyzer can make different things on different machines, at least the older version would not compile CPU code with AVX instructions when there was no AVX available on the host processor. It might be that it does something different on your machine compared to what it would do on your customer's machine. By the way, did you use v1 or v2?

                                   

                                  Which catalyst version are you using? also which is the kernel file in your attachment which fails?

                                   

                                  As far as I understand, you mean that the program fails on 1 machine with win7 64bit with app-sdk 2.7 and tahiti card, but not others which has same combo? (by the way, I believe sdk is not necessary for running compiled opencl programs)

                                    • Re: Help with clBuildProgram crash
                                      rogue

                                      v1 or v2 of what?

                                       

                                      I believe that my users has the latest drivers.  As for the attachment, the kernel is in a .h file.  There are two separate kernels supported by the program.  One is called wieferich_kernel.h and the other is wallsunsun_kernel.h.  These are generated using the perl script cltoh.pl from the .cl files.  The Wieferich.cpp and WallSunSun.cpp classes prepend a #define to the character buffer (in the .h files I just named) passed into clCreateProgramWithSource, which returns a status of CL_SUCCESS.  That #define is needed in order for the kernels to compile.

                                       

                                      I was using the SDK to build and link the program so that I could use the VS debugger to determine where it was crashing.

                                       

                                      As for AMD, if there is a way to provide them a call stack or dump when the application crashes, they should be able to track down the problem.  Of course if they provided the source to their SDK or pdb files for the dlls, I could probably find it myself.  I personally think that there is an uninitialized variable in the function that is crashing or that function is not checking for a invalid status from a separate function call.

                                        • Re: Help with clBuildProgram crash
                                          yurtesen

                                          I was asking about the kernel analyzer. I put your wallsunsun.cl file into kernel analyzer (had to define SPECIALTHRESHOLD, I used a random value 100 )

                                           

                                          [798965.545718] AMDAPPKernelAna[24042]: segfault at 0 ip 00007f6d06231125 sp 00007f6d02162650 error 4 in libaticaldd.so[7f6d05b27000+b47000]

                                          [1]+  Segmentation fault      (core dumped) AMDAPPKernelAnalyzer2  (wd: ~/tmp/test)

                                          (wd now: ~/tmp/wwwwcl_2.2.2)

                                           

                                          If I compile it for 7xxx cards, then kernelanalyzer v2 crashes... I think you should see the same. I used the kernel analyzer v2 from CodeXL installation on Linux. Perhaps you can try to narrow down the problem by commenting out sections of it if possible. (perhaps it might be easier if you can crash it on your machine using kernel analyzer v2).

                                           

                                          Now, about reporting... You could perhaps use the last step mentioned in this page. I would still try to narrow down the problem a little bit more before reporting.:

                                          http://developer.amd.com/support/

                                           

                                          Thanks for your patience

                                          1 of 1 people found this helpful
                                        • Re: Help with clBuildProgram crash
                                          himanshu.gautam

                                          Hi,

                                           

                                          APP SDK 2.8 was released on 4th December 2012

                                          Can you please check with the latest SDK? Hope this solves your issue.

                                          Kindly confirm.

                                          Meantime, I will see if I can run your project on my win64 setup here.

                            • Re: Help with clBuildProgram crash
                              himanshu.gautam

                              Did you get a chance to check 2.8 SDK with the latest driver?

                               

                              I tried running on the win64 that I have (with 2.8 and Cayman). There are multiple algorithm choices and other command line options that govern the run. Can you publish what exact command line causes the crash?

                              I tried a few but could not reproduce a crash. But this was on Cayman (6950).

                               

                              But, The wallsunsun run was reporting correctness issues (and not buildprogram issues)

                              "

                              > wwwwcl64 -P 10 -T WallSunSun

                              wwwwcl v2.2.2, a GPU program to search for Wieferich and WallSunSun primes

                              Sieve started: (cmdline) 0 <= p < 10

                              3 is a special instance (+0 +1 p)

                              Fatal Error:  Not prime: p = 5   c10 = 0   c11 = 1.  The code must have a bug.

                              "

                               

                              I will try out the AMD APP Kernel Analyzer test for the Wallsunsun kernel (for 7xxx series) tomorrow. THanks,

                               

                              Note: The Wieferich variant runs fine.

                              > wwwwcl64 -P 10 -T Wieferich
                              wwwwcl v2.2.2, a GPU program to search for Wieferich and WallSunSun primes
                              Sieve started: (cmdline) 0 <= p < 10
                              3 is a special instance (-1 -2 p)
                              5 is a special instance (-1 -4 p)

                              Sieve complete: 3 <= p < 10  3 primes tested
                              Clock time: 0.69 seconds at at 4 p/sec.
                              Processor time: 0.63 sec. (0.63 init + 0.00 sieve).
                              Seconds spent in CPU and GPU: 0.36 (cpu), 0.01 (gpu)
                              Percent of time spent in CPU vs. GPU: 98.13 (cpu), 1.87 (gpu)
                              CPU/GPU utilization: 0.91 (cores), 0.01 (devices)

                              • Re: Help with clBuildProgram crash
                                rebirther

                                The app is still crashing with LLVM error on HD7950 compiled with AMD SDK 2.9 and latest 13.12 driver. It runs fine on NVIDIA Cards. We have many different ATI cards waiting to run the app. The driver really needs a fix. The only way to get rid of this error is to change in "kernel.cpp"

                                 

                                status = clBuildProgram(im_Program, 1, ip_Device->GetDeviceIdPtr(),

                                buildOptions.c_str(), NULL, NULL);

                                 

                                to

                                 

                                status = clBuildProgram(im_Program, 1, ip_Device->GetDeviceIdPtr(),

                                "-cl-opt-disable", NULL, NULL);

                                 

                                But this decrease the GPU speed a lot.

                                 

                                wwwwcl v2.2.2, a GPU program to search for Wieferich and WallSunSun primes
                                LLVM ERROR: Cannot select: 0x4b8a190: i32 = setcc 0x477cb50, 0x477ba40, 0x47853d
                                0 [ORD=114] [ID=67]
                                  0x477cb50: i64 = add 0x477ba40, 0x477f780 [ORD=111] [ID=65]
                                    0x477ba40: i64 = add 0x477fd80, 0x477d960 [ORD=97] [ID=64]
                                      0x477fd80: i64 = add 0x477c040, 0x477f570 [ORD=96] [ID=59]
                                        0x477c040: i64 = srl 0x477e560, 0x477f880 [ORD=91] [ID=55]
                                          0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
                                            0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
                                              0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
                                                0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                                                  0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84]
                                [ID=22]

                                                  0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                                                0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34
                                ]
                                                  0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86]
                                [ID=23]

                                              0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                            0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
                                              0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                                                0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [I
                                D=22]
                                                  0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                                0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                              0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
                                                0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [I
                                D=23]
                                                  0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
                                          0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                        0x477f570: i64 = mul 0x477ed70, 0x477fb80 [ORD=94] [ID=43]
                                          0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                                            0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=22
                                ]
                                              0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                            0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                          0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
                                            0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=24
                                ]
                                              0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
                                      0x477d960: i64 = srl 0x477ec70, 0x477f880 [ORD=95] [ID=62]
                                        0x477ec70: i64 = add 0x477e460, 0x477fa80 [ORD=93] [ID=60]
                                          0x477e460: i64 = and 0x477e560, 0x477ef70 [ORD=90] [ID=56]
                                            0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
                                              0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
                                                0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
                                                  0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]


                                                  0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=
                                34]

                                                0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                              0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
                                                0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                                                  0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84]
                                [ID=22]

                                                  0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                                0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34
                                ]
                                                  0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86]
                                [ID=23]

                                            0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                                          0x477fa80: i64 = mul 0x477d450, 0x477fb80 [ORD=92] [ID=44]
                                            0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                                              0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=
                                22]
                                                0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                              0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                                            0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
                                              0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=
                                24]
                                                0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
                                        0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                    0x477f780: i64 = mul 0x477d350, 0x477ee70 [ORD=98] [ID=38]
                                      0x477d350: i64,ch = CopyFromReg 0x4bb6730, 0x477ce50 [ORD=98] [ID=25]
                                        0x477ce50: i64 = Register %vreg51 [ORD=98] [ID=7]
                                      0x477ee70: i64,ch = CopyFromReg 0x4bb6730, 0x477ff80 [ORD=98] [ID=26]
                                        0x477ff80: i64 = Register %vreg7 [ORD=98] [ID=8]
                                  0x477ba40: i64 = add 0x477fd80, 0x477d960 [ORD=97] [ID=64]
                                    0x477fd80: i64 = add 0x477c040, 0x477f570 [ORD=96] [ID=59]
                                      0x477c040: i64 = srl 0x477e560, 0x477f880 [ORD=91] [ID=55]
                                        0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
                                          0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
                                            0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
                                              0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                                                0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [I
                                D=22]
                                                  0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                                0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                                              0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
                                                0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [I
                                D=23]
                                                  0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
                                            0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                          0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
                                            0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                                              0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=
                                22]
                                                0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                              0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                            0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
                                              0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [ID=
                                23]
                                                0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
                                        0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                      0x477f570: i64 = mul 0x477ed70, 0x477fb80 [ORD=94] [ID=43]
                                        0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                                          0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=22]
                                            0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                          0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                        0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
                                          0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=24]
                                            0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
                                    0x477d960: i64 = srl 0x477ec70, 0x477f880 [ORD=95] [ID=62]
                                      0x477ec70: i64 = add 0x477e460, 0x477fa80 [ORD=93] [ID=60]
                                        0x477e460: i64 = and 0x477e560, 0x477ef70 [ORD=90] [ID=56]
                                          0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
                                            0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
                                              0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
                                                0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                                                  0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84]
                                [ID=22]

                                                  0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                                                0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34
                                ]
                                                  0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86]
                                [ID=23]

                                              0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                            0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
                                              0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                                                0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [I
                                D=22]
                                                  0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                                0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                                              0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
                                                0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [I
                                D=23]
                                                  0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
                                          0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                                        0x477fa80: i64 = mul 0x477d450, 0x477fb80 [ORD=92] [ID=44]
                                          0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                                            0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=22
                                ]
                                              0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                                            0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                                          0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
                                            0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=24
                                ]
                                              0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
                                      0x477f880: i32 = Constant<32> [ORD=85] [ID=3]

                                • Re: Help with clBuildProgram crash
                                  rebirther

                                  Still no luck with the latest driver 14.4. When it will be fixed?