cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rogue
Journeyman III

Help with clBuildProgram crash

I'm running into a problem with the attached program.  This program will build and run on Mac, Linux, and most Windows boxes as a 32-bit or 64-bit app.  As I said, it runs for "most" Windows users.  I have one Windows user for whom this code crashes in clBuildProgram (aticaldd64.dll) and I have no idea why.  I have another program (not attached) that is about 90% the same as the attached.  That program is a bit more complicated than this, but it runs without a problem.  Since the attached program builds and runs correctly almost everywhere else and since a similar program runs on the same box without a problem, I have to assume it is a problem with the AMD APP SDK.  This is with AMD APP SDK 2.7 on Windows 7 64-bit with VS 2012.  Here is the platform and device info.

Platform 0 is a Advanced Micro Devices, Inc. AMD Accelerated Parallel Processing, version OpenCL 1.1 AMD-APP (844.5)

  Device 0 is a Advanced Micro Devices, Inc. Tahiti

If this is not the place to submit this, then please point me to the right location.

0 Likes
17 Replies
yurtesen
Miniboss

In the past when buildprogram crashed for me, it also crashed the app kernel analyzer program. Did you try to put your kerne in there and see what it does? Are you getting an error from build log? Do all your users have the same GPU?

0 Likes

No, I'm not familiar with the app kernel analyzer and Windows isn't the platform I do a majority of my development on.  When you say "build log", are you referring to a file created by clBuildProgram or something else?  If clBuildProgram returned an error, then I would capture that error and print it out, but the problem is that it crashes the application instead of returning an error.  The code is designed and written to be GPU agnostic.  It will run on both AMD and NVIDIA GPUs.

0 Likes

Yes, I ment the clbuildprogram log, I realized your code was checking it, but you are right... if it is crashing...

You can simply copy / paste the kernel code to AppKernelAnalyzer in windows. It is a very simple program, just copy/pate and click of a button. I used the following one on Windows:

http://developer.amd.com/tools/heterogeneous-computing/app-kernel-analyzer/

However, if you prefer, you can have a linux version (and v2 actually) of kernel analyzer on Linux also. You should download CodeXL and there it comes with it!

http://developer.amd.com/tools/heterogeneous-computing/codexl/

I see on Linux -> CodeXL/0.94.774.0/AMDAPPKernelAnalyzer2-V2.0.1089.0

Since the kernel analyzer have to somewhat compile the kernel to analyze it, it might also crash. If it does that, then you can perhaps have a solid proof that something is wrong with the buildprogram itself easily.

The reason I asked about the same or different GPU is you said that code compiles for some GPUs, I wonder if there are differences on AMD GPUs causing it to crash sometimes. For example I had code which compiled on GPU but caused crash on Bulldozer at some point. The compiler makes different ISA code for different AMD GPUs also, so the problem might be restricted to only certain models.

0 Likes

It does not crash the kernal analyzer, nor does it give an error.

Unless I pass an inherently bad argument into clBuildProgram, it shouldn't crash.  The fact that the same code/kernel builds and runs everywhere else points to a problem with clBuildProgram.  Even if my code cannot run on that device for some reason, clBuildProgram should return and it should return with a status other then CL_SUCCESS.

0 Likes

Why dont you check the arguments to clbuildprogram to see if they are correct? (perhaps print them out?). I am surprised that kernelanalyzer did not crash (you ran it on the problem machine with a target of the same GPU right?)

Did you try other programs if they work on this specific user's machine? I am sorry but I dont have access to windows development tools at this point. Otherwise I would gladly try to compile/run your code. Hopefully somebody else can help in that area.

0 Likes

I did.  They are.  I did not run kernel analyzer on the machine that has the problem.  It isn't my machine, so I need to work with the user to get TeamViewer access to do that.

I wrote another program that uses the exact same function calls to clBuildProgram.  The only difference is the kernel.  That program works on the box that this one fails on.

As I said, I believe this is a Windows only issue.  It does not happen on other OSes or on other Windows boxes.  I'm hoping that someone from AMD could get involved to help solve this problem.  If this isn't the place to get help from AMD, then I would appreciate if someone could point me in the right direction.

0 Likes

You can try the support section at amd web site. But I am not sure how they can help you if the problem occurs only in your customer's machine. I mean, if they cant replicate it, they probably cant help.

KernelAnalyzer can make different things on different machines, at least the older version would not compile CPU code with AVX instructions when there was no AVX available on the host processor. It might be that it does something different on your machine compared to what it would do on your customer's machine. By the way, did you use v1 or v2?

Which catalyst version are you using? also which is the kernel file in your attachment which fails?

As far as I understand, you mean that the program fails on 1 machine with win7 64bit with app-sdk 2.7 and tahiti card, but not others which has same combo? (by the way, I believe sdk is not necessary for running compiled opencl programs)

0 Likes

v1 or v2 of what?

I believe that my users has the latest drivers.  As for the attachment, the kernel is in a .h file.  There are two separate kernels supported by the program.  One is called wieferich_kernel.h and the other is wallsunsun_kernel.h.  These are generated using the perl script cltoh.pl from the .cl files.  The Wieferich.cpp and WallSunSun.cpp classes prepend a #define to the character buffer (in the .h files I just named) passed into clCreateProgramWithSource, which returns a status of CL_SUCCESS.  That #define is needed in order for the kernels to compile.

I was using the SDK to build and link the program so that I could use the VS debugger to determine where it was crashing.

As for AMD, if there is a way to provide them a call stack or dump when the application crashes, they should be able to track down the problem.  Of course if they provided the source to their SDK or pdb files for the dlls, I could probably find it myself.  I personally think that there is an uninitialized variable in the function that is crashing or that function is not checking for a invalid status from a separate function call.

0 Likes

I was asking about the kernel analyzer. I put your wallsunsun.cl file into kernel analyzer (had to define SPECIALTHRESHOLD, I used a random value 100 )

[798965.545718] AMDAPPKernelAna[24042]: segfault at 0 ip 00007f6d06231125 sp 00007f6d02162650 error 4 in libaticaldd.so[7f6d05b27000+b47000]

[1]+  Segmentation fault      (core dumped) AMDAPPKernelAnalyzer2  (wd: ~/tmp/test)

(wd now: ~/tmp/wwwwcl_2.2.2)

If I compile it for 7xxx cards, then kernelanalyzer v2 crashes... I think you should see the same. I used the kernel analyzer v2 from CodeXL installation on Linux. Perhaps you can try to narrow down the problem by commenting out sections of it if possible. (perhaps it might be easier if you can crash it on your machine using kernel analyzer v2).

Now, about reporting... You could perhaps use the last step mentioned in this page. I would still try to narrow down the problem a little bit more before reporting.:

http://developer.amd.com/support/

Thanks for your patience

That is much more helpful.  That clearly puts the ball into AMD's court as the kernel analyzer shouldn't crash on one card, but work on another.  It should report an error.  I'll try to narrow it down then report the problem to the link you provided.

0 Likes

I tried wallsunsun kernel with SpecialThreshold as 100 on AMD APP Kernel Analyzer on Windows 7 and everything works fine.

Attached is a screenshot

codeXL - 1.0.2409.0
AMD APP Kernel Analyzer - 2.0.144

Windows 7 64 bit

Catalyst 12.10 driver

Can you please confirm if you are still seeing your problem? Thanks,

0 Likes

Hi,

APP SDK 2.8 was released on 4th December 2012

Can you please check with the latest SDK? Hope this solves your issue.

Kindly confirm.

Meantime, I will see if I can run your project on my win64 setup here.

0 Likes
himanshu_gautam
Grandmaster

Did you get a chance to check 2.8 SDK with the latest driver?

I tried running on the win64 that I have (with 2.8 and Cayman). There are multiple algorithm choices and other command line options that govern the run. Can you publish what exact command line causes the crash?

I tried a few but could not reproduce a crash. But this was on Cayman (6950).

But, The wallsunsun run was reporting correctness issues (and not buildprogram issues)

"

> wwwwcl64 -P 10 -T WallSunSun

wwwwcl v2.2.2, a GPU program to search for Wieferich and WallSunSun primes

Sieve started: (cmdline) 0 <= p < 10

3 is a special instance (+0 +1 p)

Fatal Error:  Not prime: p = 5   c10 = 0   c11 = 1.  The code must have a bug.

"

I will try out the AMD APP Kernel Analyzer test for the Wallsunsun kernel (for 7xxx series) tomorrow. THanks,

Note: The Wieferich variant runs fine.

> wwwwcl64 -P 10 -T Wieferich
wwwwcl v2.2.2, a GPU program to search for Wieferich and WallSunSun primes
Sieve started: (cmdline) 0 <= p < 10
3 is a special instance (-1 -2 p)
5 is a special instance (-1 -4 p)

Sieve complete: 3 <= p < 10  3 primes tested
Clock time: 0.69 seconds at at 4 p/sec.
Processor time: 0.63 sec. (0.63 init + 0.00 sieve).
Seconds spent in CPU and GPU: 0.36 (cpu), 0.01 (gpu)
Percent of time spent in CPU vs. GPU: 98.13 (cpu), 1.87 (gpu)
CPU/GPU utilization: 0.91 (cores), 0.01 (devices)

0 Likes

AMD APP Kernel Analyzer works fine with Wallsunsun. I have posted screenshot and details in reply to yurtesen above (watch for blue thick font)

0 Likes

Its not working with AMD SDK 2.9 and Catalyst 13.12 + all 12.x and 13.x. Can you check this pls with the answer below?

0 Likes
rebirther
Journeyman III

The app is still crashing with LLVM error on HD7950 compiled with AMD SDK 2.9 and latest 13.12 driver. It runs fine on NVIDIA Cards. We have many different ATI cards waiting to run the app. The driver really needs a fix. The only way to get rid of this error is to change in "kernel.cpp"

status = clBuildProgram(im_Program, 1, ip_Device->GetDeviceIdPtr(),

buildOptions.c_str(), NULL, NULL);

to

status = clBuildProgram(im_Program, 1, ip_Device->GetDeviceIdPtr(),

"-cl-opt-disable", NULL, NULL);

But this decrease the GPU speed a lot.

wwwwcl v2.2.2, a GPU program to search for Wieferich and WallSunSun primes
LLVM ERROR: Cannot select: 0x4b8a190: i32 = setcc 0x477cb50, 0x477ba40, 0x47853d
0 [ORD=114] [ID=67]
  0x477cb50: i64 = add 0x477ba40, 0x477f780 [ORD=111] [ID=65]
    0x477ba40: i64 = add 0x477fd80, 0x477d960 [ORD=97] [ID=64]
      0x477fd80: i64 = add 0x477c040, 0x477f570 [ORD=96] [ID=59]
        0x477c040: i64 = srl 0x477e560, 0x477f880 [ORD=91] [ID=55]
          0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
            0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
              0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
                0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                  0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84]
[ID=22]

                  0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34
]
                  0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86]
[ID=23]

              0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
            0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
              0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [I
D=22]
                  0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
              0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
                0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [I
D=23]
                  0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
          0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
        0x477f570: i64 = mul 0x477ed70, 0x477fb80 [ORD=94] [ID=43]
          0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
            0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=22
]
              0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
            0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
          0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
            0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=24
]
              0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
      0x477d960: i64 = srl 0x477ec70, 0x477f880 [ORD=95] [ID=62]
        0x477ec70: i64 = add 0x477e460, 0x477fa80 [ORD=93] [ID=60]
          0x477e460: i64 = and 0x477e560, 0x477ef70 [ORD=90] [ID=56]
            0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
              0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
                0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
                  0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]


                  0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=
34]

                0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
              0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
                0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                  0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84]
[ID=22]

                  0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
                0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34
]
                  0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86]
[ID=23]

            0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
          0x477fa80: i64 = mul 0x477d450, 0x477fb80 [ORD=92] [ID=44]
            0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
              0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=
22]
                0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
              0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
            0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
              0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=
24]
                0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
        0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
    0x477f780: i64 = mul 0x477d350, 0x477ee70 [ORD=98] [ID=38]
      0x477d350: i64,ch = CopyFromReg 0x4bb6730, 0x477ce50 [ORD=98] [ID=25]
        0x477ce50: i64 = Register %vreg51 [ORD=98] [ID=7]
      0x477ee70: i64,ch = CopyFromReg 0x4bb6730, 0x477ff80 [ORD=98] [ID=26]
        0x477ff80: i64 = Register %vreg7 [ORD=98] [ID=8]
  0x477ba40: i64 = add 0x477fd80, 0x477d960 [ORD=97] [ID=64]
    0x477fd80: i64 = add 0x477c040, 0x477f570 [ORD=96] [ID=59]
      0x477c040: i64 = srl 0x477e560, 0x477f880 [ORD=91] [ID=55]
        0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
          0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
            0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
              0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [I
D=22]
                  0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
              0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
                0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [I
D=23]
                  0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
            0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
          0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
            0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
              0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=
22]
                0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
              0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
            0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
              0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [ID=
23]
                0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
        0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
      0x477f570: i64 = mul 0x477ed70, 0x477fb80 [ORD=94] [ID=43]
        0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
          0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=22]
            0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
          0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
        0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
          0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=24]
            0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
    0x477d960: i64 = srl 0x477ec70, 0x477f880 [ORD=95] [ID=62]
      0x477ec70: i64 = add 0x477e460, 0x477fa80 [ORD=93] [ID=60]
        0x477e460: i64 = and 0x477e560, 0x477ef70 [ORD=90] [ID=56]
          0x477e560: i64 = add 0x477fe80, 0x477d550 [ORD=89] [ID=53]
            0x477fe80: i64 = srl 0x477bd40, 0x477f880 [ORD=88] [ID=51]
              0x477bd40: i64 = mul 0x477d450, 0x477f070 [ORD=86] [ID=42]
                0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
                  0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84]
[ID=22]

                  0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
                0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34
]
                  0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86]
[ID=23]

              0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
            0x477d550: i64 = mul 0x477ed70, 0x477f070 [ORD=87] [ID=41]
              0x477ed70: i64 = srl 0x477ca50, 0x477f880 [ORD=85] [ID=32]
                0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [I
D=22]
                  0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
                0x477f880: i32 = Constant<32> [ORD=85] [ID=3]
              0x477f070: i64 = AssertZext 0x477c750, 0x477bf40 [ORD=86] [ID=34]
                0x477c750: i64,ch = CopyFromReg 0x4bb6730, 0x477e770 [ORD=86] [I
D=23]
                  0x477e770: i64 = Register %vreg35 [ORD=86] [ID=4]
          0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
        0x477fa80: i64 = mul 0x477d450, 0x477fb80 [ORD=92] [ID=44]
          0x477d450: i64 = and 0x477ca50, 0x477ef70 [ORD=84] [ID=33]
            0x477ca50: i64,ch = CopyFromReg 0x4bb6730, 0x477d150 [ORD=84] [ID=22
]
              0x477d150: i64 = Register %vreg52 [ORD=84] [ID=1]
            0x477ef70: i64 = Constant<4294967295> [ORD=84] [ID=2]
          0x477fb80: i64 = AssertZext 0x477d860, 0x477bf40 [ORD=92] [ID=35]
            0x477d860: i64,ch = CopyFromReg 0x4bb6730, 0x477bc40 [ORD=92] [ID=24
]
              0x477bc40: i64 = Register %vreg36 [ORD=92] [ID=6]
      0x477f880: i32 = Constant<32> [ORD=85] [ID=3]

0 Likes
rebirther
Journeyman III

Still no luck with the latest driver 14.4. When it will be fixed?

0 Likes