cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rogue
Journeyman III

Help with clBuildProgram crash

I'm running into a problem with the attached program.  This program will build and run on Mac, Linux, and most Windows boxes as a 32-bit or 64-bit app.  As I said, it runs for "most" Windows users.  I have one Windows user for whom this code crashes in clBuildProgram (aticaldd64.dll) and I have no idea why.  I have another program (not attached) that is about 90% the same as the attached.  That program is a bit more complicated than this, but it runs without a problem.  Since the attached program builds and runs correctly almost everywhere else and since a similar program runs on the same box without a problem, I have to assume it is a problem with the AMD APP SDK.  This is with AMD APP SDK 2.7 on Windows 7 64-bit with VS 2012.  Here is the platform and device info.

Platform 0 is a Advanced Micro Devices, Inc. AMD Accelerated Parallel Processing, version OpenCL 1.1 AMD-APP (844.5)

  Device 0 is a Advanced Micro Devices, Inc. Tahiti

If this is not the place to submit this, then please point me to the right location.

Tags (2)
0 Likes
17 Replies
yurtesen
Miniboss

Re: Help with clBuildProgram crash

In the past when buildprogram crashed for me, it also crashed the app kernel analyzer program. Did you try to put your kerne in there and see what it does? Are you getting an error from build log? Do all your users have the same GPU?

0 Likes
rogue
Journeyman III

Re: Help with clBuildProgram crash

No, I'm not familiar with the app kernel analyzer and Windows isn't the platform I do a majority of my development on.  When you say "build log", are you referring to a file created by clBuildProgram or something else?  If clBuildProgram returned an error, then I would capture that error and print it out, but the problem is that it crashes the application instead of returning an error.  The code is designed and written to be GPU agnostic.  It will run on both AMD and NVIDIA GPUs.

0 Likes
yurtesen
Miniboss

Re: Help with clBuildProgram crash

Yes, I ment the clbuildprogram log, I realized your code was checking it, but you are right... if it is crashing...

You can simply copy / paste the kernel code to AppKernelAnalyzer in windows. It is a very simple program, just copy/pate and click of a button. I used the following one on Windows:

http://developer.amd.com/tools/heterogeneous-computing/app-kernel-analyzer/

However, if you prefer, you can have a linux version (and v2 actually) of kernel analyzer on Linux also. You should download CodeXL and there it comes with it!

http://developer.amd.com/tools/heterogeneous-computing/codexl/

I see on Linux -> CodeXL/0.94.774.0/AMDAPPKernelAnalyzer2-V2.0.1089.0

Since the kernel analyzer have to somewhat compile the kernel to analyze it, it might also crash. If it does that, then you can perhaps have a solid proof that something is wrong with the buildprogram itself easily.

The reason I asked about the same or different GPU is you said that code compiles for some GPUs, I wonder if there are differences on AMD GPUs causing it to crash sometimes. For example I had code which compiled on GPU but caused crash on Bulldozer at some point. The compiler makes different ISA code for different AMD GPUs also, so the problem might be restricted to only certain models.

0 Likes
rogue
Journeyman III

Re: Help with clBuildProgram crash

It does not crash the kernal analyzer, nor does it give an error.

Unless I pass an inherently bad argument into clBuildProgram, it shouldn't crash.  The fact that the same code/kernel builds and runs everywhere else points to a problem with clBuildProgram.  Even if my code cannot run on that device for some reason, clBuildProgram should return and it should return with a status other then CL_SUCCESS.

0 Likes
yurtesen
Miniboss

Re: Help with clBuildProgram crash

Why dont you check the arguments to clbuildprogram to see if they are correct? (perhaps print them out?). I am surprised that kernelanalyzer did not crash (you ran it on the problem machine with a target of the same GPU right?)

Did you try other programs if they work on this specific user's machine? I am sorry but I dont have access to windows development tools at this point. Otherwise I would gladly try to compile/run your code. Hopefully somebody else can help in that area.

0 Likes
rogue
Journeyman III

Re: Help with clBuildProgram crash

I did.  They are.  I did not run kernel analyzer on the machine that has the problem.  It isn't my machine, so I need to work with the user to get TeamViewer access to do that.

I wrote another program that uses the exact same function calls to clBuildProgram.  The only difference is the kernel.  That program works on the box that this one fails on.

As I said, I believe this is a Windows only issue.  It does not happen on other OSes or on other Windows boxes.  I'm hoping that someone from AMD could get involved to help solve this problem.  If this isn't the place to get help from AMD, then I would appreciate if someone could point me in the right direction.

0 Likes
yurtesen
Miniboss

Re: Help with clBuildProgram crash

You can try the support section at amd web site. But I am not sure how they can help you if the problem occurs only in your customer's machine. I mean, if they cant replicate it, they probably cant help.

KernelAnalyzer can make different things on different machines, at least the older version would not compile CPU code with AVX instructions when there was no AVX available on the host processor. It might be that it does something different on your machine compared to what it would do on your customer's machine. By the way, did you use v1 or v2?

Which catalyst version are you using? also which is the kernel file in your attachment which fails?

As far as I understand, you mean that the program fails on 1 machine with win7 64bit with app-sdk 2.7 and tahiti card, but not others which has same combo? (by the way, I believe sdk is not necessary for running compiled opencl programs)

0 Likes
rogue
Journeyman III

Re: Help with clBuildProgram crash

v1 or v2 of what?

I believe that my users has the latest drivers.  As for the attachment, the kernel is in a .h file.  There are two separate kernels supported by the program.  One is called wieferich_kernel.h and the other is wallsunsun_kernel.h.  These are generated using the perl script cltoh.pl from the .cl files.  The Wieferich.cpp and WallSunSun.cpp classes prepend a #define to the character buffer (in the .h files I just named) passed into clCreateProgramWithSource, which returns a status of CL_SUCCESS.  That #define is needed in order for the kernels to compile.

I was using the SDK to build and link the program so that I could use the VS debugger to determine where it was crashing.

As for AMD, if there is a way to provide them a call stack or dump when the application crashes, they should be able to track down the problem.  Of course if they provided the source to their SDK or pdb files for the dlls, I could probably find it myself.  I personally think that there is an uninitialized variable in the function that is crashing or that function is not checking for a invalid status from a separate function call.

0 Likes
yurtesen
Miniboss

Re: Help with clBuildProgram crash

I was asking about the kernel analyzer. I put your wallsunsun.cl file into kernel analyzer (had to define SPECIALTHRESHOLD, I used a random value 100 )

[798965.545718] AMDAPPKernelAna[24042]: segfault at 0 ip 00007f6d06231125 sp 00007f6d02162650 error 4 in libaticaldd.so[7f6d05b27000+b47000]

[1]+  Segmentation fault      (core dumped) AMDAPPKernelAnalyzer2  (wd: ~/tmp/test)

(wd now: ~/tmp/wwwwcl_2.2.2)

If I compile it for 7xxx cards, then kernelanalyzer v2 crashes... I think you should see the same. I used the kernel analyzer v2 from CodeXL installation on Linux. Perhaps you can try to narrow down the problem by commenting out sections of it if possible. (perhaps it might be easier if you can crash it on your machine using kernel analyzer v2).

Now, about reporting... You could perhaps use the last step mentioned in this page. I would still try to narrow down the problem a little bit more before reporting.:

http://developer.amd.com/support/

Thanks for your patience