cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

robbert_harms
Journeyman III

Compiler crashes

Hello all,

I am currently working on a application in python and opencl (using pyopencl) to accelerate computation of multiple small optimization problems. I took a C implementation of the Levenberg-Marquardt algorithm and with some code generation tools I am able to produce kernels. I appended such a generated kernel to this question.

This kernel works, but only some of the time, it often crashes during kernel compilation.

My setup is as follows:

Machine 1, Gentoo x64, Radeon HD 280X

ati-drivers-13.12:  works most of the time, gives random failures during kernel compilation

ati-drivers-14.6_beta1: always crashes during compilation

Machine 2, Windows 7, Radeon HD 260x

latest drivers: works only if '-cl-opt-disable' is set, crashes otherwise.

I hope someone is able to suggest a solution to this compiler crashing, or, perhaps AMD's compiler team can use it as a test case for their drivers.

Regards,

Robbert

0 Likes
10 Replies
dipak
Big Boss

Hi Robbert,

Thanks for reporting the error. Please find my queries/suggestions given below.


This kernel works, but only some of the time, it often crashes during kernel compilation.



latest drivers: works only if '-cl-opt-disable' is set, crashes otherwise.


1. Does the kernel always compile fine on both the setups when '-cl-opt-disable' is set? Please try, if you haven't checked.

2. Is it throwing any compilation error before crashing? Did you try to catch the compilation error using clGetProgramBuildInfo? If not, please try and share your observation.

3. Please share the driver version (for second setup), APP SDK version and development environment (say visual studio version) you've used. Please attach the output of clinfo for all the cases. [Note: personally I haven't worked on Gentoo Linux and not sure whether AMD catalyst driver officially support it or not. Please check the release notes of the associated driver to see the compatible operating systems]

4. As you've a familiarity with this code, it would be great help for us if you are able to capture the same bug in a simplified code (remove the unnecessary codes not related to the problem) and post the same. Then the code can be directly forwarded to compilation team as a test case.

Regards,

0 Likes

Hello Dipak,

Thank you for your reply, I hope we are able to solve the compilation problems. To answer your questions:

1. The compilation on the moment gives:

Machine 1 with driver 13.12: seems to compile on the moment, with and without -cl-opt-disable

Machine 1 with driver 14.6_beta1: never works, with or without -cl-opt-disable

Machine 2 with driver 14.10.1006-140417a-171099C: works if -cl-opt-disable is set.

What I find particularly strange here is that for me, on my Linux machine, any kernel above 13.12 gives (massive) compilation problems, whereas 13.12 works fine most of the time. Something changed in those driver versions for the worst (it would seem).

2. Yes, it throws various compilation errors. On windows it gives:

Process finished with exit code -1073741819 (0xC0000005)

On Linux it gives:

Program received signal SIGSEGV, Segmentation fault.
0x00007fffe3955b39 in ?? () from /usr/lib64/OpenCL/vendors/amd/libamdocl64.so

(backtrace included in seperate file).

After it crashes it also crashes the python runtime environment, hence I can not catch an error of any kind.

3.

Windows version: Windows 7 Ultimate, SP1, x64
Driver version: 14.10.1006-140417a-171099C


I attached the clinfo's.

4. I have tried to reduce the test case to the minimum but it is unfortunately a large kernel, reducing it more would make it lose its meaning.

Regards,

Robbert

0 Likes

Hi Robbert,

Thanks for your reply and posting the reduce test case.

From my first question, I wanted to figure out whether it was a compiler optimization problem or not. But your observations (particularly Machine 1 with driver 14.6_beta1) indicate something else.


What I find particularly strange here is that for me, on my Linux machine, any kernel above 13.12 gives (massive) compilation problems, whereas 13.12 works fine most of the time. Something changed in those driver versions for the worst (it would seem).


So, can I assume this issue can be reproduced with any other driver above 13.12?

BTW, did you check the release notes of the associated driver to see whether your linux is included in the compatible operating systems list or not? Just want to make sure that there is no conflict in the setup i.e. driver installation was done successfully.

Regards,

0 Likes

Hello Dipak,

Yes, that is a fair assumption, till 13.12 my software worked, all 14.x versions unfortunately not (on Linux that is, haven't tested this on windows).

You are right, my OS (Gentoo) is not officially supported by the driver, even though my setup meets the (general) linux system requirements and I installed it using my package manager. I would understand it if you would not dig further in the linux issue unless I would prove the unstability with one of the supported OS's.

For the windows case, I forgot to answer one of your questions, I use the PyCharm editor 3.4 community edition. No visual studio.

Thank you for you effort so far, I hope I am able to help and make the Ati drivers a better place.

Best regards,

Robbert

0 Likes

Hi Robbert,

We always welcome guys like you and seek your supports for improvement of AMD's OpenCL development framework (driver, run-time, sdk etc.).

Thanks for confirming that the Gentoo is not officially supported by the driver. You can understand, it will be difficult to make a setup for that OS. But I'll try to reproduce the same for Windows and/or any other flavor of Linux compatible with the driver and let you know my findings.

Regards,

0 Likes

Hi Robbert,

Sorry for this delayed reply.

I was able to reproduce the issue on Windows 7 using latest driver 14.30. In my case, compiler crashed even with option "-cl-opt-disable". But, as you mentioned, disabling optimization worked for you under Window with driver 14.10. I'll try with that older driver. Meanwhile please can you check with 14.30 and let me know your observation.

Regards,

0 Likes

Hello Dipak,

No problem, I am already happy that you are willing to take the time to delve into this issue.

I installed a newer version, 14.20, on the Windows 7 machine (I used the auto-updater from the control center which updated to that version). It now indeed no longer works. It crashes now independend of the -cl-opt-disable flag.

Regards,

Robbert

0 Likes

Hi Robbert,

Thanks for this confirmation.

I tested it again under Windows 7 but with driver 14.10 (14.10-140417a-171084E-ATI) and compiler crashed with optimization flag "-O2" and greater, but worked fine for "O0" [i.e. "-cl-opt-disable"] and "O1". I've filed an internal bug report against this issue and will keep you updated.

Regards,

0 Likes
msoos
Adept I

Dear Robbert,

To be plain an honest if you want your stuff to be fixed, you better start living like an elf in AMD land -- last time I reported a miscompilation bug it took a year for the folks at AMD to fix it. You are better off working around the bug. Just make sure the segfault doesn't turn into miscompilation (has happened to me). Double-test your results, and test them after every driver upgrade (better yet, *never* upgrade drivers, ever) Good luck!

Mate

0 Likes

Hello Mate,

Thank  you for your insight. While I already guessed that it may take a long time for this bug to be fixed, so far AMD has shown willingness to help. While I am primarily concerned that the application runs under Linux (which it does as long as I do not upgrade my driver version), I hope that in the future I will also be able to run it under Windows.

If AMD is any serious about OpenCL (which I hope they do), they will try to fix these issues.

Best regards,

Robbert

0 Likes