I have a program which uses float8 extensively and it runs fine on Intel and Nvidia SDKs and also on GPU using AMD SDK, and also using AMD SDK, I am able to run it on processors without AVX without a problem. However, on bulldozer I am getting segmentation faults.
I have tracked down the problem to a line where the 5th element of float8 variable is accessed. Perhaps the compiler is messing up something about alignment? (thats my best guess...)
So what do I do now? report to AMD? how?
Thanks for reporting the issue. Can you provide a simple test case that reproduces the issue? That would help us reproduce the problem internally.
Thanks for the quick response. It might take a few days before I can come up with a small test case. How should I send it?
yurtesen,
Can you try passing in -disable-avx on your build program to see if this is a valid work around until we get the issue resolved?
Yes, it appears to work with -fdisable-avx and program works on bulldozer with Intel OpenCL SDK. I checked with intel offline compiler and I see that it is using ymm registers and avx instructions.
But I now found out that the program is crashing when building the opencl executable. It crashes when executing clBuildProgram. Tomorrow I will try to copy/paste the code to amd's kernel analyzer to see if it will work or not (it is a shame that there is no offline compiler for AMD OpenCL on Linux).
Are you still interested in getting the problem kernel code?
As an update, I realized (with sdk 2.7) this still does not work on AMD processors. But, I am able to compile the kernel on Intel processors using AMD OpenCL SDK. Perhaps because there is no AVX? but I already tried -fdisable-avx on bullldozer and it did not work...strange...
Update, I have re-tried -fdisable-avx with APP SDK 2.7 and now there is no more segmentation faults at program build stage! Now what?
Hello Jeff and Micah,
I have now confirmed that AMD APP SDK is crashing when compiling the kernel code. I tried this both on Linux and on windows using AMD APP KernelAnalyzer. It works if I select GPU ISA but not x86 Assembly (program works fine on GPU on Linux as well).
While it is not a huge secret, I wouldnt want to put the kernel to forum to open. Is it possible to send it through private message or any other method?
Thanks,
How do I send the source? (without putting it to public forum?)