cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Raistmer
Adept II

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

Himanshu, thanks for looking into this issue deeply.

Cause Claggy was first who report Cat 12.10 issue to me I think he is right with explanation why your observation differs from what I said about Cat 12.10.

Please, replace work_unit.sah from my archive with same file from PG0009_v7.workunit.7z.zip archive that Claggy attached.

Also, another ref file,ref-setiathome_6.98_windows_intelx86.exe-PG0009_v7.wu.res (again, provided in that archive), needed to check fresh result.sah. Comparison utility remains the same.

P.S. So, for now we can summarize issues in next way:

1) Incorrect computations with Cat 12.10 appear not in all data sets. Moreover, difference for PG0009 task in (as we call them) "best signals", that is, signals below threshold to be marked as reportable. That means computations in kernels compiled under 12.10 differ from correct ones not too big, but enough for precision issue to appear.

2) Catalyst 13.1 compiler broken for this kernels file. It's another issue cause error appears even before computations begin.

0 Likes
himanshu_gautam
Grandmaster

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

Hi Raistmer,

I checked with the new data as you suggested. I had taken the result.sah file and the reference file from PG0009_v7.workunit.7z.zip attachment. Not sure what the other 2 attachments are intended for.


rescmpv5 utility gives weakly similar for 12.10 driver. So some corruption happening.

rescmpv5 gives strongly similar for 12.8 driver. Expected.

But rescmpv5 gives strongly similar for 13.1 driver now with New Data. SURPRISE again.

so as i understand it, there are two issues here:

1. Data corruption when driver is updated from 12.8 to 12.10. But not reproduced with 13.1 driver, so probably not a issue. Can you confirm?

2. Driver crash when driver updated from 12.10 to 13.1. This is valid for the old data itself.

I will try to do some debugging on codeXL too, and let you know.

Hi Raistmer,

Will it be possible to give a testcase with the host code. I tried working with the kernel file, but there are so many kernels (which are enabled/disabled using #defines) . Also RESULT_SIZE seems to be a macro defined in Host code and used in kernels. I could not compile the kernels in KernelAnalyzer because of this macro.

Message was edited by: Himanshu Gautam

0 Likes
Raistmer
Adept II

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

Hi, Himanshu.

Thanks for continuing to look into this issue.

Regarding no crash under Cat 13.1 - no idea for now, maybe Claggy or other alpha tester who follows this thread will bring some idea. I can't maintain test configs by myself now cause need stable environment so stick with Cat 12.1 on main PC and "unknown" version of Catalyst (but old too) on C-60 netbook. Info about app behavior on latest drivers comes from alpha testers.

And regarding host code - of course, no problems with this. It's GPLed app with freely available sources.

So you can look directly into repository (head or that revision that I used for test case binary). Suggestions and improvements are welcomed!

Here is repository:

https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt

and for this particular app you need files in root + these dirs:

https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt/AKv8

https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt/bin

https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt/lib

https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt/src

P.S. and defines you looking for in GPU_lock.cpp file:



strcpy(buildoptions,"-w -DRESULT_SIZE=32 -cl-unsafe-math-optimizations -fno-bin-llvmir -fno-bin-amdil");

if(swi.analysis_cfg.autocorr_fftlen) strcat(buildoptions," -DSETI7");//R: dynamically define if autocorr is needed
0 Likes
himanshu_gautam
Grandmaster

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

Thanks Raistmer for the update.

Are you sure -cl-unsafe-math-optimizations flag is not causing the data corruption issue?

I will try to look into the code base in some days. Meanwhile If you can arrange for more information, from claggy and team, it would be helpful.

0 Likes
Raistmer
Adept II

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

It was not the case with older drivers. But maybe new one enabled some more "unsafe" optimizations indeed. Worth to check, I will, thanks.

Regarding kernel file compilation issues - they were observed under Linux too. Not a crash but some "internal error" instead:

Error : Building Program (source, clBuildProgram):main kernels: not OK code -11

Internal error: Compilation failed.

(it's on Catalyst 13.2 beta7 )

It's the same app, just its Linux port. We will try to narrow issue location inside CL file.

0 Likes
himanshu_gautam
Grandmaster

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

Error : Building Program (source, clBuildProgram):main kernels: not OK code -11

Internal error: Compilation failed.

-11 is the kernel compilation failed. Check out the build log from clGetProgramBuildInfo API.

0 Likes
freighter
Journeyman III

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

himanshu.gautam schrieb:

Error : Building Program (source, clBuildProgram):main kernels: not OK code -11

Internal error: Compilation failed.

-11 is the kernel compilation failed. Check out the build log from clGetProgramBuildInfo API.

That is the complete Buildlog:

"Internal error:Compilation failed."

Attached the output from AMD APP KernelAnalyzer2 for our MultiBeam_Kernels.cl

0 Likes
Raistmer
Adept II

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

himanshu.gautam wrote:

Thanks Raistmer for the update.

Are you sure -cl-unsafe-math-optimizations flag is not causing the data corruption issue?

I will try to look into the code base in some days. Meanwhile If you can arrange for more information, from claggy and team, it would be helpful.

Checked, -cl-unsafe-math-optimizations flag has no influence on invalid resuls.

0 Likes
himanshu_gautam
Grandmaster

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

I am able to compile both the kernel files Multibeam_kernels_r1726.cl & Multibeam_kernel_r1643.cl with the above mentioned build options with 13.1 Driver. 13.2 is in beta, so I recommend to try 13.1 only. Kernel Analyzer with attached Info, built both kernels for all 18 OpenCL devices.

0 Likes
Raistmer
Adept II

Re: Problems with Cat 12.10 and up and HD7xxx (and not only) GPUs

Did you try under Windows or under Linux ?

This subthread about Linux and your screenshot very resembles Windows version "About" dialog...

0 Likes