AnsweredAssumed Answered

OpenCL bug with HD 7790 (Bonaire)

Question asked by vmiura on Apr 3, 2013

Hello,

 

I am hitting an odd bug running one of my OpenCL kernels on a new HD 7790.  This is a kernel that I've verified on a HD 7770, and also on some Fermi and Kepler cards also.

 

After a lot of narrowing, I am strongly suspecting it's some kind of compiler bug.  Unfortunately it doesn't look like CodeXL will disassemble Bonaire ISA yet so I can't confirm if it's doing something odd.  I also can't debug the kernel.

 

Are there any known issues with register clobbering or similar?  I have 'AMD APP SDK Runtime 10.0.1124.2'.

 

I'll try to make a standalone test, but this is the gist of the problem code:

 

struct MyStruct *m = (__global struct MyStruct *)(basePtr + offset);

 

if(m->magic != 123)
{

     ... dump debug diagnostics to global memory  // This never happens

      return;
}

 

if(...)
{

  // loads + arithmetic

   // no stores, and no touching 'm'

}

else
{

   // loads + arithmetic

  // no stores, and no touching 'm'
}

 

if(m->magic != 123)
{

     ... dump debug diagnostics to global memory // This always happens

     return;
}

 

The result is that I get the dump the 2nd time I check m->magic not the 1st.  Nothing should be modifying global memory here.  There's just the one kernel running with clFinish before and after - and it's 100% reproducible.

 

I dumped 'basePtr', 'offset' and 'm' and I can see m is corrupt (m != basePtr + offset).

Outcomes