AnsweredAssumed Answered

interlockedxxxx sync question with amd graphic card

Question asked by souledgeii on May 30, 2012
Latest reply on Jun 7, 2012 by kcarney
I have one piece of compute shader code, like this:
uint g_maxNumber = 1000;
groupshared uint g_global index = 0;
struct TestInfo
{
     m_offset;
     m_value;
};
StructuredBuffer <TestInfo> testBuffer : register(t0)
[numThreads(128, 1, 1)]
void function_a ()
{
    uint currentIdx;
  InterlockedAdd(g_globalIndex, 1, currentIdx);
  // GroupMemoryBarrierWithGroupSync():
 
  TestInfo curInfo = testBuffer[currentIdx];
  [loop]
  while (currentIdx < g_maxNumber)
  {
     .....
     InterlockedAdd(g_globalIndex, 1, currentIdx);
  }
}
this piece of code doesn't get the correct "currentIdx" value, but it works fine when I add groupmemorybarrier after inerlockedAdd.
Add barrier will force all thread sync, that will slow down the whole process. Is a bug with amd card or something I am missing?
this piece of code works fine for nvidia gtx570 card, the amd card I have is 7970.

Outcomes