1 Reply Latest reply on Jun 7, 2012 11:26 AM by kcarney

    interlockedxxxx sync question with amd graphic card

      I have one piece of compute shader code, like this:
      uint g_maxNumber = 1000;
      groupshared uint g_global index = 0;
      struct TestInfo
      StructuredBuffer <TestInfo> testBuffer : register(t0)
      [numThreads(128, 1, 1)]
      void function_a ()
          uint currentIdx;
        InterlockedAdd(g_globalIndex, 1, currentIdx);
        // GroupMemoryBarrierWithGroupSync():
        TestInfo curInfo = testBuffer[currentIdx];
        while (currentIdx < g_maxNumber)
           InterlockedAdd(g_globalIndex, 1, currentIdx);
      this piece of code doesn't get the correct "currentIdx" value, but it works fine when I add groupmemorybarrier after inerlockedAdd.
      Add barrier will force all thread sync, that will slow down the whole process. Is a bug with amd card or something I am missing?
      this piece of code works fine for nvidia gtx570 card, the amd card I have is 7970.