1 Reply Latest reply on Jun 7, 2012 11:26 AM by kcarney

    interlockedxxxx sync question with amd graphic card

    souledgeii
      I have one piece of compute shader code, like this:
      uint g_maxNumber = 1000;
      groupshared uint g_global index = 0;
      struct TestInfo
      {
           m_offset;
           m_value;
      };
      StructuredBuffer <TestInfo> testBuffer : register(t0)
      [numThreads(128, 1, 1)]
      void function_a ()
      {
          uint currentIdx;
        InterlockedAdd(g_globalIndex, 1, currentIdx);
        // GroupMemoryBarrierWithGroupSync():
       
        TestInfo curInfo = testBuffer[currentIdx];
        [loop]
        while (currentIdx < g_maxNumber)
        {
           .....
           InterlockedAdd(g_globalIndex, 1, currentIdx);
        }
      }
      this piece of code doesn't get the correct "currentIdx" value, but it works fine when I add groupmemorybarrier after inerlockedAdd.
      Add barrier will force all thread sync, that will slow down the whole process. Is a bug with amd card or something I am missing?
      this piece of code works fine for nvidia gtx570 card, the amd card I have is 7970.