1 Reply Latest reply on Jun 30, 2011 3:01 PM by maximmoroz

    wavefront distribution



      I want to know the priority used by the compiler to distribute wavefronts to SIMD engine (CU).

      Assume I have 20 wavefronts (reported by profiler in Visual Studio). HD 5870 has 20 cores.

      Which one is correct:

      Each SIMD engine get 1 wavefront or

      1 SIMD engine get 4 wavefronts (so, 5 SIMD engines are used, the remaining 15 SIMD engines do nothing (idle).


      The reason I asked the question above:

      I experienced two cases in my exeperiments (local work size is set to NULL).

      Case 1:

      If the total number of work items (global work size) is large, the number of wavefronts reported by profiler (after I do some math), I know that 1 wavefront is 64 work-items (full)

      Case 2:

      If the total number of work-items is not very large, the compiler chose only to half-fill the wavefront (1 wavefront is 32 work-items), so the number of wavefronts reported is large enough. It seems the compiler choose to have more number of wavefront (although it's half-filled/32) than less number of wavefront (full-filled/64). Is it correct?

      I hope someone can help me with this question. I'm writing a school report, so I don't want to write wrong information in the report.