1 Reply Latest reply on Jan 22, 2014 4:36 AM by ravikeshri

    High "write unit stalled" counter values. Any tricks to lower?

    Raistmer

      Few my kernels show quite high write unit stalled percentage (~80%).

      What could be done to rearrange writes maybe to decrease this counter (% measured in CodeXL profiler)?

       

      Currently kernel writes 4 floats at ones per workitem in adjacent threads.

       

      Kernel's ISA looks like:

       

      143 MEM_RAT_CACHELESS_STORE_RAW: RAT(11)[R1], R0, ARRAY_SIZE(4)  MARK  VPM

      144 ALU: ADDR(1728) CNT(2)

          466  x: MULADD_e    R2.x,  R5.x,  R3.w, -R1.y     

               y: MULADD_e    R2.y,  R5.y,  R3.w,  R1.w     

      145 MEM_RAT_CACHELESS_STORE_RAW: RAT(11)[R3], R2, ARRAY_SIZE(4)  MARK  VPM

       

      And CL looks like:

       

      __global float4* gpu_dechirped

      ...

       

      uint tid = get_global_id(0);

      uint dchunk=get_global_id(1);

      ....

      gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+1)*(FFT_SIZE/2)+tid]=cur_dechirp;