AnsweredAssumed Answered

High "write unit stalled" counter values. Any tricks to lower?

Question asked by Raistmer on Jan 13, 2014
Latest reply on Jan 22, 2014 by ravikeshri

Few my kernels show quite high write unit stalled percentage (~80%).

What could be done to rearrange writes maybe to decrease this counter (% measured in CodeXL profiler)?

 

Currently kernel writes 4 floats at ones per workitem in adjacent threads.

 

Kernel's ISA looks like:

 

143 MEM_RAT_CACHELESS_STORE_RAW: RAT(11)[R1], R0, ARRAY_SIZE(4)  MARK  VPM

144 ALU: ADDR(1728) CNT(2)

    466  x: MULADD_e    R2.x,  R5.x,  R3.w, -R1.y     

         y: MULADD_e    R2.y,  R5.y,  R3.w,  R1.w     

145 MEM_RAT_CACHELESS_STORE_RAW: RAT(11)[R3], R2, ARRAY_SIZE(4)  MARK  VPM

 

And CL looks like:

 

__global float4* gpu_dechirped

...

 

uint tid = get_global_id(0);

uint dchunk=get_global_id(1);

....

gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+1)*(FFT_SIZE/2)+tid]=cur_dechirp;

Outcomes