rexiaoyu

Bad performance on moving data between private memory and local memory

Discussion created by rexiaoyu on Nov 3, 2009
Latest reply on Nov 3, 2009 by MicahVillmow

Moving data from private memory to local memory is a very time-consuming job, isn't it?  When using the local memory in the kernel, my program runs much slower than before.

code:

 

__private float4 block[4];

__local float4 local_block[16];

 

//very slow here. Why?

local_block[local_id] = block[0];

local_block[local_id + 1] = block[1];

local_block[local_id + 2] = block[2];

local_block[local_id + 3] = block[3];

 

barrier(CLK_LOCAL_MEM_FENCE);



Outcomes