cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jai_cool
Journeyman III

L1 Cache friendly code

Hi there,

I would like to know how to write L1 Cache friendly code. Assuming L1 Cache line is X Bytes long, if my wavefront needs X Bytes from device memory and other wavefronts in the block OR wavefronts from other blocks that run parallel on the same execution unit need the same X Bytes, I expect to see high L1 cache hit rate.

In the following code :

<code>

__kernel void testKernel(__global__ float *A, __global__ float *B)

{

     // Testing if this hits L1 cache.

     float val = B[0];

     if (get_group_id(0) == 0) A[get_local_id(0)] = val;

}

Is this L1 Cache friendly code ? Should I expect great L1 Cache hit rate running this through the AMD OpenCL Profiler running this for 100s of blocks, each block with 256 or 512 threads.

</code>

0 Likes
8 Replies
jai_cool
Journeyman III

Can anyone help me with this.

Thanks!

0 Likes

What's your hardware?

0 Likes
amdkid
Adept I

Hello jai_cool

I think you need use builtin function prefetch, but I don't know how use it  effectively, and I also interested in this issue.

0 Likes

My GPU is AMD Radeon™ HD 6970. Please let me know how to get a high L1 cache hit rate. My understanding is all workgroups (or even wavefronts in a workgroup , even better) should read the same address or the same cache line as I have shown in my code snippet.

0 Likes

I alredy wrote to you  I don't know, but I also interested in this issue, that's why I in that branch.

0 Likes
notzed
Challenger

Somewhere in chapter 4 of the app programming guide is about all the public information available about it.

Other than that "l1 cache friendly code" is just code that doesn't access much range of memory since L1 is so small.

0 Likes

So,

Are we all convinced that the code snippet I posted should result in a very high L1 cache hit rate?

0 Likes
mikewolf_gkd
Journeyman III

I think your code is cache friendly, L1 cache line is 64 bytes, and 16 thread should be coalesced read/write.

0 Likes