cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

chevydevil
Adept II

L1CacheHit on Radeon 5870 with global memory

Hello, I thought caching for global memory fetches is not possible at this time? How is it possible that gdebugger and the amd app profiling tool say that I have a cachehit of nearly fiffty percent for a simple 7-point stencil kernel?

0 Likes
1 Solution

Caching has been supported in various forms since SDK 2.3. It started out as explicitly specified and then moved to auto-detection for caching and then in SDK 2.6 move to caching by default and auto-detection for uncached.

View solution in original post

0 Likes
3 Replies

Caching has been supported in various forms since SDK 2.3. It started out as explicitly specified and then moved to auto-detection for caching and then in SDK 2.6 move to caching by default and auto-detection for uncached.

0 Likes

Thx for the answer. I didn't know that. So the following kernel has an execution time for a 128x128x128 problem size of approx. 250.000 ns on my Radeon 5870.

#define IX33(i,j,k) ((i) + ((X)*(j)) + ((X)*(Y)*(k)))

__kernel void jacobi(__global float * x,

                     __global float * x0,

                     __global float * aux,

                     const float a)

{

    int i = get_global_id(0);

    int j = get_global_id(1);

    int k = get_global_id(2);

    int X = get_global_size(0);

    int Y = get_global_size(1);

    int Z = get_global_size(2);

    int minCoord = 0;

    int maxCoord = X-1;

    float out;

    out = ( (x[IX33(max(i-1,minCoord),j,k)] + x[IX33(min(i+1,maxCoord),j,k)]  +

                                x[IX33(i,max(j-1,minCoord),k)] + x[IX33(i,min(j+1,maxCoord),k)]  +

                                x[IX33(i,j,max(k-1,minCoord))] + x[IX33(i,j,min(k+1,maxCoord))]) +

                               ac.x * x0[IX33(i,j,k)]) * a;

        aux[IX33(i,j,k)]  = out;

}

So when I calculate the memory throughput with 32 byte *128 *128 *128 / 250.000 ns = 268GB/s there has to be something wrong with my calculation right?

0 Likes

No, it means you got some cache hits

0 Likes