lasagna

GPU Load question

Discussion created by lasagna on Apr 14, 2009
Latest reply on Apr 14, 2009 by MicahVillmow

Hello everyone,

I'm currently using Brook+ on my 3870x2 to write a texture synthesis program that takes three 64x64 images as input. I've implemented my first step but I was a bit dissapointed at the performance. When I checked GPU-Z, the GPU load meter was only high (80+%) for about a second and then stayed around 5% for the rest of the execution, which takes about 10-15 minutes!

 

This is my code to call my kernel (CompareCross):

for(int xy = 0; xy < 64; xy++)
{
  for(int xx = 0; xx < 64; xx++)
  {
    for(int yy = 0; yy < 64; yy++)
    {
      for(int yx = 0; yx < 64; yx++)
      {
        CompareCross(int2(xx, xy), stmExemplarX, int2(yx, yy), stmExemplarY, stmExemplarZ, stmOutput);
      }
    }
  }
}

 

I'm wondering why this is the case? Is it the nested loops? I figure if my kernel was badly written, I'd still see a lot of GPU activity. If anyone has some advice or tips they'd be greatly appreciated

 

Kind regards,

Rob

Outcomes