cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Nexis
Journeyman III

printf improving GPU performance???

I accidentally discovered while testing my application that printing my kernel was improving the performance of my HD 3870...

I was also able to reproduce this behavior in the  simple_matmul example from the CAL SDK. If you run the sample with the default values, the output should be something like this:

Matrix Size     GPU Only        Total
(0256x0256)     65.5152         1.1904

If I put some printf at the start of the main() function, i get the following output:

Matrix Size     GPU Only        Total
(0256x0256)     151.7902        0.4275

Is anyone else able to reproduce this behavior?

To acheive this result you actually have to put quite a fiew printf... I print 100 times about 2000 characters:

for(int i=0; i<100; i++) printf("11111111111111111111.......11");

 You don't actually have to count 2000 character, just look at the column count in your editor to get 2000 "1"...

0 Likes
4 Replies

I've verified this and will be filing a report on this. Thanks for bringing this to our attention.
0 Likes

Ok thanks, but can you tell if this is a bug or is it really improving performances?

I've noticed it doesn't change anything for big matrices (4096), the peak seems to be around 200 Gflops...

0 Likes

Nexis,

I am wondering how you could say peak peformance is 200Gflops. What was the input for it?

0 Likes

Well, just try the simple_matmult sample from the CAL SDK with a big matrice an you should get around 200 Gflops with a HD 3870.

To modify the matrix size you have to add an argument to the executable like this:

simple_matmult.exe -m 4096

0 Likes