One should almost always use integers for loop indices, it's one place you cannot afford any rounding errors. floats are not necessarily associative, as your code demonstrates.
e.g. this code never ends on a cpu:
float j;
for (j = 0; j < 10000000.0 * 5.0; j += 1.0) {
| if (j == n) { |
| | printf("n = %f\n", n); |
| | n = n * 10; |
| } |
}
output:
n = 1.000000
n = 10.000000
n = 100.000000
n = 1000.000000
n = 10000.000000
n = 100000.000000
n = 1000000.000000
n = 10000000.000000
^C (I had to kill it as it fell into an infinite loop)
Given a SP mantissa is only 24-bits, it will break at 2^24, which is confirmed with this addition to the loop:
| if (roundf(j+1.0) == roundf(j)) { |
| | printf("j = %f\n", j); |
| } |
output:
...
n = 10000000.000000
j = 16777216.000000
j = 16777216.000000
... forever
But apart from the numerical fault in your algorithm, for nou's reason and for others, big loops are a bad idea on gpu's anyway. For example the alu's don't clock very fast, so very long loops will take a long time to run, try to spread the work sideways instead. If your GUI runs on the same card then it will freeze so long as your kernel is running too (this is why microsoft windows has a watchdog timer for it).