cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

teopemuk
Journeyman III

OpenCL simple program crash

I am trying to use openlc for numerical simulation(ode+pde) but the simplest kernel crashes.

It works fine when the number of steps is low but the following code crashes on my 5750. (sometimes the gpu driver recovers, sometimes i get bluescreen or freez) I tried 6490M and it requires even lesser steps for crash.

__kernel void ocl_test(__global const float *a, __global const float *b, __global float *c) {   

    int i = get_global_id(0);

    c = 0.0;

     for (float j = 0; j < 10000000.0 * 5.0; j += 1.0) {

        c += a * j;

    }

  }

0 Likes
3 Replies
nou
Exemplar

kernel must run less than 5 seconds on windows. otherwise GPU get reset by watchdog.

Thanx. Is there any workaround for watchdog? Does it watch for primary GPU only and what about Linux? Cause this 5 sec restriction is not very appropriate.

0 Likes
notzed
Challenger

One should almost always use integers for loop indices, it's one place you cannot afford any rounding errors.  floats are not necessarily associative, as your code demonstrates.

e.g. this code never ends on a cpu:

float n = 1;

    float j;

    for (j = 0; j < 10000000.0 * 5.0; j += 1.0) {

   if (j == n) {
   printf("n = %f\n", n);
   n = n * 10;
   }

    }

output:

n = 1.000000

n = 10.000000

n = 100.000000

n = 1000.000000

n = 10000.000000

n = 100000.000000

n = 1000000.000000

n = 10000000.000000

^C (I had to kill it as it fell into an infinite loop)

Given a SP mantissa is only 24-bits, it will break at 2^24, which is confirmed with this addition to the loop:

if (roundf(j+1.0) == roundf(j)) {
   printf("j = %f\n", j);
   }

output:

...

n = 10000000.000000

j = 16777216.000000

j = 16777216.000000

... forever

But apart from the numerical fault in your algorithm, for nou's reason and for others, big loops are a bad idea on gpu's anyway.  For example the alu's don't clock very fast, so very long loops will take a long time to run, try to spread the work sideways instead.  If your GUI runs on the same card then it will freeze so long as your kernel is running too (this is why microsoft windows has a watchdog timer for it).

0 Likes