AnsweredAssumed Answered

Nested for loops GPU crashing

Question asked by shunyo on Jan 9, 2014
Latest reply on Feb 5, 2014 by shunyo


I have a set of vectors and I need to find the triple product of all combinations of the vectors. I wrote a very simple 3-dimensional kernel code:

__kernel void compute_triple_prod(__global float4* pl, __global float* res)
  int i = get_global_id(0);
  int j = get_global_id(1);
  int k = get_global_id(2);
  int gs = get_global_size(0);
  int idx = i + j * gs + k * gs * gs;
  res[idx] = dot_prod(pl[i],cross_prod(pl[j],pl[k]));

The code should run with N^3 outputs. I am trying to run for a set of 50 vectors. But the code crashes. I am using ATI FirePro V4800. Also, what are the ways to optimize the code?