1 Reply Latest reply on May 30, 2011 4:11 PM by himanshu.gautam

    OpenCL compiler causes GPU crash at runtime

    Kamo-chan
      cl::Program program.build(devices) causes GPU crash

      Hi. 

      I have tried searching the forums for a previous post relating to my problem but have found none and so am starting a new topic.

      I'm trying to write an OpenCL kernel that would compute the L2 norm of some query patches in a 1024x1024 grid.

      Though I have written the said kernel, I noticed that whenever I run the programme that is to make use of it, my GPU crashes when the programme execution reaches the compilation stage (i.e. program.build(devices)), forcing me to manually reboot my system.

      This has only happened after I changed the way the kernel was being read-in (previously was via STRINGIFY macro, but this did not work, changed to reading in from file via C++ ifstream and storing the results in a std::string). I have not yet tried to compile this for my CPU, as this implementation is specifically meant for GPU execution.

      Is there anything I am missing here that causes this crash to happen?

      Kernel is being compiled for a GPU: Radeon HD 5830; have the latest Stream SDK and kernel is posted for reference.

      EDIT: made some things slightly clearer

      #define DIMBLOCK_X 16 #define DIMBLOCK_Y 16 #define TEX_X 1024 #define TEX_Y 1024 #define QUERY_X 64 #define QUERY_Y 64 #define DIMTHREAD_X (TEX_X - QUERY_X) / DIMBLOCK_X #define DIMTHREAD_Y (TEX_Y - QUERY_Y) / DIMBLOCK_Y __kernel void L2Diff(__global float *q, __global float *t, __global float *r){ int x, y, indq, indt, indr; unsigned int tx, ty; float d; //tx = blockIdx.x * blockDim.x + threadIdx.x; //ty = blockIdx.y * blockDim.y + threadIdx.y; tx = get_global_id(0); ty = get_global_id(1); indr = tx*(TEX_Y - QUERY_Y) + ty; r[indr] = 0.0f; for(x = 0; x < QUERY_X; x++){ for(y = 0; y < QUERY_Y; y++){ indt = ((tx+x)*TEX_Y) + (ty + y); indq = (x*QUERY_Y) + y; d = q[indq] - t[indt]; r[indr] += d*d; }//end for-loop }//end for-loop }//end L2Diff