Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

OpenCL compiler causes GPU crash at runtime

cl::Program causes GPU crash


I have tried searching the forums for a previous post relating to my problem but have found none and so am starting a new topic.

I'm trying to write an OpenCL kernel that would compute the L2 norm of some query patches in a 1024x1024 grid.

Though I have written the said kernel, I noticed that whenever I run the programme that is to make use of it, my GPU crashes when the programme execution reaches the compilation stage (i.e., forcing me to manually reboot my system.

This has only happened after I changed the way the kernel was being read-in (previously was via STRINGIFY macro, but this did not work, changed to reading in from file via C++ ifstream and storing the results in a std::string). I have not yet tried to compile this for my CPU, as this implementation is specifically meant for GPU execution.

Is there anything I am missing here that causes this crash to happen?

Kernel is being compiled for a GPU: Radeon HD 5830; have the latest Stream SDK and kernel is posted for reference.

EDIT: made some things slightly clearer

#define DIMBLOCK_X 16 #define DIMBLOCK_Y 16 #define TEX_X 1024 #define TEX_Y 1024 #define QUERY_X 64 #define QUERY_Y 64 #define DIMTHREAD_X (TEX_X - QUERY_X) / DIMBLOCK_X #define DIMTHREAD_Y (TEX_Y - QUERY_Y) / DIMBLOCK_Y __kernel void L2Diff(__global float *q, __global float *t, __global float *r){ int x, y, indq, indt, indr; unsigned int tx, ty; float d; //tx = blockIdx.x * blockDim.x + threadIdx.x; //ty = blockIdx.y * blockDim.y + threadIdx.y; tx = get_global_id(0); ty = get_global_id(1); indr = tx*(TEX_Y - QUERY_Y) + ty; r[indr] = 0.0f; for(x = 0; x < QUERY_X; x++){ for(y = 0; y < QUERY_Y; y++){ indt = ((tx+x)*TEX_Y) + (ty + y); indq = (x*QUERY_Y) + y; d = q[indq] - t[indt]; r[indr] += d*d; }//end for-loop }//end for-loop }//end L2Diff

1 Reply

I don't see indq and indt variables initialized anywhere. There might be some issue with NDRanges also.

Are you able to use some kernel(say a blank kernel) in the method you are trying. All SDK samples use separate files to read the kernel. You might get help from there