krishnan

Using stdcl to fork to gpu

Discussion created by krishnan on Jan 19, 2011
Latest reply on Jan 22, 2011 by krishnan

Hi Folks

I've recently started using OpenCL and Brown Deer's stdcl. I wrote some code based on the NBody tutorial on the Brown Deer website and on the AMD Webinars, and things haven't been going according to plan. It seems to me that the processing isn't even shifting to the GPU, and my feeling is that there is some issue in the invocation of clfork() (which is a part of the stdcl).

[Edit: Code formatted at the bottom]

Now, this code should at the very least run and display "This kernel is executing" as many times as there are threads. However, I don't see it displayed even once, which leads me to believe that the code isn't even going into the kernel.

Is my diagnosis right? If so, any thoughts as to how I can fix this?

Thanks in advance for any help.

--- //Some initial declarations int i,j,ops; int step; int N = 256; int nstep = 10; int nthread = 64; float dx = 1.0/N; float dt = 0.25*dx*dx; //allocate memory for 2 N*N arrays. cl_float* A = (cl_float*)clmalloc(stdgpu,N*N*sizeof(cl_float),0); cl_float* B = (cl_float*)clmalloc(stdgpu,N*N*sizeof(cl_float),0); /* Initialize A to some values */ /* ... */ /* End initialization */ // Create a handle to the kernel, `fd_par.cl' void* h = clopen(stdgpu,"fd_par.cl",CLLD_NOW); cl_kernel krn = clsym(stdgpu,h,"fd_par.cl",CLLD_NOW); // Set the range to the size of A clndrange_t ndr = clndrange_init1d(0,N*N,nthread); clmsync(stdgpu,0,A,CL_MEM_DEVICE|CL_EVENT_NOWAIT); for(step=0; step clarg_set_global(krn,0,A); clarg_set_global(krn,1,B); clfork(stdgpu,0,krn,&ndr,CL_EVENT_NOWAIT); clmsync(stdgpu,0,B,CL_MEM_HOST|CL_EVENT_NOWAIT); } ---- My kernel is: ---- __kernel void heat_kern( __global float* A, __global float* B, ){ //test if kernel is executing printf("This kernel is executing\n"); /* Some computation */ /* ... */ /* End computation */ } ----

Outcomes