Roman85

problem with clEnqueueNDRangeKernel

Discussion created by Roman85 on Mar 12, 2011
Latest reply on Mar 24, 2011 by Roman85
problem with clEnqueueNDRangeKernel, memory leak

I use amd openCl driver version 11-2_xp32_dd_ccc_ocl and java binding for ocl JOCL-0.1.4-beta1.  I develop program on java with using openCL for accelerate massive calculations in virtual physical experiment and I need launch kernel many times

(few launch for each experiment step).

Thus the number of launches of clEnqueueNDRangeKernel can be arbitrarily large during the life of the program.

But I discovered memory leak occure every time the runs clEnqueueNDRangeKernel, about 300 bytes per call. This leads to a memory leak of about 10 MB per minute of computation. Calculations are carried out correctly but the program crash after several thousand iterations with an error of memory overflow. I determined that the leak occurs on the side open?L by memory profiler. Java heap space memory is constant. Also occurred error with code CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST after ten thousands iterations. How i can resolve these two problem? And where is found the problem: in my code, a driver or a Java binding for openCL?

I created a simple test to demonstrate this problem. Memory leak occurs and test crashes on 700000 iterations with fatal java internal error.

 

////JAVA code public static void main(String args[]){ FileUtils.setJavaLibraryPath("geo.controller.logic.ocl.dlls"); // Obtain the platform IDs and initialize the context properties System.out.println("Obtaining platform..."); cl_platform_id platforms[] = new cl_platform_id[1]; clGetPlatformIDs(platforms.length, platforms, null); cl_context_properties contextProperties = new cl_context_properties(); contextProperties.addProperty(CL_CONTEXT_PLATFORM, platforms[0]); // Create an OpenCL context on a GPU device cl_context context = clCreateContextFromType( contextProperties, CL_DEVICE_TYPE_GPU, null, null, null); // Enable exceptions and subsequently omit error checks in this sample CL.setExceptionsEnabled(true); cl_program fake_programm = clCreateProgramWithSource(context, 1, new String[]{KERNEL_SRC}, null, null); clBuildProgram(fake_programm, 0, null, null, null, null); cl_kernel fakeKernel = clCreateKernel(fake_programm, "fakeKernel", null); long numBytes[] = new long[1]; // Enable exceptions and subsequently omit error checks in this sample CL.setExceptionsEnabled(true); // Get the list of GPU devices associated with the context clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, null, numBytes); // Obtain the cl_device_id for the first device int numDevices = (int) numBytes[0] / Sizeof.cl_device_id; cl_device_id devices[] = new cl_device_id[numDevices]; clGetContextInfo(context, CL_CONTEXT_DEVICES, numBytes[0], Pointer.to(devices), null); cl_command_queue oclComands = clCreateCommandQueue(context, devices[0], 0, null); for(int i = 0; i < 1000000; i ++){ if(i%1000 == 0) System.out.println("iterations = " + i); clEnqueueNDRangeKernel(oclComands, fakeKernel, 1, null, new long[]{0}, null, 0, null, null); clFinish(oclComands); } } /////Kernel code __kernel void fakeKernel(){ }

Outcomes