3 Replies Latest reply on Mar 24, 2011 3:40 PM by Roman85

    problem with clEnqueueNDRangeKernel

    Roman85
      problem with clEnqueueNDRangeKernel, memory leak

      I use amd openCl driver version 11-2_xp32_dd_ccc_ocl and java binding for ocl JOCL-0.1.4-beta1.  I develop program on java with using openCL for accelerate massive calculations in virtual physical experiment and I need launch kernel many times

      (few launch for each experiment step).

      Thus the number of launches of clEnqueueNDRangeKernel can be arbitrarily large during the life of the program.

      But I discovered memory leak occure every time the runs clEnqueueNDRangeKernel, about 300 bytes per call. This leads to a memory leak of about 10 MB per minute of computation. Calculations are carried out correctly but the program crash after several thousand iterations with an error of memory overflow. I determined that the leak occurs on the side open?L by memory profiler. Java heap space memory is constant. Also occurred error with code CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST after ten thousands iterations. How i can resolve these two problem? And where is found the problem: in my code, a driver or a Java binding for openCL?

      I created a simple test to demonstrate this problem. Memory leak occurs and test crashes on 700000 iterations with fatal java internal error.

       

      ////JAVA code public static void main(String args[]){ FileUtils.setJavaLibraryPath("geo.controller.logic.ocl.dlls"); // Obtain the platform IDs and initialize the context properties System.out.println("Obtaining platform..."); cl_platform_id platforms[] = new cl_platform_id[1]; clGetPlatformIDs(platforms.length, platforms, null); cl_context_properties contextProperties = new cl_context_properties(); contextProperties.addProperty(CL_CONTEXT_PLATFORM, platforms[0]); // Create an OpenCL context on a GPU device cl_context context = clCreateContextFromType( contextProperties, CL_DEVICE_TYPE_GPU, null, null, null); // Enable exceptions and subsequently omit error checks in this sample CL.setExceptionsEnabled(true); cl_program fake_programm = clCreateProgramWithSource(context, 1, new String[]{KERNEL_SRC}, null, null); clBuildProgram(fake_programm, 0, null, null, null, null); cl_kernel fakeKernel = clCreateKernel(fake_programm, "fakeKernel", null); long numBytes[] = new long[1]; // Enable exceptions and subsequently omit error checks in this sample CL.setExceptionsEnabled(true); // Get the list of GPU devices associated with the context clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, null, numBytes); // Obtain the cl_device_id for the first device int numDevices = (int) numBytes[0] / Sizeof.cl_device_id; cl_device_id devices[] = new cl_device_id[numDevices]; clGetContextInfo(context, CL_CONTEXT_DEVICES, numBytes[0], Pointer.to(devices), null); cl_command_queue oclComands = clCreateCommandQueue(context, devices[0], 0, null); for(int i = 0; i < 1000000; i ++){ if(i%1000 == 0) System.out.println("iterations = " + i); clEnqueueNDRangeKernel(oclComands, fakeKernel, 1, null, new long[]{0}, null, 0, null, null); clFinish(oclComands); } } /////Kernel code __kernel void fakeKernel(){ }

        • problem with clEnqueueNDRangeKernel
          himanshu.gautam

          Thanks for reporting the issue.

          Can you post the output of the memory leak tool.

          Please also tell your System configuration: CPU,GPU,SDK,DRIVER,OS.

            • problem with clEnqueueNDRangeKernel
              gfrostamd

              @Roman85  I took your code (actually the code below has my complete java class)  and executed on 64 bit Linux platform using JOCL1.5. 

              I do see memory grow over time.  Then it settles down.  What profiler were you using?

              You might consider also asking on the JOCL forums to see if anyone else has seen this. 

              Also consider trying your code with the later version of JOCL

              Here is the code that I tried. 

               

               

              import org.jocl.*;

              class Bad{

                      static String KERNEL_SRC= "__kernel void fakeKernel(){}";

                      public static void main(String args[]){

                              // Obtain the platform IDs and initialize the context properties

                              System.out.println("Obtaining platform...");

                              cl_platform_id platforms[] = new cl_platform_id[1];

                              CL.clGetPlatformIDs(platforms.length, platforms, null);

                              cl_context_properties contextProperties = new cl_context_properties();

                              contextProperties.addProperty(CL.CL_CONTEXT_PLATFORM, platforms[0]);

                              // Create an OpenCL context on a GPU device

                              cl_context context = CL.clCreateContextFromType(

                                              contextProperties, CL.CL_DEVICE_TYPE_GPU, null, null, null);

                              // Enable exceptions and subsequently omit error checks in this sample

                              CL.setExceptionsEnabled(true);

                              cl_program fake_programm = CL.clCreateProgramWithSource(context, 1, new String[]{KERNEL_SRC}, null, null);

                              CL.clBuildProgram(fake_programm, 0, null, null, null, null);

                              cl_kernel fakeKernel = CL.clCreateKernel(fake_programm, "fakeKernel", null);

                              long numBytes[] = new long[1];

                              // Enable exceptions and subsequently omit error checks in this sample

                              CL.setExceptionsEnabled(true);

                              // Get the list of GPU devices associated with the context

                              CL.clGetContextInfo(context, CL.CL_CONTEXT_DEVICES, 0, null, numBytes);

               

                              // Obtain the cl_device_id for the first device

                              int numDevices = (int) numBytes[0] / Sizeof.cl_device_id;

                              cl_device_id devices[] = new cl_device_id[numDevices];

                              CL.clGetContextInfo(context, CL.CL_CONTEXT_DEVICES, numBytes[0], Pointer.to(devices), null);

               

                              cl_command_queue oclComands = CL.clCreateCommandQueue(context, devices[0], 0, null);

               

                              for(int i = 0; i < 1000000; i ++){

                                      if(i%1000 == 0)

                                              System.out.println("iterations = " + i);

                                      CL.clEnqueueNDRangeKernel(oclComands, fakeKernel, 1, null, new long[]{0}, null, 0, null, null);

                                      CL.clFinish(oclComands);

                              }

                      }

              }