3 Replies Latest reply on Aug 16, 2012 4:03 AM by Wenju

    OpenCL program is getting killed ?

    gopal_hc

      I am developing opencl program using MultiGPU.

      I have to launch very large number of threads. At a time i am launching only few threads for a kernel, based on number of resources(registers usage and local memory usage ) used to best utilize GPU resources. So i am launching my kernel total n times(n = N/M), where N is total number of threads that i have to launch, M is number of threads that i can be launched at a time and n is number of times needed to launch the kernel. But it is getting killed after large number of iterations.

       

      assume i have a loop

      for(i =0; i < n; i++) {

      ....writing data from CPU to GPU device using clEnqueueWriteBuffer()....

      ....launching of kernel using clEnqueueNDRangeKernel().....

      ....waiting for all commands in command queue to finish using clFinish()...

      ....reading of data from GPU to CPU clEnqueueReadBuffer()......

      }

       

      Why it is getting killed after n greater than 600 ?

       

      I am using Nvidia GPU device :: GeForce GTX 295

                      Platform Version    ::  OpenCL 1.1

                      Operating System  ::  Ubuntu 11.04

       

      Thanks in advance.

        • Re: OpenCL program is getting killed ?
          Wenju

          d_new_input_2d = (cl_mem *)malloc( sizeof(cl_mem) * num_devices);     // line 40

          d_new_input_2d[icount] = clCreateBuffer(context, CL_MEM_READ_WRITE,

                                      max_size * 5 * sizeof(unsigned int), NULL, &ret);      // line 55

          each element size = max_size * 5 * sizeof(unsigned int)

          so max_size * 5 * sizeof(unsigned int)   ( > or = or < )    sizeof(cl_mem)  ;

          To be honest, you must be careful about the memory size, especially in your code, each loop will allocate memory space.

          Look at line 135, kernel[icount] = clCreateKernel(program, "Kernel_name", &ret); the kernels are the same one?

          I'm not sure what caused the result, what's the error message?

            • Re: OpenCL program is getting killed ?
              gopal_hc

              Hi Wenju,

               

              Look at line 135, kernel[icount] = clCreateKernel(program, "Kernel_name", &ret); the kernels are the same one?

              Yes, kernel is same for both the devices.

              My program is running and giving correct result for less than 600 iterations.

              But after 600 iterations(approx) it is displaying Killed message. Why ?

                • Re: OpenCL program is getting killed ?
                  Wenju

                  I mean that you waste a lot of memory space, you should optimize your code. For example,

                  for( j = 0; j< iteration; j++)

                       for (i=0; i < device_number; i++)

                       {

                            // all operation for one device. like create one buffer, enqueue command, etc

                           //  remember release memory resource

                       }

                  }

                  just have a try. Good luck.