8 Replies Latest reply on Aug 16, 2012 5:17 PM by Skysnake

    OpenCL kernel crashes with -5 Error

    gopal_hc

      I am developing OpenCL program using MultiGPU.

       

      I have to launch very large number of threads. At a time i am launching only few threads for a kernel, based on number of resources(registers usage and local memory usage ) used to best utilize GPU resources. So i am launching my kernel total n times(n = N/M), where N is total number of threads that i have to launch, M is number of threads that i can be launched at a time and n is number of times needed to launch the kernel.

       

           Launching of kernel for few iterations goes successfully but it fails for more iterations. Kernel crashes with -5 Error when i am launching my kernel for more iterations.

       

        I am getting "CL_OUT_OF_RESOURCES" Error while while calling clEnqueueReadBuffer() function. It means that i am accessing GPU memory out of limit, but the same code is working fine for few iterations.

       

      I am using Nvidia GPU device :: GeForce GTX 295

                      Platform Version    ::  OpenCL 1.1

                      Operating System  ::  Ubuntu 11.04

       

       

      My Questions are ::

           What could be reasons the for error messages and crashes?

           What are the causes of getting "CL_OUT_OF_RESOURCES" Error?

       

      Waiting for your quick reply.

       

      Thanks in advance.

        • OpenCL kernel crashes with -5 Error
          nou

          maybe add clFinish after each 1000-10000 clEnqueuNDRange()?

            • Re: OpenCL kernel crashes with -5 Error
              gopal_hc

              Actually i am creating and releasing the input buffer for each call of clEnqueuNDRangeKernel() function. And using clWaitForEvents() for each call of clEnqueuNDRange() to wait for completion of execution.

               

              Is this the correct way?

              • Re: OpenCL kernel crashes with -5 Error
                gopal_hc

                Hi nou,

                Thanx once again for your quick reply.

                My program is running and not crashing with -5 Error. But it is getting killed after 600 iterations.

                assume i have a loop

                for(i =0; i < n; i++) {

                ....writing data from CPU to GPU device using clEnqueueWriteBuffer()....

                ....launching of kernel using clEnqueueNDRangeKernel().....

                ....waiting for all commands in command queue to finish using clFinish()...

                ....reading of data from GPU to CPU clEnqueueReadBuffer()......

                }

                where (n = N/M)  n is number of iterations needed to launch the kernel, where N is total number of threads that i have to launch, M is number of threads that i can be launched at a time.

                 

                Why it is getting killed after n greater than 600 ?

                  • Re: OpenCL kernel crashes with -5 Error
                    Skysnake

                    I have the same problem with nVidia GPUs with a PCI-E bandwith test, that i have written (klick)

                     

                    It seems to be a problem in the nVidia driver. I tried to report this bug to nVidia, but because of the fact, that there forum and so on is offline since there hack, i am not able to report the bug

                     

                    The AMD driver is in this part much better. It runs and runs and runs without any problems in my testsuite.

                     

                    Btw. yes, you can solve the problem for the clEnqueueReadBuffer and clEnqueueWriteBuffer functions with the clFinish on nVidia GPUs, but you are not able to solve it for clEnqueueMapBuffer.... I am only able to say it for pinned memory.

                    Just as hint, usw pinned memory and clEnqueueMapBuffer. It is MUCH faster