cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

j0hnny
Journeyman III

opencl clEnqueueNDRangeKernel caused memory leak

Hi, anyone who cares

      I meet with memory leak when calling clEnqueueNDRangeKernel in a deadloop using amd card, below case I used an empty kernel,  but I think this issue is not because of my empty kernel, I ever use matrix add/mul kernel and kernel execution results are right but still meet memory leak.

my code looks as below:

      -----------------------------------------code  start------------------------------------------------------------------

      do {

        for (i = 0; i < ciDeviceCount; ++i) {

            ciErrNum = clEnqueueNDRangeKernel(commandQueue, matrixEqual, 2, 0, globalWorkSize, localWorkSize,

                   0, NULL, &GPUExecution);

            oclCheckError(ciErrNum, CL_SUCCESS);

            ciErrNum = clFinish(commandQueue);

            oclCheckError(ciErrNum, CL_SUCCESS);

        }

    } while(1);

    -----------------------------------------code  end----------------------------------------------------------------------

    and my kernel is an empty kernel

-----------------------------------------code  start------------------------------------------------------------------

     __kernel void

     matrixEqual(int m, int n)

     {

         m = n;

     }

  -----------------------------------------code  end----------------------------------------------------------------------

     after run many loops, my process's memory consuming rising to 7g and finally killed by the linux kernel.

     In /var/log/syslog, it showed that kernel kill the process because out of memory.

     snd kernel: [ 8779.289654] Out of memory: Kill process 5239 (myprocess) score 917 or sacrifice child

OS version:

Linux snd 4.10.0-42-generic #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Part of clinfo:

Platform Version: OpenCL 2.0 AMD-APP (2482.3)

  Platform ID: 0x7f19c67e1098

  Name: Ellesmere

  Vendor: Advanced Micro Devices, Inc.

  Device OpenCL C version: OpenCL C 1.2

  Driver version: 2482.3

  Profile: FULL_PROFILE

  Version: OpenCL 1.2 AMD-APP (2482.3)

 

  Device Type: CL_DEVICE_TYPE_GPU

  Vendor ID: 1002h

  Board name: Radeon RX 570 Series

Hardware

2 cores Intel(R) Celeron(R) CPU G3930 @ 2.90GHz

8 cards Radeon RX 570 Series

I also attach the full code to the post.

0 Likes
3 Replies
dipak
Big Boss

Hi,

Thank you for reporting this.

The attached code seems incomplete. Please share a complete repro. Also please mention the driver version (say AMDGPU-Pro X.Y). If the driver is not the latest one, please try the latest driver and share your observation.

From the above code snippet, it looks like you haven't released the event object (GPUExecution) generated against each clEnqueueNDRangeKernel call. It can also cause a memory leak.

Regards,

0 Likes

Hi, dipak

      Thanks for you kindly reply.

    

      I tried your suggestion,  which remove GPUExecution in  clEnqueueNDRangeKernel but memory leak still happen.

       I wanner try your 2nd suggestion, check and update AMDGPU-Pro version.  Could you please help to tell

       1. how to check my amd-gpu driver version? It is not installed by me, and dpkg -l amdgpu-pro showed there was no amdgpu-pro installed, but I can see its folder at /opt/amdgpu-pro

       2. I only found newest amdgpu-pro 18.20 on website Radeon™ Software for Linux® 18.20 Release Notes .  But it said only supporting opencl 1.2,  while my Radeon RX 570 Series card said it supporting opencl 2.0? So where can I find driver supporting opencl 2.0? Do I need install Rocm for opencl 2.0 development.

        Below is amdgpu-pro 18.2 support list

pastedImage_7.png

      

                   

      Below is   RX 570 Series

pastedImage_0.png

       Thanks a lot again.

0 Likes

Currently, AMDGPU-Pro supports OpenCL 1.2 only. That's why RX 570 is listed as OpenCL 1.2 device though it supports OpenCL 2.0.

For OpenCL 2.0 kernel programming, you can choose ROCm. It provides OpenCL 2.0 compatible kernel language support with OpenCL 1.2 compatible runtime.

Also, from the above clinfo version information, it looks like that the installed driver is an older one. Please install the latest amdgpu-pro 18.20 and check. If the issue is still reproducible, please share a complete repro.

Regards,

0 Likes