1 Reply Latest reply on Oct 23, 2018 12:22 AM by faylyn101

    [Solved] clinfo reports error -33 of "Global free memory (AMD)"  by amdgpu/pro

    faylyn101

      Issue Description: [Describe your issue in detail here]

      At first, Einstein@Home GPU app encountered immediate failure by OpenCL open error -6 (CL_OUT_OF_HOST_MEMORY)

       

      16:46:13 (32530): [normal]: This Einstein@home App was built at: Jan 16 2017 08:09:16

      16:46:13 (32530): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati'.

      ...

      16:46:13 (32530): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86

      16:46:13 (32530): [debug]: glibc version/release: 2.27/stable

      16:46:13 (32530): [debug]: Set up communication with graphics process.

      boinc_get_opencl_ids returned [0x183adf0 , 0x7f2e8c080190]

      Using OpenCL platform provided by: Advanced Micro Devices, Inc.

      Using OpenCL device "Pitcairn" by: Advanced Micro Devices, Inc.

      Max allocation limit: 1345394688

      Global mem size: 1805643776

      Couldn't create OpenCL command queue (error: -6)!

      OpenCL shutdown complete!

      initialize_ocl returned error [2013]

      OCL context null

      OCL queue null

      Error generating generic FFT context object [5]

      16:46:15 (32530): [CRITICAL]: ERROR: MAIN() returned with error '5'

       

      Then I used clinfo to check any possible incompatibilities:

       

      ~$ sudo clinfo

      Number of platforms 1

      Platform Name AMD Accelerated Parallel Processing

      Platform Vendor Advanced Micro Devices, Inc.

      Platform Version OpenCL 2.1 AMD-APP (2671.3)

      Platform Profile FULL_PROFILE

      Platform Extensions cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

      Platform Host timer resolution 1ns

      Platform Extensions function suffix AMD

      ...

      Support is emulated in software No

      Address bits 32, Little-Endian

      Global memory size 2111901696 (1.967GiB)

      Global free memory (AMD) <printDeviceInfo:75: get number of CL_DEVICE_GLOBAL_FREE_MEMORY_AMD : error -33>

      Global memory channels (AMD) 8

      Global memory banks per channel (AMD) 16

      Global memory bank width (AMD) 256 bytes

      ....

       

      As it may show that this opencl + amdgpu cannot make it to fulfill some old/new-fashion of GPU app.

      Then how soon we could have a more correct amdgpu/pro drive?

       

      Thanks for any insightful enlightenment.

       

      Hardware: [Describe the make and model of your: Graphics Card, CPU, Motherboard, RAM, PSU, Display(s), etc.]

      $lspci | grep VGA

      01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]

      02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]

       

      Software: [Describe version or release date of your: Operating System, Game/Application, Drivers, etc.]

      Ubuntu 18.04.1,

      clinfo-2.2.18.03.26-1,

      amdgpu-core 18.30-641594,

      Einstein@home BOINC app: Gamma-ray pulsar binary search on GPUs v1.18

        • Re: [Solved] clinfo reports error -33 of "Global free memory (AMD)"  by amdgpu/pro
          faylyn101

          Searched Google and found old articles in github.com which once some people reported the same issue for previous amdgpu/pro opencl stuffs.

          Actually, it needs some environment variables to work properly.

          1). Can add new script in /etc/profile.d/amdgpu.sh

          export GPU_FORCE_64BIT_PTR=1
          export GPU_USE_SYNC_OBJECTS=1
          export GPU_MAX_ALLOC_PERCENT=100
          export GPU_SINGLE_ALLOC_PERCENT=100
          export GPU_MAX_HEAP_SIZE=100

          2). Can modify /lib/systemd/system/boinc-client.service

          [Service]
          Environment=GPU_SINGLE_ALLOC_PERCENT=100
          Environment=GPU_MAX_HEAP_SIZE=100
          Environment=GPU_FORCE_64BIT_PTR=1
          Environment=GPU_USE_SYNC_OBJECTS=1
          Environment=GPU_MAX_ALLOC_PERCENT=100

          ProtectHome=true
          Type=simple
          Nice=10
          User=boinc
          WorkingDirectory=/var/lib/boinc
          ExecStart=/usr/bin/boinc --redirectio
          ExecStop=/usr/bin/boinccmd --quit
          ExecReload=/usr/bin/boinccmd --read_cc_config
          ExecStopPost=/bin/rm -f lockfile
          IOSchedulingClass=idle

          Hope this helps others

           

          with /opt/amdgpu-pro/bin/clinfo

            Cache type:                                Read/Write
            Cache line size:                           64
            Cache size:                                16384
            Global memory size:                        1232203776
            Constant buffer size:                      65536
            Max number of constant args:               8
            Local memory type:                         Scratchpad
            Local memory size:                         32768

           

           

          ...