cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

mrbpix
Journeyman III

clinfo reports 14 compute units for R9 nano and RX 480

I installed the amdgpu-pro 16.30 drivers on a 64-bit Ubuntu 16.04 server machine. And strangely the clinfo utility reports 14 max compute units for the R9 nano and the RX 480:

$ clinfo
[snip]
  Max compute units:                             14

Shouldn't it be 64 for the R9 nano and 36 for the RX 480?

0 Likes
14 Replies
dipak
Big Boss

Hi,

Could you please share the clinfo output?

Regards,

0 Likes

Right, here is the output for an RX 480. I don't know if this matters, but the clinfo utility is from amdgpu-pro-clinfo (not the AMD APP SDK) and it links against libOpenCL.so from amdgpu-pro-libopencl1 (not the AMD APP SDK).

$ ldd /usr/bin/clinfo
        linux-vdso.so.1 =>  (0x00007ffc0897f000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fde13577000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fde1326e000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fde13069000)
        libOpenCL.so.1 => /usr/lib/x86_64-linux-gnu/amdgpu-pro/libOpenCL.so.1 (0x00007fde12e62000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fde12c4c000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fde12a2e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fde12665000)
        /lib64/ld-linux-x86-64.so.2 (0x0000561b8e6c6000)


$ dpkg -S /usr/lib/x86_64-linux-gnu/amdgpu-pro/libOpenCL.so.1
amdgpu-pro-libopencl1:amd64: /usr/lib/x86_64-linux-gnu/amdgpu-pro/libOpenCL.so.1


$ dpkg -l amdgpu-pro-libopencl1:amd64
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                          Version             Architecture        Description
+++-=============================-===================-===================-================================================================
ii  amdgpu-pro-libopencl1:amd64   16.30.3-315407      amd64               AMD OpenCL ICD Loader library


$ clinfo # the output below is for an RX 480
Number of platforms:                     1
  Platform Profile:                     FULL_PROFILE
  Platform Version:                     OpenCL 2.0 AMD-APP (2117.7)
  Platform Name:                     AMD Accelerated Parallel Processing
  Platform Vendor:                     Advanced Micro Devices, Inc.
  Platform Extensions:                     cl_khr_icd cl_amd_event_callback cl_amd_offline_devices


  Platform Name:                     AMD Accelerated Parallel Processing
Number of devices:                     1
  Device Type:                          CL_DEVICE_TYPE_GPU
  Vendor ID:                          1002h
  Board name:                         
  Device Topology:                     PCI[ B#1, D#0, F#0 ]
  Max compute units:                     14
  Max work items dimensions:                3
    Max work items[0]:                     256
    Max work items[1]:                     256
    Max work items[2]:                     256
  Max work group size:                     256
  Preferred vector width char:                4
  Preferred vector width short:                2
  Preferred vector width int:                1
  Preferred vector width long:                1
  Preferred vector width float:                1
  Preferred vector width double:           1
  Native vector width char:                4
  Native vector width short:                2
  Native vector width int:                1
  Native vector width long:                1
  Native vector width float:                1
  Native vector width double:                1
  Max clock frequency:                     555Mhz
  Address bits:                          64
  Max memory allocation:                4244635648
  Image support:                     Yes
  Max number of images read arguments:           128
  Max number of images write arguments:           8
  Max image 2D width:                     16384
  Max image 2D height:                     16384
  Max image 3D width:                     2048
  Max image 3D height:                     2048
  Max image 3D depth:                     2048
  Max samplers within kernel:                16
  Max size of kernel argument:                1024
  Alignment (bits) of base address:           2048
  Minimum alignment (bytes) for any datatype:      128
  Single precision floating point capability
    Denorms:                          No
    Quiet NaNs:                          Yes
    Round to nearest even:                Yes
    Round to zero:                     Yes
    Round to +ve and infinity:                Yes
    IEEE754-2008 fused multiply-add:           Yes
  Cache type:                          Read/Write
  Cache line size:                     64
  Cache size:                          16384
  Global memory size:                     8544440320
  Constant buffer size:                     65536
  Max number of constant args:                8
  Local memory type:                     Scratchpad
  Local memory size:                     32768
  Max pipe arguments:                     0
  Max pipe active reservations:                0
  Max pipe packet size:                     0
  Max global variable size:                0
  Max global variable preferred total size:      0
  Max read/write image args:                0
  Max on device events:                     0
  Queue on device max size:                0
  Max on device queues:                     0
  Queue on device preferred size:           0
  SVM capabilities:                    
    Coarse grain buffer:                No
    Fine grain buffer:                     No
    Fine grain system:                     No
    Atomics:                          No
  Preferred platform atomic alignment:           0
  Preferred global atomic alignment:           0
  Preferred local atomic alignment:           0
  Kernel Preferred work group size multiple:      64
  Error correction support:                0
  Unified memory for Host and Device:           0
  Profiling timer resolution:                1
  Device endianess:                     Little
  Available:                          Yes
  Compiler available:                     Yes
  Execution capabilities:                    
    Execute OpenCL kernels:                Yes
    Execute native function:                No
  Queue on Host properties:                    
    Out-of-Order:                     No
    Profiling :                          Yes
  Queue on Device properties:                    
    Out-of-Order:                     No
    Profiling :                          No
  Platform ID:                          0x7f02276c08f8
  Name:                               Ellesmere
  Vendor:                          Advanced Micro Devices, Inc.
  Device OpenCL C version:                OpenCL C 1.2
  Driver version:                     2117.7 (VM)
  Profile:                          FULL_PROFILE
  Version:                          OpenCL 1.2 AMD-APP (2117.7)
  Extensions:                          cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event
0 Likes

Thanks for sharing the information. I'll check with the concerned team and get back to you.

Regards,

0 Likes

Hi Marc,

It seems that a similar issue has already been reported to the driver team and they are working on it.

Regards,

0 Likes
ekondis
Adept II

I'm experiencing the same problem with R9-Nano.

I have already referred to the issue in the past.

0 Likes

Yup. This bug is very annoying. I am writing OpenCL code and because of this bug, I have no proper way of determining the optimal global work size based on the number of compute units...

0 Likes

That's right. btw, I'm wondering where this magic number (14) comes from.

0 Likes

Hi Elias,

Somehow it slipped through the cracks. My apologies. Thanks for reviving it again.

Regards,

0 Likes

Thanks.

Is there any rough estimation on the availability date of a new release? Current release is almost 3 months old.

0 Likes

Good news.  AMD GPU Pro 16.40 has just  been released. It also supports RHEL 6.8 and 7.2.

AMDGPU-PRO Driver for Linux® – Release Notes

Regards,

0 Likes

That's good news though the bug still remains. R9 Nano still reports 14 max compute units. In addition, the GPU of AMD FX-7500 APU (Spectre) is reported to have 7 compute units where 6 is the correct number.

0 Likes

Thanks for reporting.

Sorry, the issue has not been fixed yet. Please keep patience.

Regards,

0 Likes

The bug still persists when using the just released 16.50 driver.

Please do what's necessary to see this annoying bug corrected.

0 Likes

It was fixed in 16.60, finally! clinfo now properly reports 64 for the R9 nano and 36 for the RX 480.

0 Likes