7 Replies Latest reply on Aug 11, 2014 7:50 AM by signork

    APP SDK samples can't find GPU


      I have installed a Radeon R9 290 into a 64-bit Ubuntu 12.04 Linux machine and successfully installed amd-catalyst-13.11-beta-v9.4-linux-x86.x86_64 driver and AMD-APP-SDK-v2.9-lnx64.


      My X is working well and I can query the GPU with aticonfig:


      $ aticonfig --odgc


      Default Adapter - AMD Radeon R9 290 Series

                                  Core (MHz)    Memory (MHz)

                 Current Clocks :    300           1250



              Performance Level :    0

              Current Bus Speed :    2500

               Current Bus Lane :    1

                       GPU load :    0%


      I can build the samples fine, but when I run them I get:


      Platform 0 : Advanced Micro Devices, Inc.

      GPU not found. Falling back to CPU device

      Platform found : Advanced Micro Devices, Inc.

      Selected Platform Vendor : Advanced Micro Devices, Inc.

      Device 0 : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz Device ID is 0x2422ce0


      It keeps falling back to CPU. I cannot seem to get the samples to find the GPU. Here's the list of linked libraries from one of the samples:


      $ ldd BinarySearch

        linux-vdso.so.1 =>  (0x00007fffa62eb000)

        libOpenCL.so.1 => /opt/AMDAPP/lib/x86_64/libOpenCL.so.1 (0x00007f09ed99c000)

        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f09ed67d000)

        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f09ed466000)

        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f09ed0a6000)

        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f09ece89000)

        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f09ecc84000)

        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f09ec988000)

        /lib64/ld-linux-x86-64.so.2 (0x00007f09edba4000)


      Any ideas how to get the GPU recognised?





        • Re: APP SDK samples can't find GPU

          How about trying to run the samples as root? The application clinfo (shipping) with the driver should also see the GPU, not just the samples.

            • Re: APP SDK samples can't find GPU

              Running as root has no effect. Same outcome.


              Here is some more info:


              $ aticonfig --odgc


              Default Adapter - AMD Radeon R9 290 Series

                                          Core (MHz)    Memory (MHz)

                         Current Clocks :    300           1250


                      Performance Level :    0

                      Current Bus Speed :    2500

                       Current Bus Lane :    1

                               GPU load :    0%


              $ uname -srvp

              Linux 3.5.0-44-generic #67~precise1-Ubuntu SMP Wed Nov 13 16:16:57 UTC

              2013 x86_64


              $ sudo clinfo

              Number of platforms:                     1

                Platform Profile:                     FULL_PROFILE

                Platform Version:                     OpenCL 1.2 AMD-APP (1214.3)

                Platform Name:                     AMD Accelerated Parallel Processing

                Platform Vendor:                     Advanced Micro Devices, Inc.

                Platform Extensions:                     cl_khr_icd cl_amd_event_callback




                Platform Name:                     AMD Accelerated Parallel Processing

              Number of devices:                     1

                Device Type:                          CL_DEVICE_TYPE_CPU

                Device ID:                          4098

                Board name:                         

                Max compute units:                     8

                Max work items dimensions:                3

                  Max work items[0]:                     1024

                  Max work items[1]:                     1024

                  Max work items[2]:                     1024

                Max work group size:                     1024

                Preferred vector width char:                16

                Preferred vector width short:                8

                Preferred vector width int:                4

                Preferred vector width long:                2

                Preferred vector width float:                8

                Preferred vector width double:           4

                Native vector width char:                16

                Native vector width short:                8

                Native vector width int:                4

                Native vector width long:                2

                Native vector width float:                8

                Native vector width double:                4

                Max clock frequency:                     1600Mhz

                Address bits:                          64

                Max memory allocation:                4196192256

                Image support:                     Yes

                Max number of images read arguments:           128

                Max number of images write arguments:           8

                Max image 2D width:                     8192

                Max image 2D height:                     8192

                Max image 3D width:                     2048

                Max image 3D height:                     2048

                Max image 3D depth:                     2048

                Max samplers within kernel:                16

                Max size of kernel argument:                4096

                Alignment (bits) of base address:           1024

                Minimum alignment (bytes) for any datatype:      128

                Single precision floating point capability

                  Denorms:                          Yes

                  Quiet NaNs:                          Yes

                  Round to nearest even:                Yes

                  Round to zero:                     Yes

                  Round to +ve and infinity:                Yes

                  IEEE754-2008 fused multiply-add:           Yes

                Cache type:                          Read/Write

                Cache line size:                     64

                Cache size:                          32768

                Global memory size:                     16784769024

                Constant buffer size:                     65536

                Max number of constant args:                8

                Local memory type:                     Global

                Local memory size:                     32768

                Kernel Preferred work group size multiple:      1

                Error correction support:                0

                Unified memory for Host and Device:           1

                Profiling timer resolution:                1

                Device endianess:                     Little

                Available:                          Yes

                Compiler available:                     Yes

                Execution capabilities:                    

                  Execute OpenCL kernels:                Yes

                  Execute native function:                Yes

                Queue properties:                    

                  Out-of-Order:                     No

                  Profiling :                          Yes

                Platform ID:                          0x00007f7bf86d9fc0

                Name:                               Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

                Vendor:                          GenuineIntel

                Device OpenCL C version:                OpenCL C 1.2

                Driver version:                     1214.3 (sse2,avx)

                Profile:                          FULL_PROFILE

                Version:                          OpenCL 1.2 AMD-APP (1214.3)

                Extensions:                          cl_khr_fp64 cl_amd_fp64

              cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics

              cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics

              cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

              cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing

              cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3

              cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt


              $ cat /etc/X11/xorg.conf

              Section "ServerLayout"

                   Identifier     "Layout0"

                   Screen      0  "aticonfig-Screen[0]-0" 0 0

                   InputDevice    "Keyboard0" "CoreKeyboard"

                   InputDevice    "Mouse0" "CorePointer"



              Section "Files"



              Section "Module"



              Section "InputDevice"



              1. generated from default

                   Identifier  "Mouse0"

                   Driver      "mouse"

                   Option         "Protocol" "auto"

                   Option         "Device" "/dev/psaux"

                   Option         "Emulate3Buttons" "no"

                   Option         "ZAxisMapping" "4 5"



              Section "InputDevice"



              1. generated from default

                   Identifier  "Keyboard0"

                   Driver      "kbd"



              Section "Monitor"

                   Identifier   "Monitor0"

                   VendorName   "Unknown"

                   ModelName    "Unknown"

                   HorizSync    28.0 - 33.0

                   VertRefresh  43.0 - 72.0

                   Option         "DPMS"



              Section "Monitor"

                   Identifier   "aticonfig-Monitor[0]-0"

                   Option         "VendorName" "ATI Proprietary Driver"

                   Option         "ModelName" "Generic Autodetecting Monitor"

                   Option         "DPMS" "true"



              Section "Monitor"

                   Identifier   "0-DFP6"

                   Option         "VendorName" "ATI Proprietary Driver"

                   Option         "ModelName" "Generic Autodetecting Monitor"

                   Option         "DPMS" "true"

                   Option         "PreferredMode" "1920x1200"

                   Option         "TargetRefresh" "60"

                   Option         "Position" "2560 0"

                   Option         "Rotate" "left"

                   Option         "Disable" "false"



              Section "Monitor"

                   Identifier   "0-DFP7"

                   Option         "VendorName" "ATI Proprietary Driver"

                   Option         "ModelName" "Generic Autodetecting Monitor"

                   Option         "DPMS" "true"

                   Option         "PreferredMode" "2560x1440"

                   Option         "TargetRefresh" "60"

                   Option         "Position" "0 303"

                   Option         "Rotate" "normal"

                   Option         "Disable" "false"



              Section "Device"

                   Identifier  "aticonfig-Device[0]-0"

                   Driver      "fglrx"

                   Option         "Monitor-DFP6" "0-DFP6"

                   Option         "Monitor-DFP7" "0-DFP7"

                   BusID       "PCI:2:0:0"



              Section "Screen"

                   Identifier "aticonfig-Screen[0]-0"

                   Device     "aticonfig-Device[0]-0"

                   DefaultDepth     24

                   SubSection "Display"

                        Viewport   0 0

                        Depth     24




              $ xdpyinfo

              name of display:    :0.0

              version number:    11.0

              vendor string:    The X.Org Foundation

              vendor release number:    11300000

              X.Org version: 1.13.0

              maximum request size:  16777212 bytes

              motion buffer size:  256

              bitmap unit, bit order, padding:    32, LSBFirst, 32

              image byte order:    LSBFirst

              number of supported pixmap formats:    7

              supported pixmap formats:

                  depth 1, bits_per_pixel 1, scanline_pad 32

                  depth 4, bits_per_pixel 8, scanline_pad 32

                  depth 8, bits_per_pixel 8, scanline_pad 32

                  depth 15, bits_per_pixel 16, scanline_pad 32

                  depth 16, bits_per_pixel 16, scanline_pad 32

                  depth 24, bits_per_pixel 32, scanline_pad 32

                  depth 32, bits_per_pixel 32, scanline_pad 32

              keycode range:    minimum 8, maximum 255

              focus:  window 0x1000004, revert to PointerRoot

              number of extensions:    33












                  Generic Event Extension






















              default screen number:    0

              number of screens:    1







              On Thu, 12 Dec 2013 17:45:23 -0800, Meteorhead <developer.forums@amd.com>

                • Re: APP SDK samples can't find GPU

                  Are you running these commands locally, when a desktop is running, or are you running them remotely? Do you use X tunneling when logging in?

                    • Re: APP SDK samples can't find GPU

                      locally yes. desktop is running. no tunnelling... but I found out what the culprit was. I delved deeper and basically my conclusion is the OpenCL on the R290 is not supported at all by these latest AMD drivers on Linux.


                      I pulled out the R290 and replaced it with an R280, and everything immediately worked, with no rebuilding or relinking required. There was nothing wrong with my system all along.


                      It would be nice for this to be in the release notes of the drivers but It just says the R290 is 'supported' but I guess that means graphics, not as a GPU for compute.


                      So... does anyone know when AMD is going to release drivers that support OpenCL on R290 on linux?

                        • Re: APP SDK samples can't find GPU

                          It works fine, I've been hashing dodgecoins in Linux with a R290 for the last few days :-)


                          The problem I've seen is that installing the AMD SDK incorrectly puts in your library path the SDK version

                          of OpenCL which is interestingly enough is CPU-only. It overrides the ones from the driver.


                          You need to do two thinsg to fix it:


                          - Remove the statement it adds to your /etc/profile:




                          -export LD_LIBRARY_PATH


                          - Remove the file it adds to /etc/ld.so.conf.d/ (I don't remember the exact name but it's fairly obvious).


                          - run ldconfig as root


                          That should fix it.