1 Reply Latest reply on Nov 27, 2017 9:25 AM by drnil

    Possible bug in AMD App SDK 3.0 install script?

    drnil

      I have recently become interested in using AMD GPUs for doing fast matrix multiplications in double precision. I have installed AMD App SDK 3.0 for 64-bit Linux, but I have some problems with missing library configuration files. Inspecting the install script (AMD-APP-SDK-v3.0.130.136-GA-linux64.sh), I suspect there are some bugs there. I use a clean install of 64-bit Ubuntu 16.04 LTS, with kernel 4.10.0-40-generic, and hardware Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz, 16 GB RAM. The graphics board is a Radeon HD 7870 (clinfo attached as clinfo-1.txt).

       

      Here is what I do:

      • Clean install of Ubuntu 16.04 LTS, 64-bit, incl. update/dist-upgrade/reboot
      • make sure build-essential is up to date
      • install and run clinfo from Ubuntu distribution (output is:  Number of platforms  0)
      • install amdgpu-pro-17.40-492261(log attached as amdgpu-pro-installation.log) and reboot
      • re-run clinfo (output attached as clinfo-1.txt). The graphics board is properly identified.
      • install AMD App SDK 3.0 (AMD-APP-SDKInstaller-v3.0.130.136-GA-linux64.tar.bz2; log attached as InstallLog_11-26-2017T14-34-47.log)
      • run clinfo again (only minor differences from clinfo-1.txt in global memory size, max memory allocation, and profiling timer offset)
      • run /opt/AMDAPPSDK-3.0/samples/opencl/bin/x86_64/MatrixTranspose, which says:

      ---------------------------------------

      Platform 0 : Advanced Micro Devices, Inc.

       

      Input

      15.3732 201.81 51.9855 89.2322 92.572 34.4675 96.2478 66.3863 11.345 225.168 161.374 96.5491 81.8505 211.932 108.827 124.578 202.418 244.549 182.722 90.1216 215.537 92.7693 98.0533 20.0925 44.5603 225.408 55.9775 247.995 202.92 0.896856 45.4356 218.293 202.706 97.4211 51.5251 39.2784 131.889 147.773 105.665 143.234 116.941 11.0383 239.783 198.791 222.97 92.6095 67.3688 169.388 81.1586 250.091 3.50951 40.6953 86.8598 101.563 60.7878 131.42 70.9705 116.765 123.415 17.8903 117.662 168.851 236.183 64.3685

       

      Platform found : Advanced Micro Devices, Inc.

       

      Selected Platform Vendor : Advanced Micro Devices, Inc.

      Device 0 : Pitcairn Device ID is 0xb32f00

      Error: clCreateCommandQueue failed.

      Location : /var/lib/jenkins/workspace/APPSDK_BUILD/C/Release/L/LBM/P/x64/S/OpenCL/T/gcc-4.8.4/samples/opencl/cl/1.x/MatrixTranspose/MatrixTranspose.cpp:148

      -----------------------------------

      I found that the install script did not set the environment variable LD_LIBRARY_PATH (this possibility is mentioned in the installation guide), but worse, and the main issue here is: The docs suggest setting this manually based on the contents of a non-existent config file: /etc/ld.so.conf.d/amdapp_x86_64.conf . Unfortunately, I cannot find anything like it. Moreover, installation also leaves a dangling symbolic link from  /opt/AMDAPPSDK-3.0/lib/x86_64/libOpenCL.so to /usr/lib/libOpenCL.so.1 .

      I have tried plausible settings for LD_LIBRARY_PATH based on the file /opt/AMDAPPSDK-3.0/install.sh, but to no avail. The best result I have achieved is that I could run clinfo and the MatrixTranspose sample with the CPU instead of the GPU, but usually, I have segmentation violations for both.

       

      I would be most grateful for some hints on what LD_LIBRARY_PATH should be set to, or if this problem is the indirect effect of some other problem!

       

      Message was edited by: drnil (spelling corrections; corrected filename)