9 Replies Latest reply on Jun 14, 2013 5:52 AM by gbilotta

    Device information issues

    gbilotta

      While writing and testing a clinfo program, I came across the following issues with the device information exposed by the AMD platform:

      • the reported MAX_CLOCK_FREQUENCY on Intel CPU devices is not the _maximum_ clock frequency, but the current one; for example, if my CPU is running at 800MHz but can go to 2.5GHz, 800MHz is reported; the issue seems to be limited to Intel CPUs, since for the Phenom in one of my systems the maximum frequency is correctly reported;
      • although the list of available extensions does not expose the cl_khr_fp16 extension, querying for the preferred and native half vector types gives a value of 2 on most of the CPUs and GPUs I have tried; this violates the standard, since when the extension is not supported, 0 should be reported; if the platforms does indeed support the half type, then the cl_khr_fp16 extension should be mentioned among the extensions supported by the device;
      • querying for the LINKER_AVAILABLE (OpenCL 1.2) property returns CL_FALSE. This violates the specification, since COMPILER_AVAILABLE is CL_TRUE, and in this case LINKER_AVAILABLE must also return CL_TRUE; additionally, FULL_PROFILE devices (and most f not all CPUs and GPUs supported by the AMD platforms have FULL_PROFILE) must also return CL_TRUE for LINKER_AVAILABLE;
      • on CPUs, the reported PRINTF_BUFFER_SIZE (OpenCL 1.2) is 65KB, while according to the standard the minimum value allowed for FULL_PROFILE devices is 1MB;
      • finally, where can I find some information about the GLOBAL_FREE_MEMORY_AMD property? From the cl.hpp I could derive that it's of type size_t[], and from my experimentation it seems to present two identical values which represent the amount of free global memory (in megabytes) for the GPU; would it be possible to have some documentation about this property, such as why are there two values? (I was also surprised to find it exports a size_t[] since I would have expected it to provide a cl_ulong[], for consistency with other memory-related properties)

       

      For reference, my clinfo code can be found on http://github.com/Oblomov/clinfo/

        • Re: Device information issues
          himanshu.gautam

          Thanks for reporting the issues. Will check the code and specification, and report.

          • Re: Device information issues
            himanshu.gautam

            Hi,

            I tried your code, but could not use it

            the make file needed cl.h file, but could not figure out how to add the path in it. Compiled directly using g++ clinfo.c -I "/opt/AMDAPP/include" -lOpenCL. The code gave a bunch of errors, pointing to lines containing multiply nested macros. So i just modified a similar code i had. attached here.

            Could not reproduce CPU freq bug. Getting 3401 (for my intel ivybridge 3.4GHZ CPU). Other issues were reproduced.

             

            To make it clear cl_khr_fp16 is not supported on any AMD platforms as of now. So PREFERRED_VECTOR_WIDTH_HALF should return 0. CL_LINKER also return false (error) and printf buf size is 64kb (error, as it should be 1MB min). GLOBAL_MEMORY_FREE_AMD seems to be future planned feature. I do not see it documented anywhere. Forwarding the bugs to Engineering team.

              • Re: Device information issues
                gbilotta

                himanshu.gautam wrote:

                 

                Hi,

                I tried your code, but could not use it

                the make file needed cl.h file, but could not figure out how to add the path in it. Compiled directly using g++ clinfo.c -I "/opt/AMDAPP/include" -lOpenCL. The code gave a bunch of errors, pointing to lines containing multiply nested macros. So i just modified a similar code i had. attached here.

                This is probably due to missing macro definitions in cl_ext.h, although it makes me wonder why they are in the one I have, since I cannot find them in the official Khronos cl_ext.h either (probably a distribution thing?). I'll work around this issue in my next version of the code.

                Could not reproduce CPU freq bug. Getting 3401 (for my intel ivybridge 3.4GHZ CPU). Other issues were reproduced.

                Maybe it depends on the CPU (in my case, it's a rather old one, Core2 Duo T9400 @ 2.53), or on the active CPU governor (I'm using the ondemand governor, so the CPU stays at 800MHz until heavy loads are reached).

                To make it clear cl_khr_fp16 is not supported on any AMD platforms as of now. So PREFERRED_VECTOR_WIDTH_HALF should return 0. CL_LINKER also return false (error) and printf buf size is 64kb (error, as it should be 1MB min). GLOBAL_MEMORY_FREE_AMD seems to be future planned feature. I do not see it documented anywhere. Forwarding the bugs to Engineering team.

                Thank you very much. I hope to see these future extensions explained in the next APP SDK

                  • Re: Device information issues
                    himanshu.gautam

                    gbilotta wrote:

                     

                    himanshu.gautam wrote:

                     

                    Hi,

                    I tried your code, but could not use it

                    the make file needed cl.h file, but could not figure out how to add the path in it. Compiled directly using g++ clinfo.c -I "/opt/AMDAPP/include" -lOpenCL. The code gave a bunch of errors, pointing to lines containing multiply nested macros. So i just modified a similar code i had. attached here.

                    This is probably due to missing macro definitions in cl_ext.h, although it makes me wonder why they are in the one I have, since I cannot find them in the official Khronos cl_ext.h either (probably a distribution thing?). I'll work around this issue in my next version of the code.

                    what are your further plans with clinfo code project?

                      • Re: Device information issues
                        gbilotta

                        himanshu.gautam wrote:

                         

                        what are your further plans with clinfo code project?

                        My plan would be to (try and) keep it up to date as new OpenCL versions and/or extensions are published, trying to support both generic (KHR and EXT) extensions and vendor-specific ones (currently, AMD and NV) to gather as much information as possible on any device, making an effort to make it buildable out-of-the-box on any machine that has up-to-date official headers.

                         

                        The code is released in the public domain, since it's rather trivial and I feel it should be freely usable by anyone, without any kind of restrictions. Making it buildable with just the official Khronos OpenCL headers does require transcribing the definitions of some extensions which are (currently) not declared there, of course. I hope that this is not cause for concern.

                         

                        I understand that there is a naming conflict with the utility shipped with the AMD APP SDK. This is not exactly surprising since the utility I'm coding is intended essentially as a drop-in replacement for the one from the SDK. If AMD feels that this conflict may be a problem, I can rename the project to something satisfactorily different.

                         

                        On the other hand, if AMD wishes to use my clinfo in place of the one currently developed in-house, that would be fine for me too. If there is enough interest in it, I will make an effort to clean up the code and make it easier to maintain (it was coded more with the idea of getting something functional out fast than doing something particularly elegant).

                    • Re: Re: Device information issues
                      gbilotta

                      himanshu.gautam wrote:

                      Could not reproduce CPU freq bug. Getting 3401 (for my intel ivybridge 3.4GHZ CPU). Other issues were reproduced.

                      I'm seeing the CPU freq bug on an AMD CPU too. Again, this is an old CPU and the active governor is the ondemand one.

                       

                      /proc/cpuinfo starts with:

                      processor     : 0
                      vendor_id     : AuthenticAMD
                      cpu family     : 15
                      model          : 75
                      model name     : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
                      stepping     : 2
                      microcode     : 0x62
                      

                      According to cpufreq-info, I can set frequencies to 2GHz, 1.8GHz and 1GHz. I'm using the ondemand governor, so the CPU is usually kept at 1GHz, and the "max cpu frequency" reported by clinfo is 1GHz. Setting as cpufreq governor either userspace or performance brings the actual (and reported) frequency to 2GHz. So it would seem that at least on older CPUs (both Intel and AMD) on Linux the reported max frequency is not the actual CPU max frequency but the one set by the governor, or something like that.

                    • Re: Device information issues
                      himanshu.gautam

                      AMD supports cl_amd_fp64 extension which is a subst of cl_khr_fp64

                      Check the URL below.

                      http://www.khronos.org/registry/cl/extensions/amd/cl_amd_fp64.txt

                        • Re: Device information issues
                          gbilotta

                          himanshu.gautam wrote:

                           

                          AMD supports cl_amd_fp64 extension which is a subst of cl_khr_fp64

                          Check the URL below.

                          http://www.khronos.org/registry/cl/extensions/amd/cl_amd_fp64.txt

                          Thanks for the reference, but is this in reply to some specific issue

                           

                          My version of clinfo currently shows all extensions in the appropriate fields, and then checks for specific extensions only to see if additional queries can be made to extract additional information. In the case of fp64 support, my code checks if fp64 is supported (first by cl_khr_fp64 and then, if not found, by cl_amd_fp64). If fp64 is supported by either of these extensions, it shows the double-precision FP configuration, mentioning the extension which it found for the support. In case both the KHR and AMD extension are available, only KHR is shown because it's the first one checked, and support for cl_amd_fp64 would not add support for more information (in other words, cl_amd_fp64 is considered a fallback in case cl_khr_fp64 is not supported). Is this wrong?