7 Replies Latest reply on Feb 18, 2013 2:26 PM by sdanchenko

    Errors in memtestCL when testing more then 3992Mb

    sdanchenko

      My setup:

      Linux Ubuntu 12.10

      Sapphire Radeon Vapor-X HD 7970 GHZ OC 6GB

      AMD 13.2 driver and APP SDK 2.8

       

      I would like to test memory on the card. If I run "memtestCL 3992 1" I have no errors. When I run with 3994 or more I got tons of errors.

       

      Also, as a last resort, I inserted "cout << CL_DEVICE_MAX_MEM_ALLOC_SIZE <<endl;" before clCreateBuffer command and got printed out 4112 as Max allocatable memory (if I understand it correctly).

       

      So, my question: is there a bug in memtestCL or I have, actually, memory errors and it is better to replace a card?

       

      Also, are there any other ways to test GPU memory? Especially 6Gb.

       

      Regards,

      Serhiy.

        • Re: Errors in memtestCL when testing more then 3992Mb
          sdanchenko

          Update:

          I set environment variable "set GPU_FORCE_64BIT_PTR=1"  according to http://devgurus.amd.com/thread/160325 post and was able to get to 4096Mb of no errors. Anything over 4096Mb still gives me tons of errors.

           

          So, I kinda lost. Is it AMD driver issue? Is it hardware issue? Is it memtestCL issue?

           

          Also, I am using 64bit Linux.

           

           

          Regards,

          Serhiy

            • Re: Errors in memtestCL when testing more then 3992Mb
              german

              sdanchenko wrote:

               

              Update:

              I set environment variable "set GPU_FORCE_64BIT_PTR=1"  according to http://devgurus.amd.com/thread/160325 post and was able to get to 4096Mb of no errors. Anything over 4096Mb still gives me tons of errors.

               

              So, I kinda lost. Is it AMD driver issue? Is it hardware issue? Is it memtestCL issue?

               

              "set GPU_FORCE_64BIT_PTR=1" is a key for a new feature, which was not intended for the end users. Currently with the latest drivers it may have problems with the generated opencl kernels.

              You can test only memory reported by the OpenCL runtime with the default settings (see my message regarding clinfo).

              Use that value(from "Global memory size", converted to MB) in "memtestCL". Also you can run a couple instances of "memtestCL" at the same time to test more memory on your system.

            • Re: Errors in memtestCL when testing more then 3992Mb
              german

              sdanchenko wrote:

              Also, as a last resort, I inserted "cout << CL_DEVICE_MAX_MEM_ALLOC_SIZE <<endl;" before clCreateBuffer command and got printed out 4112 as Max allocatable memory (if I understand it correctly).

              That's not correct. That line prints the value of CL_DEVICE_MAX_MEM_ALLOC_SIZE define.

              Run "clinfo" program on your system(it's a part of the installation). It will print the device information, supported by the OpenCL runtime on your system(both CPU and GPU). Find the line with "Global memory size" for your GPU device.

              For example:

                          Global memory size:                            1073741824

              That means 1GB.

                • Re: Errors in memtestCL when testing more then 3992Mb
                  sdanchenko

                  Hi German,

                   

                  Thank you for respond.

                   

                  This is my clinfo output:

                  $ clinfo | grep mem

                    Max memory allocation:             536870912

                    Global memory size:                 2147483648

                    Local memory type:                 Scratchpad

                    Local memory size:                 32768

                    Unified memory for Host and Device:         0

                    Max memory allocation:             4198122496

                    Global memory size:                 16792489984

                    Local memory type:                 Global

                    Local memory size:                 32768

                    Unified memory for Host and Device:         1

                   

                  Which does not make sense as it shows only 2147483648 / 1073741824 = 2 Gb. Most Radeon HD7970 comes with 3Gb and my card suppose to have 6Gb.

                   

                  PS. Sorry for late replay - got sick and did not have access to computer.