14 Replies Latest reply on Jul 18, 2015 3:50 AM by elavram

    W8100 is slower in x64 executable - uses more VGPRs and SGPRs


      Hello everybody,


      I am trying to develop an OpenCL ODE solvers library and encountered the following problem. I hope someone can provide some input.


      In my system (windows 8.1 x64) I have a Firepro W8100 (first PCI slot) and a 7970 GHz Edition (second PCI slot). I have installed the latest driver for both cards.


      W8100 : 14.502.1019-WHQL-FirePro-WindowsX32X64

      7970 : AMD-Catalyst-15.6-Beta-Software-Suite-Win8.1-64Bit-June22


      The OpenCL driver for both devices is 1642.5 (OpenCL 1.2).


      I am using Visual Studio 2013 Pro to compile my OpenCL programs. I compile the programs for x86 and x64 targets.


      When I run the executables using the 7970 there is no difference between the x86 and x64 version. x86 and x64 have the same execution time and by using CodeXL I see that the kernel uses the same number of VGPRs and SGPRs.


      The problem is when I run the executables using the W8100. I see that in the case of x64 it takes considerable more time to run and the kernel uses more VGPRs and SGPRs.


      Any idea why this happens and how I can fix it?


      Many thanks!!!


      CodeXL screenshots:


      7970_x86 https://www.dropbox.com/s/wktzusnnltg4nee/7970_x86.png?dl=0

      7970_x64 https://www.dropbox.com/s/sr40dwx61s0v7c8/7970_x64.png?dl=0

      W8100_x86 https://www.dropbox.com/s/s312pflf9puoszc/W8100_x86.png?dl=0

      W8100_x64 https://www.dropbox.com/s/5ulngea7retxxgy/W8100_x64.png?dl=0