W8100 is slower in x64 executable - uses more VGPRs and SGPRs

Question asked by elavram on Jul 4, 2015
Hello everybody,


I am trying to develop an OpenCL ODE solvers library and encountered the following problem. I hope someone can provide some input.


In my system (windows 8.1 x64) I have a Firepro W8100 (first PCI slot) and a 7970 GHz Edition (second PCI slot). I have installed the latest driver for both cards.


W8100 : 14.502.1019-WHQL-FirePro-WindowsX32X64

7970 : AMD-Catalyst-15.6-Beta-Software-Suite-Win8.1-64Bit-June22


The OpenCL driver for both devices is 1642.5 (OpenCL 1.2).


I am using Visual Studio 2013 Pro to compile my OpenCL programs. I compile the programs for x86 and x64 targets.


When I run the executables using the 7970 there is no difference between the x86 and x64 version. x86 and x64 have the same execution time and by using CodeXL I see that the kernel uses the same number of VGPRs and SGPRs.


The problem is when I run the executables using the W8100. I see that in the case of x64 it takes considerable more time to run and the kernel uses more VGPRs and SGPRs.


Any idea why this happens and how I can fix it?


Many thanks!!!


