I am trying to develop an OpenCL ODE solvers library and encountered the following problem. I hope someone can provide some input.
In my system (windows 8.1 x64) I have a Firepro W8100 (first PCI slot) and a 7970 GHz Edition (second PCI slot). I have installed the latest driver for both cards.
W8100 : 14.502.1019-WHQL-FirePro-WindowsX32X64
7970 : AMD-Catalyst-15.6-Beta-Software-Suite-Win8.1-64Bit-June22
The OpenCL driver for both devices is 1642.5 (OpenCL 1.2).
I am using Visual Studio 2013 Pro to compile my OpenCL programs. I compile the programs for x86 and x64 targets.
When I run the executables using the 7970 there is no difference between the x86 and x64 version. x86 and x64 have the same execution time and by using CodeXL I see that the kernel uses the same number of VGPRs and SGPRs.
The problem is when I run the executables using the W8100. I see that in the case of x64 it takes considerable more time to run and the kernel uses more VGPRs and SGPRs.
Any idea why this happens and how I can fix it?