I've now tried it on another machine (vista 64 with 6950), this time with the 32-bit cl.exe from VS2010express and the 64-bit cl.exe from Windows sdk 7.1. Unfortunately the problem with compiling compute shaders for recentish cards still seems to be there. The command line I used was
cl /EHsc main.cpp /I "\Program Files (x86)\A
TI Stream\include" /link /LIBPATH:"\Program Files (x86)\ATI Stream\lib" aticalcl.lib
for the 32-bit version and a similar one for the 64-bit version.
The error one receives from the program in the troublesome cases is:
Err: 1, Error Creating program info
So it seems that the card in the machine doesn't affect the issue. Perhaps there is a bug somewhere.
Any advice much appreciated,
Just in case anyone else is interested, it seems that compute shaders for cypress/cayman will in fact compile if the number of threads per group is explicitly declared. So:
will in fact work. This suggests to me that there is an issue with the compiler in handling arbitrary-size thread groups in a compute shader. The strange thing is it manages okay with a pixel shader.
For compute shader, the number of threads per group is a requirement. If you don't know the size at compile time, use dcl_max_thread_per_group N to get the largest that will be executed.
Thanks for your reply. The dcl_max_thread_per_group N did the trick. I didn't realise this had to be fixed on the newer cards. The SKA was confusing me in that it doesn't seem to need that line to successfully compile for the newer architectures.