I am not familiar with Mac, but it seems the problem is only with the OpenCL (.cl) files, since the C++ is compiling and running as expected. The errors are a bit confusing though. What changes did you do here? Judging by the errors it's like the contents of ranlux_unified_1.0.0.cl is being included at the end of the code instead of the beginning.
So just to clarify so we're on the same page, you should either leave all the codes as they are, or you can try to past the contents of ranlux_unified_1.0.0.cl into the top of the PRNGTest_kernels_1.9.cl file (so replace the first line, which is "#include "ranlux_unified_1.0.0.cl"" with the contents of the ranlux_unified_1.0.0.cl file).
According to what you have posted, the program generated 20*10^9 numbers/second by using Cypress. Can you tell me what is that GPU. I have tested Nvidia 9400M(16 core with 450MHz core speed) on Mac computer with "./test.exe 0" and the result was just 200Mega numbers. This is too much lower than your result.
As nou said, Cypress = HD 5870.
Note that running ./test.exe 0 will run on the CPU. Run ./test.exe 0 1 (where the last 1 is setting useGPU to true) to check GPU performance. I know that the Nvidia GTX480 is slightly faster (something like 20%) than Cypress on this code, which is not very surprising since it's somewhat linear/not vectorized. It should perform decently on Nvidia hardware.
Thank you for your reply.
I have always tested ./test.exe 0 1 1 1 1. To make it shout i have written ./test.exe 0. Your code is really well organized and easy to understand. 1600 stream GPU machine? Wow. That's amazing. I hope I can utilize this code for my research project. Thanks alot.
I have generated number by ./test.exe 0 1 1 1 1. Then i have printed out the PRNs numbers. The result is that from 1~153599 it is random number from 0 to 1. However, from 153600to 614399 it was all zeors. From 614400 to 767998, it was random numbers from 0 to 1 but from that point to 1228800, they were all zeros. This is repeated up to 24999999. Did I don't know why it is repeating zeros. If you know some reason please let me know.Code is the same except that ranlux_unified_1.0.0.cl part is put on top of the PRNGTest_kernels_1.9.cl and compiled on Mac. Thanks for reading.
Thanks for your comment. I have just discovered some weirdness when I run the code on the GPU, even failing the correctness check on my GPU (Cypress) now. Very strange, since it seemed to work correctly on the GPU in the past with previous drivers/SDKs. Also it still seems to work correctly on the CPU.
I'll try to figure it out during christmas, very weird problem.
I have tested the random number generation using CPU.However, the test failed.
What i did was,
./test exe 0 0 1 0 0;
The result was,
RANLUX illegal initialization seed: 0. RANLUX initialized using default seed instead: 314159265
Building OpenCL program: Done
ERROR (-54): CommandQueue::enqueueNDRangeKernel(): Invalid work group size