Edit march 2011: I've updated the code quite a bit. The generator now works correctly with SDK 2.3 and SDK 2.4 rc1, and it should be much simpler to use. Newest version can be found on bitbucket, just grab the newest zip-file. As before there's also a program that can verify that the generator is producing correct sequences, measures performance and shows how to use the generator.
Hello. This is a piece of code I've developed for my own use, because I couldn't really find any good pseudorandom number generators for OpenCL that were simple to implement and understand. I'm sharing it so that others who just want a good PRNG can use it, without having to spend time porting a generator themselves. The generator is meant to be used with one copy per work-item. State data is kept in 7 float4 vectors.
The algorithm was developed by Martin Lüscher, while the Fortran 77 implementation this code is based on was developed by Fred James. The two relevant papers are:
Martin Lüscher, A portable high-quality random number generator for lattice field theory simulations, Computer Physics Communications 79 (1994) 100-110
F. James, RANLUX: A Fortran implementation of the high-quality pseudorandom number generator of Lüscher, Computer Physics Communications 79 (1994) 111-114
The code is licensed with an MIT license. If you really do find it usefull or interesting I'd love to hear about it :-). Also, if anyone thinks there is a problem with it please let me know, as I'm planning to make use of it myself for Monte Carlo simulations.
Usage of the generator is described as comments in the code, and I think the example code should show pretty clearly how to use the generator. The example code can also check the correctness of the implementation agains values generated by the original Fortran 77 code.
To see a list of the arguments just run prngtest.exe. To check performance and correctness of a GPU at luxury level 4 you'd run "prngtest.exe 4 1".
RANLUX is an "old" (proposed in 1994) PRNG. It has been used extensively, and to the best of my knowledge no defects have been found at luxury setting 2 and above (which perfectly lines up with the inventors predictions).
The generator has a few notable features:
All calculations are in 32-bit floating point. The algorithm generates a number between 0 and 1 directly.
It is possible to select a "luxury level", where higher luxuries mean better but slower numbers. Luxury levels 0-4 are presented here, 4 being best.
RANLUX is also one of very few (the only one I know of actually) PRNGs that has a underlying theory explaining why it generates "random" numbers, and why they are good. Indeed, if the theory is correct (and I don't know of anyone who has disputed it), RANLUX at the highest luxury level produces completely decorrelated numbers down to the last bit, with no long-range correlations as long as we stay well below the period (10^171). Most other generators can say very little about their quality (like Mersenne Twister, KISS etc.) They must rely on passing statistical tests.
At luxury level 0, more than 20*10^9 numbers per second on a Cypress. At luxury level 4, more than 4*10^9 numbers per second.