Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

Random numbers in OpenCL

Does anyone have a simple random # generator for OpenCL kernel? The links on are both broken. Uploading them from the host is not practical.

20 Replies

I hope people have suggestions. All I can do is a search, but it turned up some possibilities.

A good discussion of multiple algorithms/approaches on Stack Overflow

A PRNGCL implementatino with links to source code

MWC64X - Uniform random number generator for OpenCL.

Hope this helps

Adept II


I've implemented a GPL licensed PRNG library for use in scientific simulations nearly two years ago. With the restrictions of OpenCL 1.1 it was one hack after another. It implements a classical Mersenne Twister PRNG, one MTGP PRNG and Multiply with Carry generator with a state of 5 uints. The Mersenne Twister PRNG is fastest but it does not support a work group size of 256 work items. The untempered MTGP generator is fast, too. With some optimizations some cycles in the evaluation of MT generator can be saved. It can output normally and uniformly distributed random numbers. Hopefully, that helps you. If I'm able to install a working OpenCL 2.0 driver I could cleanup the whole library. Here is the link:

If you need a PRNG for graphics than simply use a linear congruential generator or a small MWC once and do some rotates and XORs.

Edit: Someone built a simple comparison of some OpenCL PRNGs: mcopik/OpenCLPRNGs · GitHub

-- NaN

Adept II

There is a couple of counter-based RNG from Random123 suite (D. E. Shaw Research) implemented in VexCL library (ddemidov/vexcl · GitHub). The advantage of counter-based generators is that they are stateless and hence you don't need to access global memory which makes them work fast on GPUs. If you don't want to depend on a library, you can run a simple simulation and ask VexCL to dump the generated OpenCL kernel to stdout and then use it. See a simple example here:



Thank you for the info. I’m having a bit of trouble with the code. First the file “pi.cpp” contains "#include <vexcl/vexcl.hpp>” but I cannot find a header of such name. I see "vexcl<> / CL<> / cl.hpp” (only 11.45 K lines long). In addition the “Clang” compiler (on a Mac) keeps telling me " OpenCL does not support the 'static' storage class specifier”. Do you know of anyone who is implementing OpenCL on a Mac?


<vexcl/vexcl.hpp> is located here: vexcl/vexcl.hpp at master · ddemidov/vexcl · GitHub.

I don't have much experience with MacOS, but VexCL has been reported to work there. See vexcl/ at master · ddemidov/vexcl · GitHub

<CL/cl.hpp> in VexCL tree is just a copy of Khronos-supplied C++ bindings header ( If Apple provides its own version of that, you could use it.

> In addition the “Clang” compiler (on a Mac) keeps telling me  " OpenCL does not support the 'static' storage class specifier”.

Could you clarify that a bit? What exactly is the source of the problem here?



Thank you ddemidov,

I’m sure the problem is my own ignorance but I seem to recall reading in several places “cross-platform” so I naively assumed the code would work on the Mac but I can’t even get this stuff running even from the command line. Here are two of the complaints I get from the Clang compiler when I use the command: clang -S -emit-llvm -o sum.ll -x cl sum.c

/Applications/ error:

      invalid conversion between vector type '__v4hi' and integer type '__m64'

      (aka 'long long') of different size

    return (__m64)__builtin_ia32_packuswb((__v4hi)__m1, (__v4hi)__m2);


/Applications/ error:

      OpenCL does not support the 'static' storage class specifier

static __inline__ __m64 __attribute__((__always_inline__, __nodebug__))

Am I supposed to build the kernel “” first or what? Thank you for your patience.

                                                  1. @

@Edit: Upon further search I find that the "static" complaint is coming from a file that is NOT part of the MTGP software but rather a file that is part of GCC. Exactly why Clang is using it is way above my head.


I think you need to use libc++ with clang on MacOS. This works for me (although I am on linux):

clang++ -o pi pi.cpp -std=c++11 -stdlib=libc++ -lc++abi -I${VEXCL_ROOT} -lOpenCL -lboost_system
Adept II

i had good success with Mersenne Twister for Graphic Processors (MTGP)

and Tiny Mersenne Twister (TinyMT)

and TinyMT doesnot use that much memory.

if you really want a very simple random number generator (there are some issues w.r.t. correlation between some values How to get a "random" number in OpenCL - Stack Overflow)

Random Number Generators (using the 16807)

float get_random_0_1_ul (unsigned int* seed) //uniform between 0-1


  *seed = ((*seed) * 16807 ) % 2147483647;

  return  (float)(*seed) * 4.6566129e-10; //(4.6566129e-10 = 1/(2^31-1) = 1/2147483647)




Thank you titanius,

That looks like a variation of the Park Miller generator. I’ll have to use that one and the integer version until something more exotic comes along that I can implement. For that reason I will leave the question marked as “unanswered” for now. Thanks again, Bob


you're welcome. yup, its park miller.

i am not sure if you want to write your own or not, but using TinyMT is trivial. once you get the download here

sample is here

what i do in my kernels is

include the file (i put the sources in tinymt-opencl relative to the .cl file)

#include "tinymt-opencl/tinymt32_jump.clh"

inside the kernels

tinymt32j_t tinymt;

tinymt32j_init_jump(&tinymt, (get_global_id(0) + 1 )); //init the rng to to something somewhat random within each thread

to generate value between 0-1

float val = tinymt32j_single01(&tinymt));

for a uint32

uint32 val = tinymt32j_uint32(&tinymt));

the documentation of individual functions is a bit sparse but you might get some idea from the tinymt32_jump.clh and you can generate values between a bunch of float ranges and also returns integers



Thanks titanius,

I tried to implement the tinymt and I followed the instructions on

I’m sort of new to OpenCL from the OS X command line. Everything went well except for one complaint: make: *** No rule to make target `../jump/jump32.c', needed by `jump32.o'. Stop.

I’m not sure what to do with that. Any suggestions?Bob


i never ran into that issue, oh maybe because i never tried running the samples

i think those samples may also need this (i am guessing its for their the non-opencl cpu comparison code) TinyMT Jump Function

(point 2 here TinyMT OpenCL Sample)

within your program, one thing i forgot to mention is when i used an #include "tinymt-opencl/tinymt32_jump.clh", i had to add the directory to the opencl include search path

as tinymt-opencl was a subdirectory in the same path as the kernel file, you pass the include path to buildprogram as

status = clBuildProgram(program, 1, deviceid, "-I .", NULL, NULL); // "-I ." adds the current directory of the .cl file



Dear titanius,

What do you mean by “buildprogram” ? Bob


Hello Bob

as opencl kernel can either be used in a compiled form (useful in release) or can be compiled everytime the program starts (useful when the kernel is being updated and debugged), there might be some place in your code where you will be compiling for a particular device; in c, its done using clBuildProgram and at that point the includes to the TinyMT are required to be specified. clBuildProgram might be a different function in c++.

let me try to put some code using the rng and opencl in sometime.


I put a small test program that uses the rng here - Google Drive

there is a vc++ solution and a inorder to compile on linux i had to do (put appropriate opencl libraries and header paths and it should compile)

g++ -I/opt/AMDAPP/include -lOpenCL -L/opt/AMDADD/lib/x86_64 main.cpp Ocl.cpp -std=c++0x

and it will generate an a.out.

the kernel is in a file called

take a look into the main.cpp for the details. there is a place (line 2 in main.cpp) where i choose the vendor to use.

let me know if you run into any issues



I got 32 errors. Most were

Ocl.cpp:88:5: error: case value evaluates to -49, which cannot be narrowed to type

'cl_uint' (aka 'unsigned int')

starting at “evaluates to -62” and up to “evaluates to -49".

Then the rest were

Ocl.cpp:88:5: error: use of undeclared identifier 'CL_INVALID_PROPERTY'



./ocl_macros.h:78:27: note: expanded from macro 'LOG_OCL_ERROR2'

#define LOG_OCL_ERROR2(x) LOG_OCL_ERROR(x, "")


./ocl_macros.h:72:26: note: expanded from macro 'LOG_OCL_ERROR'


I changed the compiler to the latest I have: g+-4.9.0 -I/opt/AMDAPP/include -lOpenCL -L/opt/AMDADD/lib/x86_64 main.cpp Ocl.cpp -std=c+0x and got around 24 errors, all like

main.cpp: In function ‘int main()’:

ocl_macros.h:72:26: error: ‘CL_INVALID_PROPERTY’ was not declared in this scope

case CL_INVALID_PROPERTY: cout <<"Error : CL_INVALID_PROPERTY"<< endl;break;\


ocl_macros.h:78:27: note: in expansion of macro ‘LOG_OCL_ERROR’

#define LOG_OCL_ERROR2(x) LOG_OCL_ERROR(x, "")


main.cpp:29:9: note: in expansion of macro ‘LOG_OCL_ERROR2’


^ ^

Keep in mind that I am compiling on the OS X unix command line


maybe its due to an old version of opencl.h ViennaCL / Mailing Lists

anwyays just comment (or delete) out line 72 in ocl_macros.h, so you dont get the error with g++ 4.9



Hi titanius and all,

I finally got tinymt working in Xcode after much hacking. Let me explain where the problem lay and how I fixed it. First Apple has this slick, but not very compatible method of compiling from Xcode (NOT from the command line like the rest of the community). You first compile the kernel by selecting it and press command-B to “build” it. Then you select the host and do the same and finally command-R to “run” the whole project.

At the first step the compiler complained about the use of “static” and “cl_uint" in the tinymt32.clh header and in all the headers used by tinymt32.clh. After I removed all the static’s and changed cl_uint to int, I was able to generate random numbers.

I don’t know how much damage I have done in those hacks so now I’m trying to generate many random numbers in each kernel instance and ship them back to the host where I can check their correlation, etc.

Now I have a different problem of reading and writing large chunks of global memory into the work units but I believe that is out of the scope of this discussion so I shall bid you all farewell (for now) and thanks for all your help and patience.

I salute AMD for hosting such a useful and easily accessible forum. rsacker


Everyone: This topic is a perfect example of why forums are cool. I wish they all had this good a discussion.   Thanks to all who are chipping in.


ANother entry in the random number generator discussion:


AMD is collaborating with the open source community to create an openCL library to generate random numbers