cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

joohongyee
Journeyman III

The RANLUX pseudorandom number generator implemented in OpenCL

Unfortunately, I was not able to figure out how to fix those two problems. Do you have any progress? 

0 Likes
himanshu_gautam
Grandmaster

The RANLUX pseudorandom number generator implemented in OpenCL

joohongyee,

I do not have any idea about the ranlux algorithm, so maybe someone else can guide you better. Anyhow, this error of Invalid work group size as per opencl spec is :

CL_INVALID_WORK_GROUP_SIZE if local_work_size is specified and number of workitems
specified by global_work_size is not evenly divisible by size of work-group given
by local_work_size or does not match the work-group size specified for kernel using the
__attribute__((reqd_work_group_size(X, Y, Z))) qualifier in program
source

 

It maybe  helpful .

0 Likes
joohongyee
Journeyman III

The RANLUX pseudorandom number generator implemented in OpenCL

Thank you so much for your comment. I will try to find help from others and if I can make it, I will post it over here. 

0 Likes
joohongyee
Journeyman III

The RANLUX pseudorandom number generator implemented in OpenCL

Hi Himanshu,

I ran the code on NVIDIA GPU machine and it works okay. However, ATI machine still makes errors. As you have indicated, it might be driver/SDK machine. I heard that NVIDIA GPU is more complex than ATI machine. But in case of Mac, it makes more errors with ATI machine. I hope this problem will be solved in the future time. I really appreciate your comment and help. 

Joohong

 

0 Likes
himanshu_gautam
Grandmaster

The RANLUX pseudorandom number generator implemented in OpenCL

Joohongyee,

Are you trying it on MAC Book. The OpenCL implementation for MAc Books is being written and maintained by apple.

0 Likes
MicahVillmow
Staff
Staff

The RANLUX pseudorandom number generator implemented in OpenCL

joohongyee,
We do not support the apple platform. Please contact apple with any issues with their OpenCL implementation.
0 Likes
dravisher
Journeyman III

The RANLUX pseudorandom number generator implemented in OpenCL

Originally posted by: joohongyee Hi Himanshu,

 

I ran the code on NVIDIA GPU machine and it works okay. However, ATI machine still makes errors. As you have indicated, it might be driver/SDK machine. I heard that NVIDIA GPU is more complex than ATI machine. But in case of Mac, it makes more errors with ATI machine. I hope this problem will be solved in the future time. I really appreciate your comment and help. Joohong

 

Thank you for your interest and comments joohongyee. I've found out that with the latest SDK (AMD APP SDK 2.3) I get wrong (but still seemingly pseudo-random) sequences on my HD 5870. My CPU still works correctly though. Using the older ATI Stream SDK 2.0.1 with Catalyst 10.1 my GPU is producing correct results, and since you got it to work on an Nvidia machine it sounds like there's a problem with the more recent AMD implementations only.

I'll have to do some more tests to figure out with which SDK/driver the problems started. But it seems like a very difficult bug to find since the algorithm seems to run fine for a while, and then it suddenly diverges from the known good numbers, but it continues generating seemingly pseudorandom numbers.

0 Likes
himanshu_gautam
Grandmaster

The RANLUX pseudorandom number generator implemented in OpenCL

Hi dravisher,

Did you found what is causing this issue? Is it a Precision issue or some implementation bug?

Is the issue reproducible from the code you posted in the first post of this thread?

 

0 Likes
dravisher
Journeyman III

The RANLUX pseudorandom number generator implemented in OpenCL

What I've found out is that with my HD 5870:

SDK 2.01 with Catalyst 10.1 works

SDK 2.1 with Catalyst 10.4 works

SDK 2.2 with Catalyst 10.8 works

SDK 2.3 with Catalyst 11.1 produces wrong numbers after a while.

I've also tried it on a Nvidia T10 that I have access to at the university, and it too generates the correct numbers.

The problem does show up in the program I posted. For example running it as "prngtest.exe 0 1 1" or "prngtest 4 1 1" will show that the last numbers checked are incorrect. Running on the CPU with "prngtest.exe 0 0 1" works as expected. It can also be seen by printing or writing to file the PRNs array in the program, where the CPU (at least on my computer) is producing correct results, and seeing that after a while the GPU will start generating different values. I've checked this against 10000 values from the Fortran implementation and my OpenCL implementation generates the sequence correctly on the CPU, and on the GPU except with the newest SDK.

My guess would be that there is a (small) calculation error at some point which then causes the generated sequence to diverge. Even an error in the least significant bit would do it I think. As I understand it this would be a bug in the SDK since I'm only using addition and multiplication, which should be correctly rounded according to the specification.

Since the error seems to happen at the same place each time, I'll try to see if I can isolate the exact operation that's failing.

0 Likes
MicahVillmow
Staff
Staff

The RANLUX pseudorandom number generator implemented in OpenCL

dravisher,
If you can provide a small test case that hits the issue, we can get it fixed for SDK 2.4.
0 Likes