You can use atomic functions but those are available for only 32 bit and 64 integers.
before using these, see device supports these extensions or not. you get this info from clGetDeviceInfo.
See section 9.5 in OpenCL Spec 48 revision for more information.
Thanks for your quick reply,
Unfortunately it is for several output variables :
- 3 float
- a integer
So, there is no way to insure that they are not overwriting it ?
no. i think that atomic instructions is atomic only across one device. not multiple device.
There is no way to avoid access to the same variables, even for the same device ?
And it is only possible for "int" , not for float ?
The problem is that I must do several operations like this :
if ( Hit < t )
Hit = t;
Uc = u;
Vc = v;
I see no way of doing this without risk !
well opencl is suppose for massive paralelism with hundred of thread. with this any locking is highly unwanted. so you must find another approach to your problem. for example if you want calculate triangle ray interesection then you can map your problem to 2D matrix when you run N thread where N is number of ray and then calculate hit with trangles in scene.
Thanks for your advice. I completely agree with your adivce... but...
The problem is that we work on this software since3 years, and doing this will b a solution but we should translate all the software to OpenCL. Simply because we trace some ray, after we do some "processing" and then resend some ray in differents directions. So we should do some .NET and after swith to OpenCL, reswitch to .NET ....
What I would like is to do a "search" method that use the GPU/CPU power, when we have to search an intersection into 1 millions faces I would like to get 1 result only.
A solution will be to create an "output" array and each "thread" puts its result... but
1 - I have to reserve a big array (1 millions X 4 floats)
2 - I have to search into this array for the best result
I'm not sure it is very effective...
But I don't see any other solution :-(
and you must search intersection only one ray at a time? what about find intersection hundred of ray with milions of triangle. then you can in one thread compute intersection in loop and output only best solution for each ray.
and even then ther is techniue called reduction kernel. which is good for find one best value in large array of values.
I can do this for primary rays, but oncea primary ray is traced I distribute a lot of other rays based on several different algorithms... it is impossible to rewrite everything !
Tracing the ray is the first operation we do in our renderer, if we do this in OpenCL the remaining algorithm must be translated too :-(