Hi,
It seems i'm getting a kernel argument corruption,I dont clearly understand why and I can't reproduce it with nvidia cards.
Here is the code, it's a small skeleton of ray casting algorithm using a persistent thread. The problem is if I uncomment the last lines, the kernel variable _launch_dim is modified (???). The code output the following lines
launch dim x : 512 launch dim y : 512 launch dim Z : 1 tid : 44
bbox 0 warp : 0 thread warp id : 44
bbox 0 NO HIT
bbox 1 warp : 0 thread warp id : 44
bbox 1 HIT
which is good but when i uncomment the last lines from the for loop, i get that :
launch dim x : 0 launch dim y : 512 launch dim Z : 1 tid : 108
bbox 0 warp : 1 thread warp id : 44
bbox 0 NO HIT
bbox 1 warp : 1 thread warp id : 44
bbox 1 NO HIT
launch dim x is modified (?), it's doesn't happen every time it depends of execution and the ray computed.
If you have any idea how I can get though this, It would be much appreciated.
I'm currently using a HD 7950 with 13.4 drivers and a windows 7 64 bits.
Roger
for(;;)
{
//PRINTFMACRO(thread_warp_id,44,"launch dim x : %i launch dim y : %i launch dim Z : %i tid : %i\n",_launch_dim.x,_launch_dim.y,_launch_dim.z,get_global_id(0));
if (thread_warp_id == 0) {
pop_index[warp_id] = atomic_add(&_queue_heads[0],WARP_SIZE) ;
}
if (pop_index[warp_id] + thread_warp_id < _queue_MAX)
{
uint index = pop_index[warp_id] + thread_warp_id;
PRINTFMACRO(index,135340,"launch dim x : %i launch dim y : %i launch dim Z : %i tid : %i\n",_launch_dim.x,_launch_dim.y,_launch_dim.z,get_global_id(0));
launch_index.y = (index % (_launch_dim.y*_launch_dim.x)) / _launch_dim.x;
launch_index.x = ((index % (_launch_dim.y*_launch_dim.x)) % _launch_dim.x);
float2 d;
d.x = ((float)launch_index.x / (float)_launch_dim.x) * 2.f - 1.f ;
d.y = ((float)launch_index.y / (float)_launch_dim.y) * 2.f - 1.f ;
float3 U2 = (float3)(d.x*U.x, d.x*U.y, d.x*U.z) ;
float3 V2 = (float3)(d.y*V.x, d.y*V.y, d.y*V.z) ;
_ray.origin = cam_pos ;
_ray.direction = normalize((U2 + V2 + W)) ;
_ray.index = (launch_index.y * _launch_dim.x) + launch_index.x ;
char _ray_datas_count = 0 ; // rtSpawnCoun
for(int _bbox = 0; _bbox < 2; _bbox++)
{
PRINTFMACRO(_ray.index,135340,"box %i warp : %i thread warp id : %i\n",_bbox,warp_id,thread_warp_id);
char hit = intersect_ray_bbox(_ray, _aabbs[_bbox],&_entry_distance, &_exit_distance);
if (hit)
{
PRINTFMACRO(_ray.index,135340,"bbox %i HIT\n",_bbox);
(_intersection) =((__global float3 *)_geometries_raw_data)[_bbox];
}else{
PRINTFMACRO(_ray.index,135340,"bbox %i NO HIT\n",_bbox);
}
}
//corrupt kernel args
// Ray r2;
// r2.origin = _intersection ;
// r2.direction = normalize((pos_lumiere1-_intersection)) ;
// _ray_datas[((_ray_datas_count)*WARP_SIZE) + thread_warp_id] = r2;
// (_ray_datas_count) = ((_ray_datas_count) + 1) ;
// r2.origin = _intersection ;
//r2.direction = normalize((pos_lumiere2-_intersection)) ;
//_ray_datas[((_ray_datas_count)*WARP_SIZE) + thread_warp_id] = r2;
//(_ray_datas_count) = ((_ray_datas_count) + 1) ;
}
From the values given to launch_dim it looks to be same as global size. IT should be better if it is queried inside the kernel using get_global_size(0).
Anyways from first view of the kernel, i would not expect a variable to change its value, if some non-relevant code is enabled. Could you give a small repro case, including hostside code, which I can compile and run at my end. That would help in quickly reproducing the bug and get it fixed it it exists.
Thank you for answering, here is a small repro case. I'm getting wrong printed values.
launch dim x : 0 launch dim y : 512 launch dim Z : 1 tid : 44
launch index x : 135340 launch index y : -1
It's possible I did a stupid mistake, but I can't see it .
Roger
ps: don't mind the joined file name I did this fast...
It looks like the argument is getting corrupted after the first iteration of the for loop.
I had inserted a simple printf for the variable _launch_dim, and here are my results:
C:\Users\cas\Desktop\HelloWorld>HelloWorld.exe
_launch dim: 512 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
_launch dim: 0 512 1
Forwarding it to Proper team. I checked the code thoroughly, and do not find any place where this variable is getting modified, so this change in value is not accepted. Please point out if i mis-understood something.
Thank you for taking a closer look into this, I'm still getting that issue right now .
Kernel variable launch dim getting modified without any reason.
I am raising a bug to AMD Engineering Team. I will update the post, once it is fixed.
The issue is no longer reproducible on the latest driver. Can you verify and confirm?