6 Replies Latest reply on Aug 4, 2014 2:40 AM by pinform

    Kernel argument corrupted  ? (maybe)

    roger512

      Hi,

       

      It seems i'm getting a kernel argument corruption,I dont clearly understand why and I can't reproduce it with nvidia cards.

       

      Here is the code, it's a small skeleton of ray casting algorithm using a persistent thread. The problem is if I uncomment the last lines, the kernel variable _launch_dim is modified (???). The code output the following lines

       

      launch dim x : 512  launch dim y : 512 launch dim Z : 1 tid : 44

      bbox 0 warp : 0 thread warp id : 44

      bbox 0 NO HIT

      bbox 1 warp : 0 thread warp id : 44

      bbox 1 HIT

       

      which is good but when i uncomment the last lines from the for loop, i get that :

       

      launch dim x : 0  launch dim y : 512 launch dim Z : 1 tid : 108

      bbox 0 warp : 1 thread warp id : 44

      bbox 0 NO HIT

      bbox 1 warp : 1 thread warp id : 44

      bbox 1 NO HIT

       

      launch dim x is modified (?), it's doesn't happen every time it depends of execution and the ray computed.

       

      If you have any idea how I can get though this, It would be much appreciated.

       

      I'm currently using a HD 7950 with 13.4 drivers and a windows 7 64 bits.

       

      Roger

       

       

       

       

      for(;;)

          {   

              //PRINTFMACRO(thread_warp_id,44,"launch dim x : %i  launch dim y : %i launch dim Z : %i tid : %i\n",_launch_dim.x,_launch_dim.y,_launch_dim.z,get_global_id(0));

              if (thread_warp_id == 0) {           

                  pop_index[warp_id] = atomic_add(&_queue_heads[0],WARP_SIZE) ;

              }

             

              if (pop_index[warp_id] + thread_warp_id  < _queue_MAX)

              {

                  uint index = pop_index[warp_id] + thread_warp_id;

                 

                 

                  PRINTFMACRO(index,135340,"launch dim x : %i  launch dim y : %i launch dim Z : %i tid : %i\n",_launch_dim.x,_launch_dim.y,_launch_dim.z,get_global_id(0));

                  launch_index.y = (index % (_launch_dim.y*_launch_dim.x)) / _launch_dim.x;

                  launch_index.x = ((index % (_launch_dim.y*_launch_dim.x)) % _launch_dim.x);

             

                  float2 d;

                  d.x = ((float)launch_index.x / (float)_launch_dim.x) * 2.f - 1.f ;

                  d.y = ((float)launch_index.y / (float)_launch_dim.y) * 2.f - 1.f ;

         

                  float3 U2 = (float3)(d.x*U.x, d.x*U.y, d.x*U.z) ;

                  float3 V2 = (float3)(d.y*V.x, d.y*V.y, d.y*V.z) ;

       

                  _ray.origin = cam_pos ;

                  _ray.direction = normalize((U2 + V2 + W)) ;

                  _ray.index = (launch_index.y * _launch_dim.x) + launch_index.x ;

                 

                  char _ray_datas_count = 0 ; // rtSpawnCoun

                 

             for(int _bbox = 0; _bbox < 2; _bbox++)

                  {

                             

                      PRINTFMACRO(_ray.index,135340,"box %i warp : %i thread warp id : %i\n",_bbox,warp_id,thread_warp_id);

                        

                      char hit = intersect_ray_bbox(_ray, _aabbs[_bbox],&_entry_distance, &_exit_distance);

                    

                      if (hit)

                    

                      {

                          PRINTFMACRO(_ray.index,135340,"bbox %i HIT\n",_bbox);

                        

                          (_intersection) =((__global  float3 *)_geometries_raw_data)[_bbox];

       

                      }else{

                          PRINTFMACRO(_ray.index,135340,"bbox %i NO HIT\n",_bbox);

                      }

                  }

                 

                //corrupt kernel args

       

                //  Ray r2;

                 // r2.origin = _intersection ;

                 // r2.direction = normalize((pos_lumiere1-_intersection)) ;

                 

                 

                 // _ray_datas[((_ray_datas_count)*WARP_SIZE) + thread_warp_id] = r2;

                //  (_ray_datas_count) = ((_ray_datas_count) + 1) ;

                 

                 

                 // r2.origin = _intersection ;

                  //r2.direction = normalize((pos_lumiere2-_intersection)) ;

                 

                  //_ray_datas[((_ray_datas_count)*WARP_SIZE) + thread_warp_id] = r2;

                  //(_ray_datas_count) = ((_ray_datas_count) + 1) ;

                                     

              }