AnsweredAssumed Answered

OpenCL 2.0 compiler bug? (device side enqueue)

Question asked by elad on Apr 17, 2020
Latest reply on Apr 29, 2020 by dipak

A similar issue is reported here.

 

I compile a kernel (kernel1) that performs device-side enqueue to another kernel (kernel2).

When kernel2 is empty, or contains little code, there is no problem. 

 

But when certain operations occur in kernel2 the clBuildProgram fails when an error:

error: <unknown>:0:0: in function __kernel2_block_invoke void (i8*): local memory global used by non-kernel function

In my case, the trigger is calling atomic_add on a local memory variable, but in the reported issue it is calling work_group_reduce_add.

 


Is this behavior expected? or is this a bug? is there a workaround?

 

Here is the full kernels code:

 

kernel void kernel2(global float* src)
{
   local int just_a_local;

   if (get_local_id(0))
   {
      just_a_local = 0;
   }
   barrier(CLK_LOCAL_MEM_FENCE);

   atomic_add(&just_a_local, 5); // commenting this line resolves the issue!
}

 

kernel void kernel1(global float* src)
{

   local int just_a_local;

   if (get_local_id(0))
   {
      just_a_local = 0;
   }

   barrier(CLK_LOCAL_MEM_FENCE);

   atomic_add(&just_a_local, 5);

   if (get_global_id(0) == 0)
   {
   enqueue_kernel(
   get_default_queue(),
   CLK_ENQUEUE_FLAGS_WAIT_KERNEL,
   ndrange_1D(1),
   ^{
   kernel2(src);
   }

   );
   }

}

 

Hardware: AMD Radeon Pro W5700, Windows 10 latest driver.

Outcomes