cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rotor
Journeyman III

passing local variable from one kernel to another kernel

passing local memory pointer

Hi all,

I am considering the possibility of passing local variable (i.e. local memory pointer and/or address) from one kernel to another. So is there anyone here can help a suggestion. The scenario is described below:

- I did the task f1 in kernel k1 and get some results store in a local memory variable (let says __local output_k1[1024]). Now I lunch a second kernel k2 and this kernel k2 wanna get the out put Output_k1 from kernel k1 as k2's input. Traditionally we may need to write back the Output_k1 to the host then the host pass those values located in host memory to the kernel k2. However it is really inefficient to do that, so I want to keep Output_k1 inside the device's local memory and just pass the pointer to that local memory location to kernel k2.

Thank you,

Roto

 

0 Likes
5 Replies

rotor,
Local memory is only alive during a kernel execution. In order to pass data between kernel executions, you need to write it back out to main memory.

Also, why not have k1 call k2 directly?
0 Likes

Thanks Micah. Actually the idea k1 call k2 directly is very useful but only if the workgroup size (and dimension) of k1 and k2 are the same. In my case I want to set workgroup size of k2 different from the one of k1(its sound irregular to do so but because of the structure of the k1's output result ).

0 Likes

rotor,
Another thing you can try is to only call k2 from k1 if the thread is within the second smaller output range.

Say you have 128 threads that you want to run on k2 and k1 but 512 threads you want to run on k1 only.

kernel void k1()
{
...
if (get_global_id(0) < 128) {
k2()
}
}

The only problem here is there is no global synchronization and there might be a slight performance degradation for certain work-groups where parts of them call k2 and parts of them do not.
0 Likes

Hi Micah,

 

That's would be a nice solution in this situation . However there's one thing I consider here: there are some __local variables in k1 is not needed anymore from the point I call k2 and beyond. But in k2 I may want to utilize a large __local memory chunk. So do I have any mechanism to release some variables in k1 (atomic operation maybe) and dynamically allocate some new local memory for k2 on the running time.

e.x:

 

k2()

{

allocate(var);

do the task;

}

---

 

k1()

{

...

release(var)

k2();

}

---

 

0 Likes

Why can't you just reuse them? Are you doing inline __local variable declarations in the kernel?

In my opinion that should not have been allowed into the standard because it represents highly dubious scoping. I always write kernels by passing in a local memory allocation through the API because it has clearer scoping of its lifetime. If you do that and just use const offsets into the array in the kernel then you can use the same const addresses in both subkernels for different data and manage the way the local data is used yourself instead of wondering what the compiler wants to do with it.

Maybe:

__kernel a( __local *al )

{

  b( al );

  c( al );

}

void b( __local *bl )

{...}

void c(__local *cl )

{...}

 

0 Likes