Originally posted by: Raistmer I need to initialize float or float4 flag before kernel call. What better to use, some memset kernel of one workitem or by writing float or float4 zero value from host to GPU memory? In which case call overhead will be lower?
Just add one more parameter to actual kernel and set kernel argument with required value and call kernel.
Originally posted by: Raistmer But how can I ensure then that last launched workitems don't overwrite already setted flag ? for example: __kernel void abc(__global int* flag, int zero){ flag[0]=zero; if ( ....... ){ flag[0]=1; } } if condition not true in last wavefront I always will recive zero flag, not ?
Make it constant so that no one able to overwrite.
Originally posted by: Raistmer LoL, if I make it constant then how flag can be setted if needed condition is true ? Looks like I miss your idea completely What I need: if some event happened due kernel run (in my case it's some signal is found) flag setted to 1, no matter what workitem found it and no matter many workitems found it or just single one. But if no one found it - flag should remain zero. Then, after kernel launch CPU code can download only flag from GPU memory and decide if it needs whole data array to be downloaded too or not. But that flag should be initialized. If I will initialize it inside kernel - no synching between workgroups possible AFAIK. That is, some workitem in prev workgroup can find signal, then next workgroup will re-initialize flag that kills whole idea....
You can solve this by using atomics. if you find singal, increment the value of flag by 1.
Originally posted by: Raistmer yes, but how I will set initial value? increment from random initial number will not help...
create buffer with using host ptr where host ptr value is initialized with required value and set kernel argument and run kernel.
Originally posted by: Raistmer And third way: 1)call memset kernel 2)call kernel What way will faster ?
Second is better than third becasue third is Second + kernel invocation + kernel execution.
Second is the best among the three.