I am optimizing my kernel and I wonder if I should store the value of get_global_size(i) in a local variable or if it is as efficient just to call the function every time.
IMHO this is optimized and get_global_id is not computed each time.
They are cheap, and calculated only once at the beginning of a kernel (only if you calling them in your kernel).
For example on the 7970 it costs only a single MAD to acquire a zero based linear thread id. All other ids are calculated based on this linear id using a hidden constant buffer filled with your NDRange's parameters.
If you use only get_global_id(0) then will use about 8 instructions.
Retrieving data ...