Does OpenCL take advantage of the following techniques when using small local arrays?
- On VLIW -> indexed_temp_arrays (x0[n]) (aka. R55[A0.x] indirect register addressing in ISA)
- On GCN -> v_movrel_b32 instruction
Or if OpenCL always uses LDS memory for local arrays, is there an extension to enable those faster techniques?
Thanks in advance.