Hello again, using Radeon HD5700 and HD5800 series:
1) How diferent is rsqrt vs native_sqrt? Precision?
2) Will the max work group size be increased in the future?
3) There is a way to force using dot products and the _PREV instructions?
4) I'm getting too much LDS bank conflicts (100 according to the profler) on a kernel, it does mostly write private-read public on float4 datatype, why exactly the bank conflicts are so high? Workarounds?
5) This had some threads before... What's about more than 4 GPUs in a system?