In CS,
1) is 16x4 as good as 8x8 memory access pattern?
2) it would better to use separate sampler (on the same resource) in each wavefront, if the wavefronts fetch data from different memory areas?