1. Are arrays with shared qualifier stored in LDS(Local Data Share)?
i.e: shared uvec4 sArray;
2. If it is true, then why they are slower than large SSBOs(or the same speed)?
In my case each shader invocation works with its own range of elements (96 elements, 96 reads and 96 writes).
GPU: Radeon HD 7850.