Read Access to LDS

Discussion created by FamilyGuy on Aug 12, 2010
Latest reply on Aug 12, 2010 by Jawed

Question: the doc (OpenCL Programming Guide rev 1.03) says

"Each stream processor can generate up to two 4-byte LDS requests per cycle."

How do I actually achieve this for reads in IL? There are LDS_LOAD (one DWORD) and LDS_LOAD_VEC (four DWORDS). Both of them appear to be inefficient on 5870 chips.