See the section in the programming guide about memory tiling.
In short: for maximum performance image access should be 2d-coherent, and the cache is so small it has to be pretty closely coherent.
For random access pattern a simple array might be better, unless the 8-bit 'float' access is useful.