Efficiently copying from local buffer to image

Discussion created by binarysplit on Dec 4, 2011
Latest reply on Dec 7, 2011 by notzed
Image broken into tiles for local processing. Best way to copy back to the image?

I'm making a rasterizer where each work group processes a 32*32 tile. To maximize IO speed, the tile stores a buffer in local memory then copies it back to the image in main memory when it's done. Unfortunately I can't use the async copy functions because I need to the global buffer to be of type image_t so that I can hand it over to OpenGL once the processing is complete.

What's the fastest way to BLT my tile into the output image? Should I have SPU1 loop through the pixels and copy them in, or should I have all 16 SPUs do an interleaved copy, or a non-interleaved copy? Or is there some way to use the async copy functions with images?