galmok

Peformance of EnqueueRead/Write/CopyBufferRect

Discussion created by galmok on May 15, 2011
Latest reply on Aug 12, 2011 by awatry

Is there any information on when we can expect the performance of the following functions to be increased (significantly)?

clEnqueueReadBufferRect

clEnqueueWriteBufferRect

(clEnqueueCopyBufferRect) <- didn't really test this one and it may be ok.

A short test, comparing a clEnqueueRead/WriteBuffer sequence with clEnqueueRead/WriteBufferRect transferring the same amount of memory show significant difference in transfer speed.

4096*4096*8 = 128MB upload and downloaded with non-rect takes about 7.5ms. This includes copying host memory to linear host buffer, uploading linear to device, running a kernel to copy linear device memory to destination array (strided).

The same array uploaded and downloaded with rect takes about 0.48 seconds. This is about 6.5 times slower and as it is, takes longer than the kernel operating on the data.

 

Outcomes