Device: HD5850
cl::Image2D bufferIn(context, CL_MEM_READ_WRITE, SingleIntensity(), bufferDimension, bufferDimension);
...
Profile p; p.start("upload"); commandQueue.enqueueWriteImage( imageBuffer, CL_TRUE, o, s, // o = 0,0,0 s = width,height,1 bufferDimension * sizeof(float), 0, pBuffer); p.stop(); |
where
is
cl::ImageFormat(CL_INTENSITY, CL_FLOAT) |
and pBuffer points to a float[bufferDimension * bufferDimension].
bufferDimension is 5000, and for an image with 3168x4752 it takes 0.233590s.
3168*4752*4 / 0.233590s = 0.257790762 GB/s
Much too slow?! Which steps would you suggest to find the speed problem? (Ubuntu 10.4, newest SDK/Driver)]
*edit*
speed from sample seems ok.
ati-stream-sdk/samples/opencl/bin/x86_64$ GPU_MAX_HEAP_SIZE=90 ./PCIeBandwidth --device gpu --timing --verify --iterations 5 --length $((3168*4752*4)) Host to device : 2.0552 GB/s Device to host : 1.03913 GB/s Passed! |