Application does not scale when using cl::Buffer-Object

Discussion created by centershocksb12 on Mar 17, 2011
Latest reply on Mar 21, 2011 by himanshu.gautam


i have to following problem:

My application does not scale for multiple GPUs. It always is a bit slower on more GPUs than on less.

I could figure out, that a cl::Buffer-Object is causing this. I use the Buffer as follows:

First I create a usual array with malloc() which includes 20 elements (they are filled later):

int* pOverlap_region = (int*) malloc(80);

After it is filled I create the Buffer-Object:

cl::Buffer overlap_region = cl::Buffer::Buffer(
            this->context.getOpenCLContext(), CL_MEM_COPY_HOST_PTR, 80, pOverlap_region, &err);

this->context.getOpenCLContext() returns the context.

Then it is set as an argument for the kernel:

err |= kernel.setArg(3, (cl::Buffer) overlap_region);

If this Buffer is created and *not* set as an argument, the application scales on multi-GPU.

Does anybody know why the behaviour is like this?

Thanks for your replies