AnsweredAssumed Answered

CL_MEM_USE_PERSISTENT_MEM_AMD flag being ignored by clCreateBuffer (but not by clCreateImage)

Question asked by efraim on Nov 8, 2015



I am trying to allocate host-visible device memory as a buffer.

I use the following code snippet:

auto buff = clCreateBuffer(ctx, CL_MEM_USE_PERSISTENT_MEM_AMD, 1024*1024, NULL, NULL);
auto ptr = clEnqueueMapBuffer(q, buff, CL_FALSE, CL_MEM_READ_WRITE, 0, 1024*1024, 0, NULL, &ev, NULL);
clFlush( q );
spinForEventsComplete( 1, &ev );
*(int*)ptr = 0xDD;


Verify the resulting pointer with WinDbg and it is not mapped to the GPU memory, but rather to the host memory.

On the other hand if I do:

cl_image_desc pixelDesc;
memset(&pixelDesc, '\0', sizeof(cl_image_desc));
pixelDesc.image_type = CL_MEM_OBJECT_IMAGE2D;
pixelDesc.image_width = 1024;
pixelDesc.image_height = 1024;
cl_image_format pixelFormat = { CL_RGBA, CL_UNSIGNED_INT8 };
auto img = clCreateImage(ctx, CL_MEM_READ_ONLY | CL_MEM_USE_PERSISTENT_MEM_AMD, &pixelFormat, &pixelDesc, NULL, NULL);
size_t  imageOrigin[3] = {0, 0, 0};
size_t  imageRegion[3] = { 1024, 1024, 1 };
size_t rowPitch;
auto ptr = clEnqueueMapImage(q, img, CL_FALSE, CL_MEM_READ_WRITE, imageOrigin, imageRegion, &rowPitch, NULL, 0, NULL, &ev, NULL);clFlush( q );
spinForEventsComplete( 1, &ev );
*(int*)ptr = 0xDD;


the resulting pointer is being mapped to the GPU memory.

According to the documentation (the AMD Accelerated Parallel Processing OpenCL Programming Guide) it should be possible to use host-visible device memory on both buffers and images.


I have also verified that the TransferOverlapped sample does not demonstrate any performance difference whether CL_MEM_USE_PERSISTENT_MEM_AMD is supplied or not. This strengthens my hypothesis that the OpenCL runtime ignores the flag.


I am using the Pitcairn GPU on Windows 7 64-bit with the latest AMD driver.