I have a piece of code that I'm trying to optimize for an APU that looks something like this:
float* temp = new float[x]; // some size x determined at runtime
// create/copy values into temp
// sizeof * x needs to be used since it doesn't get the size of temp from *temp
cl_buffer = clCreateBuffer(context, CL_MEM_USE_HOST_PTR, sizeof(*temp) * x, &temp, &error);
clEnqueueMapBuffer(cmd_queue, cl_buffer, CL_TRUE, CL_MAP_READ, 0, sizeof(*temp) * x, 0, NULL, &cmd_event, &error);
but this gives me incorrect results compared to the version which just uses on-device buffers. I believe the issue is with &temp not pointing to the full array (it also works with a static array if I use a host pointer). Can anyone correct my understanding on how to use USE_HOST_PTR with a dynamic host pointer (if this is even possible)? Thanks.