I have a question which I could not resolve with the specs and other places.
When I create buffers, I have three main options:
CL_COPY_HOST_PTR: in this case memory is allocated on device and it is implicitly initialized by the actual data in host memory pointed to by one of the arguments.
CL_USE_HOST_PTR: this does not allocate memory on device, rather it only creates a pointer pointing into host memory and every access by the device (may it be read or write) travels through PCIe (except for APUs). Screaming slow.
CL_ALLOC_HOST_PTR: this flag allocates memory on the device, but leaves it uninitialized, thus allowing a NULL pointer to be passed as an argument. Does not involve implicit data copy, but leaves the programmer to initialize memory on device before usage.
Now, when I map buffer objects into host memory:
mapping: upon mapping a buffer into host memory, it's contents are copied back from device. If CL_MAP_READ is specified, than nothing special is done, after host thread finishes with data, it us unmapped and ready for use on the device. If CL_MAP_WRITE is enabled, then it can be read by host thread as before, but modifications on the contents of the mapped memory will be visible on the device once it is unmapped.
What will happen if device uses memory objects that are mapped to host memory at a given time? Will it crash the program, result in undefined behaviour, will it be sluggish?
Please tell me if I am making wrong assumptions at any given point.