I'm trying to implement a multi gpu app.
i need to share data between two gpus.
Because there is random access in the buffers, i cant just split the calculation into half.
I would like to ask, what could be the best way to synchronize modified data after kernel execution. I cant just copy them together, because there are random modifications on both gpu(but they newer write to the same area).
First i thought that svm is the answer, but it seems that its only shared between one device and the host at a time. If i create an svm buffer and do modifications on the kernel side, it
won't be combined on the host side, there is no option to just map the buffer, i can only map the buffer from queue1 or queue2 so map the gpu1's svmbuffer or the gpu2's. I have to
map the buffer to queue1(gpu1) and queue2(gpu2) every time i want to send data to the gpu-s. Am i doing something wrong, or its not actually as shared as i thought.