I'm working on the direct communication between an FPGA PCIe board (Altera) and a GPU (nVidia for the moment). We easily managed to make the FPGA write in the memory exposed by the GPU, where a kernel is polling to detect new data, but we couldn't make the GPU write in the memory of the FPGA. Actually, we couldnt find any function allowing us to give the GPU a FPGA physical address it could use (CUDA wants a address in the system memory and can't handle memory on the PCIe bus). So today, the FPGA has to read in the memory of the GPU when the CPU tells it to do so, which is quite inefficient and leads to other problems (optimal synchronization is difficult).
DirectGMA seem to be exactly what we need as the DirectGMA page says that it allows a GPU to write directly in the memory of a device supporting DirectGMA, and beside that, we are very interested in using OpenCL instead of CUDA.
My question is: what does need a FPGA board to be considered as a device supporting DirectGMA? Is it possible to use the same polling mecanism on the GPU side to detect fresh data? We would like not to rely on interruptions as our application is very latency sensitive.