As it was discussed in other threads, it's possible to enable the second GPU in a HD5970 board, but when we do so, we find that there is a performance drop; instead of a value near the theoretical 2x in kernel execution, we have a value around 1.4 times improvement, so my question is: Is something that we can do to mitigate it? I.e. It should be better to do a memory tranfer + kernel invocation for each GPU in order; or it could be better to do the memory transfers for each GPU, wait to them to finish, and then start calling the kernels? Does it make a difference? thanks a lot for any insight about this.
Its best to do them all simultanosly using one thread for each device.
Although AMD do not support the second device in 5970.