Some time when i transfer a single float4 data from CPU to GPU, it could cost 15 ms. And there is 1% chance to cost so much.
Although much of the transfering cost less, we still can't afford that much cost if it happened.
What's the reason? Is there any way to avoid it??
Really appreciate ur help!!
Using the asychronous nature of data transfer can help avoid this cost.