I see the subkernel can perform read and write stream in binomial option
What's the restrictions of this?
is there anyone could explain?
For example
gpu_backwardTraverseFirst(A2.x, puByr4.x, pdByr4.x, A1, A1);
gpu_backwardTraverseFirst(tempPrice4.x, puByr4.x, pdByr4.x, A2, A2);
gpu_backwardTraverse8(puByr4.x, pdByr4.x, A1, A2, A1);
The comment said A1 and A2 in subkernel are used as temporary variables, while at the same time eliminating the original CPU loop inside kernel
I confuse, how subkernel act now, subkernel I know just get some parameters the return the output, but now it write to the kernel. As there are threads as many as output stream size, didn't one thread will overwrite another thread's stream?
Help