    OpenCL Pipes


      I'm trying to learn more about pipes, and also AMD's implementation of pipes on windows.


      I have two tightly coupled pieces of code that I would like to run as two concurrent

      kernels connected by pipes. But, I am concerned that performance may be poor.

      So, questions:


      1) how do I ensure that two kernels will run concurrently

      2) what kind of performance should I expect in the following situation: two kernels, k1 and k2

      and two pipes p1 and p2 - in k1's inner loop, write a byte to p1 which k2 reads and sends

      back a response byte in pipe p2 which k1 is waiting for.