The following (simplified) code causes brcc to hang:
kernel void computeEnergy(uint4 old_spin, uint4 new_spin<>, out uint4 energy<>, int ROWS, int COLS)
int4 index = instance();
kernel void updateSpin(uint4 spin_in<>, uint4 seeds_in<>, out uint4 spin_out<>, out uint4 seeds_out<>, int num_steps, uint num_spin_states, int ROWS, int COLS)
computeEnergy(spin_in, proposed_spin, proposed_energy, ROWS, COLS);
Is the hang being caused by the subkernel containing a stream (gather) operation? The manual is a little unclear on this - it says that kernels cannot call stream operators. Perhaps this was supposed to say subkernels cannot call stream operartors?
If the gather stream in the subkernel is the source of my problem, how does one deal with gathers that should be in subkernels? Just inline the code in your kernel? Can you perform a gather on stream data that local to the kernel (ie, not an input)? Since you can't write to the input stream, how do you do a gather operation inside a loop, where each gather is performed on the result of the previous iteration?