Hi Peter,
a gather array input with a normal output works just fine for this: I have higher order finite differences code working in exactly the same way.
For example:
kernel void compute_diff( double in_stream[], out double out_stream<> {
// get the index of the out stream being treated
int4 outpos = instance();
int xpos = outpos.x;
// compute the finite difference
out_stream = in_stream[xpos+1] - in_stream[xpos] ;
}
That's all there is to it; if you declare your receiving stream to be one shorter in length than the original matrix, you will have no problems with the last value. If both streams have the same length, you will need to deal with the last position in out_stream, either by calling the kernel on out_stream.domain() with the appropriate limit, or by using the kernel offset possibility in brook+1.4.
Regards,
Olivier