I'm new to OpenCL.  I'm trying to solve a task parallel problem in which values in a grid are dependent upon the solved gird values directly above and directly to the left.  (ie in a row major layout, [i,j] depends on [i,j-1] and [i-1,j]).

I would like to use the clEnqueueTask and event structure to enqueue the grid and have each cell waiting on the events of the dependent cells.  This seems to be problem OpenCL was made for.

I cannot find any good examples of this to work from.  I am spinning my wheels trying things and failing miserably.  Any pointers would be greatly appreciated.

One specific question --do I need a separate cl_kernel instance for each cell, or can create 1 and reuse it for all cells?