In the clAmdFft reference manual it states:
For a more complex example, an input buffer contained a raster grid of 1024 x 1024 monochrome pixel values, and you want to compute a 2D FFT for each 64 x 64 subtile of the grid. Specifying strides allows you to treat each horizontal band of 1024 x 64 pixels as an array of 16 64 x 64 matrixes, and process an entire band with a single call to clAmdFftEnqueueTransform. (Specifying strides is not quite flexible enough to transform the entire grid of this example with a single kernel execution.) It is possible to create a Plan to compute arrays of 64 x 64 2D FFTs, then specify three strides: [1, 1024, 64]. The first stride, 1, indicates that the rows of each matrix are stored consecutively; the second stride, 1024, gives the distance between rows, and the third stride, 64, defines the distance from matrix to matrix. Then call clAmdFftEnqueueTransform 16 times: once for each horizontal band of pixels.
However, to call clAmdFftEnqueueTransform on each band one would need to be able to point to the beginning of each band in the call. But clAmdFftEnqueueTransform only offers input and output buffer parameters, no offset parameter in the buffer. So how would one point to each separate band for processing? (save from creating large numbers of sub-buffers with clCreateSubBuffer)