I was wondering if there are any plans to allow for (general) non-rectangular domains in CAL.
For example, in a program that operates on the lower or upper half of a matrix it'd be great to have triangular domains. I've seen a factor of two speedup in my Cholesky factorization code by going from a big square domain (with the lower threads just exiting immediately) to multiple rectangular domains of fixed smallish height and decreasing width covering the triangle (leaving less threads having to exit straightaway) but this can't be optimal. Generating ones own maps from rectangular to non-rectangular domains is also tricky, particular when multiple programs have to run on different parts of a resource. Or does anybody else know any other tricks?
I recently came across the old ATI CTM Guide and that mentioned the facility for conditional program execution that would be ideal for this specific case (i.e. it could only run the kernel for points in the domain whose x coord was greater than the y one say). Is this facility still available on modern cards and might such functionality be made available in CAL?