I just wanted to ask and see how many users have had real success using Brook+ (NOT CAL). When I say "real success" I mean some complex (non-simple, non-embarassingly parallel) real world application using Brook+ while providing performance improvement.
I think it would be very interesting to see and I welcome any/all posts. Please, post the application if you want and the speedup if you want, that would be great!
Personally, I have had little/no success with some LBM using Brook+. I have currently switched to CUDA/Cell for the time being while I am waiting for Brook+ to become more mature and better documented.