It's only a short note in the glossary of both the IL reference guide, July 2010, v.2.0d, and Stream Computing programming guide, rev2.01.
So, correspondingly, I'm using CAL +IL.
In the example of the post above, if I allocate the resource with the GLOBAL_BUFFER-flag, and run only Kernel_B, I get 0% hitrate. If I omit that flag, I get 50% hitrate (and of course a shorter runtime).
Thx.