There are two *very* annoying bugs in the Brook+ API:
When a stream is called for the first time in the process-lifetime, it creates a new instance of brook::Runtime which in turn calls CALRuntime or CPURuntime. This runtime instance is then used for the lifespan of the process.
Things start to break when Streams get called from different threads. Let's say the main thread calls a stream after the process is created and later on a worker-thread calls the stream again. The second call (with a different threadID) causes brook::Runtime to create a new runtime enviroment. It calls CALRuntime::initialize() when it turn calls calInit(). This (now second) call fails and so does the runtime creation.
In effect, we can only use Brook from the very thread that called any Stream first for the whole lifetime of the process. This could be easily fixed by having the Brook API handle an internal table of threads and their specific context. I neither want to re-implement my code with CAL itself nor locally patch the the Brook API.
There are numerous references to std:cout in the API. Please remove or redirect to stderr instead of stdout *all* of them. We need complete control over what is printed to stdout and a library which inserts it's error messages into a stream of otherwise binary data is a pain :-\
Example is in Runtime.cpp, line 150:
if(_runtime == NULL)
std::cout << "Failed to initialize CAL runtime, falling back to CPU\n";