Cuorematto,
1) w.r.t is shorthand for 'with respect to', i've asked our documentation guy to expand this instead of using shorthand
2) calShutdown needs to be called when you are done using CAL and before calInit is called again. This usually happens at the end of a program, but isn't required to. calInit and calShutdown are not thread safe and therefor should only be called from the main thread of a program and are also a pair. So for every calInit, there needs to be a calShutdown called, this returns the state of the cal subsystem to the original state allowing calInit to be called again. Nesting of calInit and calShutdown is not allowed.
3) The domain is the 2D range of data you want to run. You can think of this as a matrix of data to run your kernel on. So, with a domain of 256x256, you are going to run 65k iterations of your kernel
4) There are currently no options available for the user to choose various compiler optimizations. This allows the compiler to better target your kernel for the specific architecture that the kernel will run on which might have very different constraints than the originally developed architecture.
Hope these answered your questions and let me know if you have any others.