Is there anyone? Off course for small domain so it would be more precise.
Maybe some information like number of input/output parameters, how much branching you use, how much threads will likely be deployed, and how long the kernel function is, would help.
Ultimately, your experience on when to use parallel function and when not to will be a big help.
Retrieving data ...