Is there any relative perfomance droping when using domain of execution?
Comparing two situations, which one got better performance?
1. use old array for output, then call the kernel with domain of execution.
2. declare a new array for output which has the same size of the domain of execution in situation 1, then call the kernel without domain of execution.