Can barriers improve performance

Well, it is clear that barriers are almost necessary in all the kernel for synchronization purpose. But can there be some impact of barriers on the wavefront scheduler , like trying to execute wavefronts in a workgroup more closely to each other at barriers. Just speculating


Another query was related to mem_fence function. This seems like a synchronization functionality, but it is not blocking in nature. Any situation, where this will be preferred over barriers?