Question regarding memory access

In many circumstances, the band width of the global memory is a bottleneck of performance.


If two or more threads in the same work group read the same data from global memory, at the same time or almost the same time, will the GPU read the data only once, and broadcast the data to all threads requiring it?


What if the threads in DIFFERENT work groups read the same data from global memory?


