cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

cadorino
Journeyman III

Kernel occupancy and workgroup size

Hi to everybody.

I'm developing a benchmark to estimate the completion time of integrated and discrete GPUs considering the amount of operations executed per byte transferred.

The kernel is very simple and "useless". Simply put, each thread reads the same constant argument and adds this value to an accumulator variable a certain number of times.

What I'm a little bit surprised to discover is the kernel occupancy by varying the global size and the work group size.

In particular, I set the global size to 256K and the work group size to 64. On the 7970 the occupancy is 100%. On the A8-3850 (Llano) the occupancy is 25%. If i double the work group size (128) the occupancy of the integrated GPU becomes 50%.

Can you help me to understand why it is so?

Thank you!

0 Likes
10 Replies