Why work item number in a work group better be multip of 64?
for a8-3850, it has total 400 simd engines and 5 CUs. I guess the best number for work item/work group is 80* N (400/5). But the profiler suggest to be N*64. Can anyone help me on why I understand wrong?
each CU has 16 SIMD engines. each workgroup is assigned to one CU. CU operates in wavefronts. one wavefront is executed during four ticks when it process 4*16=64 workitems.