I am currently working on GitHub - fancyIX/sgminer-phi2-branch: A PHI2 implementation of sgminer for AMD .
I have some questions about OpenCL and optimization on GCN cards.
Please add me to the white list.
The first question I'd like to ask for help:
GPU scheduler can assign multiple work-groups to a CU if GPU resources (VGPRs, LDS etc.) are available. Actually, ACEs (Asynchronous Compute Engines) are responsible for all compute shader scheduling and resource allocation. ACEs manage compute tasks and GPU resources and accordingly create and dispatch work-group(s) to individual CUs for execution. It is very dynamic in nature.
So, I think GPU scheduler will automatically run two or more work-groups on same CU if :
From OpenCL programming perspective, device fission could be used in case, however, currently device fission is not supported for AMD GPUs. Other than this, I'm not aware of any OpenCL feature that can be used to control the association between work-groups and CUs.