About work-groups sheduling mechanism

Discussion created by PavelKudrin on Feb 17, 2010
Latest reply on Feb 19, 2010 by omkaranathan

I have Radeon HD 4870x2 card (RV770 GPU), and i've got number of simultaneously processing work-groups = 32 by experimental way.

I didn't understand, from where this number appeared. 

As I know for RV770, CL_DEVICE_MAX_COMPUTE_UNITS = 10. Why then 32?

Additional info:

a) works fine:

globalWorkSize = 8192

localWorkSize = 256

b) don't work:


globalWorkSize = 8448

localWorkSize = 256

c) don't work:


globalWorkSize = 16384

localWorkSize = 256


local work size is got from clGetKernelWorkGroupInfo(... CL_KERNEL_WORK_GROUP_SIZE ... )

Then I have following questions:

1) how many simultaneous work groups can work together?

2) if that number exists and is finite, then is the number of simultaneousely processing work-groups depends on GPU type? 

3) also if that number exists and is finite, then how this number can be retrieved programmatically by querying device? (like for number of SIMD engines using clGetDeviceInfo( ..., CL_DEVICE_MAX_COMPUTE_UNITS, ... )