cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

pbcpbt
Adept I

How do I get the number of work groups?

What is the best way to know the number of work groups used by kernel?

17 Replies
himanshu_gautam
Grandmaster

Re: How do I get the number of work groups?

I think clGetKernelWorkGroupInfo() API must help

http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clGetKernelWorkGroupInfo.html

Note that this API will return what is the size that is safe to use for that device.

While enqueuing a kernel, one can specify anything that one wants.

So, that way, the actual workgroup size with which a kernel is executing now is not a property of the cl_kernel object.

The application has to track all these.

HTH

0 Kudos
Reply
pbcpbt
Adept I

Re: How do I get the number of work groups?

clGetKernelWorkGroupInfo() returns the following:

CL_KERNEL_WORK_GROUP_SIZE

CL_KERNEL_COMPILE_WORK_GROUP_SIZE

CL_KERNEL_LOCAL_MEM_SIZE

How am I supposed to get work group number out of these?

0 Kudos
Reply
himanshu_gautam
Grandmaster

Re: How do I get the number of work groups?

IMHO, you are interested in knowing the number of workgroups inside the kernel. The API suggested above is useful while you are spawning workgroups from host side.

To get number of workgroups from kernel space use size_t get_num_groups (uint dimindx)

To get the workgroup number a thread belongs to use size_t get_group_id (uint dimindx)


0 Kudos
Reply
pbcpbt
Adept I

Re: How do I get the number of work groups?

I'd prefer to get group number from host side.

How do I use suggested API?

0 Kudos
Reply
himanshu_gautam
Grandmaster

Re: How do I get the number of work groups?

pbcpbt wrote:

I'd prefer to get group number from host side.

How do I use suggested API?

What do you mean by that? How can you get group number from host code, work-groups will only execute on device.

You can set the number of work-groups to whatever value you please in host code, or you can query the runtime to give you a favorable work-group size.

0 Kudos
Reply
pbcpbt
Adept I

Re: How do I get the number of work groups?

1. I want to know HOW MANY WORK GROUPS are being executed.

2. I want to know how can I set the number of work groups myself.

0 Kudos
Reply
himanshu_gautam
Grandmaster

Re: How do I get the number of work groups?

pbcpbt wrote:

1. I want to know HOW MANY WORK GROUPS are being executed.

Check the argument globalsize and localsize in clEnqueueNDRangeKernel function.

Number of Workgroups = globalSize / local Size

2. I want to know how can I set the number of work groups myself.

You can set the same values for setting the number of workgroups. Keep in mind, that local size perfectly divides globalSize.

0 Kudos
Reply
pbcpbt
Adept I

Re: How do I get the number of work groups?

As far as I understand the number of work groups can not be more than maximum work group number for any given device.

If I use the "Number of Workgroups = globalSize / local Size" equation as you suggest I get some unreal results.

For example I have globalSize around 1,000,000,000 (2^30 to be exact) and maximum local Size is 256. Your equation gives the Number of Workgroups = 4,194,304. One of my devices has maximum number of workgroups 248, the other 256.

0 Kudos
Reply
LeeHowes
Staff
Staff

Re: How do I get the number of work groups?

Where are you getting these maximum workgroup numbers?

My guess about your numbers is that a tahiti card with 32 cores can have 8 workgroups per core due to barrier resources, so it would be able to support 256 concurrent workgroups if they had more than one wavefront each. It could support thousands of concurrent single-wavefront workgroups, though, because the workgroup consumes no significantly limited resources in that case over what the wavefront itself does.

If the runtime lets you launch a billion work items, then you will get enough workgroups to satisfy that. They won't all be concurrent and instead will have their entry onto the device serialized such that more will be launched as others  complete, freeing resources in the process, but they will be run eventually.

0 Kudos
Reply