cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jajce85
Journeyman III

Work Groups & Work Units corresponding to GPU threads

Hi I was wondering if someone can explain to me how do work groups and work units get executed on ATI hardware in terms of kicking of threads(smallest instance of execution i.e. a single kernel)?

Say if I had n work groups each with m work units on an array with K length (just a simple 1-1 copy in kernel, no cris-crossing between global index & array index)? How many threads will get executed say on a Radeon 4870 (which has 10 compute units), how will the work groups and work units gets split up between the compute units and subsequent sub units in GPU?

Basically it would be good if someone wrote a chronology/timeline of execution explaining the threads connected to the work units and work groups?

Like for example :

Execute instance 0 : 4 Compute units work on work group 0-3 -> thread 0 in ComputeUnit0 works on WorkUnit0 in WorkGroup0, t1 in CU0 works on WU1 in WG0 and so on....
Ex. inst 1 : 4 Compute units work on work group 5-8 -> thread 0 in CompUnit0 works on WorkUnit0 in WorkGroup5 ...
...

I hope I am not asking for too much smile.gif

0 Likes
1 Reply
genaganna
Journeyman III

Originally posted by: jajce85 Hi I was wondering if someone can explain to me how do work groups and work units get executed on ATI hardware in terms of kicking of threads(smallest instance of execution i.e. a single kernel)? Say if I had n work groups each with m work units on an array with K length (just a simple 1-1 copy in kernel, no cris-crossing between global index & array index)? How many threads will get executed say on a Radeon 4870 (which has 10 compute units), how will the work groups and work units gets split up between the compute units and subsequent sub units in GPU? Basically it would be good if someone wrote a chronology/timeline of execution explaining the threads connected to the work units and work groups? Like for example : Execute instance 0 : 4 Compute units work on work group 0-3 -> thread 0 in ComputeUnit0 works on WorkUnit0 in WorkGroup0, t1 in CU0 works on WU1 in WG0 and so on.... Ex. inst 1 : 4 Compute units work on work group 5-8 -> thread 0 in CompUnit0 works on WorkUnit0 in WorkGroup5 ... ... I hope I am not asking for too much smile.gif

 

Please read following document for more details on this and This is a good document for beginners.

http://developer.amd.com/gpu_assets/Stream_Computing_User_Guide.pdf.

0 Likes