This content has been marked as final. Show 3 replies
Each compute unit has on the 4870 has 16 thread processors, each with 5 streaming cores per thread processor. So 16 * 5 * 10 = 800.
Some hardware slides can be found:
Also there are some well written architecture reviews on some of the tech sites that go into more detail.
ahh ok. So in terms of kicking of work groups and subsequently work units. The way work groups & work units are divided into threads and into streaming cores is up to the driver right?
at clEnqueueNDRangeKernel, you specify the dimensions of the work-group and the total number of work-items. A work group is assigned to a compute unit and completes all execution on that compute unit.