Archives Discussions

ManeOne · ‎09-28-2010

The AMD OpenCL programming guide (page 69) states that work-groups retire in order for the HD 5000 series GPU's. Is this behavior defined in the specification or does it only apply to the 5000 HD series?

One or more work-groups execute on each compute unit. On the ATI RadeonTM
HD 5000-series GPUs, work-groups are dispatched in a linear order, with x
changing most rapidly. For a single dimension, this is:
DispatchOrder = get_group_id(0)
For two dimensions, this is:
DispatchOrder = get_group_id(0) + get_group_id(1) * get_num_groups(0)
This is row-major-ordering of the blocks in the index space. Once all compute
units are in use, additional work-groups are assigned to compute units as
needed. Work-groups retire in order, so active work-groups are contiguous.

I have a kernel that takes two arrays of strings. The kernel takes one string in array A and searches for a match in array B. The end result is a boolean global array of ints (not ptimal, but easy and my gfx card is only OpenCL 1.0 compliant ). From the boolean array a floating point score is calculated, this part of the code is highly sequential as it involves running sums, but I would like to calculate the score in the kernel. What I am thinking is that if the work-groups retire in order I could use a local barrier and then have the last work-item perform the sequential score calculation.

himanshu_gautam · ‎09-28-2010

ordering of workgroups is implementation defined and not openCL specification defined.

As far as your case is concerned it would be more safe to write a separate kernel.

ManeOne · ‎09-28-2010

That's what I kinda expected.... Thanks for the reply

edward_yang · ‎10-02-2010

I don't think your case requires the workgroup to retire in order, though. Unless I am misunderstanding your problem, you should be able to use mem_fence and atomic add to synchronize multiple workgroups at the end of the kernel execution (and find out the precise order of each retiring workgroup).

himanshu_gautam · ‎10-03-2010

As of now,global sync is just not supported.

Although there might be some way to force it as per this link.

http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=140248&highlight_key=y

Archives Discussions

Work-group execution order