cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

boxerab
Challenger

Seeking sum algorithm for GPU

I have a workgroup of size N, and local array "buffer" of size N.

For each work item in the group, with local id k, I want to calculate

the sum S of all array items with index less than k.

i.e.

for (int i  = 0; i < k; ++i)

    S += buffer;

Currently, I calculate this naively, as above.  Is there a more efficient way of

doing this, where intermediate sums are stored back into buffer, for example?

0 Likes
1 Reply
realhet
Miniboss

If you sum every pair at the start in radix style, you can reuse them later. This way you'll need half amount of adds at every thread.

But in reality the required amount of inter-thread communication for this would be so slow through local memory.

0 Likes