cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

joggel
Journeyman III

Re: How to do parallel reduction correctly?

Jump to solution

Yeah, this is right!

0 Likes
joggel
Journeyman III

Re: How to do parallel reduction correctly?

Jump to solution

Yeah, you are right. Which means, I need a other method to perform reduction on global level. This is the reason, why that kernel is not working. Thanks a lot.

Is it possible to perform any other global reduction with more than one workgroup? It looks like, that using only one work group is the key to my problem.

So that the Tesla machine returned the right values, was just a coincidence. I will rewrite the code.

Thanks a lot, again.Now I know, what I shouldn't do in the future.

0 Likes
joggel
Journeyman III

Re: How to do parallel reduction correctly?

Jump to solution

Mhh, what do you mean by "spawning only one workgroup"? You mean I spawn only one work group, which has the size of the array(therefore there is only one)? If I do this, my device runs out of memory.

It looks like I need a complete different approach, or I let the cpu do the final reduction.

0 Likes
himanshu_gautam
Grandmaster

Re: How to do parallel reduction correctly?

Jump to solution

No... What i meant was this:

"

I  wrote some host code that will allocate memory , build program, set arguments and call your kernel.

In this host code, if I spawn only 1 workgroup, your code works fine.

The code-snippet that you complained about works fine on 7970 card here.

However, if I spawn multiple workgroups (i.e. increase the image size), the code does not return correct output.

This is because of the bug in the code that assumes global synchronization

"

So, as you rightly said, you can allow the CPU to do the final reduction.

(or) Spawn one another kernel with only 1 workgroup to do the final reduction alone

joggel
Journeyman III

Re: How to do parallel reduction correctly?

Jump to solution

Thanks again. I think, all problems were cleared up.

0 Likes
himanshu_gautam
Grandmaster

Re: How to do parallel reduction correctly?

Jump to solution

Glad to know! Good luck!

0 Likes