cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

eduardoschardong
Journeyman III

Few questions about OpenCL implementation

Hello again, using Radeon HD5700 and HD5800 series:

1) How diferent is rsqrt vs native_sqrt? Precision?

2) Will the max work group size be increased in the future?

3) There is a way to force using dot products and the _PREV instructions?

4) I'm getting too much LDS bank conflicts (100 according to the profler) on a kernel, it does mostly write private-read public on float4 datatype, why exactly the bank conflicts are so high? Workarounds?

5) This had some threads before... What's about more than 4 GPUs in a system?

 

0 Likes
8 Replies
nou
Exemplar

1. yes. native_* function can use special HW instruction. for example normal cos() is in ISA code have around 60 instructions. native_cos() only one.

2. maybe, there is problem with insuficient resources (mainly registers) if you increase group work size

5. it should work. there is some issue with 5970.

0 Likes
Fr4nz
Journeyman III

Originally posted by: eduardoschardong Hello again, using Radeon HD5700 and HD5800 series:

4) I'm getting too much LDS bank conflicts (100 according to the profler) on a kernel, it does mostly write private-read public on float4 datatype, why exactly the bank conflicts are so high? Workarounds?



Could you show us your kernel code (at least the part regarding these LDS conflicts)?

0 Likes

Originally posted by: Fr4nz
Originally posted by: eduardoschardong Hello again, using Radeon HD5700 and HD5800 series:

 

4) I'm getting too much LDS bank conflicts (100 according to the profler) on a kernel, it does mostly write private-read public on float4 datatype, why exactly the bank conflicts are so high? Workarounds?



 

Could you show us your kernel code (at least the part regarding these LDS conflicts)?

 

There is a problem with the LDS bank conflict's performance counter: the value it reports is several times higher than the actual value.  The next release of the profiler will fix this problem.

0 Likes

Originally posted by: bpurnomoThere is a problem with the LDS bank conflict's performance counter: the value it reports is several times higher than the actual value.  The next release of the profiler will fix this problem.


bpurnomo,

do you plan to release a linux OpenCL profiler in the future?

0 Likes

Yes.  Supporting linux is in our plan but it is not in our current short-term roadmap.

 

0 Likes

I admire the valuable information you offer in your articles. I will bookmark your blog and have my children check up here often. I am quite sure they will learn lots of new stuff here than anybody else!

audio conference

 

0 Likes

Originally posted by: nou 1. yes. native_* function can use special HW instruction. for example normal cos() is in ISA code have around 60 instructions. native_cos() only one.


O noted this, in my case the normal rsqrt() is just too slow, I saw the extra instructions with many CMOVs so I thing it's mostly about denormals and special cases, but I didn't find anywhere documented how exactly the native_rsqrt behaves.

Originally posted by: bpurnomo

There is a problem with the LDS bank conflict's performance counter: the value it reports is several times higher than the actual value.  The next release of the profiler will fix this problem.



Thank you, what about the others?

 

0 Likes

I had a similar question about sin/cos vs. native_sin/native_cos.

The blog entry here http://sn.im/z44cy (or look at the June 21 entry for www.bigNcomputing.org ) has some code that you may find useful as a starting point to do your own tests for rsqrt and native_rsqrt.

 

 

0 Likes