8 Replies Latest reply on Jul 7, 2010 4:12 PM by kb1vc

    Few questions about OpenCL implementation

    eduardoschardong

      Hello again, using Radeon HD5700 and HD5800 series:

      1) How diferent is rsqrt vs native_sqrt? Precision?

      2) Will the max work group size be increased in the future?

      3) There is a way to force using dot products and the _PREV instructions?

      4) I'm getting too much LDS bank conflicts (100 according to the profler) on a kernel, it does mostly write private-read public on float4 datatype, why exactly the bank conflicts are so high? Workarounds?

      5) This had some threads before... What's about more than 4 GPUs in a system?

       

        • Few questions about OpenCL implementation
          nou

          1. yes. native_* function can use special HW instruction. for example normal cos() is in ISA code have around 60 instructions. native_cos() only one.

          2. maybe, there is problem with insuficient resources (mainly registers) if you increase group work size

          5. it should work. there is some issue with 5970.

          • Few questions about OpenCL implementation
            Fr4nz

             

            Originally posted by: eduardoschardong Hello again, using Radeon HD5700 and HD5800 series:

            4) I'm getting too much LDS bank conflicts (100 according to the profler) on a kernel, it does mostly write private-read public on float4 datatype, why exactly the bank conflicts are so high? Workarounds?



            Could you show us your kernel code (at least the part regarding these LDS conflicts)?

              • Few questions about OpenCL implementation
                bpurnomo

                 

                Originally posted by: Fr4nz
                Originally posted by: eduardoschardong Hello again, using Radeon HD5700 and HD5800 series:

                 

                4) I'm getting too much LDS bank conflicts (100 according to the profler) on a kernel, it does mostly write private-read public on float4 datatype, why exactly the bank conflicts are so high? Workarounds?



                 

                Could you show us your kernel code (at least the part regarding these LDS conflicts)?

                 

                There is a problem with the LDS bank conflict's performance counter: the value it reports is several times higher than the actual value.  The next release of the profiler will fix this problem.