9 Replies Latest reply on Feb 11, 2010 5:01 PM by empty_knapsack

    OpenCL profiler - Questions and minor bug report

    eduardoschardong

      Bug report: The .csv isn't presented correctly for some locations.

      Latin locations, for example, use "," instead of "." for the decimal separator and "." as just a grouping digit, when the profiler display the output values are transformed depending on the culture on the machine, this cause the values to be displayed incorrectly, % units are displayed as, for example, 8107, KernelTime is displayed as 2299992 when it should be 22,99992 (or 22.99992 if culture invariant) and so.

       

      Questions:

      1) Is FetchUnitStalled the percent of KernelTime spent on fetch units while they are doing nothing?

      2) Is FetchUnitBusy - FetchUnitStalled the percent of KernelTime spent on fetch units while they are actually doing something usefull?

      3) What the GPU is doing when FetchUnitStalled? I mean what cause stalls? If I have 94,07 for FetchUnitStalled and 90,74 for FetchUnitStalled what's likely the problem with my kernel?

       

      note: WriteUnitStalled is always 0, it looks buggy since kernels with to much writes are slow even when all numbers are low.

       

        • OpenCL profiler - Questions and minor bug report
          ryta1203

          Why doesn't the Profiler say how many GPRs are being used? This could be relevant.

          That would be great, thanks... unless AMD is planning on releasing a new SKA with OpenCL support (but it doesn't look like they are).

          • OpenCL profiler - Questions and minor bug report
            bpurnomo

             

            Originally posted by: eduardoschardong Bug report: The .csv isn't presented correctly for some locations.

             

            Latin locations, for example, use "," instead of "." for the decimal separator and "." as just a grouping digit, when the profiler display the output values are transformed depending on the culture on the machine, this cause the values to be displayed incorrectly, % units are displayed as, for example, 8107, KernelTime is displayed as 2299992 when it should be 22,99992 (or 22.99992 if culture invariant) and so.

             

            Thank you for the bug report.  We will investigate this localization issue.

             

             

            1) Is FetchUnitStalled the percent of KernelTime spent on fetch units while they are doing nothing?


            It is the percentage of GPU time the Fetch units is waiting for results (not doing anything).

             

             

            2) Is FetchUnitBusy - FetchUnitStalled the percent of KernelTime spent on fetch units while they are actually doing something usefull?


            Yes

             

             

            3) What the GPU is doing when FetchUnitStalled? I mean what cause stalls? If I have 94,07 for FetchUnitStalled and 90,74 for FetchUnitStalled what's likely the problem with my kernel?


            When fetch units are stalled, it is possible the other units are doing something useful.  It is also possible that other units are stalled waiting for the results from the fetch units.

            Your problem is likely that you have too many fetches, not enough wavefronts inflight to hide the fetch latency.

             

             

            note: WriteUnitStalled is always 0, it looks buggy since kernels with to much writes are slow even when all numbers are low.


            Thank you for reporting this bug.