10 Replies Latest reply on Aug 24, 2015 4:24 AM by dipak

    Sharp increase in CPU usage by AMD driver if number of sync points decreased - how to avoid?

    Raistmer

      Recently I changed algorithm in app to keep more data directly on GPU that caused considerable decrease in number of buffer mapping (and each buffer mapping was sync point also).

      I expected improve in run time due to increased GPU load and decrease in CPU time also due to less work for CPU to do.

       

      But almost all what I got is sharp increase in CPU time. If before change CPU time constituted only small part of elapsed time, now CPU time almost equal elapsed. That is, almost 100% of CPU usage during whole app run.

       

      I tried to avoid such CPU usage putting working thread into sleep before sync points - no success. While logs show that app reads event and sleep until it get corresponding status, CPU time not decreased.

       

      So, I used ProcessExplorer to find who is consuming CPU. It was AMD driver thread with next stack:

      KERNELBASE.dll!WaitForSingleObjectEx+0x98

      kernel32.dll!WaitForSingleObjectEx+0x43

      kernel32.dll!WaitForSingleObject+0x12

      amdocl.dll!oclGetAsic+0x511eb

      amdocl.dll!oclGetAsic+0x4883d

      amdocl.dll!oclGetAsic+0x678fc

      amdocl.dll!oclGetAsic+0x675d9

      amdocl.dll!oclGetAsic+0x612a9

      amdocl.dll!oclGetAsic+0x48560

      amdocl.dll!oclGetAsic+0x40dc2

      amdocl.dll!oclGetAsic+0x3fd49

      amdocl.dll!oclGetAsic+0x3a677

      amdocl.dll!oclGetAsic+0x313e

      amdocl.dll!oclGetAsic+0x3326

      amdocl.dll!clSetKernelExecInfo+0x57c8f

      amdocl.dll!clSetKernelExecInfo+0x4bd67

      amdocl.dll!clSetKernelExecInfo+0x3087b

      amdocl.dll!clSetKernelExecInfo+0x5628

      amdocl.dll!clSetKernelExecInfo+0x56c6

      amdocl.dll!clSetKernelExecInfo+0x143d

      amdocl.dll!clSetKernelExecInfo+0x1656c

      kernel32.dll!BaseThreadInitThunk+0x12

      ntdll.dll!RtlInitializeExceptionChain+0x63

      ntdll.dll!RtlInitializeExceptionChain+0x36

       

      picture clearly shows that this thread is main CPU consumer for the app's process:

      AMD_driver_CPU_leak.png

       

      So the question is - how to avoid such behavior? It seems that w/o big number of sync points between GPU and host code AMD driver goes mad and starts to use whole CPU core for own needs.

       

      EDIT: Here are illustrations with CodeXL TimeLine pics how timeline looked before

      non-SoG.png

      after change

      SoG_1500icfft.png

       

      and with marker event and Sleep(1) loop until event reached

      SoG_event_marker.png

       

      As one can see (profiling done on C-60 APU but discrete HD6950 under different drivers shows same CPU usage pattern) there is quite big time interval of ~80s where GPU works on its own w/o synching with host. And it's place where AMD driver starts to consume whole CPU core.

       

      EDIT2: it's very similar to issue described here: Re: Cat13.4: How to avoid the high CPU load for GPU kernels?  by .Bdot

      Any new cures since 2013 year?..