AnsweredAssumed Answered

Sharp increase in CPU usage by AMD driver if number of sync points decreased - how to avoid?

Question asked by Raistmer on Jun 23, 2015
Latest reply on Aug 24, 2015 by dipak

Recently I changed algorithm in app to keep more data directly on GPU that caused considerable decrease in number of buffer mapping (and each buffer mapping was sync point also).

I expected improve in run time due to increased GPU load and decrease in CPU time also due to less work for CPU to do.

 

But almost all what I got is sharp increase in CPU time. If before change CPU time constituted only small part of elapsed time, now CPU time almost equal elapsed. That is, almost 100% of CPU usage during whole app run.

 

I tried to avoid such CPU usage putting working thread into sleep before sync points - no success. While logs show that app reads event and sleep until it get corresponding status, CPU time not decreased.

 

So, I used ProcessExplorer to find who is consuming CPU. It was AMD driver thread with next stack:

KERNELBASE.dll!WaitForSingleObjectEx+0x98

kernel32.dll!WaitForSingleObjectEx+0x43

kernel32.dll!WaitForSingleObject+0x12

amdocl.dll!oclGetAsic+0x511eb

amdocl.dll!oclGetAsic+0x4883d

amdocl.dll!oclGetAsic+0x678fc

amdocl.dll!oclGetAsic+0x675d9

amdocl.dll!oclGetAsic+0x612a9

amdocl.dll!oclGetAsic+0x48560

amdocl.dll!oclGetAsic+0x40dc2

amdocl.dll!oclGetAsic+0x3fd49

amdocl.dll!oclGetAsic+0x3a677

amdocl.dll!oclGetAsic+0x313e

amdocl.dll!oclGetAsic+0x3326

amdocl.dll!clSetKernelExecInfo+0x57c8f

amdocl.dll!clSetKernelExecInfo+0x4bd67

amdocl.dll!clSetKernelExecInfo+0x3087b

amdocl.dll!clSetKernelExecInfo+0x5628

amdocl.dll!clSetKernelExecInfo+0x56c6

amdocl.dll!clSetKernelExecInfo+0x143d

amdocl.dll!clSetKernelExecInfo+0x1656c

kernel32.dll!BaseThreadInitThunk+0x12

ntdll.dll!RtlInitializeExceptionChain+0x63

ntdll.dll!RtlInitializeExceptionChain+0x36

 

picture clearly shows that this thread is main CPU consumer for the app's process:

AMD_driver_CPU_leak.png

 

So the question is - how to avoid such behavior? It seems that w/o big number of sync points between GPU and host code AMD driver goes mad and starts to use whole CPU core for own needs.

 

EDIT: Here are illustrations with CodeXL TimeLine pics how timeline looked before

non-SoG.png

after change

SoG_1500icfft.png

 

and with marker event and Sleep(1) loop until event reached

SoG_event_marker.png

 

As one can see (profiling done on C-60 APU but discrete HD6950 under different drivers shows same CPU usage pattern) there is quite big time interval of ~80s where GPU works on its own w/o synching with host. And it's place where AMD driver starts to consume whole CPU core.

 

EDIT2: it's very similar to issue described here: Re: Cat13.4: How to avoid the high CPU load for GPU kernels?  by .Bdot

Any new cures since 2013 year?..

Outcomes