I frequently see a BSOD when creating compute buffers after reboot on Windows 7 Embedded. I don't see any BSOD on a regular Windows 7 Pro though.
I can reproduce the same BSOD when running c:\Windows\System32\clinfo.exe on Windows 7 Embedded.
My setup is as follows:
GPU: AMD Radeon HD 7850 PCIe 2BG AES
CPU: Intel® Core™ i7-3615QE
RAM: 4GB
Catalyst Version: 13.9
Driver version: 1268.1
OS: Windows 7 Embedded Standard 64 bit (Build 7601) Service Pack 1
The GPU has no display attached, since it's only use for computations.
I don't see the BSOD with the Catalyst Version 12.4, but every driver since has the problem. Unfortunately I can't use the 12.4 driver since it will not spin up the GPU when no display is attached. The GPU clock and the GPU memory clock only run at 10%.
To reproduce the BSOD I use a small script that runs on Windows start-up. It executes clinfo.exe and then restarts the computer. Normally I see a BSOD within the hour, but sometime i takes longer (I've seen approximately 8 hours).
Best,
Jakob
Could you please try this with latest catalyst driver? and let me whether the problem still exits.
Also if possible could you please share your script here
Hi,
I have tried with the latest stable driver and the latest beta driver. Both om them gives a BSOD.
The script I've been running is fairly simple, just a .bat script I put into C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup for automatically execution on boot:
@echo off
ping 192.9.2.2 -n 1 -w 2000 > nul
C:\Windows\System32\clinfo.exe
shutdown -r -t 0
Regards,
Jakob
Thanks for the details, Will check and get back to you.
Hi,
It seems that not having a pagefile on the system changes some timing on boot, which makes the BSOD occur more frequently.
Regards,
Jakob
Hi
I executed the same script. but its keep on restarting for me.. Will it be same case and do i need to continue like this for and hour?
HI,
As I wrote above, I it looks like having a pagefile, makes the BSOD a lot less frequent. It took me 16 hours to reproduce the error with a pagefile, but only ~1 hour without one.
Regards,
Jakob
HI,
I can send you an memory dump from the BSOD if you want.
Regards,
Jakob
Ya please send across
Hi,
I have had DDK Escalation Engineer at Microsoft to look at the crash dump. Here is what he gave me:
a DDK Escalation Engineer analyzed the crash dump and provided this pretty technical information from his debugging session. You could use it in your discussion with the ATI support (see SUMMARY section at the end of this email).
The dump shows that indeed the clinfo.exe was involved into the issue which was cause by a timeout:
DXGANALYZE_ANALYSIS_TAG_TDR_BUGCHECK_REASON: 2
DXGANALYZE_ANALYSIS_TAG_TDR_FLAGS: 0
DXGANALYZE_ANALYSIS_TAG_SESSION_GUID: {E7D35FC3-56AC-11E3-BA24-806E6F6E6963}
DXGANALYZE_ANALYSIS_TAG_TDR_REASON: 2
DXGANALYZE_ANALYSIS_TAG_TDR_PROCESS_0: clinfo.exe
The clinfo.exe process tries to finish, during that it communicates with the graphic driver, very likely to destroy the device object. And very likely it runs into a timeout:
PROCESS fffffa8006467b30
SessionId: 1 Cid: 09c0 Peb: 7fffffd4000 ParentCid: 09a0
DirBase: 7d240000 ObjectTable: fffff8a001f60660 HandleCount: 2633.
Image: clinfo.exe
VadRoot fffffa8006466230 Vads 149 Clone 0 Private 5176. Modified 82. Locked 387.
DeviceMap fffff8a0017a9bb0
Token fffff8a001f87060
ElapsedTime 00:00:19.090
UserTime 00:00:00.000
KernelTime 00:00:00.031
QuotaPoolUsage[PagedPool] 0
QuotaPoolUsage[NonPagedPool] 0
Working Set Sizes (now,min,max) (13283, 50, 345) (53132KB, 200KB, 1380KB)
PeakWorkingSetSize 24672
VirtualSize 265 Mb
PeakVirtualSize 381 Mb
PageFaultCount 47733
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 17598
THREAD fffffa8006459b60 Cid 09c0.09c4 Teb: 000007fffffde000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable
fffff880075aa610 SynchronizationEvent
Not impersonating
DeviceMap fffff8a0017a9bb0
Owning Process fffffa8006467b30 Image: clinfo.exe
Attached Process N/A Image: N/A
Wait Start TickCount 2077 Ticks: 559 (0:00:00:08.720)
Context Switch Count 27122 LargeStack
UserTime 00:00:01.029
KernelTime 00:00:00.670
Win32 Start Address 0x000000013f8448c8
Stack Init fffff880075aac70 Current fffff880075a9fe0
Base fffff880075ab000 Limit fffff880075a2000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr Call Site
fffff880`075aa020 fffff800`02ee1992 nt!KiSwapContext+0x7a
fffff880`075aa160 fffff800`02ee0eaa nt!KiCommitThreadWait+0x1d2
fffff880`075aa1f0 fffff880`0fdaa050 nt!KeWaitForMultipleObjects+0x272
fffff880`075aa4b0 fffff880`0fdd8cfd dxgmms1!VidSchWaitForEvents+0x9c
fffff880`075aa510 fffff880`0fdd602a dxgmms1!VidSchWaitForCompletionEvent+0x139
fffff880`075aa550 fffff880`0fda9b7a dxgmms1!VidSchiWaitFlushCompletion+0x36
fffff880`075aa580 fffff880`0fcf16e7 dxgmms1!VidSchFlushDevice+0x1a2
fffff880`075aa6d0 fffff880`0fcd6815 dxgkrnl!DXGDEVICE::~DXGDEVICE+0xff
fffff880`075aa740 fffff880`0fd14e4a dxgkrnl!DXGADAPTER::DestroyDevice+0x1c9
fffff880`075aa770 fffff880`0fd147e0 dxgkrnl!DXGPROCESS::Destroy+0xba
fffff880`075aa820 fffff960`00194d74 dxgkrnl!DxgkProcessCallout+0x268
fffff880`075aa8b0 fffff960`00194473 win32k!GdiProcessCallout+0x244
fffff880`075aa930 fffff800`031b2001 win32k!W32pProcessCallout+0x6b
fffff880`075aa960 fffff800`031956dc nt!PspExitThread+0x4d1
fffff880`075aaa60 fffff800`02edb8d3 nt!NtTerminateProcess+0x138
fffff880`075aaae0 00000000`77c715da nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`075aaae0)
00000000`001efd58 00000000`00000000 0x77c715da
But there’re no packets in the internal work list for clinfo.exe anymore what would mean that all work is already done:
Video Scheduler Status
CurrentStatus: SchedulerRunning
DxgAdapter: 0xfffffa8005c1f000
VidSchGlobal : 0xfffffa80059ce010
NumberOfTotalNodes: 6
WorkerThread: 0xfffffa800555b8c0
Logical Adapter 0 Node 0 Status
VidSchNode 0xfffffa8005c27000
Submission fence information
Last generated by VidSch : 0x000000000000018f
Last submitted to driver : 0x000000000000018f
Last processed by GPU : 0x000000000000018f
Last completed by GPU : 0x000000000000018f
Last preempted by GPU : 0x00000000000000ff
Last faulted by GPU : 0x0000000000000000
Last processed by Scheduler: 0x000000000000018f
Preemption fence information
Last generated by VidSch : 0x000000000000000e
Last submitted to driver : 0x000000000000000e
Last completed by GPU : 0x000000000000000e
Last completed by Scheduler: 0x000000000000000e
Hardware Queue Content
Empty
Priority Table
All ReadyContextListTable are empty
WaitingContextList
Empty
IdleContextList
VidSchContext 0xfffffa8005677d50 (csrss.exe, Normal InProcPriority)
Deferred Wait Packet List:
-Empty
Queue Packet List:
-Empty
VidSchContext 0xfffffa8005c3bd50 (System Process, Normal InProcPriority)
Deferred Wait Packet List:
-Empty
Queue Packet List:
-Empty
VidSchContext 0xfffffa800658d300 (clinfo.exe, Normal InProcPriority)
Deferred Wait Packet List:
-Empty
Queue Packet List:
-Empty
VidSchContext 0xfffffa8005c3fd50 (System Process, Normal InProcPriority)
Deferred Wait Packet List:
-Empty
Queue Packet List:
-Empty
VidSchContext 0xfffffa8004e32620 (dwm.exe, Normal InProcPriority)
Deferred Wait Packet List:
-Empty
Queue Packet List:
-Empty
VidSchContext 0xfffffa8005d539d0 (dwm.exe, Normal InProcPriority)
Deferred Wait Packet List:
-Empty
Queue Packet List:
-Empty
VidSchContext 0xfffffa800567ac00 (csrss.exe, Normal InProcPriority)
Deferred Wait Packet List:
-Empty
Queue Packet List:
-Empty
WaitingPowerContextList
Empty
Logical Adapter 0 Node 1 Status
VidSchNode 0xfffffa8005c3a000
No packets submitted to this node
Hardware Queue Content
Empty
SUMARY
The BSOD is caused by the cooperation of the graphic card and the driver, both coming from ATI. Actually there’s nothing MS could do to change/improve the behavior. You’ll have to contact the ATI support (or who else owns/provides the driver and its source code) to analyze it further. If ATI should believe the OS is somehow causing the BSOD (what my colleague don’t believe) then ATI itself would have to open a support incident with MS to work together on the issue.
Yes, I know, this statement doesn’t contain that much help as you would have expected, but as already mentioned at the beginning, for MS it’s pretty impossible to provide a more detailed analysis about 3rd-party software/driver without owning the related source code + knowledge/design of the driver. At the moment we can just say “it’s a pure ATI hardware/driver problem”.
Hi,
Here can you download the same memory dump as they analyzed at Microsoft: Message | Secure File Transfer
Regards,
Jakob
Hi,
Any news, what is the status?
Regards,
Jakob
Hi Bruhaspati,
Are you still looking in to this?
Regards,
Jakob
Any news?
Regards,
Jakob