cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jakob
Journeyman III

BSOD on Windows 7 Embedded

I frequently see a BSOD when creating compute buffers after reboot on Windows 7 Embedded. I don't see any BSOD on a regular Windows 7 Pro though.

I can reproduce the same BSOD when running c:\Windows\System32\clinfo.exe on Windows 7 Embedded.

My setup is as follows:

     GPU: AMD Radeon HD 7850 PCIe 2BG AES

     CPU: Intel® Core™ i7-3615QE

     RAM: 4GB

     Catalyst Version: 13.9

     Driver version: 1268.1

     OS: Windows 7 Embedded Standard 64 bit (Build 7601) Service Pack 1

The GPU has no display attached, since it's only use for computations.

I don't see the BSOD with the Catalyst Version 12.4, but every driver since has the problem. Unfortunately I can't use the 12.4 driver since it will not spin up the GPU when no display is attached. The GPU clock and the GPU memory clock only run at 10%.

To reproduce the BSOD I use a small script that runs on Windows start-up. It executes clinfo.exe and then restarts the computer. Normally I see a BSOD within the hour, but sometime i takes longer (I've seen approximately 8 hours).

Best,

Jakob

0 Likes
13 Replies
himanshu_gautam
Grandmaster


Could you please try this with latest catalyst driver? and let me whether the problem still exits.

Also if possible could you please share your script here

0 Likes

Hi,

I have tried with the latest stable driver and the latest beta driver. Both om them gives a BSOD.

The script I've been running is fairly simple, just a .bat script I put into C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup for automatically execution on boot:

@echo off

ping 192.9.2.2 -n 1 -w 2000 > nul

C:\Windows\System32\clinfo.exe

shutdown -r -t 0

Regards,

Jakob

0 Likes

Thanks for the details, Will check and get back to you.

0 Likes

Hi,

It seems that not having a pagefile on the system changes some timing on boot, which makes the BSOD occur more frequently.

Regards,

Jakob

0 Likes

Hi

I executed the same script. but its keep on restarting for me.. Will it be same case and do i need to continue like this for and hour?

0 Likes

HI,

As I wrote above, I it looks like having a pagefile, makes the BSOD a lot less frequent. It took me 16 hours to reproduce the error with a pagefile, but only ~1 hour without one.

Regards,

Jakob

0 Likes

HI,

I can send you an memory dump from the BSOD if you want.

Regards,

Jakob

0 Likes


Ya please send across

0 Likes

Hi,

I have had DDK Escalation Engineer at Microsoft to look at the crash dump. Here is what he gave me:

a DDK Escalation Engineer analyzed the crash dump and provided this pretty technical information from his debugging session. You could use it in your discussion with the ATI support (see SUMMARY section at the end of this email).

The dump shows that indeed the clinfo.exe was involved into the issue which was cause by a timeout:

DXGANALYZE_ANALYSIS_TAG_TDR_BUGCHECK_REASON: 2

DXGANALYZE_ANALYSIS_TAG_TDR_FLAGS: 0

DXGANALYZE_ANALYSIS_TAG_SESSION_GUID: {E7D35FC3-56AC-11E3-BA24-806E6F6E6963}

DXGANALYZE_ANALYSIS_TAG_TDR_REASON: 2

DXGANALYZE_ANALYSIS_TAG_TDR_PROCESS_0: clinfo.exe

The clinfo.exe process tries to finish, during that it communicates with the graphic driver, very likely to destroy the device object. And very likely it runs into a timeout:

  1. 16.0: kd> !PROCESS fffffa8006467b30

PROCESS fffffa8006467b30

    SessionId: 1 Cid: 09c0    Peb: 7fffffd4000  ParentCid: 09a0

    DirBase: 7d240000  ObjectTable: fffff8a001f60660  HandleCount: 2633.

   Image: clinfo.exe

    VadRoot fffffa8006466230 Vads 149 Clone 0 Private 5176. Modified 82. Locked 387.

    DeviceMap fffff8a0017a9bb0

Token fffff8a001f87060

ElapsedTime 00:00:19.090

    UserTime 00:00:00.000

KernelTime 00:00:00.031

QuotaPoolUsage[PagedPool]         0

QuotaPoolUsage[NonPagedPool]      0

    Working Set Sizes (now,min,max)  (13283, 50, 345) (53132KB, 200KB, 1380KB)

PeakWorkingSetSize 24672

VirtualSize 265 Mb

PeakVirtualSize 381 Mb

PageFaultCount 47733

MemoryPriority BACKGROUND

BasePriority 8

    CommitCharge 17598

THREAD fffffa8006459b60  Cid 09c0.09c4  Teb: 000007fffffde000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable

fffff880075aa610  SynchronizationEvent

Not impersonating

DeviceMap fffff8a0017a9bb0

Owning Process fffffa8006467b30 Image:         clinfo.exe

Attached Process N/A Image:         N/A

Wait Start TickCount 2077           Ticks: 559 (0:00:00:08.720)

Context Switch Count 27122 LargeStack

UserTime 00:00:01.029

KernelTime 00:00:00.670

Win32 Start Address 0x000000013f8448c8

Stack Init fffff880075aac70 Current fffff880075a9fe0

Base fffff880075ab000 Limit fffff880075a2000 Call 0

Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5

Child-SP RetAddr           Call Site

fffff880`075aa020 fffff800`02ee1992 nt!KiSwapContext+0x7a

fffff880`075aa160 fffff800`02ee0eaa nt!KiCommitThreadWait+0x1d2

fffff880`075aa1f0 fffff880`0fdaa050 nt!KeWaitForMultipleObjects+0x272

fffff880`075aa4b0 fffff880`0fdd8cfd dxgmms1!VidSchWaitForEvents+0x9c

fffff880`075aa510 fffff880`0fdd602a dxgmms1!VidSchWaitForCompletionEvent+0x139

fffff880`075aa550 fffff880`0fda9b7a dxgmms1!VidSchiWaitFlushCompletion+0x36

fffff880`075aa580 fffff880`0fcf16e7 dxgmms1!VidSchFlushDevice+0x1a2

fffff880`075aa6d0 fffff880`0fcd6815 dxgkrnl!DXGDEVICE::~DXGDEVICE+0xff

fffff880`075aa740 fffff880`0fd14e4a dxgkrnl!DXGADAPTER::DestroyDevice+0x1c9

fffff880`075aa770 fffff880`0fd147e0 dxgkrnl!DXGPROCESS::Destroy+0xba

fffff880`075aa820 fffff960`00194d74 dxgkrnl!DxgkProcessCallout+0x268

fffff880`075aa8b0 fffff960`00194473 win32k!GdiProcessCallout+0x244

fffff880`075aa930 fffff800`031b2001 win32k!W32pProcessCallout+0x6b

fffff880`075aa960 fffff800`031956dc nt!PspExitThread+0x4d1

fffff880`075aaa60 fffff800`02edb8d3 nt!NtTerminateProcess+0x138

fffff880`075aaae0 00000000`77c715da nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`075aaae0)

00000000`001efd58 00000000`00000000 0x77c715da

But there’re no packets in the internal work list for clinfo.exe anymore what would mean that all work is already done:

  1. 16.0: kd> !schstatus 0xfffffa80059ce010 -all

Video Scheduler Status

   CurrentStatus: SchedulerRunning

   DxgAdapter: 0xfffffa8005c1f000

   VidSchGlobal : 0xfffffa80059ce010

   NumberOfTotalNodes: 6

   WorkerThread: 0xfffffa800555b8c0

   Logical Adapter 0 Node 0 Status

VidSchNode 0xfffffa8005c27000

Submission fence information

Last generated by VidSch   : 0x000000000000018f

Last submitted to driver   : 0x000000000000018f

Last processed by GPU      : 0x000000000000018f

Last completed by GPU      : 0x000000000000018f

Last preempted by GPU      : 0x00000000000000ff

Last faulted by GPU        : 0x0000000000000000

Last processed by Scheduler: 0x000000000000018f

Preemption fence information

Last generated by VidSch   : 0x000000000000000e

Last submitted to driver   : 0x000000000000000e

Last completed by GPU      : 0x000000000000000e

Last completed by Scheduler: 0x000000000000000e

      Hardware Queue Content

Empty

      Priority Table

All ReadyContextListTable are empty

WaitingContextList

Empty

IdleContextList

VidSchContext 0xfffffa8005677d50 (csrss.exe, Normal InProcPriority)

Deferred Wait Packet List:

-Empty

Queue Packet List:

-Empty

VidSchContext 0xfffffa8005c3bd50 (System Process, Normal InProcPriority)

Deferred Wait Packet List:

-Empty

Queue Packet List:

-Empty

VidSchContext 0xfffffa800658d300 (clinfo.exe, Normal InProcPriority)

Deferred Wait Packet List:

-Empty

Queue Packet List:

-Empty

VidSchContext 0xfffffa8005c3fd50 (System Process, Normal InProcPriority)

Deferred Wait Packet List:

-Empty

Queue Packet List:

-Empty

VidSchContext 0xfffffa8004e32620 (dwm.exe, Normal InProcPriority)

Deferred Wait Packet List:

-Empty

Queue Packet List:

-Empty

VidSchContext 0xfffffa8005d539d0 (dwm.exe, Normal InProcPriority)

Deferred Wait Packet List:

-Empty

Queue Packet List:

-Empty

VidSchContext 0xfffffa800567ac00 (csrss.exe, Normal InProcPriority)

Deferred Wait Packet List:

-Empty

Queue Packet List:

-Empty

WaitingPowerContextList

Empty

   Logical Adapter 0 Node 1 Status

VidSchNode 0xfffffa8005c3a000

      No packets submitted to this node

      Hardware Queue Content

Empty

SUMARY

The BSOD is caused by the cooperation of the graphic card and the driver, both coming from ATI. Actually there’s nothing MS could do to change/improve the behavior. You’ll have to contact the ATI support (or who else owns/provides the driver and its source code) to analyze it further. If ATI should believe the OS is somehow causing the BSOD (what my colleague don’t believe) then ATI itself would have to open a support incident with MS to work together on the issue.

Yes, I know, this statement doesn’t contain that much help as you would have expected, but as already mentioned at the beginning, for MS it’s pretty impossible to provide a more detailed analysis about 3rd-party software/driver without owning the related source code + knowledge/design of the driver. At the moment we can just say “it’s a pure ATI hardware/driver problem”.

0 Likes

Hi,

Here can you download the same memory dump as they analyzed at Microsoft: Message | Secure File Transfer

Regards,

Jakob

0 Likes

Hi,

Any news, what is the status?

Regards,

Jakob

0 Likes

Hi Bruhaspati,

Are you still looking in to this?

Regards,

Jakob

0 Likes

Any news?

Regards,

Jakob

0 Likes