AnsweredAssumed Answered

Intermittent hangs - 4mo. into diagnosis, evidence points to CPU/mobo

Question asked by esmea on Jul 25, 2015
Latest reply on Jul 26, 2015 by esmea

TLDR; I've done everything short of buying a new CPU/motherboard and still haven't pinned down the problem, but I've isolated it to the CPU or motherboard... unsure of where to take it from here.

 

System Specs:

Motherboard: ASUS  Crosshair V Formula-Z

Motherboard Revision : 1.xx

Motherboard BIOS Revision : 2101

 

CPU: AMD FX-9370 (220W)

Configuration: Default

Frequency: 4.4GHz

Voltage: 1.45V (Tested Stable at 1.31V)

BClock: 200MHz

Mult: 22.0

Cooler: CoolerMaster Hyper 212Evo (formerly Corsair H80i)

RAM: 32GB (4x8GB) AMD Radeon Gamer Series 2133MHz

Configuration: AMP/XMP Profile 1

Frequency: 2133MHz

Timing: 10-11-11-30

GPU: ASUS R9-290X-DC2OC-4GB

GPU Firmware version: 1.2

GPU Driver version: Catalyst 15.7

Storage:

SSD: Samsung 840 PRO 256GB (Firmware: DXM06B0Q)

HDD1: Hitachi 2TB

HDD2: WD Black 4TB

PSU: SeaSonic 1050W (Formerly Antec TP750

OS: Windows 8.1 64bit

Display configuration: 1920x1200 (Main), 1280x1024 (Secondary)

 

Long version:

The problem - intermittent hangs in the system -  started in late march, and was accompanied by a string of particularly undesired events which included unexpected power outage and CPU high temps due to underperforming Corsair H80i (this was quickly resolved)

 

Testing for software fault: as usual, I took the problem as a sign of silent corruption via soft memory errors and went about my usual routine of reformatting my OS drive (SSD). The problem lingered, however, even after completely reinstalling all up to date drivers (I keep all drivers up to date anyways, but fresh installations help) Patience for newer versions was required at this point. Note: Catalyst 14.12 and 15.4beta were both stable prior to problem, neither would resolve issue.

 

Testing for driver/firmware updates as solution: Unfortunately, though, during this initial time of diagnostics, my system encountered a hard freeze while I was performing a firmware update on my GPU. I RMAed the GPU and installed my older HD5870 to use in the meantime. At first, the problem appeared to go away, but eventually reappeared as I increased the load on the system.

BIOS was already up to date.

Updated SSD firmware - no improvement.

 

Testing clock configuration: At this time, I caught myself up on all information regarding overclocking theory in relation to AM3+ CPUs. I spent several days adjusting the CPU clock settings and stress testing via combination of Prime95, ROG RealBench, 3DMark, Unigen Heaven, and AMD Overdrive. During this time, I discovered that the CPU remained stable at 1.31V minimum at stock clock settings, and did not exceed 63°C under full load. Adjusted voltage to 1.36V for headroom and continued.

RAM settings adjusted to AMP #1 after CPU configured.

Somewhere after this setup, the problem was continuing, so BIOS was defaulted and RAM set to AMP. Noted that CPU Voltage defaulted to 1.5V.

 

RMA: GPU returned from RMA; new GPU exacerbated the problem. I contacted ASUS, who assumed faulty replacement unit and replaced that one. Situation remained. Proceeded to hardware diagnostics.

 

Cooling: (Note: ASUS AISuiteII's ProbeII was reporting alarming voltage fluctuations which coincided with system hangs, so PSU was considered as cause. These alerts continued even when Corsair Link or other hardware monitoring program was not installed) Temperatures were not notably high, given the amount of attention on the system internals at the time. The Corsair H80i exhibited an inability to provide sufficient cooling, which was compounded by long-term use (3+months) with dust. Resolved to replace Case, CPU Cooler, and PSU with superior models. Antec case replaced with Thermaltake Core v71, Corsair H80i replaced with CM Hyper212Evo, Antec TP750 replaced with SeaSonic 1050W. ProbeII alerts subsided. System hangs continued.

 

Situational understanding: Time was taken to allow for driver updates to come out. During this time, the problem was narrowed down to coincide with video playback - browser (HTML5, Flash, Silverlight, etc.), VLC, media player, in-game (any game that had video playback), Xbox Video, Windows Movie Maker, Raptr, Kodi (XBMC), VideoPad Editor, and so on. Hardware acceleration did not matter. Sound output device did not matter. Catalyst Control Center settings did not matter.

 

Catalyst 15.7 released - problem worsened to occur every 30sec to 5min during video playback now accompanied with "Display Driver has crashed and recovered successfully."

Tried installing Win7 - no improvement. Win10 Technical Preview - no improvement.

 

Hardware isolation: Installed OS on different HDD - no improvement. Individually tested each DIMM with Memtest86+ - all passed 24hr testing with zero errors. Testing with GPU already performed during RMA. Swapped CPU with only hand: Phenom II 945 (95W) - problem resolved.

 

Obviously I do not intend to run my system on such an outdated processor; I had to reduce my RAM frequency to 1333MHz, and this would be ignoring the problem, not solving it.

I have come to the following conclusion: either the CPU is going bad, or the motherboard is failing to handle a high-powered CPU (aka the motherboard is bad). Either way, they both passed stress tests and benchmarks with flying colors as recently as 3 days ago - a characteristic which has made this problem most especially difficult to diagnose.

I do not have any spare high power CPUs on hand, nor do I have any spare motherboards capable of testing the FX-9370. I'm also resolved to wait until Zen comes out before buying any new CPUs, not that I can even afford to right now - an emergency expense (one of my cats died, but not before requiring lots of medical attention) ate all of my funds I had set aside for upcoming upgrades.

Before I pin down a definitive diagnosis, I need to be able to isolate the problem to one of these two devices.

 

Finally, the question: Given the explanation of events, how should I go about this?

RMA the motherboard?

Eat the loss of the $220 CPU and put up with the PhenomII 945 until Zen?

Something I've overlooked?

 

This is all from the log I kept, which may be incomplete - it did take 4 months to get to this point. I tried to include all relevant information, to avoid unnecessary questions. 

Outcomes