cancel
Showing results for 
Search instead for 
Did you mean: 

Processors

drdocumentum
Adept I

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

I ended replacing the Asrock TRX40 Creator for a Gigabyte TRX40 Aorus Master on my expense (no warranty). And my 3960x has been rock solid for about a month now.

This Gigabyte board is a lot better than the Asrock. Better components, better layout, better connectors. Not a single memory error.

hardcoregames_
Big Boss

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

drdocumentum wrote:

I ended replacing the Asrock TRX40 Creator for a Gigabyte TRX40 Aorus Master on my expense (no warranty). And my 3960x has been rock solid for about a month now.

This Gigabyte board is a lot better than the Asrock. Better components, better layout, better connectors. Not a single memory error.

I have gone through the same problem with lower end boards. Thanks for posting your results.

mantisman13
Adept II

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

For the most part I've been extremely pleased with the Taichi. The Bios could be a bit easier to understand, but for the most part it's clear. I think it's target at people who have a lot more overclocking experience than I have. As far as components go, it's a step up from Creator and should be able to take the heat. I'm seeing 90c avg now on my CCD packs but I was seeing under 80c before, which makes me think I may also have an issue with my Nzxt Kraken x62 that I didn't mention above. It doesn't seem to be adjusting fan or pump rate and Cam software losses connection shortly after boot, leaving it running at whatever the rate was at that point. I'm still keeping in range, but not boosting over 4k constantly like I was before.  My last C1,C2,A1,A2,B1,B2 test that gave me a BSOD the temp never got over 50c, so that's not what is causing the memory issues. Since the MOBO is kicking that 0d error without even getting to the post screen, I suspect the issues is with the board/cpu, so I still have to try out my mount pressure idea. I'll post back with those findings.  

mantisman13
Adept II

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

Worked for a moment.... I pulled the Kraken 62x and loosened the cpu mount torques. Cleaned, Re-torqued,  new thermal past on the aio plate only and remounted the aio and populated the A slots. System came up and booted into windows. Showed full 256. Ran CPU-Z  stress for several mins. Did a few Cinebench benches, Started folding and let run for an hr. All seemed right. All except my Nzxt Cam software has been giving me an issue where it is loosing the connection to monitor the liquid temp, fans and pump rpm and stops dynamically adjusting based on the cpu temperature. This is evidently causing me issues where my avg cpu temps push the upper range of 95c and spike higher. I've been testing this out and find if I remove the USB connection from the pump head that the fans runs on high and the pump must still run as my cpu avg is back to 83c after 12hr of folding. But back to running with full ram... So while all seemed to work fine, I did notice some oddities. In CPU-Z, in the SPD tab, memory slots normally report as slot # 1-8.  What I saw here was Slot 1 - 2 (which are normally empty when I do not have ram in the A slots) were showing the normal setting for the ram, yet slots 3 - 8 were blank. I closed CPU-Z and opened again. This time I had info in slots 1 - 4 and then the numbering skipped to 11-14. Dim reporting in HWiNFO64 reported the correct MOBO locations for all ram. So i figured it has something to do with Window caching info from the bios and reasoned a reboot would clear that up. On reboot, I decided to go into bios and look at the ram report. That looked right and while I was in there I enabled the PS/2 Wake up and also changed the Hardware monitor cpu2 fan from fan mode to pump mode. Saved setting and tried to reboot. This returned me straight back to the 0d bios error where it would not even get to the post screen. I power cycled and it got past post and tried to go into window but immediately hit a BSOD. Powered down, retired, same result. Both were IRQL_NOT_LESS_OR_EQUAL errors. Powered down again. Removed A Dimms. Rebooted, got back into bios and revered my changes, rebooted, hit the windows repair, restarted and all was good again. I sort or doubt my bios changes were the issues but I haven't confirmed that yet.

 

Swapped out 2 8G memory for the 2 32G in another system and they have been running fine for 14 hr of folding but at a different timing setting. The MSI gaming pro board doesn't seem to  recognize the XMP profile and I had to set the closest 3200 settings I could get. They are running at Dual, DRAM Freq 1599.1 Mhz, FSB:DRAM 1:16, CL 18 (not 16), RCD 20, RP 20, tRAS 38, tRC 75 , CR 1T. So the main difference is the CAS Latency is a bit slower. I had tried to adjust that to 16 in the bios, but it didn't take I guess. 

But this all gets me thinking back to when I first set up the Threadripper. I don't think I had XMP profiles to select from and originally I was using the BIOS set default which may have been 2132. The dims report a Max bandwidth of DDR-2132(1066 Mhz), but are picked and rated for 3200Mhz by Corsair. During my original bench marking I went in and started picking higher memory setting and letting the BIOS auto fill timings. I had crashed at higher level like 3600 and I forget where I got things stable. I may have been 3200, but then again it may have been a bit lower. I do recall that where I landed greatly improved all my benchmarks, hitting well over 17k on Cinebench r20. I'm only able to get to mid 16k with the 3 way memory. Interestingly, while I did have the 256 running in quad briefly, I wasn't breaking into 16k, just upper 15K, so it was slower and my CPU cores that normally boost up above 4K were stuck at 3.7.   

So, could I still be just dealing with a memory timing/voltage issue here?

0 Likes
misterj
Exemplar

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

mantisman13, I think I have the same problem as riveryeti, the OP, and maybe you.  Based on his advice, I removed the RAM stick from the offending channel and have had NO problems since.  It sounds like you have identified the offending channel, so I advise removing the sticks from this channel and see if it will run. 

What version of CPU-Z are you using?  I have seen the strange results you describe on older version that do not support  our MB/CPU.  Latest right now it is 1.92.0.  iCue and the other software you are using are not to be trusted.  I suggest to uninstall all those applications and run CPU-Z, AIDA64 (paid, but trial available) and Ryzen Master (RM).  I have no other applications to trust.  We still do not know what riveryeti did, but I suspect he RMAed his processor.  I have not, simply because I am tired of changing things in my system.  Please try a three channel setup in your system, and let us hear how it goes.  I would suggest a Clear CMOS, RAID setup, then see how it goes.  BTW I get a Cinebench R20 over 17,000 (all cores), with my three channel system and no other tuning or OCing.  I have set NUMA mode active (see here) which seems to help my memory performance.  Enjoy, John.

PS: This is a user forum and seldom does an AMD employee read/post here - support.

drdocumentum
Adept I

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

I have seen this memory issues on another forum with the TRX40 boards from Asrock. I believe they might have a design problem or a nasty BIOS bug.

I remember from my tests (wich I did for about a month everyday before giving up and purchasing a Gigabyte TRX40 Aorus Master). That the board worked ramdomly sometimes rebooting into BIOS and reloading defaults worked fine for the day. But next day the "BIOS charge capacitor" was depleated and the issue reappered.

I also tried a lot of DIMM combinations. And they semed to work for a period and then started failing again. When the quarantine was lifted on my sector I went to the dealer and he said that I must comform to the QVL. Well, I explained him that I have sets from Crucial, HyperX and Corsair (I bough memory two times also suspecting there was a memory defect) and all had the same problem. So, he kept the mobo and CPU for a three day testing and confirmed with his own lab memory that was the motherboard after all. He tried my CPU with a new Gigabyte TRX40 Aorus Pro Wifi that he had in stock and the issues went away. He finally ended exchanging me the Asrock for the Pro Wifi. Since I had already purchased a new more advanced model board online, the exchanged one ended sitting on my home storage and planning to sell it online. 

The lesson I learned on this issue is that no matter what combination or test you perform. The board fails ramdomly at different times with sometimes different BSOD error messages.

I also believe that Asrock knows about this problem because my board got a little physical damage due to all the assembly/disassemby tests that I did, and they agreed to RMA it anyway to the local dealer after he told them what the issue was (he also sent them pictures of the damage). My local dealer was very open and understood that the damage was a consequence of the board random failing and I thank him for that.

So, if you have the chance, just return your Asrock and buy a board form another brand (Asus, MSI or Gigabyte). You will save a lot of time and headaches.

0 Likes
drdocumentum
Adept I

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

One more thing I forgot to mention: Memory voltage. My memory requires 1.35 volt to reach the rated 3200Mhz. After setting the correct value and rebooting on the Asrock, going to the system monitor section on BIOS you can see that voltage was reported as 1.376 V and also the value was not stable, sometimes It changed to 1.365. I tried lowering the value to match to 1.35 and it was impossible. I did show that to the dealer as an argument against the board quality and he agreed that it was not normal after comparing the voltages on the Gigabyte that on a 1.35 setting the monitor reported 1.356V. I also found online pictures of the same issue for the Asrock. due that I suspect that the memory voltage regulation subsystem on the Asrock might be flawed by design.

0 Likes
misterj
Exemplar

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

drdocumentum, I do not know what MBs or processors you are/have been running, but in my case this is a Processor problem.  I have seen it on MSI (2990WX) and Gigabyte (3970X) but not on my ASRock (1950X).  I strongly believe, with the OP, this is a memory controller problem.  I suspect if you depopulate the offending channel, your system will run OK, just with a slower memory bandwidth.  Enjoy, John.

0 Likes
drdocumentum
Adept I

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

So you haven't read my posts. I have a 3960x. The CPU is working fine on gigabyte TRX40 Pro Wifi and Aourus Master (I have both boards) but fails on the Asrock TRX40. Dealer confimed it was a board problem as he tested both on their lab.

0 Likes
misterj
Exemplar

Re: Bad memory channel - how to test if mobo or CPU IMC (TR 3960x)?

Thanks, drdocumentum.  I hope your system continues to run well.   My current MB is Gigabyte TRX40 DESIGNARE and runs fine on triple channel setup with NUMA enabled.  Enjoy, John.

0 Likes