Hi misterj (also john). I'm guess your using email. There are a bunch of post you may have missed. You are right, I'm completely stable using 3 way memory, so long as I stay away from the A channel (see above "Did yet another test where I tried D1,D2,C1,C2,A1,A2. This booted up, but I got a BSOD after just a few mins with no real load."). This does point to a board issue, but as I was running fine for 6 months and I'm having some issue with my AIO software, I know things got a bit heated towards the max. This may have damaged something or in the software updates I did, I may have gotten the bios into a un-stable set up that I just need to find my way back to.
Good Idea, I'll get back into BIOS tonight and see what the voltage is set to. HWiNFO64 is reporting my DRAM voltage as 1.366V just as you noted.
Thanks. The primary reason I bought a threadripper was the 8 slot memory support. I use my setup as a virtual machine server for my work's development/consulting projects and I need to use about 8 big VMs so I need lots of memory and all the channels working. I don't really care much about having a lot of cores as my workload is memory limited not CPU's so I purchased the 3960x instead of the 3970x or 3990x CPUs. Regarding the Designare, I read that you need to put the thunderbolt card on slot 4 or the system won't work fine giving memory errors. I also read a user that reported that XMP profiles were not working when the card was installed. So maybe that is your problem.
mantisman13, what??? "I'm guess your using email." I post just like others do, just have not posted here in a long while, but when I saw your posts thought I would comment.
HWinfo and many others give poor results. The ONLY application to use is Ryzen Master (RM). Please post a screenshot - simply drag-n-drop the image into your post. What score do you get in CB R20 for all cores? I think a memory voltage a 100 mV high is a clear indication of MB quality but seriously doubt that it will make the memory less stable. In fact, maybe more stable. I have the same memory in all my systems - G.SKILL Flare X F4-3200C14Q-32GFX,. These are Samsung B-Dies and all run on 1.35 volts.
Have either of you looked into Event Viewer for errors? If not, please do. If you find any please expand Details in the lower panel and paste the data here. Thanks and enjoy, John.
Interesting idea. Only pcie slot I'm using is the top one for the GPU. I used the 2 onboard m.2 sockets for my system mirror. But perhaps it's more an issues when using raid vs sata and memory. At any rate, the "only" software that was updated before everything went SNAFU was the NZXT Cam AIO software, and installing nodejs and vuejs and bootstrap vuejs, docker for window had and update and of course window itself has a big update... after I started having issues I updated the AMD chipset which only updated the I2C Controller. The iCue firmware updates to the memory came after issues began but before the system was so unstable it wouldn't run at all. That also didn't happen until I stated using the XMP profile. I think the XMP profile showed up after the memory firmware but before I updated the bios from 1.1 to 1.6 and that actually makes sense. Now, you mentioned setting your voltage to 1.35 as your memory is specified for. Mine is the same at 3200 and looks like the board is trying to over volt it a tad to 3.65. I've read that adding an extra .1v can help with memory timing. But 1.35 is the XMP profile that the manufacture is saying they should be stable at. The JEDEC timings all use 1.2 V, so the .15 up to 1.35 is already a over clocked voltage. So I think that is my next move.
Like you, I want all the Ram and more. Not a gamer so less concerned with memory latency. I do a lot with database development and web services and getting into building for docker deployments, so I need both memory and cores. May talk myself into springing for 3990x and plan to build one more box up, but it's been a while since a good payday, so need to try to keep thoughts like that under control.
Not much useful in event logs. Best I get is from looking at mini dumps, but that doesn't give much help either. Sorry about the mail comment, just that you were missing things me and another had already posted. I have tried to run Ryen Master, but it needs to have VBS enabled and I found an obvious setting in my bios for that yet. It may be something that get sets if you agree to the overclocking statement in the bios that then will void my cpu warranty which I'm not willing to do, especially now. I agree that RM would yield measures that are more pure, as they are not having to go through the system layer, but I'm not sure for the issue I'm having that really will matter much. No reason to distrust the voltage reading I'm getting out of cpu-z or HWiNFO64.
mantisman13, SVM (Secure Virtual Machine) must be Disabled to run RM. I only enable it when I run Hyper-V (W10 virtual machine):
Screenshot from my system. SVM should not get set/reset for any reason except User action. I do not need to agree to any OC stuff to set SVM on/off. RM is also the best way to mess with most BIOS items. Please notice that I have 'Power Supply Idle Current' set to 'Typical Current Idle'. This seemed to help when this first popped up. If you do not mind, please compress your Minidump folder and attach it here. Enjoy, John.
Just to clarify regarding excess voltage. The Asrock was on my case at 1.376v which is 19% more voltage over the configured 1.35. I believe that is too much for an acceptable error margin. Memory was actually getting very hot so I believe It is an important defect for that motherboard as it will compromise memory life on the medium or long term.
A comment regarding NUMA. On a first or second gen TR CPU it does helps as the OS can see what cores are near and which ones are far to RAM. But on a 3rd gen NUMA is not the right choice. Current Threadrippers have a separated memory controller die that feeds all the CCXs via Infinity fabric and there are no NUMA nodes anymore. The important parameter for a 3rd gen TR regarding memory performance is to keep the Infinity Fabric clock in sync with the memory clock. For 3200 Mhz modules. That is 1600Mhz IF and Memory clocks. On my tests under the Asrock I could see that sometimes the BIOS on that board configured the IF as low as 800 Mhz while the memory was at 1600Mhz. That setting can be observed on the CPU-Z NB Frequency: (this is my current setup, it shows a difference on a few Mhz but that is because CPU-Z reads NB and DRAM serialized and the value is not 1600 Mhz perfect as MB clock generator isn't allways fixed)
I use my Graphics Card on slot 3 because I can't fit it on the first slot as I use a Cooler Master Wraith Ripper which is a very big air cooler. The Gigabyte BIOS has a setting to select the primary graphics card slot so it is not really important.
BTW: I am using an air cooler because my local dealer told me he has received many issues with pumps exploding inside the case due threadripper heat stress (I had a water cooler AIO installed first). Since people uses the TR machines for work they kept them on always 7x24 and water pumps are not very reliable for that kind of continuous workloads.