cancel
Showing results for 
Search instead for 
Did you mean: 

Drivers & Software

throwaway1234567
Journeyman III

How to troubleshoot stability issues with a 6900 XT

Having problems with my 6900xt, i get constant crashes in every game - the only way to fix this so far has been to set max frequency and voltage to 80%, but this isnt ideal

 

removed old nvidia drivers with DDU and installed new radeon drivers, any other benchmark other than furmark crashes and most games crash on menu load - except from CSGO

What i've tried

- Double checked 8 pin power cables arent splitting off of 1 cable

- Reinstalling windows, Same problem

- Reinstalling drivers

i assumed this was a problem with power draw somehow (Even though i have a 1000w PSU) so i ran furmark at 1080p and my power draw hit 300W, no crashes

Not really sure what else to try, sent a support request to AMD but probably wont hear back for a few weeks

87 Replies
Spladian
Adept I

Getting the same on my card.  I've been trying to get high frame rates on games at 3840x1600 resolution. This is the card to do it, but the current drivers, or Radeon Software appear to be unstable.

Will install the driver only and report back to see if it is the driver, or the radeon software.

I think there is an issue with the initial driver for the 6900XT. Games are crashing with only the Driver installed and no radeon 'control panel.'

If you use the Microsoft Basic display adapter and then check HWINFO all the way at the bottom, it was showing zero errors for me. As soon as I load the INF driver only (no radeon software/wattman etc.) I start getting WHEA errors.

 

PCIE errors.png

 

 

0 Likes

WHEA Errors could also be drivers, CPU, other hardware devices on your mainboard, and/or Windows itself. I suddenly experienced BSOD with the WHEA message - started literally out of the blue one day and would crash every 1-2 hours. Drove me nuts tracking down the issue. Turned out to be bad NIC and BT drivers installed by Windows shortly after a Tuesday update! This was back in the Summer on the May 2004 build.

FYI I have a 6900XT running just fine on an X570 system w/ 3900 CPU and Windows 2020H2. Works great; fast and stable. And was not a fresh install but one that's been hammered with changes since the summer.

0 Likes

Should add I'm also running the 2020.12.2 drivers.

0 Likes

I have yet to try a fresh install.  Although I've DDU'd drivers/ AMD clean-up utility/ and Microsoft uninstall tool many times now.

The card has worked for hours and hours, only in some spots in a few games does it consistently crash.  Being new to AMD, I'm not accustomed to this behavior. I do know another person reached out to me on discord and we have been bouncing ideas off of each other trying to solve the issue, though. 

Weird issue though, If i hard boot, WHEA errors go away. If I restart, they come back. With, or without Radeon Software.

0 Likes

You might try disabling hibernation/fast startup. It can get cached data corruption and re introduce it every time you startup. The WHEA errors are becoming pretty prevalent and likely will take a bios fix to remedy, but IMHO fast startup is an unneeded feature and usually on makes things worse not better. AMD for some reason seems to be affected by it worse than the competition. Regardless this is a normal first step when helping AMD users to get more stable with their GPU. To stop the cached data from re-introducing bits of older drivers.....

https://www.windowscentral.com/how-disable-windows-10-fast-startup

 

0 Likes

I'm also coming from Nvidia. I didn't have any issues with crashing in quite some time with their cards/drivers. I recently picked up a 6900XT, and also used DDU and installed the latest set of drivers. 

I run at 4k 120Hz (LG CX), and CoD Warzone keeps crashing on me. It's pretty intermittent, and usually seems to come about after about an hour or two of playing. It hard crashes the PC (completely locks up).

I just updated my BIOS (X570) as MSI just released the new AMD chipset update. 

I'll try disabling fast boot on Windows and see if that makes a difference. If not, I think I'll try the "Undervolt GPU" option in Adrenaline. My PSU is an EVGA Supernova P2 850W (Platinum). I'm not running a crazy OC on my 3700X, but I do have AMD PB Overdrive enabled in the BIOS. 

I know it's not uncommon for new GPUs to experience stability issues, but after defending/ignoring the comments about AMD drivers still being "trash", I'm starting to feel like a fool. Lol

0 Likes

AMD RDNA and now RDNA2 are known for having random high voltage spikes as well. Often times going beyond what even a quality PSU that is should be enough according to recommendations can handle. You can run a monitor with logging and see if you are getting a voltage drop at the time when problems happen. 

You can test the GPU and PSU also with OCCT from ocbase.com and see if you can force the issue. 

0 Likes

Whea errors only show up when I restart windows, but when cold boot, they are not there. The polling rate of HWINFO is 2000ms, which is probably why the errors were happening 1 every 2 seconds. 

I played some games today - and things were stable. 3 hours with no crashes.  I'll chalk it up to maybe a registry issue, or driver issue as windows was on my machine for 2 years.

ACV.png

 

Thanks all for trying to assist. Starting from the beginning for any troubleshooting is always the best. Windows installs are no different, but is mostly avoided because of the hassle of rebuilding everything.

0 Likes

Shame on me for not being intuitive on this, but I'm trying to understand what your "solution" was for your issue. I'm having similar crashes with a 5700 XT, so I'm just going across threads looking for any info. Was the full windows wipe what helped you? And if so, did you keep ANY files or did you full wipe all files?

Sorry for not understanding what you did here to stabilize your system.

Thanks!

0 Likes

 

Hopefully I can add to this conversation as I was having system crashes (Total system lockup with video frozen and hard reset to resolve)  after moving from a nvidia 2080 to an AMD 6900xt. 

Crashing as well in Call of Duty Modern Warfare and later in a web browser...  hmm 

setup: cpu i9-9900k / 64gb, asrock z390 taichi ultimate mb, a few nvme drives.

 


What I found so far:

I was crashing in games like Modern Warfare from sitting idle in the lobby after awhile. Thinking this was due to overclocking, I went back to the default AMD settings which resulted in additional system crashes.  hmm okay.. now why am I crashing even more now???

Here is what I think happened:

Modern Warfare most likely has a memory leak and/or is requesting too much vram.  (More on that in a sec).  Thinking the crashes were a result of overclocking I moved to the default AMD settings which unfortunately sets the fan curve very low. This allows for the junction and core temps to rise. While okay for the video card.. this is throwing some serious heat off the metal backplate on the GPU.  okay and..

With my particular motherboard... the GPU sits very close to the System Memory and M2 slots for storage.  I started seeing some heat warnings on hwinfo64 on my nvme.. wtf.    So I ended up moving the GPU down to the secondary PCI 16 slot to give some distance from the memory/etc.  Then went back into the GPU settings and cranked the fans up 100% and removed that zero fan. 

Back to Modern warefare..   I put a cap on video ram usage.  Went into  documents/call of duty modern warfare/players/adv_options.ini    and changed video memory scale = 0.5  for now. 

So now the game is pulling 9gb of vram. So we'll watch that and restart the game if she goes above 10gb. 

--

NOTE : I'm still trying to setup baseline testing but I was able to at least get a few hours of game play without issues. No more system crashes while at the desktop doing nothing. So we're making progress I guess. 

I'm trying to narrow down what appears to be the Radeon application crashes in the event logs.  What I'm noticing now is windows error reporting when I come back to the PC and turn on my monitors.  I do not run any type of power management and my PC runs all day (Never turn this thing off unless patching).  So I'm wondering if something may be happening with display detection but I'm not there yet in my testing.

Triple Monitor setup (2k monitors LG-850b) :  2 Monitors run Display Port and 3rd monitor is running Type-C to DP cable.  

NOTE2: Usually on hard system crashes which require a power off.. I'm stuck having to pull that type-c cable out and plug it back in to get video back. 

 

I'd like to hear if anyone else has luck moving pci 16 slots and running fans 100%.  etc..  

-Motavar

 

Edit  Note:  Running 20.12.2. software 

 

0 Likes

update:

Decided to try an older game like BattleFront II.

Crash about every hour.  Hmm..

Did a fresh install of drivers with DDU and boot to safe mode.

Going to give this another run. 

sfc /scannow shows no issues. 

Swapped out ram so I was running 32 out of 64.  Flipped sticks.. same issues with crashing. 

Running no OC's on video.   We'll see what happens after this. 

 

 

0 Likes

I think some of these issues could be PSU related. My buddy has a 6900 xt and he was running off one 8pin that splits with a multi rail psu so he wasnt getting the full power the card needed. This may help https://youtu.be/PWtKSHT2od8  .

If you have a multi rail psu you may try to switch it to single.

0 Likes

I have a 1200w single rail PSU - I believe the issue is power spikes causing an undervoltage condition.

 

While the cards can bench 300w all day long, I don't think it is possible to play a game on these cards (which would be representative of a dynamic workload) at a 300w configuration due to the power spiking up to 800+ watts.

 

0 Likes

If you have a 1200W supply that should be plenty! Unless it's defective I don't think it would cause any stability issues at all. These cards only draw up to 300W'ish stock but, even then, power under load is actually quite low in the 120-150W range as people have reported. These AMD cards are pretty frugal with power.

Always hard to help solve other people's computer problems as there are just so many variables. But as a reference my 6900XT is running perfectly here with no issues (aside from the DisplayPort one). There is a texture/lighting issue in one game but it's either driver related or the game code, and not h/w or firmware.

0 Likes

Unless it's defective! - Bingo

I had a 1300w PSU laying around. Put that in and haven't had a crash playing a game that crashed my 1200w PSU within a minute.

The EVGA 1200W Platinum has a 10 year warranty, so shouldn't have any issue RMA'ing it.

This appears to be the solution for my issue. ... Will report back if the crashing comes back.

0 Likes

Swapped out my evga 850 psu with a really old ocz 1000w and I'm stable again.

Put my 5ghz oc back on, xmp memory back, 6900xt oc clock at 2450/2550, 1150v, memory maxed, and 300w. 

Ran a solid 4 hours gaming tonight but still more tests are needed.  I guess I'll have to move to a 1200w psu+ to run this stable.

 

 

 

0 Likes

I think some of these issues could be PSU related. My buddy has a 6900 xt and he was running off one 8pin that splits with a multi rail psu so he wasnt getting the full power the card needed. This may help https://youtu.be/PWtKSHT2od8  .

If you have a multi rail psu you may try to switch it to single.

Sorry about double post, I don't see a way top delete one. :( 

0 Likes

Update for (my) issue so far:

12 hours stable so far..

bios upgraded and defaulted.  Removed my 2-year old  5ghz/4.8ghz overclock on my i9-9900k and removed xmp profile from memory. 

NOTE:  While you can benchmark/test/burn-in all day long.. some hardware can just behave differently.  So while my OC was okay with nvidia maybe it just doesn't work with this 6900xt.  Which is okay..  we'll work through that and confirm if it was the cpu or the xmp for stability or something else.   There are a lot of online posts that mention radeon issues related to xmp profiles.   Further testing will have to be done as I want to check ram voltage, change out power supply, etc. etc. 

 

0 Likes

I'm pretty sure it's the drivers at this point - I was getting 1 WHEA error every 2 seconds in HWINFO 64 (scroll all the way to the bottom to see it.) As soon as I uninstalled the Driver to go back to the Basic Microsoft Display Adapter - the WHEA errors Stopped. 

0 Likes

Well, it's not the drivers (20.12.2) alone as I'm having zero problems. So it's something with the configuration of your machine(s).

There's another BIOS setting unique to the 6000 series which is the "Above 4G Decoding" related to SAM. Make sure that's enabled even if you do not have a 5000 series CPU.

I suspect the WHEA is from another driver conflict somewhere that's been surfaced now with the new GPU. Also assume you've updated the chipset drivers correct? Go to AMDs site to d/l and don't use the default Win10 drivers.

Another thing to note: make sure to use the default AMD power profiles. Folks have had troubles when going into the settings and making custom changes.

0 Likes

I just wiped my whole machine. Fresh start with Windows. The issue still persists.  Power plan is defaulted to AMD Ryzen Balanced. Nothing tweaked yet like unpark cpu cores, or optimize Windows for gaming.

Steps I took so far to rebuild machine: 

-Click on check for updates until it stopped downloading anything

-install chipset driver

-install adrenaline driver.

-Create System Restore point

-Download Chrome with DuckDuckGo extension

-Download UBISOFT connect , Download Assassins Creed Vahalla. 

-Loaded a save game where I know the I could reproduce the issue. Issue occurred within 2 minutes.

More than happy to entertain suggestions at this point: Discord id: Spladian#9998.

Windows: Windows Home Fresh install as of 12/30/2020.

CPU: 3800x

Mobo: Gigabyte Aorus Extreme 570 with latest bios update

GFX : 6900xt Power Color Red Devil 

Ram: DDR4000 CL18 - Trident Z Neo - replaced it with

Ram: DDR3800 CL14 - Trident Z Neo (approved ram for motherboard) - This ram is 2 days old

PSU - EVGA SuperNova 1200W Platinum

0 Likes

So, I ended up disabling Fast Boot, and also manually bumping up the GPU fan curve in Adrenaline. 

I got the feeling that my GPU was a little "too" quiet. After doing this, I was able to play all night without a crash. *Knocks on wood*

Try cranking the fan curve up and seeing if that helps.

0 Likes

I'm starting to wonder if the WHEA errors showing in HWINFO 64 are actually telling of anything.

 

If I cold boot the system, they disappear. If I restart the system, they appear at 1 every two seconds on the desktop. Once in game, they level out to about 1 a minute. over the course of an hour of play. I've not touched anything in the Radeon software except for enabling Adaptive Sync.

The default profile runs at:

2500mhz,

1175mv,

250watts,

1300 fan speed

69c Temp, 80c Junction Temp.

Everything goes fine until I start manually playing around with these values and then errors become much more frequent. 

The hard pill to swallow here is - Once I click any of the buttons on the tuning page, the issues begin. If I go back to default values, the issues then don't go away, but persist. This was before reformatting Windows, so I'll play a day to see if things are mostly stable now, and report back if any errors after manually setting the fan curve.

Appreciate all the suggestions.

0 Likes

Definitely not the drivers. I'm running 20.12.2 as well. 0 issues since I put the card in. Not even a fresh install of windows. I have yet to have 1 crash.

hwinfo.jpg

0 Likes

Hi,  

 

Thanks for posting this. I enabled above 4G coding on my 9900k Z370 motherboard and now my rx 6900 xt runs perfectly. I couldn’t even get past the title screens of Control and Forza Horizon 4 until I did this. I was almost going to send it back!

Update:  

Since I put that really old OCZ 1000w PSU in I was stable. 

I went out and purchased a Bquiet Darkpower 1500w psu to replace the older ocz with something newer (and room for larger cards after the 6900xt).  

 

So far the 1500w has been stable as well.  No bsod, no lockups, been working great. 

If you have an unstable system with a 6900xt I highly suggest you start looking at a PSU above 1000w. 

 

0 Likes

I had issues like this with the reference 6900xt GPU. I returned it and purchased the Red Devil 6900xt because I read it had better stability due to more VRM’s, which is absolutely true.

Since moving to the Red Devil I’ve had no issues (even in OC mode) what so ever in the exact same PC.

On the reference 6900 reducing the voltage by 5% should help and when I did it I somehow ended up with higher benchmark scores on 3dmark compared to 100% voltage for some reason. 

My power supply is the Corsair RX 1000x. I’m not going to suggest you change GPU or PSU if you have this problem as I think this can be resolved with the amd software or driver update.

Just to add my reference card was hitting a gpu clock of 2480 which might have been the issue why it kept freezing, when I reduced the voltage it was still over 2300 but completely stable. 

Also if your interested, the Red Devil hits over 2500 on the GPU clock in the default OC mode and the memory is over 2000, this is completely stable (again this far exceeds the advertised speed and my coincidently PSU keeps up). So having fully compared a reference 6900 with a 6900 with more VRM’s (on the same PC) it seems to me the reference GPU is the issue and not your ram/ psu/ cpu or motherboard. 

0 Likes

i was having same problem on my 6800 xt.

unplugged my second monitor problem gone .

not saying this is your problem but it was mine

main monitor 144 hz.

second was a 60 hz 

Did you find a fix at all? I’m thinking about speaking to scan but they most likely will refer me to amd/gigabyte

0 Likes

I have tried all the solution above and none of them has solved the hard crash issue. My GPU crashs every 90 minutes aprox. I only play Call of Duty Warzone.

Proposed solution tried:

  • Removing drivers using AMD Cleanup tool and install them again
  • Removing the fast boost start on windows
  • having diferent line cables from PSU to GPU (I have also changed my PSU from 750W to 1000W for this issue)
  • Updating windows to 20H2 19042.804
  • Updading AMD Drivers to 21.2.2

 

My hardware is:

GPU: Gigabyte Radeon RX 6900 XT
CPU: AMD Ryzen 7 3700X (watercooled, No OC)
RAM: 4x8GB G.Skill TridentZ Neo 3600MHz CL16
PSU: ROG Strix 1000G 
Mother Board: Gigabyte Aorus Master x570 Rev 1.1

I have also Samsung Monitor G9 which also flicking and horizontal lines when I use at 240Hz (But this is a different problem).

If anyone has found a solution which is not listed above I would appreciate that.

regards.

0 Likes

i have an update on my 6900XT

it completely degrades and it gets worse every single hour.


first three days.
heavy benching and gaming for almost 20 Hours at 2.7Ghz.. No Problem.

day 4 and 5
2700 is not stable anymore, 2650 is not stable anymore.

day 7
Card barely runs at stock speeds and starts artifacting in games.

Day 10 (Today)

Card needs to be downclocked almost 500Mhz to be stable enough to display my desktop.

changing memory clock speeds results in a full system lockup and the card can't play a Game at clockspeeds north of 1700Mhz.


i just wait for my 3070 that i ordered for 170€ above MSRP and i'll sell and return every single thing i have with the name AMD on it.

i spent easily 2 grand in AMD Hardware over the last two years and i never had a SINGLE DAY without massive problems..
from corrupted Operating Systems over crashes, reboots, black screens... a 5900X  that can't even run Aida64 at stock speeds without BSODs.

i am finally done with AMD.
and i'll never look back... even if Zen 4 wipes the floor with Intel and costs less than half the price.

0 Likes

@K0NG95 Well that's too bad about the 6900XT. But you have to agree that when one overclocks then you have to accept the outcome. AMD does only rate the clocks to 2015 Mhz for 'gaming applications' which would imply something I guess.

0 Likes

what outcome?
can't increase VDDCI nor VDDCR. hardlocked currentlimits.. temperature limits.

there is nothing i could have done to damage the card.. and especially not in a few days. (not even on 24/7 LN2 Session with like 400mv above stock.


The Card runs Stock at around 2350 Mhz in games and 2200 under heavy synthetic load.
it's just a broken card that seems to exist more often than it should.

0 Likes

your card is faulty , that happen, my 6800xt did too, i got myself a 6900xt instead and it's working great ... for the rest it's not related ... early cards , being first buyers seem to be risky as it always has been

btw cpu ...seems lot of people coming from intel seem to think it must run as their intel hardware so they start to touch bios etc ...  ... and so there are so much topics about voltage and temps etc, black screen about ram timings problem personal tweaking where people are sure they are right but don't even try to rethink etc  ... and rma of fully working hardware, 90% about lack of information or misinformation, i even read so much wrong information or not very accurate in reviewer website.. (not speaking about stupid comentaries on some retailer website - saw one guy complaining about coal whine and black screen on a 6900xt and 20 thumb up from people not having the hardware to test for themselves ... and then in the description of his config you realise he has poor 750W psu , not even has the minimum requirement .. and the guy so proud to say he had rma 2 6900xt with the same problem ... see ..... ) that 's just crazy today

most of the time problems comes from people thinking things have to work as they think it as to , and when you see forum flooded by people saying "me too" , but not trying to understand how there new hardware works .. it's viral and stupid

not sure i'm clear enough, i'm not aggressive , just say we are in a specific situation now

0 Likes

I understand people have problems with their graphics card, but maybe the problem can also be something els = your cpu and the software/drivers. So if your pc crashes allot try this ( it wont hurt your pc )
I say this since i have my self had all kind of issues since i bought a 5900x cpu and a 6900xt gpu with the error power-kernel 41 = a power issue.. however my PSU's are 750 and 1000w ( and the 1000w is brand new )
Somehow if i use factory settings in bios and set cpu vcore to auto then the computer is unstable like the software that control it dosent work good enough.. but if i manualy tune the power the computer become stable.

1). restart computer and go into bios
2). Press F2 and bios advanced mode should start. ( or select bios advanced mode somehow )
3).  find where it say   CPU Vcore  AUTO...  Change Auto to  1.375volt
4). exit bios and save.

Also you can go into windows power plan and edit it.. Select that the PCIE graphics card can never turn off power/sleep
Turn off that harddrives can sleep
Turn off that computer can go into hybernation or sleep mode.   ( set to 0 )

Personaly i had crashes since i bought new hardware in december and i think my cpu is a tiny bit deffective wich result in system crahes.. but the above have made it usable..
I suspect that my cpu is defective or driver have bugs  since my computer also crash with an nvidia 2080rtx card with the 5900x cpu.
New cpu's are first expected to be in stores in April or later so i cant RMA the cpu until then.

0 Likes