i got a lot of problems with the Rx Vega64 Liquid edition
this is my 3rd Card (RMA the other 2 cards before this one).
The problem that i have is the Vega card randomly causes a Black-screen / Grey-screen. it doesn't matter what i do or what i play it does it random there is no pattern in the behavior.
I tried a lot of different drivers , did a clean install. updated the graphics card Vbios(is now 016.001.001.000.008774) as well the Bios of the Mainboard (is now 1801 ) but nothing is helping at this moment.
What i would like to know is this a compatibility issue with Vega and maybe the X99 chipset /or only this mainboard ?
or is this a serious driver issue that amd isn't aware of and is pretty rare ?
The blackscreen appears sometimes within 2 hours , but it can sometimes takes up to around 4 to 6 hours..
the other 2 Vega LC that i had displayed the same problem so it is almost impossible that the card is defective.
what i tried
i disabled the Hardware acceleration if Firefox Quantum
Use DDU 17.8.1 for each driver switch
Nothing above here fixed it
Now i tried something out as those black-screen/grey-screen got me thinking that it could be Graphics card memory related
specially because my desktop has the same grey background color as the crash does so,
after watching the HBM2 down/Up clocking constantly 167mhz-500mhz-800mhz-945mnhz,
i locked the HBM2 Memory on 945mhz and it can't down-clock anymore and stays static on 945mhz (with Wattman).
The HBM2 Memory doesn't down volt it always stays static on 1.356V so i don't have to change that and Wattman also doesn't have the ability to do so.
only problem that this creates is that the idle voltage of the GPU is now 0.950V instead of 0.800V but i don't mind that at the moment
This seems to have fixed my Issue that i was working on for 8 weeks. I'm still testing it though but i already have more success with this tweak then i have at default and system is stable for 2 days already without fail/blackscreen.
Maybe this can fix the blackscreen problem for other people too.
I hope that the AMD Driver/Vbios Team can take a look into this issue
RX Vega 64 Liquid edition
it is now more then 4 days stable.. i can now pretty much confirm that at max Memory clock the system is completely stable
I am now testing the Memory clock @500mhz that way the Gpu core voltage can go down to idle 0.800V instead of the 0.950V that is caused when i have the memory @ 945mhz
and this is also stable for more then a day already, so i think there is really something going on with the frequency switching of the HBM2.
I think that i have done enough testing for AMD to take a look @ this Culprit.
Tried to disable State 0 (HBM2 @ 167) and only allowed 500,800,945mhz but got a grey screen hard-lock within 5 minutes, so it is really only stable when i manually set the Frequency and it doesn't matter what state i pick all are stable, but i can't let it dynamically select the frequency then it will crash in a matter of hours (if lucky ).
is this a Hardware or Driver issue ?
Is there really no one around here who has the same kind of problems ?
I have exactly the same issue and it has been driving me mad.
Using Vega 64 LE Air Cooled.
There really is no pattern just a lovely Blank Screen (95% of the time Grey). I have had other colours but mainly Grey.Can happen 5 minutes into a game or 5 hours, nothing in event log and i have not been able to get any sort of Crash Dump files to be created.I too have tried clean installs on various drivers to no avail amongst other things.
FYI my system is an older Sandy Bridge setup so the only similarity is Windows 10 ver and 3840x2160 @60Hz Connected through DP.
I will try locking the HBM2 Memory as you suggest and will report back.
Just check wattman after a restart or shutdown.. seems that wattman sometimes loses it's settings.. else you might think that locking the hbm didn't fix it
While i haven't been able to play/test as much as i would do normally i have yet to have a grey screen lockup.I managed to play for two session 1 -1/2 and 4 hrs without incident so it is looking good so far.Hoping i can get on a bit more during the week.
I hope this is the reason, testing it now
i had terrible months with this error, mine turned blue/grey/black when i played a game in a 1080p window and watched youtube or browsed on an uhd@60Hz(Dp) resolution, it was crashing from 10min to 3 hours
i even had a horrible sessions of testing where i switched several parts and tested 3 power supplies up to 1000w platinum bequiet, and it didnt change it
I so hope this "temporary locking" might fox it and if it does then please
AMD try to update it in the drivers or bios
god bless you guys
I am now 2 weeks stable with youtube, 2 games in windowed mode.
Normally i would have had at least 1 crash a day.
My system is completely stable with this temp fix
I hope this works for you guys too.
Had the same problem, everyone is saying its a faulty PSU even though it can run pretty fine on max load (CPU AND GPU). Happens always during path of exile, today it happened while watching twitch.
Do I just go into global WattMan and set the memory speed to min/max at 945MHz? Also what does that mean in terms of power consumption?
Well i Currently work with 2 modes that i manually control.
1 While on the desktop watching youtube or browsing the web i set the HBM on State 1 locked min/max (500mhz).
2 While gaming i set it locked on State 3 min/max (945mhz).
This way you can still conserve energy, it is more work but it's either that or a unstable system.
Maybe you can set wattman global setting @ suggestion 1, and make a profiles for the games that you play en set with each game profile the HBM locked on State 3 min/max.. i didn't try this method as i don't really care to set it manually each time but maybe this works to