cancel
Showing results for 
Search instead for 
Did you mean: 

Rig Showcase Discussions

whiskey-foxtrot
Forerunner

Re: Oscar Mike

colesdav​: As with CPU + memory OC'ing, I also tend to work on the CPU first to find the breaking point and step back down. Once that's stable, I focus on the memory. Same here with GPUs.

So far I've increased the power limit as one of the first steps I do before overclocking or tuning, which currently can only be done via custom profiles. In a previous run, I copied the settings used in the "Turbo mode" and applied a custom fan curve and increased the power limit. Going back to the results I posted above, I simply increased the Core clock till I discover a breaking point. I actually have more results now which include a balance between under-volting and increasing the core clock gradually and keeping the measured amperage on the PCI-e cables to a MUCH lower than stock settings (balanced and turbo modes) saving about 8 - 10 amps in the process!  I'm working on a few different projects, but I will have several articles covering each component of performance tuning.

I honestly haven't watched Gamersnexus since his Ryzen Threadripper cooling solution video; I'll probably take a look at it soon.

As for MCLK and GLCK being proportionally linked seems to be correct so far; I'll post some links on this later in the day. It's almost 4AM here and I've been running almost every single scenario on this card for the last 4 hours or so. I will post various numbers to address the other points you mentioned which are exactly the steps I've ended with for the 8 - 10 amps lower draw, lower temps, and increased performance (FPS in synthetic and some games).

As for the BIOS tweaks, I hope that it becomes a user customizable option; I do not like to be locked in, especially when I have a dual-bios option. Making any changes right now renders the BIOS useless.

colesdav
MVP

Re: Oscar Mike

I think you need to take a break ... I got pretty tired yesterday, I started making mistakes and decided to get away from my PC's.


For the Vega GPU I think maximizing the HBM2 frequency should be for sure the first thing to try.
Biggest performance benefit for minimal power increase.

If you do look at your scaling numbers, just think if new  HBM2 comes out with 1100MHz frequency sometime soon what that might do for Vega 64 ...

I really appreciate all of the testing you have been doing.
Everyone else interested in buying a RX Vega should be thanking you for your valuable information as well. 

For Overclocking / Undervolting information:
The Bulidzoid Videos on Vega 56 overclocking are the most important ones to watch. 
| have provided as brief a summary of his videos and the results in one of my posts somewhere .. maybe I will point post it here again maybe simpler.

RE: As for the BIOS tweaks, I hope that it becomes a user customizable option; I do not like to be locked in, especially when I have a dual-bios option. Making any changes right now renders the BIOS useless.

I was thinking that maybe AMD need to send out a VBIOS update that will only work on already released Vega cards if it appears the cards could in fact be running at
lower voltage. 

I do not think customizable VBIOS for the User is going to happen. The BIOS is locked on the RX Vega Cards as far as I know and it is likely to do so.
I think the reason might be as follows.
AMD are making a multipurpose GPU with Vega.

Same Chip used for Gaming, Frontier Edition, Workstation, Instinct.
From what I have seen so far, all of the RX Vega and Frontier Edition cards have PCB with Identical VRM. This may be the same with Workstaion and Instinct models.

I do not know this but I think all of the Vega PCBS might be exactly  the same across all of the Vega products to keep the costs down.
So think about that for a moment.
If the BIOS were not locked out you could in theory ...

(1). Transform your Vega 56 to pretty much the same performance as a Vega 64.
(2). Turn your Vega 64 into a two slot wide WX Vega 64 8GB ...
(3). Turn your Vega into a Vega Frontier Edition 8GB.
(4). With a little bit of hardware modding on top (an SSD/ PCIe NVMe ) ... possibly turn it into a Vega 64 8GB Radeon Pro SSG?

If any of the above were true, and not many home users would likely attempt it, ... it could cause havoc for AMD pricing and marketing structure if someone with a factory somewhere decided to try it for real.

Maybe there is some way that could be worked out to allow BIOS Modding just for overclockers which would not allow transforming cards to different "model"  but I have not thought about it.

Cheers.

0 Kudos
whiskey-foxtrot
Forerunner

Re: Oscar Mike

I think I'm going to hold off for the time being until I have a better idea of driver stability and how it will impact these cards (or until my warranty/exchange period ends).

"Does it look like the Memory Frequency Scaling Numbers from BuildZoid/Gamers Nexus (Based on a Vega 56 Overclock) that I posted are actually applicable to the Vega 64 card? "

Yes up to a point, but there seems to be very little headroom to make any worthwhile changes and results are very inconsistent within the same app being tested. What IS consistent however, are some of the "bugs": if you make ANY manual changes (enter the number instead of using the percentage slider) to the GPU frequency, the memory frequency drops to 800MHz, or 500MHz.

So I can set 1750 (MHz) for both available slots and it will complete Time Spy

IF Memory frequency does not exceed 1105 MHz (and 1050mV) and

IF GPU voltage is 1150mV or greater.

Since I manually entered 1750, memory frequency drops to 800 MHz.

(GPU: 1650 MHz gives memory frequency of 500MHz) - this is consistent. I think this is a bug in the WattMan software. Either way, it's also consistently pulling anywhere from 20 to 26A through the PCI-e cables (measured with my Fluke 355 meter).

What's also consistent: Increase HBM2 memory frequency gives higher scores in DX12 synthetic benchmarks, however frequency has to be entered in manually at 1100. If you use WattMan's "Max" setting of 1100Mhz, it's an instant crash. Again, I think this is a bug in the software.

Luxmark and Keyshot also benefit from the higher memory frequencies as expected; Blender isn't affected much.

Anyway, going back to some of the tests with Time Spy:

Run #GPU Core (%)GPU (mV)Memory FreqMemory (mV)Graphic Test 1 (FPS)Graphic Test 2 (FPS)Score
1150010501100105042366403
2160010501100105043386621
316501050945105051407413
4165010501000105052417510
5165010501100110050407234
6165010501100110050407327
7167510501100110045406948
8010501100110054427705
9310501100110054437843
105105011001100FAIL
115110011001100FAIL
12511501100110056437999
13511451100110056438003
14511351100110044207986

I used wide bracketing so there's room for fine tuning, specifically for Time Spy but the results are too close to even bother. These scores can be viewed side-by-side here as well: Result

qwixt
Forerunner

Re: Oscar Mike

Can it play minecraft at 60 fps? 

Sorry I had to. Nice looking system. Is this a dual pump system? Trying to see what 3 fans on the bottom are cooling,

whiskey-foxtrot
Forerunner

Re: Oscar Mike

You laugh, but Minecraft can be a PITA

I built this early April in anticipation of RX Vega so there are 2 pumps in the system for a separate CPU and GPU loop. Right now only one is hooked up so it's pushing from lower RAD to GPUs, then to top RAD, down to CPU.

0 Kudos
whiskey-foxtrot
Forerunner

Re: Oscar Mike

So around 474 at peak with basic undervolting while maintaining performance. I know I can shave off another 3 amps safely which would be around 36 Watts

My EVGA 1080 SC3 at the same spot is 410 (with an OC), but is also lower in scores (not by much, but lower).

So technically, the Vega 64 is only about 10 - 20 Watts more.

20170828_191937.jpg

This reading was from the second to the last test in the table below with the "8003" score.

colesdav
MVP

Re: Oscar Mike

Hi,

Thanks for trying that very interesting and valuable experiment.

I might be  loosing track on what you have done exactly  and I do not want to make assumptions.

Could you please respond to my next statements / questions / fill in the (?)  so I make sure I understand what you have done?

I have numbered the questions / point s to save any confusion.

1, The power consumption was originally ?

2. HBM2 CLK  = ?

3. GCLK = ?

4. You undervolted from ? to ?

5. The undervolt allowed you to keep the GCLK and HBM2 CLK the same at  ? ?

6. The power consumption dropped from ? to 474 Watts.

7. You saved a total of ? Watts. 

8. You think you might be able to save another 36 Watts, but have not proven  it yet. Is this because of a GCLK stability issue appearing?

9. You just beat the GTX1080 graphics score wth a power draw of 474 KW wheras the GTX1080 in your example is pulling  410 KW. ?

10. Based on the numbers you achieved:
The RX Vega is now pulling 15% more power than the GTX 1080 in this test to achieve the same performance in the test.

11. What is your opinion on the Undervolt you have applied. Why do you think this is not the default out of the box setting for your card?
12. Do you know anyone else with a Vega who could try the same experiment with their RX Vega 64 to see if they could get similar results?

13. What settings did you run your GTX 1080 with. Do you think that card could be undervolted to the same extent? If not , why not?

Thanks again.

Sorry I have not been providing any OpenCL / Rendering data on R9 Nano(FuryX) for past few days, I have done some more work on it but I am very busy with other work.

Cheers.

0 Kudos
whiskey-foxtrot
Forerunner

Re: Oscar Mike

I posted it, but it's still being moderated for some reason. I'll respond when that comes up.

I also don't want to get ahead of myself - I have yet to put all my testing in one article (or at least get them broken up). I'll post all that here as well. I also have a little over 6 hours of raw footage showing every step I've taken to test/adjust etc.

As for the 1080, it could also be undervolted, but from my limited experience, it also renders lower performance numbers. I say limited because all my Nvidia cards still have their plastic on them; I get them, play with them for a few hours and they sit there for possible future projects or giveaways.

1, The power consumption was originally: IDLE = 86 Watts; 26 - 28A at the PCI-e on "Balanced Mode'.

Once the table in a previous comment is up, you can safely match 19A at the first entry (1500MHz entry) to a high of 24A on the second to last entry. Entries in between are on avg in the lower 20's (21 - 22A).

2. HBM2 CLK  = varies, see table in post above. I mostly kept Mem CLK to 1100MHz.

3. GCLK = varies, see table above

4. You undervolted from 1200mV to 1050

5. The undervolt allowed you to keep the GCLK and HBM2 CLK the same at : very inconsistent clock reading but within 50 points (e.g. 1150 to 1100 mV), GCLK decreased, but actual graphic tests remained the same

6. The power consumption dropped from 510 to 474 Watts.

7. You saved a total of ~30 - 50 Watts.

8. You think you might be able to save another 36 Watts, but have not proven  it yet. Is this because of a GCLK stability issue appearing?  Actually using "Turbo settings" as the max for comparison (~510 - ~515Watts), I've increased performance, and reduced overall wattage at the wall. You'll see 7 scores with manual input for the GCLK frequency, and 7 scores using the percentage slider. There seems to be an inconsistency in how the actual frequency is output to Wattman or other software. 5% on the slider roughly equates to 1675MHz across all tests.

9. You just beat the GTX1080 graphics score wth a power draw of 474 KW wheras the GTX1080 in your example is pulling  410 KW. ? The GTX score was beat at a slightly lower Watt at the wall; I've since then no longer included the GTX for any testing/comparison. Again, this is in Time Spy. Results will inevitably be different based on optimizations for either platform.

10. Based on the numbers you achieved:
The RX Vega is now pulling 15% more power than the GTX 1080 in this test to achieve the same performance in the test.

11. What is your opinion on the Undervolt you have applied. Why do you think this is not the default out of the box setting for your card? Measurements (A/W - power consumption) are higher on comparable default settings. You can't really tell what the stock settings are unless you 1: set the setting (power saving, balanced, turbo), and then switch to a custom setting. This seems to show the parameters for the last setting used (power saving, balanced, turbo). Manually using these settings but adjusting voltage consequently give different performance numbers (usually slightly higher), and a lower wattage.
12. Do you know anyone else with a Vega who could try the same experiment with their RX Vega 64 to see if they could get similar results? I have access to 5 systems with RX Vega cards. Everything I've done here can be duplicated within a small margin of error.

Other notes:

1. out of 8 cards tested (all Radeon RX Vega 64), 0 can reach stability with Mem CLK set above 1105MHz, at any possible setting.

2. WattMan software is currently not the most stable which may result in incomplete or inaccurate data

3. Radeon RX Vega 64 in its current state does not allow for much Overclocking headroom, whether straight OC, or via undervolting. Unless something changes with the BIOS options in the near future, I do not see this changing. Running the RX Vega 64 in "Power Saver mode" or "Balanced Mode" will give the average user the best performance out of the box: decent performance and not ideal but still better power consumption. You can undervolt (GCLK) up to about 1 or 2A (or ~12 - 24W) before you start noticing a significant drop in Gpu frequency.

0 Kudos
colesdav
MVP

Re: Oscar Mike

RE: I think I'm going to hold off for the time being until I have a better idea of driver stability and how it will impact these cards (or until my warranty/exchange period ends).

Sure I do not blame you. Last thing I would want to happen is that you kill/degrade any of your RX Vega cards trying out these Undervolting / Overclocking experiments. The GPU Core power running at 1675GHz might have been pretty high for example.

Some RX Vega / Vega FE reviewers hit the same issue with setting the frequency on the Radeon Crimson GUI. I think that bug needs to get fixed before many people will start to look at Overclock and Undervolt again in any more detail.

Anyhow thanks again for the data.

I will get on with completing OpenCL Testing some time this week.
I received a response from Compubench this morning about the Compubench testing and results.
I will be opening 3 separate discussions about that though.
There are indeed some bugs / issues that need to be fixed when  running Compubench 2.0 on AMD cards.

Cheers.

0 Kudos
colesdav
MVP

Re: Oscar Mike

I forgot to mention. This might be important info to you if you are running TimeSpy DX12 at high overclock and start to get very high scores. 
If you watch Buildzoid video here: Ramblings about VEGA 2: BIOS modding, power play tables, 2GHz core clock and performance scaling - Y...
He mentions a glitch in the Vega when overclocking it running 3D Mark Timespy DX12 at 26:18/42:22 and completes the discussion at 28:20/42:22.

He says he thinks the new "primitive discard" - ( I think he actually might mean Draw Stream Binning Rasterizer ...) is misbehaving at higher overclocks.

He stated that "Vega starts to discard stuff that it really shouldn't be discarding".
He saw problems as follows when running 3DMark TimeSpy DX12:
.
(1). Display Cabinets in the TimeSpy DX12 benchmark started to have items in the Display Cabinet go missing / completely empty. 
(2). Parts of the Crystalline Structure in the first Graphics Test was missing.
(3) . Some of the floor decoration was missing / some parts of the floor wasn't rendering.
(4). In summary, parts of the benchmark start to dissapear as you start to overclock the card higher, leading to inflated 3DMark DX12 Timespy Scores.
(5). He thinks this will lead to interesting problem for HWBOT score submission.
(6). He does not believe that 3D Mark Timespy DX12 will be able to check for this issue as it looks like artifacting.

I do now if you were watching your benchmark runs at the time or if you mihght have seen this behavior and not mentioned it or maybe it only happens on Vega 56 cards or maybe he had a bad sample.

Cheers.

0 Kudos