[SOLVED] Vega 64 can't OC HBM2 memory above 950mhz when 1150 used to be stable.

xXxREBELOxXx

Distinguished
Jul 23, 2015
88
0
18,660
Hey all as the title states I used to be able to have 1150 mhz as a rock stable 24/7 memory clock for years, and as overclocking the core is frustratingly useless that was the one thing my Vega 64 Rog Stix OC had going for it. I reinstalled windows while diagnosing an issue I had last year and that was around the time I ran the card stock for a while since I barely had time for games. The past few months I started clocking things up and playing games again and it will always crash at some point without warning/artifacts if I touch the memory clock at all, how long until it does is completely random and isn't game dependent. If it helps I believe the issue only came about after using newer than mid 2021 drivers though I don't really recall if it happened after a certain driver version, have tried at least 5-6 different ones all progressively newer with no change.
I set my clocks to 950 (5mhz increase) and after a while (weeks maybe?) I noticed it still hadn't reset so it clearly didn't crash, I set it to 960 or 980 and today when trying rFactor it crashed in 5 minutes (driver not game as usual). I think it's software related and I haven't tried DDU because it's a bother and I have a pretty new windows install. I also didn't install AMD chipset drivers as I had with my previous OS but I am unsure how that could be related to this. I have also redone the paste with Thermal Grizzly Kryo and Thermal Grizzly 3mm pads for VRMs as the original didn't fully cover the VRMs. There was no issues after that though and the Core, Memory, and VRMs all ran much cooler afterwards.

I always need to restart my pc even though AMD driver does recover, it'll still be bugged out, sometimes to the point that it'll continually black screen then recover every 30 seconds until a restart.

It's super annoying and I hate running at 945mhz when my card is capable of 1150mhz stable and 1180mhz bench stable. Thanks all and hopefully someone knows what's going on or can help me figure this out.
 
Solution
Well, My Vega 64 was the same model at yours, I found I could leave the stock 1630mhz the card came with and under volt and got better performance, I did end up at 1680mhz on the core at 1.025v, I scored more at 1680mhz vs 1740mhz in firestrike, time spy and even heaven, even 1630mhz beat 1740mhz, I started to loose performance at anything more than 1680mhz and my wall on that card was 1.1v anything higher I seem to loose performance even leaving it stock, some sort of power limit somewhere.

I was able to get to 1100mhz on the memory, couldn't really go higher without some strange artifacts here and there or some games just crashing.

The Vega 64 is a weird card, it does like to be undervolted more than being overclocked, 1.2v is...
Well, My Vega 64 was the same model at yours, I found I could leave the stock 1630mhz the card came with and under volt and got better performance, I did end up at 1680mhz on the core at 1.025v, I scored more at 1680mhz vs 1740mhz in firestrike, time spy and even heaven, even 1630mhz beat 1740mhz, I started to loose performance at anything more than 1680mhz and my wall on that card was 1.1v anything higher I seem to loose performance even leaving it stock, some sort of power limit somewhere.

I was able to get to 1100mhz on the memory, couldn't really go higher without some strange artifacts here and there or some games just crashing.

The Vega 64 is a weird card, it does like to be undervolted more than being overclocked, 1.2v is entirely to high for that card at stock in my option, everyone could get sub 1v pretty easy on that thing.

I'd definitely research a little on undervolting on the Vega 56 and 64, its interesting.
 
Solution
Well, My Vega 64 was the same model at yours, I found I could leave the stock 1630mhz the card came with and under volt and got better performance, I did end up at 1680mhz on the core at 1.025v, I scored more at 1680mhz vs 1740mhz in firestrike, time spy and even heaven, even 1630mhz beat 1740mhz, I started to loose performance at anything more than 1680mhz and my wall on that card was 1.1v anything higher I seem to loose performance even leaving it stock, some sort of power limit somewhere.

I was able to get to 1100mhz on the memory, couldn't really go higher without some strange artifacts here and there or some games just crashing.

The Vega 64 is a weird card, it does like to be undervolted more than being overclocked, 1.2v is entirely to high for that card at stock in my option, everyone could get sub 1v pretty easy on that thing.

I'd definitely research a little on undervolting on the Vega 56 and 64, its interesting.
Nice! yeah I also found 1.1v to be the max voltage before the card loses performance in core clocking and I believe my core clock was best around 1660-1680 range for performance as well. I since haven't cared too much for core clocks because I find auto clocks & voltage with good temps will use similar voltages and net me around 1600mhz under load. The reason for the performance degrading over 1.1v is that the max wattage draw from the card is 330W (50% PL), 1.1v x 300Amps (bios max) is 330W with a 50% PL. Asus is the only company iirc that bios locks the VRMs to 330W 300Amps, so even power play tables would not exceed this, which sucks because I would have likes to go higher since my cooling allowed for it. These cards are super finicky though I agree, I don't like boost clocks, miss the days of static max clocks lol

At this time the only thing I'm concerned about is memory clocks which seemed to be broken for me, I mean has anyone heard of a vega 64 that couldn't go over 950mhz? Let alone one that was once 24/7 stable with 1150mhz? It's so weird and I know it's software related, honestly might be the drivers after mid 2021 when they also changed registry Power Play Tables so you couldn't mess with the bios. I might try an older driver but, I'd like to see if there's something someone knows about or had this issue and fixed it first.
 
Nice! yeah I also found 1.1v to be the max voltage before the card loses performance in core clocking and I believe my core clock was best around 1660-1680 range for performance as well. I since haven't cared too much for core clocks because I find auto clocks & voltage with good temps will use similar voltages and net me around 1600mhz under load. The reason for the performance degrading over 1.1v is that the max wattage draw from the card is 330W (50% PL), 1.1v x 300Amps (bios max) is 330W with a 50% PL. Asus is the only company iirc that bios locks the VRMs to 330W 300Amps, so even power play tables would not exceed this, which sucks because I would have likes to go higher since my cooling allowed for it. These cards are super finicky though I agree, I don't like boost clocks, miss the days of static max clocks lol

At this time the only thing I'm concerned about is memory clocks which seemed to be broken for me, I mean has anyone heard of a vega 64 that couldn't go over 950mhz? Let alone one that was once 24/7 stable with 1150mhz? It's so weird and I know it's software related, honestly might be the drivers after mid 2021 when they also changed registry Power Play Tables so you couldn't mess with the bios. I might try an older driver but, I'd like to see if there's something someone knows about or had this issue and fixed it first.

Yeah I've seen some people not able to get more than 1050mhz but every single one seems to go over 1000mhz at least. Maybe a driver issue or something, might have to run DDU, or even just and out right bad driver is my only guess at this point.
 
I was thinking driver updates could be the issue.
Rolling back driver would likely cost as much performance
As rolling back the over clocks will.

There is only one way to find out! Start with the most recent driver that worked well.
That is if you remember, if not start at 20.10.1 and go from there.

I don't see the memory degrading at60c either, but it can happen. Maybe add more voltage to the ram?
 
Okay so it's been a bit, took me a few months since the post to figure out what's going on, it is indeed the newer drivers and not my card so @spentshells you are right. Now what AMD specifically did was change the ULPS for Vega in an update, what that did was make your overclock unstable and make no noticeable difference in power consumption. I looked in the registry to turn it off but, since AMD patched the bios modding to make it harder, the ULPS must have been affected because I was not able to fix it with the registry. I did however find a solution after continuedly testing, I set the minimum memory clock to 800mhz, what this does is not allow the card into ULPS ever and it's beautiful. Able to hit my previous clocks, though I'm running a mild 1640 core and 1120 memory so I can run lower voltages. So I got my 10% performance increase back lol.

Thanks everyone for helping out, and hopefully this post helps some other classy individuals with a Vega 56/64!