Question R9 295x2 Issue - One Card Being Utilized

Mar 31, 2019
6
0
10
0
Hey all, I've got a Sapphire r9 295x2 card that I need some advice on how to fix. The card has undergone fairly moderate use and is a few years old, as a preface. Right now, I have a couple issues with it:

1) Progressively worse overheating that causes immediate PC shutdown, as if it were unplugged. The card has little other simultaneous issues: no artifacting and only a couple BSODs in the past, not recently in tandem with the overheating, with driver failures that did not correlate to new drivers being installed. I can try to grab more info on this if relevant (would need instructions on how). Today was the worst, idle temps were 65C on the first card and 45C on the second - more on that below. When loading a simple 1080p stream, temps would spike up to 80C+, then the PC would shut off. Second card remained inactive, see 2). I am in the process of getting more compressed air to clean out the dust as there is some present. With these temps in mind, I'm wondering if I should reapply thermal paste as it looks like there is a fair amount left but not covering the entire core on both gpus.

2) Only one of two of the gpus appears to be working when idle and when under any amount of load, for a good chunk of time now (I don't use the card heavily anymore, so this hasn't been a major issue - both functioned fully in the past. I have mined some with this card, the moment one of them stopped behaving properly was during this time, I recall temperatures around 85-90C for mining, no forced shutdowns ever occurred during this period). Both gpus are recognized in device manager as functioning properly. Crossfire is on, have not tried turning frame pacing off yet. Wattman (adrenaline 1.9.3.3 - previous versions no difference, maybe a version before 1.7.x would work? - not sure where to get this) shows both gpus running with the same settings, 1250mhz memory and 1018mhz core. The first GPU uses exactly that memory, and functions properly in any scenario. The second gpu sits at 300mhz memory, 150mhz core speed (? Might have reversed the two values) and at 45C-ish idle as mentioned earlier. Have tried adjusting settings within wattman to give preference to the second gpu, no change. Tomorrow after I do a preliminary cleaning of the card without thermal paste, I'll try reseating it with one of the two 8pin PCI-e cables in a different port on my PSU, as I read that may be a possible fix for this issue.

I want to see if just a basic cleaning out of the card without more paste, a switch of PCI-e ports on the PSU, and potentially a previous version of Wattman/drivers will fix the issues I'm having. I'm worried they just seem a bit extreme and that the card is dying, as I definitely don't have the funds for another GPU. Please let me know your thoughts on these issues.

Specs:
Windows 10 64bit
AMD Ryzen 7 1700, 3.5ghz @ 1.27V
G.Skill Ripjaws 2x8 GB DDR4
Seasonic Focus+ Gold 850W PSU
MSi B350 Gaming Pro MB (BIOS is current)
PNY 120GB SSD
 
Last edited:

cdrkf

Honorable
Hey all, I've got a Sapphire r9 295x2 card that I need some advice on how to fix. The card has undergone fairly moderate use and is a few years old, as a preface. Right now, I have a couple issues with it:

1) Progressively worse overheating that causes immediate PC shutdown, as if it were unplugged. The card has little other simultaneous issues: no artifacting and only a couple BSODs in the past, not recently in tandem with the overheating, with driver failures that did not correlate to new drivers being installed. I can try to grab more info on this if relevant (would need instructions on how). Today was the worst, idle temps were 65C on the first card and 45C on the second - more on that below. When loading a simple 1080p stream, temps would spike up to 80C+, then the PC would shut off. Second card remained inactive, see 2). I am in the process of getting more compressed air to clean out the dust as there is some present. With these temps in mind, I'm wondering if I should reapply thermal paste as it looks like there is a fair amount left but not covering the entire core on both gpus.

2) Only one of two of the gpus appears to be working when idle and when under any amount of load, for a good chunk of time now (I don't use the card heavily anymore, so this hasn't been a major issue - both functioned fully in the past. I have mined some with this card, the moment one of them stopped behaving properly was during this time, I recall temperatures around 85-90C for mining, no forced shutdowns ever occurred during this period). Both gpus are recognized in device manager as functioning properly. Crossfire is on, have not tried turning frame pacing off yet. Wattman (adrenaline 1.9.3.3 - previous versions no difference, maybe a version before 1.7.x would work? - not sure where to get this) shows both gpus running with the same settings, 1250mhz memory and 1018mhz core. The first GPU uses exactly that memory, and functions properly in any scenario. The second gpu sits at 300mhz memory, 150mhz core speed (? Might have reversed the two values) and at 45C-ish idle as mentioned earlier. Have tried adjusting settings within wattman to give preference to the second gpu, no change. Tomorrow after I do a preliminary cleaning of the card without thermal paste, I'll try reseating it with one of the two 8pin PCI-e cables in a different port on my PSU, as I read that may be a possible fix for this issue.

I want to see if just a basic cleaning out of the card without more paste, a switch of PCI-e ports on the PSU, and potentially a previous version of Wattman/drivers will fix the issues I'm having. I'm worried they just seem a bit extreme and that the card is dying, as I definitely don't have the funds for another GPU. Please let me know your thoughts on these issues.

Specs:
Windows 10 64bit
AMD Ryzen 7 1700, 3.5ghz @ 1.27V
G.Skill Ripjaws 2x8 GB DDR4
Seasonic Focus+ Gold 850W PSU
MSi B350 Gaming Pro MB (BIOS is current)
PNY 120GB SSD
So to your first part about overheating- it's worth changing the thermal paste- if the card is a few years old the paste may well have dried out a bit so even if there appears to be plenty it might not be transferring heat efficiently to the heat sink hence the card will overheat.

The other thing you can try doing to keep heat under control is to undervolt the core (i.e. using the overclocking software, keeping speeds at stock settings try dialing back the gpu voltage as low as it will go with the gpu still being stable in games- lower voltage really helps bring heat down).

With respect to only one gpu being used- in most cases that is normal on a dual gpu card. On the desktop, or in any title without crossfire support only one of the two gpu's gets utilised. You have to be running a 3D application / game that has explicit crossfire support for the second gpu to kick in. As a test, try firing up something you know support crossfire (e.g. 3D mark) and you should see the second gpu kick in- it will also be very apparent if it's working on not based on your score (could be a case of wattman not reporting correctly). The sad thing is many of the latest games no longer have crossfire support as dual gpu's have been pretty much dropped since DX12 starting to gain traction (in DX12 the game developers can support multi gpu's directly themselves, Crossfire is only for DX11 titles).
 
Mar 31, 2019
6
0
10
0
So to your first part about overheating- it's worth changing the thermal paste- if the card is a few years old the paste may well have dried out a bit so even if there appears to be plenty it might not be transferring heat efficiently to the heat sink hence the card will overheat.

The other thing you can try doing to keep heat under control is to undervolt the core (i.e. using the overclocking software, keeping speeds at stock settings try dialing back the gpu voltage as low as it will go with the gpu still being stable in games- lower voltage really helps bring heat down).

With respect to only one gpu being used- in most cases that is normal on a dual gpu card. On the desktop, or in any title without crossfire support only one of the two gpu's gets utilised. You have to be running a 3D application / game that has explicit crossfire support for the second gpu to kick in. As a test, try firing up something you know support crossfire (e.g. 3D mark) and you should see the second gpu kick in- it will also be very apparent if it's working on not based on your score (could be a case of wattman not reporting correctly). The sad thing is many of the latest games no longer have crossfire support as dual gpu's have been pretty much dropped since DX12 starting to gain traction (in DX12 the game developers can support multi gpu's directly themselves, Crossfire is only for DX11 titles).
Alright, I'll report back with changed paste/undervolted core if needed Thursday.

The most notable thing I have to point out about the dual gpu usage is that when running mining programs or Folding@Home, the second gpu underperforms extremely. It's possible for me to just use the second gpu to mine/fold while using the first for everything else desktop - even while doing this, the second gpu does very little (no difference from its performance when using both at the same time). If I just fold with the first one, then the second card idles while the first folds at the same level it always has while taking on all other desktop tasks. Beforehand, both gpus performed equally when stressed. It's like the second one got capped at 10-20% of its previous performance, so even when I do try using it explicitly, it doesn't do much. I feel like this is a driver issue since it started happening suddenly one day, and hasn't recovered since then with any newer drivers. I remember it occurring when I switched to a newer version of Nicehash back in mid 2018, where before that period on older versions of Nicehash (and the same version of Folding@Home) both gpus did equal work. What I'm not sure about is if I updated the AMD Radeon drivers at this time as well, but if one of the programs updated and the other didn't while the gpu underperforms on both, that seems to point toward the drivers or the gpu itself.

I'll try running 3D mark once I can safely test temperatures again to see if the same thing happens. I'm fairly certain it will, since I have already tried explicitly using the second gpu (it's always the same one that underperforms, they don't switch functionality) with these programs.
 
Mar 31, 2019
6
0
10
0
Ok, after much frustration with small screws and putting everything back together, my problem is now worse in regards to the overheating issue. I think I know what is wrong though - the closed loop cooling system isn't cooling the first gpu. The cooling loop's fan output is cold, but the GPU itself continues to overheat (while the second one stays idling at 40C). When I first boot up the PC, the cooling system immediately is loud with the sound of liquid circulating/rushing and fast fan speed. How do I fix the cooling loop to reach the first GPU?

Edit: It seems like I'm out of luck. All of the waterboards made for 295x2's are discontinued, and none are on sale that I could find (besides a sketchy one from China) I'm extremely disappointed, the only way I could continue to use this card is either through fixing the pump or macgyvering some other setup - I wouldn't even know where to begin.
 
Last edited:
Mar 31, 2019
6
0
10
0
B series mobos don't support Crossfire. This is an odd situation with the 2 GPUs on a single card (I thought it presented itself as a single GPU?) but that might be your problem.
I don't recall seeing if MSi claimed crossfire support, but you're right, I don't see it in the BIOS. The card doesn't present itself as a single GPU now, I'm not sure if it used to? Maybe the drivers changed how it appears to the system? Because both GPUs definitely worked fully up until mid 2018ish.

Regardless, I have no use for this GPU if I can't fix the Asetek pump malfunctioning.
 

cdrkf

Honorable
I don't recall seeing if MSi claimed crossfire support, but you're right, I don't see it in the BIOS. The card doesn't present itself as a single GPU now, I'm not sure if it used to? Maybe the drivers changed how it appears to the system? Because both GPUs definitely worked fully up until mid 2018ish.

Regardless, I have no use for this GPU if I can't fix the Asetek pump malfunctioning.
I don't think the issue is crossfire, because from the motherboard side you aren't running Crossfire (you are only using once pcie slot as it's a single card).

With regards to the cooling problem, I believe the pump system used on those cards was produced by Coolermaster (AMD have a partnership with them), so it might be worth contacting them to see if you can get a replacement pump. The other port of call would be try try AMD support- the R9 295 is a beastly card so would be a shame to not be able to use it due to the cooler!
 
Mar 31, 2019
6
0
10
0
I don't think the issue is crossfire, because from the motherboard side you aren't running Crossfire (you are only using once pcie slot as it's a single card).

With regards to the cooling problem, I believe the pump system used on those cards was produced by Coolermaster (AMD have a partnership with them), so it might be worth contacting them to see if you can get a replacement pump. The other port of call would be try try AMD support- the R9 295 is a beastly card so would be a shame to not be able to use it due to the cooler!
It shouldn't be crossfire that is causing it. The best way I can describe it is as if one update of the drivers changed the way the card is viewed/utilized entirely. I don't know if it was a driver update though, or if it was something else, or an actual hardware malfunction due to mining use. I know that after that aspect changed and I stopped mining, whenever I would do any folding and start encountering the overheating issue (this was the very start of it, only occurring at full load) the system would shut off and Wattman would restore to default settings.

I realized after I reinserted the card into the mobo that it was somewhat loose before, such that it could wobble just a bit in the mobo pcie slot. I was hoping that having it stable in the slot would improve the conditions, but it did not. I have some very very faint hope that I just suck at reseating the heatsink to the GPU, or that because I wiped away previous thermal paste from a bigger radius on the heatsink than the GPU's surface area, I should've applied a layer back onto the pump as well. More research suggests that this is the heatsink (which comes from the aforementioned Asetek, not Coolermaster) not interacting with the GPU, as someone else had my same issue in 2017 but was unable to find a replacement cooling system for the card through AMD or Asetek, which is my situation. I honestly think I'm screwed here. Is it possible that a very slightly off reseat would cause complete contact failure for the pump? It just seems unlikely to me, unless you really need a good amount of force between the heatsink and the GPU. I loosened each screw by half a turn, since I read that too much pressure would cause issues as well.
 

ASK THE COMMUNITY