AMD 290x Crossfire crashes entire PC within 5 seconds

gothen1234

Honorable
Jun 27, 2012
45
0
10,530
Hello!

I have an R9 290x crossfire setup and as per usual it isn't working, only this time it is different, if you fancy yourself a bit of a challenge, I'll run through some details to save some time:

Power?
1300w EVGA G2 PSU, single 100Amp 12v rail so no issues here.

Cards?
The first 290x is a gigabyte, currently it is fitted with a G10 and Kraken x40
The second 290x is a reference, currently fitted with a G10 and a Corsair H90

VRM Cooling?
Both have more than adequate VRM cooling with gelid enhancement kits all round (don't reach beyond 80c in regular games)

Drivers?
I've tried 16.7.3, 16.8.2 and 16.8.3 crimson.


I will note down now that I've tested both cards, I put them both in at the same time, disabled xfire and fired up heaven on monitor 1, switched main displays and fired up heaven again on monitor 2.
I checked with GPU-z and Crimson and both cards were being 99/100% utilised. So tested separately but at the same time if that makes sense?


Ok, so I ensure crossfire is enabled in crimson, fire up Furmark, 240fps, clearly crossfire is working, flawlessly in fact! After 10minutes of Furmark I close it and load up Heaven.

I will describe the process that occurs, I'm thinking motherboard but others might disagree.

Heaven loads on monitor 1 and within 5 seconds the picture completely freezes, the other monitor however works fine and I can see GPU-z updating temps (I've tried with and without any monitoring software open btw) I can alt-tab and move the cursor on monitor 2 I can even use chrome if it happens to be open on monitor 2. However, if I try to attempt to close the benchmark/game on the taskbar then after 10 seconds both screens lock up and I can't do anything, sound still continues to play but the image is completely locked, the same happens if UAC appears, same happens if I use CTRL + ALT + DELETE.


I couldn't do too much to figure out what was wrong with crossfire lastnight though, think I might have flicked the switch on the Gigabyte 290x because I couldn't even get it to boot into windows for 15 seconds before it black screened. For some reason, there is a switch on the gigabyte 290x that breaks the card 0___0.


I believe the motherboard could be to blame;

I have been perfectly able to run crossfire before now with 2x 7950's (use bridges)

Whenever I make a BIOS change, the system shuts off after saving changes, then I hear it's usual boot routine but the monitors don't receive a signal, it's like the graphics card hasn't been detected properly or something.
This happens with any graphics card and results in me standing up, flicking the power switch on the PSU, waiting till it's completely drained the cap, then flicking the switch again, going to the breaker box to flick the breaker, then returning and pushing the power button. Only then will it turn on again as normal.
This process which I went through 15-20 times lastnight has driven me a little bit insane so I do apologise if this post seems too to the point.

I only recently bought the reference 290x, I did test it with furmark on it's own for 15minutes or so.

The seller wants to know if I am ok with it and that it works etc, should I cut my losses and just send it back?
Or is it likely that something else is to blame? Such as the motherboard?

thanks to anyone who has spent the time to read this

thanks more to those who have any ideas :)


 
Update both the card's BIOSes(teh button on teh GB is either from a dual BIOS confing, and you should only be able to update one of teh BIOSes- look it up) and make sure you are running both monitors from the main GPU, not one each and not both on the "secondary" card. Also, what's your MB? A Bios update there would not hurt either.
 


I will update both card BIOS's, I am running both monitors from the same GPU in Xfire cause it disables the outputs of one card so there's no alternative, I just ran a monitor from each to stress test both separately.

I have an Asrock Z97 Extreme 6, I'm unsure if it is the latest BIOS version but I'll try to flash the newest one and see if that helps.
 
Hybrid CF is not a good idea at all. The weakness of one card can be multiplied with the weekness of the other one. Remove your OC, if any. If one of the card cannot run at the speed of the other, you will have problems.

I had 290x CF setup and it was working better than my actual 1080 gtx. Nvidia video driver crashing all the time in windows 10 at 4k and unable to differentiate my display... a damn nightmare of unstable drivers.

You will need to check your mobo with each card individually in the two slots you are using. Card 1 slot 1, card 1 slot 2, card 2 slot 1 and card 2 slot 2.

It seems actually like a power supply issue IMO. Check your cables and switch them to see if it is unstable.

So, as a summary, you need to faultfinding if the new card is bad, your mobo is damaged, your power supply is damaged. I am not thinking it is related to drivers or your OS.

Edit: use a single monitor for all this. Windows 10 and 7 are extremely unstable when you boot up from many different resolutions and switching card output. All my GTX 1080 problems are due from displayport to hdmi feed switch. Use a single monitor, a single resolution and a single card to troubleshoot.
 
make sure the GPU's are not overclocked. AMD had an issue where if they were overclocked using 2 or more monitors the first monitor would freeze and or fuzz out.

as the picture shows the main monitor will go batty and the second will be fine. So be sure you are running stock clock speeds.
 
Posting Spec:

i5-4690k @ 4.4GHz stable (currently running stock) under Be Quiet Dark Rock Pro 2
Asrock Z97 Extreme 6
4x 4GB Corsair Vengeance Pro 1866MHz (9-10-9-27) (16GB total)
PCI-E x1 TP-Link Wireless card
4x Be Quiet Pure Wings 2 140mm's
4x Akasa FN063 Viper 140mm's
Fractal Design Define R5
5x assorted size and brand 2.5" Hard drives
1x Crucial C300 SSD
EVGA SuperNova G2 1300w (Rated 9.5 on Johnny Guru)




I have the Gigabyte card clocked down to the same speeds as the reference card using just Crimson Overdrive.

Gigabyte:
1000Mhz core (1040 normally) says "-3.8% gpu clock"
1250MHz mem
0% power limit

Reference:
1000Mhz mem (1000Mhz normally)
1250Mhz mem (1250Mhz normally)
0% power limit

The Power Supply Unit is brand new, only received it yesterday too, it is a 1300watt single 12v rail 100amp PSU, still using my old PSU cables though because I'm lazy and they're also EVGA cables.

Gonna swap out all power cables to the: Mobo and GPU's (checking each one first with volt meter)

Reflash some VBios's, perhaps update them? Unsure where I'd be looking for these though.

Neither card is overclocked anymore, both cards are sufficiently cooled (<70degs C)

Going to test each card with: Furmark, Heaven, GTA V (hates unstable cards) for say 20minutes each with just my acer predator monitor connected via dport

Going to do those same tests again but in the other PCI-E Slot

Going to check the supplied voltages from the PSU

I'll get back with my findings
 


I'm running Windows 8.1, just flashed mobo BIOS to latest but hasn't fixed it.

Where would I find the latest VBIOS for each card?
 


I'm not sure why flashing the BIOS for the GPU was brought up because the only time this is necessary is when there is a major issue with a card and the only fix is in the BIOS. The reason you can not find them is because you do not need to flash the Video cards BIOS.
 
Ok finally! I am making progress.

I replaced the power cable to the motherboard with the one from the new EVGA PSU and when it turned on I was able to run heaven with crossfire solidly for a good 10-15minutes.

I only had 1 monitor plugged in to get this working. I plugged the other monitor in, then started the test again, again it ran fine for a good 10-15minutes before I turned it off.

Then the ultimate test! Restart the PC with both monitors connected and start heaven. Crashed again.

Currently only running 1 monitor again.

 
+gothen1234 Are you using displayport cables for these two monitors? If so, switch them out with two other displayport cables but use a different brand. There is a rare issue where displayport cables have a defect where one specific pin is supplying a voltage when it shouldn't be according to spec. Swap to a different brand of cable and tell me if the issue still occurs.

Also, what is your CPU temps during idle and load. I'm simply curious, as I rarely hear about people crossfiring two 290x's these days.
 


I am using a displayport cable for my acer predator and an HDMI to DVI-D for my dell. I can rule out the dport cable now by just using the dell monitor but I can't see it being an issue as it's worked fine with the single 290x I've had in here for the past 3months.

I had cpu temp problems with my previous 7950 xfire setup, this time around with the 290x's I figured I would need a new strategy so currently the back and top of the case are intakes and the front of the case is outtake.

CPU with my OC usually idles around 32-35c when in c states, fixed OC usually has it hang around the 40-45c range, load temps never normally exceed 75c under heavy load with prime and no more than 68c in game.
 
Update, so I tested all the cables for the new EVGA supply: Motherboard and 4x VGA power leads, all checked out despite some being wired differently on each end.

I replaced the mobo and vga power leads with the new EVGA leads and I made some progress in that Heaven ran for almost 20seconds this time, I had put my CPU OC back on again to check if it would help.

So I just tested it with stock BIOS settings, with the new leads Heaven ran for just over 2minutes but then crashed again, whole picture frozen and no way of recovering from that without flicking the PSU switch and rebooting.

Going to try now using just my dell monitor via HDMI and see if this yields any results.

thanks to everyone for the suggestions, still unsure on a solution as of yet but I'll keep updating this thread
 
Update, running with HDMI makes no difference.

Just to rule out software I disabled and closed all programs I normally run:
-Speedfan
-flux
-google drive
-gyazo
-hamachi
-origin
-skype
-fraps
-GPU-Z
-Core Temp

Managed to run GTA 5 with some hiccups, textures disappearing, map disappearing, etc
But VRAM usage was just over 4GB (according to GTA V, real-world I couldn't test without it crashing again)
Was able to play it for a good half hour and minimize it several times, change settings and close it
As soon as I setup GPU-Z, Core Temp and Speedfan to run Heaven once more it crapped out again within seconds.

Running Heaven again after a reboot with absolutely nothing else running resulted in another crash within a minute.

I'm slowly reaching the point where a windows reinstallation is my only available option, but I have only recently done this and I don't want to go through the hassle again.
 
quote" So I just tested it with stock BIOS settings, with the new leads Heaven ran for just over 2minutes but then crashed again, whole picture frozen and no way of recovering from that without flicking the PSU switch and rebooting."

this part says that default settings are not stable. have you stress tested your CPU, memory while just having 1 GPU in to endure that your system is stable to start with?
 


Sorry, this leads on from an earlier message. I got it working for 10-15minutes with stock settings, it didn't crash so I plugged in my 2nd monitor to see if that would make it crash, ran it again for another 10-15minutes, didn't crash so I restarted the system again, changing BIOS settings back to my OC with both monitors connected and it went back to crashing again. When I went back to 1 monitor, again on stock BIOS settings it continued to crash. This problem hasn't gotten any noticeable pattern.

Currently I'm doing a full uninstall and reinstall of the latest drivers.

I have a 40mv bump up on the System Agent voltage, not sure if this is helping as of yet but I read somewhere this resolves the 'black screen' issue I was having a whileback, so figured it can only be a positive influence.

As for whether stock clocks are stable, couldn't tell you other than when I've put it back to stock I've never had any BSOD or any weird glitchy stuff.

I've put about 20hours of work into my 4.4GHz OC and I know that is definitely stable, that being 20hours of tinkering, not 20 hours of stress tests.
 


This is a fair point, I'll put the old 750w in and see if the issue is still present. It should be ok to run this setup at stock, checked the power in and it doesn't exceed 640watts and that's with 2 instances of furmark and prime concurrently.
 


Ok, so I swapped out the PSU and I'm getting the same problem.
 
Update, Ok so...
Tried running 1 monitor using displayport, same problem
Tried running 1 monitor using hdmi, same problem
Tried both with OC settings, same problem
Tried both again with stock BIOS settings, same problem (though didn't crash until the 20second mark)
Updated BIOS on my Asrock Z97 Extreme 6, same problem
Downclocked the Gigabyte GPU to match the reference, same problem
All of that was done with the old cables with the new PSU

Using just 1 monitor over HDMI and using the new cables with the new PSU, same problem (though didn't crash until I rebooted)
Completely uninstalled the AMD drivers, then reinstalled, same problem
Changed the new PSU for the old one, same problem
Bumped up the System Agent voltage by 40mv, same problem
Closed all my software, everything, appears to run fine though haven't tested it enough to know for sure and if it is the case that I can't use any of my software to use xfire then screw this cr*p.
Downclocked both cards considerably, tried running heaven with all my software, same problem
Pulled all but 1 stick of RAM out, same problem

I've noticed 1 thing imparticular though, if I close EVERYTHING! I can manage to get heaven running long enough to complete a benchmark, I then open GPU-Z, I run the benchmark again and BOOM! It crashes immediately.
But, on the other hand, if I boot up my PC, close all monitoring software such as:
-core temp
-Speedfan
-GPU-Z
and leave everything else running, it still crashes immediately
It feels like if I work back from everything being closed and checked everything that works with crossfire and everything that doesn't work with crossfire I'll come to the conclusion that it makes no difference but I guess short of returning the GPU I don't have too many more options.

Going to swap the GPU's over
Going to leave it running heaven overnight with none of my software running to see how long it lasts
Going to check through what software doesn't like xfire if it hasn't crashed, else, going to consider software to not be a factor.
Going to make a clone backup of windows, reinstall a fresh copy, install everything again
and I guess if there's no more suggestions from this forum I'll message the seller about whether he's ok to wait for me to RMA a motherboard before I decide I can't keep it.

 
Final update, I give up.

With no software open it ran fine all night. When I started checking through what worked and what didn't work.
It was clear that anything that required the use of video or some form of video detection was not going to work.

On top of the software that seemed to cause it to crash it would randomly crash anyway, implying that somewhere amongst the 10 or so things I checked and ticked off as being A ok there is a program that will crash my entire pc.

There is no stability in this and how anyone else has managed to get crossfire setups working with 2x 290x's without using any additional software is beyond me (if my board isn't to blame).

Gyazo
GPU-Z
Flux
Remotr

All cause it to lock up my entire PC in case this thread is read by someone else with this issue.

I'm also going to RMA my motherboard anyway because of the Video detection glitch that is plaguing me, I may try again in the future but 2 days of troubleshooting isn't worth the 2 or 3 games that use xfire sufficiently enough to say: "WOW! this was worth every penny".


 
Has anyone ever typed out a good whole page of reply, then realised they hadn't logged in? Let's write this again...

Ok, so my brother allowed me to use his PC to test the 2 cards. I moved every part from my PC that was compatible to his to keep consistency; RAM, GPU's, SSD, PSU, WiFi card (he uses skylake so CPU was out of the equation)

It is the exact same story as before, no matter what I do there is always a system crash. Even if I RMA'd the motherboard it would still crash the entire PC just as it is doing to my brothers PC now.

Within 5 seconds of loading Heaven/any game, as soon as there is high gpu usage it will immediately crash

Whilst writing this I actually just realised; the other day when I installed the GPU's into my system, I hadn't completely taken it in that when I was playing the original dawn of war, it was using crossfire. It's when there is a high GPU usage that the issue occurs, does this open up any more ideas from anyone?

So anyway, my slight internal revelation aside. I swapped out my RAM for my brothers to see if it was RAM related, again, crashed within 5 seconds of loading heaven/any demanding game

Again, thought perhaps it was the wireless card, same story, crashes within 5 seconds....

So here is the real kicker to this entire ordeal, I didn't want to make the seller wait however long it takes to RMA a motherboard before getting back to him about whether the card worked or not. So I sent him a message on eBay telling him the situation, trying to get his advice/help, I told him that I don't know whether the card is completely working because it crashes immediately in crossfire, he came up with a good selection of suggestions but only stuff that had been mentioned here. Thing escalated a bit and he is saying it either works and he doesn't accept returns OR it doesn't work and I've broken his graphics card, he said he didn't want to wait for me to RMA a motherboard so that I could mess him around and he is also saying he will fight my claim to a refund and will win based on a previous customer who bought a 295x2 and didn't load the rails on his PSU properly, I fear I have similar issues in terms of hard evidence. No good deed goes unpunished it seems.

At this point I've spent the past 3 days slowly figuring out what is the cause of the issue and what isn't the cause of the issue. So far I've pretty much gathered from this test with my brothers PC and the previous things I did that one of the graphics cards is broken, nothing else really seems plausible at this point.

I don't have too much in terms of evidence to prove it is the reference 290x that was sold to me recently, I have a few things that kind of sway in the direction of a faulty card but nothing pressing like the thing just plain old not working;

GPU-z reports that the AMD card has a different brand of VRAM each time I had to reset my pc from a crash

GPU-z occasionally reports that the card only has a 32bit bus width and a 20GB/s bandwidth (with ulps disabled) following on from this I was able to get furmark working in xfire occasionally, one time it worked but I was only seeing just shy of 25fps.

When I did manage to get GTA V working with no other software, it was with old settings that predicted less than 4gb's of VRAM usage on a single card, yet when it was working in crossfire textures and the map and everything were popping in and out, disappearing completely then returning with low to high textures, models disappearing and reappearing, you get the idea.

One time when I loaded GPU-z with no crossfire enabled GPU-z reported that the AMD card was missing half of the usual information that is presented(I've been unable to get evidence of this since)

Most of these GPU-z problems I put down to GPU-z at the time or the motherboard screwing up with it, but there has been quite a few and I can't see it being just coincidence that it's reported random weird things from just the one card.

When the card arrived, it came with a G10 and AIO cooler but absolutely no VRM heatsinks on bank 1 or bank 2, just 4 v-ram heatsinks on the 4 v-ram chips underneath the fan, this probably should have been a big no no from the start, I have been testing this card with VRM heatsinks on. Is it worth testing it without the VRM heatsinks?

This whole experience has been nothing but stress(pun unintended)
 
Have you tried using the cards separately? If they work like that it is fine if not and the one you are using is the E-Bayed one that is fault just return it.

It can't be that hard now just try one card at a time.
 
Now that I've been able to disprove all of my hardware excluding the graphics cards, what is everyones opinion on the issue?

Hardware or Software?

Software from my post seems plausible as the issue but then why is no one else complaining about this problem? Surely someone uses software with video monitoring/detection? It can't just be put down to every bit of software I have is causing the problem. Even after singling out what worked and what didn't it still crashed.