Black screen, fans on full, crash - why?

localstarlight

Distinguished
Feb 12, 2016
13
0
18,510
I have a completely new machine with the following specs:

Intel i7-6850K (not currently overclocked)
Asus X99 Deluxe II
64GB RAM (4x16G CorsVengLPX DDR4 2400C14)
2 x EVGA 1080
1000W EVGA SuperNOVA Plat PSU

I've just been trying to play Just Cause 3, and three times in a row the game has suddenly gone to black screen (monitor says no signal), and the fans crank up to a loud maximum, while the audio continues playing for a while. I have waited to see if it recovers, but have had to do hard shutdowns each time. I am not currently overclocking anything.

I've been using GPU-Z to monitor the GPU temps, wondering if that might be the cause. Here is the data leading up to the most recent failure:

Card 1: https://plot.ly/~SteveWattsKennedy/6/
Card 2: https://plot.ly/~SteveWattsKennedy/7/

Does that look like the GPU is causing problems?

Does anyone know what could be causing this problem, and how I could fix it?

Or are there other tests I can do to try to figure out the cause of this?

Thanks so much for any help. Didn't think a brand new machine with top of the range hardware would be freaking out like this!

Update:
For some reason the data on those plot.ly links doesn't seem to be working. Here are the actual CSV files, if anyone actually wants to see them: https://we.tl/ZiCNrOE4fL
 
Solution
Right before crash on the first graph your GPU load is skyrocketing to 100 percent, something that even a single 1080 shouldn't do. I would recommend trying to use just one of them and see if you get a crash, then if you don't try the other one and vice versa. It is possible one of the cards is bad.
 
So I tried each card out, and it seems like the card I had in the first position does have a fault. This seems to be something EVGA are aware of, and are doing RMAs for the faulty cards:

Reddit thread about it: https://www.reddit.com/r/nvidia/comments/50rh24/gtx_10801000_evga_info_for_those_experiencing/
EVGA staff response on Reddit: https://www.reddit.com/r/nvidia/comments/50rh24/gtx_10801000_evga_info_for_those_experiencing/d77sf9f

So I will have to send the card in for a replacement - lucky I have two!

BUT, I'm confused/intrigued about what you say about the GPU load going to 100%, because even the card that doesn't crash is doing that.

Here is the plot from the card that crashed: https://plot.ly/~SteveWattsKennedy/12/

And here is the plot from the card that didn't crash (yet?!?!?): https://plot.ly/alpha/workspace/?fid=SteveWattsKennedy:10

Both cards shoot up to 98-100% GPU load while Just Cause 3 is running, and only dip when I enter a menu screen. Is this wrong behaviour?
 


While it does seem very odd to me that it would be at 100 percent, it is not necessarily an issue if everything is running properly. With it dipping in menu screens it seems to me "normal," although I certainly would not expect it. That said, I could very well be wrong, and it could also be GPU-z giving false readings. Regardless though, if everything is working then it doesn't really matter.
 
Solution