[SOLVED] Brand new system of a friend having random black screen crashes with fans going 100%

Dec 25, 2018
2
0
10
Hello there;

One month ago I helped a friend pick out parts for his gaming system:


  • CPU: AMD Ryzen 5 2600X
    GPU: Gigabyte Radeon RX Vega 56 Gaming OC 8GB
    MOB: Gigabyte B540 Aorus Pro
    SSD: Adata XPG Gammix S11 480GB
    HDD: Toshiba DT01ACA300 3TB
    PSU: Seasonic Focus Plus Gold 750W
    CPU Cooler: Gelid Tranquillo Rev. 4 B
    Case: Sharkoon DG7000-G RGB
    Windows 10 Home x64

However, at seemingly random moments his brand new system will crash into a black screen (with audio loss) and the fans will all start to blow very loud at 100% at this point, until I turn off the power using either the power button or the PSU I/O button. The strange thing is that this crash has even occurred on idle loads on several occasion, though it can also happen when he is gaming.


Even though I had already stressed all components and checked their temperatures individually I checked them all again and even ran some FurMark combined with Prime95 to stress the whole system. GPU doesn't go above 76 degrees Celsius / 167 degrees Fahrenheit during stress testing which is well below the max temperatures. The CPU is running a bit hot when facing Prime95 most temperature extensive tests, about 81.5 degrees Celsius max but this should not be a problem for a second gen Ryzen 5. During one of the stress tests a black screen crash with 100% fans occurred after 20 minutes of stressing but the problem is it also occurs when under no stress at all, for example when idle on desktop.

Memtest86+ is making multiple passes without ever encountering any problems at all.

Unfortunately there are no BSOD errors or mini-dumps to be found. I have checked Event Viewer thoroughly the past crashes and on several occasions, there we no entries up to half an hour prior to the crashes. When there were events they would not always be there the next crash and often related to pretty standard DCOM bug events so I don't think these could be related. I expected to find more in Event Viewer around the times of the crashes, very strange there would sometimes be no entries at all prior to crashing.

I've spent over 50 hours already trying to fix this strange issue but I'm a bit dumbfounded on what to try next.

So far I've tried:


  • *Reinstalling Windows 10
    *Changing the PSU connection to GPU from two 8 pins by daisy chain to two separate 8 pins without daisy chain.
    *Tweaking bios settings for increased stability such as disabling AMD Cool'n'Quite etc.
    *Many hours of googling and looking through forums for similar problems.
    *Setting a different minimum state for the memory clock of the GPU because the previous minimum state was deemed unstable by some others with similar problems.
    *Increasing the power limit for the GPU because this worked for some others with the RX Vega 56 and black screen crashes.
    *Going through event viewer and looking for related events.
    *Checked for BSOD mini-dumps.
    *Extensively stress tested temperatures and stability both after building and prior to writing this post.
    *Several clean installations of chipset and GPU drivers.
    *Flashed BIOS of motherboard and GPU to the latest releases
    *Tweaking Radeon Global Settings.

I'm almost starting to consider an RMA of some parts but due to the random crashing its quite hard to pin point the origin, especially without bluescreens, minidumps or other logs.


Any advise you might have on what to try next would be much appreciated :)
 
Solution
Try tweaking the motherboard fan settings. If they are going to 100%, there's likely a reason for that. It could be the GPU or CPU drastically overheating, but only for a short time, and since your motherboard really doesn't want that to happen, the speed increases massively. You did say that temps are okay, but this can be the result of a faulty CPU or GPU (this is likely why you wont get an error code or high temp readings, as the CPU doesn't have time to show you). Consider trying both in another system if you can, or RMA both of them. I know RMAs are a pain and take a long time, but it's worth it in the long run for better understanding your rig. In the BIOS you can see at what temperature the fan speed increases.

In the case of...
Dec 25, 2018
2
0
10


That's exactly my thought this PSU came very highly recommended and is well above the systems watt requirements. Unfortunately I do not have such a high watt PSU to replace it with but thank you for your recommendation I will keep it in mind. I do wonder why leds and fans stay on after crashing if the PSU was faulty. I also read about others with the RX Vega 56 who replaced their PSU without any changes to the crashes which is why I am hesitant to go and find another 750 watt PSU.



Yes I do it has been flashed to F4 which is the most recent. What makes you think its the motherboard, this type of crash?

I'm going to try and reproduce the crash with RGB Fusion disabled...



Also did a slight undervoltage clock of the RX Vega 56 and noticed target temp was 75, changed it to 70. Now running stress tests, but temp hasn't changed much (to 74)
 

Shock34

Reputable
Nov 1, 2015
127
0
4,710
Try tweaking the motherboard fan settings. If they are going to 100%, there's likely a reason for that. It could be the GPU or CPU drastically overheating, but only for a short time, and since your motherboard really doesn't want that to happen, the speed increases massively. You did say that temps are okay, but this can be the result of a faulty CPU or GPU (this is likely why you wont get an error code or high temp readings, as the CPU doesn't have time to show you). Consider trying both in another system if you can, or RMA both of them. I know RMAs are a pain and take a long time, but it's worth it in the long run for better understanding your rig. In the BIOS you can see at what temperature the fan speed increases.

In the case of temp problems, be sure the computer is ENTIRELY clear of dust, and has a fresh modest amount of thermal paste on both the GPU and CPU. If this violates your GPU warranty, then don't do it.

This seems like a motherboard issue to me, as that's what's in charge of GPU connectivity and fan speed. If your screen is going black, it's because the CPU or GPU has failed, but like another user has said, It could be a lack of power because of a faulty PSU. In this case, I would RMA the PSU to be sure it's okay. If you have a friend that also games you can try your PSU in his PC if he doesn't mind. Once again, RMAs are a pain in the ass, but it's far better than living in fear of losing your friend's precious parts, since when PSUs die they like to take the rest of the rig with them.

Make sure the card and CPU are running at standard clocks, the same for the RAM. You'd be surprised how much unstable ram can affect your PC. To measure this, run a prime-95 test focused on RAM for around an hour. If you get just one error, the clock is unstable. You'll have to do some tweaking to fix this. I am aware you have tried prime95, but it's best to repeat the test multiple times, as the error can pop up whenever it wants.

The CPU fan may be faulty or plugged into the wrong header, for example in a system fan terminal. Since system fans run at different speeds and inconsistently based on the CPU, this could be the culprit. I highly doubt it though, just a thought.

For a last solution from me, make sure all fan headers and front-io headers are fixed in the correct position. I had a similar problem where my fans would speed up and the system would reset without warning or bluescreen. I resolved the issue by identifying that the reset button wire was placed in the power-on header! Stupid, but it saved my bacon and was the last thing I thought it could be.

Best of luck for your friend!
 
Solution
Jul 4, 2019
1
0
10
Hello there;

One month ago I helped a friend pick out parts for his gaming system:


  • CPU: AMD Ryzen 5 2600X
    GPU: Gigabyte Radeon RX Vega 56 Gaming OC 8GB
    MOB: Gigabyte B540 Aorus Pro
    SSD: Adata XPG Gammix S11 480GB
    HDD: Toshiba DT01ACA300 3TB
    PSU: Seasonic Focus Plus Gold 750W
    CPU Cooler: Gelid Tranquillo Rev. 4 B
    Case: Sharkoon DG7000-G RGB
    Windows 10 Home x64
However, at seemingly random moments his brand new system will crash into a black screen (with audio loss) and the fans will all start to blow very loud at 100% at this point, until I turn off the power using either the power button or the PSU I/O button. The strange thing is that this crash has even occurred on idle loads on several occasion, though it can also happen when he is gaming.


Even though I had already stressed all components and checked their temperatures individually I checked them all again and even ran some FurMark combined with Prime95 to stress the whole system. GPU doesn't go above 76 degrees Celsius / 167 degrees Fahrenheit during stress testing which is well below the max temperatures. The CPU is running a bit hot when facing Prime95 most temperature extensive tests, about 81.5 degrees Celsius max but this should not be a problem for a second gen Ryzen 5. During one of the stress tests a black screen crash with 100% fans occurred after 20 minutes of stressing but the problem is it also occurs when under no stress at all, for example when idle on desktop.

Memtest86+ is making multiple passes without ever encountering any problems at all.

Unfortunately there are no BSOD errors or mini-dumps to be found. I have checked Event Viewer thoroughly the past crashes and on several occasions, there we no entries up to half an hour prior to the crashes. When there were events they would not always be there the next crash and often related to pretty standard DCOM bug events so I don't think these could be related. I expected to find more in Event Viewer around the times of the crashes, very strange there would sometimes be no entries at all prior to crashing.

I've spent over 50 hours already trying to fix this strange issue but I'm a bit dumbfounded on what to try next.

So far I've tried:


  • *Reinstalling Windows 10
    *Changing the PSU connection to GPU from two 8 pins by daisy chain to two separate 8 pins without daisy chain.
    *Tweaking bios settings for increased stability such as disabling AMD Cool'n'Quite etc.
    *Many hours of googling and looking through forums for similar problems.
    *Setting a different minimum state for the memory clock of the GPU because the previous minimum state was deemed unstable by some others with similar problems.
    *Increasing the power limit for the GPU because this worked for some others with the RX Vega 56 and black screen crashes.
    *Going through event viewer and looking for related events.
    *Checked for BSOD mini-dumps.
    *Extensively stress tested temperatures and stability both after building and prior to writing this post.
    *Several clean installations of chipset and GPU drivers.
    *Flashed BIOS of motherboard and GPU to the latest releases
    *Tweaking Radeon Global Settings.
I'm almost starting to consider an RMA of some parts but due to the random crashing its quite hard to pin point the origin, especially without bluescreens, minidumps or other logs.


Any advise you might have on what to try next would be much appreciated :)
idk but upon searching the exact model of your GPU, it turns out that that exact model have many problems. I have seen reviews from amazon and newegg. It's like a 80:20 ratio of lemon unit to working. I have read a review that it's his 8th RMA on Gigabyte and up to that, the card still have some problems. Maybe RMA is the solution or if you're lucky enough, gigabyte might replace it with other brand or maybe other model.