[SOLVED] Graphics card or MoBo issue?

Apr 11, 2020
6
0
10
For the past month i have been having some serious problems with my computer. I use it for Blender, Unreal Engine, and Adobe programs.

Initially UE4 was crashing with a DXD error, which is my graphics card, (Geforce GTX 1060 6Gb) but then i started getting a lot of BSODs and other crashes. I took it to a computer firm, who ran stress tests on everything, and they said it was a faulty RAM module i had in. Took it out, and the crashes stopped, but UE4 still wont load. On top of this, Blender would crash on startup. Had similar issues in the past, and refreshing the OS sorted it, so tried that, but that kept failing as well, so figured i would do a format.

Before i did that, i ordered a new graphics card to try, an AMD 570 8Gb. Blender stopped crashing on startup, but the screen kept freezing, going black, then returning to normal, with the error saying display driver stopped working.

Did the format, and reinstalled windows 10. The black freezing error returned once AMD card was installed. Used DDU and put my Nvidia card back in. The black crashes stopped straight away, but now Blender, UE4 and Photoshop crash on start up. If i go into device manager and disable the Nvidia driver, Photoshop opens and works fine (apart from 3d settings). As soon as i enable to the driver again, it crashes the program straight away.

I've had 2 people look at it who cant say what the issue is. I really need this sorting as i am losing money while it isnt working, and would appreciate any advice you can give.

Specs:
Windows 10 64bit
Gigabyte B450m Ds3h motherboard
Ryzen 7 1800x CPU
32Gb Vengeance Corsair DDR4 Ram
Nvidia GTX 1060 6Gb Graphics card
600w Thermaltake TR2
240Gb Samsung M.2 SSd
1Tb sata Hd
Dual hdmi monitor set up
No overclocking on anything
 
Solution
If DDU + vanilla driver install did not help (you mentioned to even reinstall the whole OS),
I would try running another video memory test off this list - it might be a fried/bad graphics card memory IC. They are changeable with enough skill and effort (tech only though).
This worth mentioning.

Edit1: Post a close picture of the back of the card and a full product ID/model, I have 1060 card for parts.
Edit2:
the most common causes for these cards to fail are:
  1. VRM Failure (sometimes sparky and final for the board) - crashes under/after prolonged heavy load, bad VRM cooling or overclocking
  2. Core Failure - requires core to be replaced (BGA), pricy part and lots of labor. One crash, sometimes followed by VRM...
I took it to a computer firm, who ran stress tests on everything, and they said it was a faulty RAM module i had in.
Is that really all the information you got?

I've had 2 people look at it who cant say what the issue is.
Because that kind of problems are very illusive of nature.

Based on the information you provides, It can be anything of:
  • Faulty RAM
  • Faulty motherboard
  • Faulty PSU
  • Or maybe both GPU's are faulty.
I advice you to do the following tests:
  • Memtest86+ (advice you use a Linux distro that have Memtest86+ as a startup option) for several hours and see if it find error in RAM.
  • OCCT, run a stress test and post a schreenshot of the graphs. That should reveal heat or voltage problems.
 
  • Memtest86+ (advice you use a Linux distro that have Memtest86+ as a startup option) for several hours and see if it find error in RAM.
  • OCCT, run a stress test and post a schreenshot of the graphs. That should reveal heat or voltage problems.

Have run the tests and here's the results:

https://drive.google.com/open?id=1lj8xlmFMIMwCBviIoX2mW6lZusWGKUI7

Memtest86+ ran for just under 6 hours and didnt find any issues.

I ran the OCCT test for 30 mins and 1 Hour, neither of which found any errors.

I ran the OCCT Memtest, and after 3 minutes started running into the millions of errors detected. My RAM usage went to its max, screens went black and had to reset. It put out a crash report, which is in the google drive.
 
Can't see anything obvious i'm afraid. But there is something that puzzles me - approx 10 min within "OCCT 1Hr" there is a spike in GPU voltage readings. Any thoughts about that - are the computer set to go to sleep or power saving mode or similar during 10 min ?

Another thing - I was expecting to see 12V readings as well.

The only thing worth mention is that OCCT report error on GPU testing. Together with the funny voltage reading, that would put the GPU in a bad spot (but too bad the +12V readings was missing).
 
Can't see anything obvious i'm afraid. But there is something that puzzles me - approx 10 min within "OCCT 1Hr" there is a spike in GPU voltage readings. Any thoughts about that - are the computer set to go to sleep or power saving mode or similar during 10 min ?

Another thing - I was expecting to see 12V readings as well.

The only thing worth mention is that OCCT report error on GPU testing. Together with the funny voltage reading, that would put the GPU in a bad spot (but too bad the +12V readings was missing).

No power saving or sleep modes at all
 
Someone advised to try different speed RAM. I ordered and tried 3000Mhz (instead of the 2400 thats in there now) and it's working. Blender, Adobe and Unreal Editor are all working fine.

Thanks for your help and suggestions
 
If DDU + vanilla driver install did not help (you mentioned to even reinstall the whole OS),
I would try running another video memory test off this list - it might be a fried/bad graphics card memory IC. They are changeable with enough skill and effort (tech only though).
This worth mentioning.

Edit1: Post a close picture of the back of the card and a full product ID/model, I have 1060 card for parts.
Edit2:
the most common causes for these cards to fail are:
  1. VRM Failure (sometimes sparky and final for the board) - crashes under/after prolonged heavy load, bad VRM cooling or overclocking
  2. Core Failure - requires core to be replaced (BGA), pricy part and lots of labor. One crash, sometimes followed by VRM failure and sparks or if the chip gets dented
  3. Graphics memory failure - requires memory ICs to be replaced (BGA). Prominent artifacts and crashes up to a complete failure. Usually from memory overclock, or if the card runs hot for a long time, or static electricity, or poor IC quality
Most GPUs have protective features against a faulty PSU btw.
 
Last edited:
Solution
Have run the tests and here's the results:

https://drive.google.com/open?id=1lj8xlmFMIMwCBviIoX2mW6lZusWGKUI7

Memtest86+ ran for just under 6 hours and didnt find any issues.

I ran the OCCT test for 30 mins and 1 Hour, neither of which found any errors.

I ran the OCCT Memtest, and after 3 minutes started running into the millions of errors detected. My RAM usage went to its max, screens went black and had to reset. It put out a crash report, which is in the google drive.

Have you experienced any problems related to the errors found using OCCT Memtest? and is it still detecting errors? I'm asking because I also experience crashes but only in one game, but there are weird texture glitches present on some other games. I tested the OCCT memtest and I got thousands of errors too.