Is my PSU dying? Advice wanted over random shutdowns.

Jul 24, 2018
16
0
10
Parts:
CPU: Intel - Core i7-6700K 4GHz Quad-Core
CPU Cooler: Cooler Master - Hyper 212 EVO 82.9 CFM
MB: MSI - z270 SLI PLUS ATX LGA1151
RAM: Corsair - Vengence LED 16GB DDR4-3000
Storage: Samsung - 850 EVO-Series 500GB SSD
GPU: Asus - GeForce GTX 1080 8GB STRIX
Case: Phanteks Enthoo Pro M Tempered Glass
PSU: EVGA - SuperNOVA G3 650W 80+ Gold Full Mod
OS: Windows 10
Three Case Fans

PCPARTPICKER List - https://pcpartpicker.com/user/Mitcha47/saved/k8Kzyc


Ive had this Tower for about 9 months, the past 2 weeks I have experienced Blackouts, just treight shutdown like power was turned off. I didnt pay much attention to it at first because I assumed it was because of the storm I was playing in. Well after that day it kept happening.

Two situations occur.

1. Computer shutsdown and boots right back up, goes through whole startup process.

2. Computer shuts down, and then tries to boot up for a second and then trips back off again, then it does this cycle of on and off into peputuity I assume (I stop it as quickly as possible.) While doing this cycle of horror it makes 1 loud click per boot on.

This problem occurs randomly, but if i boot up FarCry3 it will trip in 15 minutes or less. Doing less tasks like browsing or netflix doesnt seem to trip it as often, although it does happen prety much once a day.

Event Manager says event 41 with category of 63, indicating a power failure to the kernel.

The largest change in the past 9 months has been an upgrade in monitor, from 20 inch to a 55 inch TCL TV. I dont see this as too much of a problem though with my current graphics card.

From the info I have gathered from the internet I beleive its a bad PSU, and have ordered one. I just really want a second opinon if I am missing something.

Ill try posting HWMoniter stats

HW moniter stats

 
There are a few possibilities. One is that the power supply was damaged in the storm. A second is that the power supply is detecting a fault within the system and shutting down. The second seems more likely. It would explain the intermittent nature. For example a loose screw rolling around under the motherboard and occasionally shorting the motherboard to the case.

Here are two general troubleshooting checklists.

http://www.tomshardware.com/faq/id-1893016/post-system-boot-video-output-troubleshooting-checklist.html

http://www.tomshardware.com/faq/id-2041564/troubleshoot-boot-display-issue.html
 


I wasnt sure how to stress the CPU, so I just did a system scan with BitDefender. You think an Anti-virus wouldnt take up so much CPU, but it does, oh it really does. So ill take that as my unofficial CPU test. Passed.

I did test my GPU with FurMark. Ran spectacularly until I put it to full screen, so 55 inch 4k size. This produced a shutdown after 10 minutes. I suspect that the PSU is overheating or just failing at delivering the correct amount of voltage that my massive GPU needs because of a failure of some sort.

From the rough power diagram i made:

Outlet -> PSU -> MB -> Compnents.

Its frustrating to not know which one is crap.

It seems the outlet is fine because the router and tv stay on when the Tower shuts down.

Even the linked photos of HW Moniter show the right voltage on the MB.

Le Sigh.
 
I would run AIDA64 as well, just to be safe. As others have suggested, try looking around for a loose screw in your case, as that may be causing a short. Alternatively, you could also test the system outside of the case, to ensure that the issue still persists, thus eliminating the possibility of any shorts happening.
 
The 650 watt EVGA G3 is great power supply. It should be more than enough power for the GTX 1080 (the system requirement is 500 watts).

Anti-virus will produce 100% CPU usage.

What were the temperatures when you were running Furmark?

Try the same test on a regular computer monitor.
 


So I just ran fur mark at 1920 x 1800. GPU temp got to 64 degress celcius.
After about 4 minutes, crash.

It seems too dependable to be a short. High graphics utilization programs are very good a getting it to go into scenerio #2 where it shuts down and then keeps trying to boot up but something slams it back down into blackout zone.

I did update my Original Post with a few HW moniter images that i thought would be relevant.

 
I've ordered a power supply, should be here on Wednesday(Tommorow). I'll try swapping out the pay and cables with this one and see what happens.

If it's not the PSU, would changing the MB be the next step?
 
After replacing the power supply, that brings up many possibilities. First if the motherboard doesn't have a LED display for error codes, I would invest in either a case speaker or a PCI LED diagnostic board. That will give you more of an idea what is going on. For example, it may be a memory controller on the motherboard and that would need a replacement. Or it may have a CPU error for example.

LED Diagnostic card.
https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=100007621&IsNodeId=1&srchInDesc=diagnostic&bop=And&ActiveSearchResult=True&Order=BESTSELLING&PageSize=36

Case speaker.
http://www.newegg.com/Product/Product.aspx?Item=N82E16812201032&cm_re=case_speaker-_-12-201-032-_-Product
 


The error code LED's on the MB contain diagnostics for:
CPU
DRAM
VGA
BOOT

None of which where lit. It was neat finding out about this capability on my MB though :)
The PC is not overclocked.

I should have the PSU installed after work though, T-Minus 24 Hours :)
 


That isn't what I was asking about. Most newer motherboards have those indicator LEDs. The LED display that I asked about is and actual alpha-numerical LED display that shows specific error codes. Many mid to high-end motherboards have those displays.

But at least those LED's didn't indicate an error state.
 
Alright, I ordered a set of tools. It came with a diagnostic board and some other nifty tools that might help. It will be helpful in the future and on Thursday when it arrives. Which unfortunately is a day later than the arrival of the PSU. Dear God let's hope it's the PSU.
 


Either way I will talk you through it.
 
Have the entire computer internals laying out on cardboard. New Corsair RM 750x as the PSU. At around 10 minutes on full screen 55inch tv furmark GPU stress testing the system shutdown. Tried to post 3 times. Then was successful and went through the boot sequence.

So it looks like its not a PSU problem. It also doesnt seem to be a static short problem as it is all out of the case.

I did notice however that the GPU smelled like warm metal. Not a burning smell though.

I am now going to try Memtest and see what happens.
 


Memtest ran 4 passes with 0 errors.

I took out my gpu and noticed that on on all of the gold connector pins on the card, there is one at the veeeery end where it is cut off a bit. Perhaps that metal flaked off and is in my first pci slot. So I moved the graphics card to another slot and am trying to get it to crash again by running firearm. If it doesn't shutdown, what should I do ?
 
I'm not sure what pin you are talking about. Can you take a picture of it and post it?

If it is a on a GPU video port, try another port. If it is on the PCIE power cable, use another PCIE power cable. If it is a contact in the motherboard PCI Express x 16 slot, use another slot. If it is in the graphics card PCI board contacts, you can try to repair the contact (difficult to do).
 


Ah! The PCI board contacts was the identifier I was looking for.

I took two photos, one for each side.
Side "A" shows damage on data lanes. Side"B" shows damage on power lanes.
Could this explain why it is fine under low loads but shutsdown during high ones?
Doth you suggest RMA?

(I also am terrible at soldering. Its not for someone like me with these shakey 21 year old hands.
 


How old is the graphics card? It would be a possible poor connection on those two contacts for sure. There may still be enough surface area to make the contact.

You could say that it is a manufacturer defect, since there is no way to know when the contacts were damaged. But I would just site the shut down problems, and point out that you tried a new power supply if asked.