Question Failing PSU?

FishyTree

Honorable
Dec 28, 2016
17
0
10,510
So this just started happening tonight. (YAY!) Everything has been working for months, so I assume something is starting to die. It's possible I might just need a bigger PSU, but that's why I'm here. I do a lot of 3D modeling in Blender, and for whatever reason my PC is now crashing whenever I try to render anything or use material view. Sometimes it happens right away, and sometimes it works for a few minutes, but it inevitably crashes. It seems to be fine during normal operation like writing this. So I'm not sure if this is a power issue, or a GPU issue.

It's also possible I've just gone past the PSUs limit and need an upgrade because I've added quite a few things since I built it, it's a Corsair RM750x PSU.

Specs:
MSI X570 Gaming Plus
Ryzen 7 5800x
2x XFX Speedster RX-6600XT GPUs
Kingston Fury Beast 64gb RAM, 3600mhz
2x 8TB HDDs (WD and Seagate)
2x Samsung 870 SSDs (500gb, 1tb)
1x Samsung 870 (or 850?) M.2 drive.

Also... Stupid question. Is it possible I am right at the limit of the PSU and just have too much stuff plugged in? The only other X factor here is I had my FPV goggles charging off the USB.. (I LITERALLY just noticed that as I was writing this). I'm scared to do anything now at all though, because I don't want it to crash worse and corrupt all my files or something, and moving them all around is.. Well, there's a reason I have two 8tb HDDs. I have a LOT of files. So any advice on how to proceed with troubleshooting this would be greatly appreciated. I might try again here with the goggles unplugged and see if that changes anything, but I've heard a PSU failure can corrupt data, so I might just.. not do that.

In the event log, I'm seeing event id 41 / kernal power. "The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly."

One thing I know it's not: Dust in the fans/thermal throttling, I just blew them clean not that long ago. They were pretty bad, and it never did this even then. It also happens so fast in some events, there's no way it's a heat issue. OHM shows both GPUs at around 43 and 38c currently.

The entire computer did do a windows update right before too. Like, one of those annoying, restarts itself updates. Could it be a driver/software issue perhaps? I know my GPU drivers are up to date, I just updated those about a week ago. When it crashes it doesn't restart, or even fully shut off. It just goes to that illuminated, but still black screen like right before BIOS or when you first open a game.. if that makes sense?

I've had it on, just sitting here for about an hour now and it's operating totally fine. No unnatural noises or anything. It also seems I can run Blender just fine, but only in the basic view. When I turn on render view/material view, or try to render the scene is when I have an issue.

So that's why I'm here. I'm not sure if this is a GPU issue, PSU issue, or even a software issue and don't know where to begin. I'd try more, but I really don't want to keep crashing the computer if I can help it.
 

Lutfij

Titan
Moderator
BIOS version for your motherboard at this moment of time?

Source(borrow, not buy) a higher wattage to power your entire rig with your dual GPU setup. In fact you could try and work with merely one discrete GPU and see if the issue persists. If it doesn't then it's a power issue.

If you think it's s driver/software issue, use DDU to remove all GPU drivers from your platform and then manually reinstall the latest GPU drivers sourced off of AMD's support site in an elevated command, i.e, Right click installer>Run as Administrator.
 

FishyTree

Honorable
Dec 28, 2016
17
0
10,510
BIOS version for your motherboard at this moment of time?

Source(borrow, not buy) a higher wattage to power your entire rig with your dual GPU setup. In fact you could try and work with merely one discrete GPU and see if the issue persists. If it doesn't then it's a power issue.

If you think it's s driver/software issue, use DDU to remove all GPU drivers from your platform and then manually reinstall the latest GPU drivers sourced off of AMD's support site in an elevated command, i.e, Right click installer>Run as Administrator.
I have no idea what BIOS version, I assume the one it came with. I have to restart the computer to check that, right? I'm copying all my files over to an external drive first. Several TBs to move so it's taking a while..

Source(borrow, not buy) a higher wattage to power your entire rig with your dual GPU setup."

How do I do that? There's a place that rents them for testing or something? I'm confused. If you mean from someone else, I don't really know anyone else who'd have one. I'm the the closest thing my circle has to a computer guy, lol.

In fact you could try and work with merely one discrete GPU and see if the issue persists. If it doesn't then it's a power issue.
Wow, duh.. once the files are transferred I'll give this a try. So. what if it does persist? Then it is a GPU issue, correct? I hope it's the PSU.. That's a cheaper part to replace, lol.

"If you think it's s driver/software issue, use DDU to remove all GPU drivers from your platform and then manually reinstall the latest GPU drivers sourced off of AMD's support site in an elevated command, i.e, Right click installer>Run as Administrator."

IDK if it is or not. I was just spitballing.. Even so, Is there another way? That has never worked once for me. (elevated prompt) I follow the instruction to the letter and plain and simply, nothing happens. No error, no nothing. I've never understood that.

Regardless, the machine has been on and running since last night. No shutdowns, but no attempts to run anything like Blender or games either.
 

FishyTree

Honorable
Dec 28, 2016
17
0
10,510
OK, Major update and likely very good news. It's almost certainly a software issue, not a hardware one. (I'm quite relived, but not out of the woods yet)

It seems as though there's nothing wrong at all with any of the hardware, as it only seems to do it on Blender, and not just when the GPU is in use.

I transferred all my most important files to an external drive, disconnected it and fired up Blender with a scene I finished a while back. (One I know works and has no issues) and the computer almost instantly restarted.

So I figured before I get the tools out I'd try one thing first, and that was to see if it was just Blender causing the crash. So I opened two other programs that use the GPUs.

I've had both Substance Painter and Character Creator 4 open for about 20 minutes now, both are open on screen with models loaded, and nothing. Everything is running fine. temps are good, no smells, no noises.. Fans aren't even going like they do during a render. Not even any lag with both open either. So I'm going to go ahead and say the computer is fine, physically..

So from here we're looking into software solutions. This started after forced a windows update. I assume driver updates from here are the best course of action?

I am going to try a different version of Blender first and see if the problem persists. If it doesn't, I may just have to reinstall my preferred build of Blender.
 

FishyTree

Honorable
Dec 28, 2016
17
0
10,510
OK, Major update and likely very good news. It's almost certainly a software issue, not a hardware one. (I'm quite relived, but not out of the woods yet)

It seems as though there's nothing wrong at all with any of the hardware, as it only seems to do it on Blender, and not just when the GPU is in use.

I transferred all my most important files to an external drive, disconnected it and fired up Blender with a scene I finished a while back. (One I know works and has no issues) and the computer almost instantly restarted.

So I figured before I get the tools out I'd try one thing first, and that was to see if it was just Blender causing the crash. So I opened two other programs that use the GPUs.

I've had both Substance Painter and Character Creator 4 open for about 20 minutes now, both are open on screen with models loaded, and nothing. Everything is running fine. temps are good, no smells, no noises.. Fans aren't even going like they do during a render. Not even any lag with both open either. So I'm going to go ahead and say the computer is fine, physically..

So from here we're looking into software solutions. This started after forced a windows update. I assume driver updates from here are the best course of action?

I am going to try a different version of Blender first and see if the problem persists. If it doesn't, I may just have to reinstall my preferred build of Blender.
Welp. As soon as I posted that I went to close Substances and it crashed again. Soo looks like it's back to any possibility.

Gonna try GPUs one at a time here. See if one of them is bad.
 

FishyTree

Honorable
Dec 28, 2016
17
0
10,510
Alright, so I've tried taking the GPUs out one at a time and it appears that I may have a bad one. The card that was on the bottom is now in the top slot, with nothing in the bottom slot and seems to be working fine again.

One thing I noticed in my last attempt with just the now suspected bad card was when I was scrolling around the model, the screen started to twitch, flashed black, then was fine for a second. As soon as I started moving around it crashed again. My RAM is also a new addition, so I started to wonder if that was it.

I spent like five minutes in material view, spazing the controls around as drastically as I could trying to replicate it but it wont. I've started rendering out the animation I was working on when this all started, and it is going again. Slower than before, but working none the less. It's been well over an hour now of constant blendering and now animation rendering.

It's not all that surprising to me that it's a dead GPU, assuming the problem doesn't happen again in the next few minutes like it did before; But given that it's nearly constantly rendering something, I'm frankly surprised it lasted this long.

In fact I now suspect that one was going bad for a while, and causing other issues that I'm just now connecting the dots of. I had strange, but specific performance issues in Blender I shouldn't have that seem to have evaporated with the removal of that card. Mostly texture distortions and failures.

Looks like it might just be time for an upgrade after all! I'll post back later on tonight to either close this out or whatever. I can get an older 2080 TI for about what I spent on these new, and they're still better cards so that's the route I'm looking into right now.