Question Kernel 41 errors with 4090 when it hits ~65 degrees

Nov 7, 2022
10
1
15
0
Looking for some help/advice with a new build. Recently I experienced my system rebooting 3 times playing MWII due to a kernel 41 error. My card was running fine and I was getting 250+ at 1440p ultrawide with uncapped frames. I noticed the card idles at 36 and it would go to ~45 in game but after 10 mins or so it started to creep to 50, 52, 55, 57, 58, 60, 63, 65 etc. At around this temperature my system just rebooted due to a kernel 41.

I thought maybe it was linked to Nvidia saying the latest game ready driver is unstable and to go back to an older one, so I rolled back my driver to the previous one. Went back into game and the exact same thing happened. As far as I can tell this error means something in the PC couldn't get enough power and so the system shut down.

I've since capped my fps at 140 and I haven't had a single crash yet, GPU temp hasn't gone above 57 degrees so I guess I have a few questions:

1.) Can capping FPS so the GPU underperforms save power draw? Less frames, less work, less heat, less electricity consumed? The card should not be encountering issues at 65 degrees, but since forcing it not to hit that the system has been stable.
2.) Would plugging in the PD_12V_PWR cable on my mobo help with the problem? Reading the description for what it actually does leaves me very confused, as far as I can tell it gives extra power to the PCI slots but is only relevant if you're using a second GPU/additional PCIe cards? Or would it assist the sole PCIe slot by giving the 4090 more power?

Specs
TUF OC 4090
13900k + NZXT Z73
ROG Z690 Maximus Extreme
Gigabyte UD1000GM PG5 1000W ATX3 PSU
10 fans total (QL120s), 3x AIO exhaust top mounted in push config through RAD, 1x exhaust back, 3x intake side and 3x intake bottom.

Nothing has been OC'd beyond how it arrived. I know unseated RAM modules can cause this, but mine are seated correctly and I've since gamed for about ~100 hours with capped frames having the GPU run at 55-57 degrees without any restart whatsoever.

I would really like to resolve this because I want to upgrade to a 4k monitor which I imagine is going to push my temps back into the mid 60s even with capped frames where I may see the issue start again.
 
Nov 7, 2022
10
1
15
0
Welcome to the forums, newcomer!

Can you see if a higher wattage PSU that's from another brand changes what you're experiencing? Higher wattage as in 1.2KW or higher. The reason behind me suggesting a higher PSU is due to this;
View: https://youtu.be/nZcyhcPVxUM?t=196
Thank you for the reply, this video makes a lot of sense - at the time I built the PC I was only aware of 2x ATX3 PSU's available one from MSI and one from Gigabyte both 1000W, and I thought Nvidia themselves had said 850+ would work which is why I purchased the one I have - so I'm baffled as to why they announced that if the system requires 1200 to avoid system restarts.
 

Lutfij

Titan
Moderator
A lot of other brands actually changed their GPU's power requirements to 1000W, can't remember which one but it's a mess out there with the connector, the GPU's power draw, the PSU's and worst of all the amount of resources spent towards something that didn't need to exist(the 12pin connector).
 
Nov 7, 2022
10
1
15
0
A lot of other brands actually changed their GPU's power requirements to 1000W, can't remember which one but it's a mess out there with the connector, the GPU's power draw, the PSU's and worst of all the amount of resources spent towards something that didn't need to exist(the 12pin connector).
Sweet, I can't find anything higher than 1000W with a native 12VHPWR socket in Canada but the MSI MEG Ai1300P launches in 5 days so I'll wait for that and give it a try.
 

Lutfij

Titan
Moderator
You can't source anything from Seasonic or Corsair, huh? Asus seem to also bundle a 12VHPWR cable with their high end units(at the counter during checkout, last I was informed across the forums by one user).
 
Last edited:

DRagor

Illustrious
As far as I can tell this error means something in the PC couldn't get enough power and so the system shut down.
Nope. All it says is PC hit unexpected shutdown. The reason for this shutdown could be absolutely anything.
Can capping FPS so the GPU underperforms save power draw?
Yes
Less frames, less work, less heat, less electricity consumed?
Exactly
The card should not be encountering issues at 65 degrees
Agree, there is something fishy there.
Would plugging in the PD_12V_PWR cable on my mobo help with the problem?
Unlikely. 4090 draws all the power from PCIe cable, unless it is OCed so much that total power draw exceeds 600W.

Since we talking about 4090, first question should be: did you already checked if your PCIe power cable is not already melting?
Next, it does not seem to me like power issue, more like temp issue. The fact that you can play game for quite a bit of time without problems and it only shuts down consistently after reaching certain temp is not what I would expect from power issue. Now, it is very true that 4090 should not shut itself at mere 65 C - unless there is something on your unit that actually hits much higher temp (say, VRAM for example). I would be worried that your GPU might be actually defective. That said, I 100% agree with Lutfij that your setup should be running 1200W PSU. But, I would pass that MSI unit - there was already one report of this very PSU melting PCIe cable with 4090. To early to say it's bad model, but better safe then sorry.
 
Nov 7, 2022
10
1
15
0
You can't anything from Seasonic or Corsair, huh? Asus seem to also bundle a 12VHPWR cable with their high end units(at the counter during checkout, last I was informed across the forums by one user).
Not with native 12VHPWR at the PSU end, unless there's a model I'm unaware of. All the ones I've found higher than 1000W have no release date yet, MSI Loki for example, ThermalTake GF3 out of stock, I haven't seen an ATX3 from Corsair yet and I'm not wanting to go ATX2 and use an adapter with the cable melts just to be safe.

Since we talking about 4090, first question should be: did you already checked if your PCIe power cable is not already melting?
Next, it does not seem to me like power issue, more like temp issue. The fact that you can play game for quite a bit of time without problems and it only shuts down consistently after reaching certain temp is not what I would expect from power issue. Now, it is very true that 4090 should not shut itself at mere 65 C - unless there is something on your unit that actually hits much higher temp (say, VRAM for example). I would be worried that your GPU might be actually defective. That said, I 100% agree with Lutfij that your setup should be running 1200W PSU. But, I would pass that MSI unit - there was already one report of this very PSU melting PCIe cable with 4090. To early to say it's bad model, but better safe then sorry.
Yup checked the cable at both ends and clear - but since I am capping FPS and keeping the temps below 60 I'd imagine that there's not enough power being drawn by the GPU anyway that would start melting the cable.

With regards to the temp issue not power issue, is it not a fair statement that it could be power related because the issue doesn't occur until the temps rise? Given you agreed that limiting FPS and limiting heat means less power is drawn, it may stand to reason that when I had uncapped FPS the temps rising is the catalyst for the increased power draw and the PSU then failing when that wattage is demanded? I'm not an expert here so just thinking out loud.

What PSU would you recommend? I'm not seeing many options even available at 1200W ATX3 without going for an older PSU that requires the sketchy adapter. All I've seen so far is Thermaltakes GF3, MSI's Loki with no release date and the MEG Ai1300P.
 

DRagor

Illustrious
Given you agreed that limiting FPS and limiting heat means less power is drawn, it may stand to reason that when I had uncapped FPS the temps rising is the catalyst for the increased power draw and the PSU then failing when that wattage is demanded?
When you uncap FPS increased power draw kicks in immediately, not after temps rise. So basically if there is power issue it should occur as soon as you hit the part of the game that loads GPU heavily, not after some (extended) time.
What PSU would you recommend?
Well, that's the problem with ATX 3 PSUs that there are not too many of them and even less has been properly reviewed by trusted sites, so its hard to recommend any. The solution for now seems to be to use well known quality ATX 2 units from Corsair or Seasonic with the new PCI 5 cables sold separately by them.
 
Nov 7, 2022
10
1
15
0
When you uncap FPS increased power draw kicks in immediately, not after temps rise. So basically if there is power issue it should occur as soon as you hit the part of the game that loads GPU heavily, not after some (extended) time.

Well, that's the problem with ATX 3 PSUs that there are not too many of them and even less has been properly reviewed by trusted sites, so its hard to recommend any. The solution for now seems to be to use well known quality ATX 2 units from Corsair or Seasonic with the new PCI 5 cables sold separately by them.
Ah, got it - that makes sense. I think for now I'll just hold off and stay at 1440 limiting frames where my system is stable until Nvidia comment on the cable fiasco before I drop $$ on another PSU, I thought about putting my 3080 back in and trying to stress test the PSU to see if it fails with a different card to maybe eliminate that but not sure it will draw enough wattage. I could also take the card off the Lian Li pcie4 vertical riser and mount it directly to the motherboard, but then that puts my cable at risk of bending and then getting the melting cable issue. I'll wait for Nvidia first as step 1 and go from there. Thanks for all the replies.
 
Reactions: DRagor
Nov 7, 2022
10
1
15
0
I was just about to say what Dragor said above. Corsair, Seasonic have a cable like this;
https://www.corsair.com/us/en/Categories/Products/Accessories-|-Parts/PC-Components/Power-Supplies/600W-PCIe-5-0-12VHPWR-Type-4-PSU-Power-Cable/p/CP-8920284
that pairs with a compatible Corsair PSU.
When you uncap FPS increased power draw kicks in immediately, not after temps rise. So basically if there is power issue it should occur as soon as you hit the part of the game that loads GPU heavily, not after some (extended) time.

Well, that's the problem with ATX 3 PSUs that there are not too many of them and even less has been properly reviewed by trusted sites, so its hard to recommend any. The solution for now seems to be to use well known quality ATX 2 units from Corsair or Seasonic with the new PCI 5 cables sold separately by them.
Sorry to drag this up again but after a week of it not happening it just restarted. I managed to capture the activity via GPUZ which I've attached on the link below but I'm not really able to interpret it. To the untrained eye it looks like at this moment my system was using 100% of the 1000W my PSU can provide and at that point the GPU fans were only spinning at 56%?

View: https://i.imgur.com/YTU8dJz.png


Does this readout confirm to the trained eye that the issue is power related? I could set a custom fan curve to have the fans running at 100% so it cools the card down a bit but if the readout is confirming power issues then that's a different matter. MWII was running at 4k frames locked to 140, haven't had any issues for a week since installing NVidia's latest driver and hotfix, and then tonight 2 restarts after about ~5 hours of gaming with the 2nd restart 20 mins after the first.

Trying a custom fan curve and limiting the power to 85% in afterburner.
 
Last edited:
Nov 7, 2022
10
1
15
0
Crashing occurred when setting power limit even as low as 80%. Not quite sure how to diagnose what is causing this now.
 

DRagor

Illustrious
I managed to capture the activity via GPUZ which I've attached on the link below but I'm not really able to interpret it. To the untrained eye it looks like at this moment my system was using 100% of the 1000W my PSU can provide and at that point the GPU fans were only spinning at 56%?
There is absolutely nothing looking suspicious in those numbers. Everything as I would expect it to be. Thus not pointing to any abnormal behavior.
It looks like temp wise your system is good, both on CPU (as expected with big AIO) and GPU (we know 4090 are much better in this regard compared to 3000 series). I don't see any reason to suspect anything wrong here.
This could be software problem - the fact that it run fine for a week could hint at that. Otherwise, it can be a power problem (transient spikes hitting PSU limit perhaps) or even something not right with motherboard. With cases like this it is really hard to tell for sure, especially considering how random the failures seem to be.
 

DSzymborski

Titan
Moderator
Honestly, I'd kinda be suspicious about the PSU. Gigabyte is still using MEIC, a company with little experience making consumer PSUs at this level, and given some of the other problems with their PSUs (fires, cheap parts), it would not surprise me if this PSU is having an issue related to design and the extreme power needs, especially transient loads, of this GPU. I'd at least want to test another PSU.
 
Nov 7, 2022
10
1
15
0
There is absolutely nothing looking suspicious in those numbers. Everything as I would expect it to be. Thus not pointing to any abnormal behavior.
It looks like temp wise your system is good, both on CPU (as expected with big AIO) and GPU (we know 4090 are much better in this regard compared to 3000 series). I don't see any reason to suspect anything wrong here.
This could be software problem - the fact that it run fine for a week could hint at that. Otherwise, it can be a power problem (transient spikes hitting PSU limit perhaps) or even something not right with motherboard. With cases like this it is really hard to tell for sure, especially considering how random the failures seem to be.
On a hunch I removed the Lian Li PCI4 vertical riser I was using and mounted the card horizontally and it then ran for ~5 hours ish without a single crash - though that's not telling me much right now because it ran for a week between crashes without an issue but then all of a sudden crashed 3 times in the space of an hour so I'll have to keep it horizontal to tell if it was a problem with that or not. Which is now annoying that I can't close my case because the cable is in the way until Cablemods 90 degree adapter comes out. Others are using this riser without issue so I didn't think it would be that, but felt like it was the simplest thing to test.
 
Nov 7, 2022
10
1
15
0
Honestly, I'd kinda be suspicious about the PSU. Gigabyte is still using MEIC, a company with little experience making consumer PSUs at this level, and given some of the other problems with their PSUs (fires, cheap parts), it would not surprise me if this PSU is having an issue related to design and the extreme power needs, especially transient loads, of this GPU. I'd at least want to test another PSU.
Yeah, I have a Rosewell 1200 that Newegg sent me with the 4090 but was hesitant to a.) use their own brand product and b.) use the shoddy Nvidia adapter that I'd be forced to use given that the Corsair cable etc is out of stock everywhere.
 
Nov 7, 2022
10
1
15
0
Yep, definitely test that. If you told us about the riser it would be first recommendation to run without it.
I mentioned it in an earlier post you liked ;) but yeah it seemed like the easiest thing to rule out first of all. Fingers crossed the issue doesn't occur again and I'll just put the side back on once the 90 degree adapter is released.
 

ASK THE COMMUNITY