Question Hard crashes - screen frozen, 0 response, high VBAT temperature

Larry.F

Prominent
Apr 1, 2023
13
0
510
Hello!

In the past 3 days, I have had several of these types of hard crashes. First it was just one, then again just one, then back to back multiple. Without warning.
Both under a bit of load, and ~30 seconds after pc launch with nothing going on. I had 3 BSODs (all 3 different types [kmod exception not handled, IRQL not less and one other I forgot]).
I have reinstalled my GPU's driver 20 times by now, both through the Nvidia geforce experience, the newest version from the official website and the december release. Used DDU aswell twice for a clean reinstall. Ran memory diagnostics tool - no error was found. PC can run perfectly fine under load for hours upon hours, and can crash while I have discord open.
I have no gpu accelerator enabled in any programs. Windows self diagnostics say I am fine.

I took apart my entire pc, blew out all the dust, checked cables, reseated RAM sticks, ssd etc.
Now using "open hardware monitor" I get a VBAT temperature peaking at 120°C (jumps from 28 to 60 to 100+ in seconds, the battery itself is not actually hot to the touch)
Is that normal? It's running at 3.168V, everything else in the voltage section that has its temperature measured is sub 40°C.
I have never used this tool before, so I have no clue how it was beforehand.
NoRqVDQ.png

In reliability monitor and event viewer I can't see anything that would catch my untrained eye as out of the ordinary. Reliability monitor infact just tells me there was an unexpected shutdown as if I didn't know and initiate that.

Is there any way I can see if I am having issues with the hardware, software, drivers or anything else that is causing this?
For context I did not update anything recently, did not change any of the parts or anything similar. GPU is 3 years old, other stuff 2. Never had an issue like this before.
As of posting it has been (after reassembly) working perfectly fine for 45 minutes now.

Specs:
Asus PRIME Z690-P
MSI 3060 Ventus 2X
i5 13600K
G.Skill 32GB 5600J2834F16G
750W Corsair PSU, worked perfectly fine since I had it
Win 10 22H2
 
Hello!

In the past 3 days, I have had several of these types of hard crashes. First it was just one, then again just one, then back to back multiple. Without warning.
Both under a bit of load, and ~30 seconds after pc launch with nothing going on. I had 3 BSODs (all 3 different types [kmod exception not handled, IRQL not less and one other I forgot]).
I have reinstalled my GPU's driver 20 times by now, both through the Nvidia geforce experience, the newest version from the official website and the december release. Used DDU aswell twice for a clean reinstall. Ran memory diagnostics tool - no error was found. PC can run perfectly fine under load for hours upon hours, and can crash while I have discord open.
I have no gpu accelerator enabled in any programs. Windows self diagnostics say I am fine.

I took apart my entire pc, blew out all the dust, checked cables, reseated RAM sticks, ssd etc.
Now using "open hardware monitor" I get a VBAT temperature peaking at 120°C (jumps from 28 to 60 to 100+ in seconds, the battery itself is not actually hot to the touch)
Is that normal? It's running at 3.168V, everything else in the voltage section that has its temperature measured is sub 40°C.
I have never used this tool before, so I have no clue how it was beforehand.
NoRqVDQ.png

In reliability monitor and event viewer I can't see anything that would catch my untrained eye as out of the ordinary. Reliability monitor infact just tells me there was an unexpected shutdown as if I didn't know and initiate that.

Is there any way I can see if I am having issues with the hardware, software, drivers or anything else that is causing this?
For context I did not update anything recently, did not change any of the parts or anything similar. GPU is 3 years old, other stuff 2. Never had an issue like this before.
As of posting it has been (after reassembly) working perfectly fine for 45 minutes now.

Specs:
Asus PRIME Z690-P
MSI 3060 Ventus 2X
i5 13600K
G.Skill 32GB 5600J2834F16G
750W Corsair PSU, worked perfectly fine since I had it
Win 10 22H2
Temperature 5 is 115 degrees. Perhaps someone can chime in here with the importance of this.
 
Can't speak to the temps; I know some board components can get *really* hot, but those temps sound excessive.

In the meantime, a simple thing you can do is run a memory test using memtest86 (or heck, even the built-in Windows one does a decent enough job, heresy, I know). Never hurts to do one when you start seeing BSODs, since it's fairly definitive (any errors = replace the RAM).
 
Can't speak to the temps; I know some board components can get *really* hot, but those temps sound excessive.

In the meantime, a simple thing you can do is run a memory test using memtest86 (or heck, even the built-in Windows one does a decent enough job, heresy, I know). Never hurts to do one when you start seeing BSODs, since it's fairly definitive (any errors = replace the RAM).
I did run it, no errors.

As a little update: I had more crashes, while running no GPU driver installed.
I tried reinstalling windows, but my pc hard crashed during the installation at 1%, not even booted in.

As for the temperature, as I said, as far as I can tell, temp 5 is supposed to be the VBAT, which is the small CR2032 battery, and I suspect it might be a bad sensor, because it keeps jumping between 25-120 celsius in an instant and back down. I would think it is physically impossible to have those sort of fluctuations within a PC.

I suspect it might actually be the MOBO, that is having issues.