Question Have Been Having Lots of BSODs and Hard Freezes on a New Computer, Finally Narrowed It Down, Need Help Solving It

Page 6 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Feb 12, 2021
60
0
30
  • 64 bit
  • New System started with Windows 10
  • Picked up the prebuilt from cyberpowerpc here

Bought November 30th 2020

Its been having some odd crash issues for a while now. I updated Windows 10, and updated video drivers, and have checked for driver updates on other parts to no avail, though I may be missing something.

I've noticed a few blue screens, saying WHEA Uncorrectable Error, I've had the computer just freeze randomly, even while doing nothing (mouse can't move, no audio comes through, ctrl+alt+del does nothing, and I have to manually power cycle the PC to get it working), and I've left my computer on, and (the most common instance) I've walked away from the computer or left it on, and when I get back to it the PC has power cycled on its own due to some problem (whether its the blue screen, the freezing, or some unknown third issue, I don't know). These issues seem to happen indiscriminately, and occur whether I'm actively using the PC (though uncommon) or if barely anything is open and running. The PC isn't overclocked as far as I know, and the temps seem fine and sit at around 40c, and considering I'm in a hot and humid environment I'm not complaining.

They all share the same keywords, task category, event id, etc.


I've been having crashes pointing to ntoskrnl.exe for months now.

I have done memtest86, 8 passes, no failures. (Technically, two 4-passes back to back)

I've checked both my OS SSD and my HDD for disk errors - no problems.

I've run an sfc scan to check for windows corrupted files - none found.

The last step, running driver verifier, has crashed my pc before getting to windows 3 times in a row. Using BlueScreenView to look at the DMP file points all of them to RzDev_0226.sys.

Great, I have a culprit. Unfortunately, is there no way to roll this driver back, or update it? There are only two razer devices plugged in to my PC, the huntsman elite keyboard, and a firefly cloth edition mousepad. This is a brand new PC (purchased in December) with a 10850k and an rtx 3090, so I'm doubtful its anything due to having an old PC, and like I said, I ran those previous tests expecting it to be a hardware issue already, as this crash never occurred on my old PC.

So... what can I do? Beg Razer to update their drivers causing my PC to crash, or do I have to buy a different keyboard? Is there a quicker solution?

I'm legitimately desperate to get this fixed. I've been struggling with this problem for months. I've been searching for the file and have found no solutions..
 

Colif

Win 11 Master
Moderator
if you only have 2 on front and the AIO is 2 on back, wonder where the 3rd you got charged for is

Extra Case Fans: 3X Apevia Dual addressable digital RGB 120mm Fan 3Pins (Included)

I can tell there is a rear fan on back of AIO, I assume its the CM one and not Apevia

If thats an actual ML Pump, and not a lite. It should work with https://masterplus.coolermaster.com/ unless its some weird cyberpower hybrid
 

Vic 40

Titan
Ambassador
Think back fan on the cooler is original Cooler Master and the front fan we see is one of the Apevia fans. The two fans a certainly different. Need a picture? 😁

Migth well mean that they do not work properly together, you know fan speeds, pressure and such ...
 
Feb 12, 2021
60
0
30
Computer is now crashing after booting, in like 3 minutes. BSOD every time. I hardly have time to open this thread and try to read anything.

I tried turning XMP off and mem speed down to 2666 MHz, tried turning "Connectivity WiFi and Bluetooth" or something like that to disabled, and it was still crashing. Just reset to defaults - optimized (its the only option).

CPU temps are at around 35C, so it isn't an overheating issue, so I think we're back to where we started.
 
Feb 12, 2021
60
0
30
Do a clear cmos, after that in the bios reset to optimal default values and take that wifi card out, see what that brings. For a clear cmos take it of the power so cable out of the wallsocket.
Does this motherboard have a clear or reset CMOS button, or some pins with a jumper? I can't get the manual to load on my phone for some reason. I'd rather not have to reseat the battery if it can be avoided.

I already tried resetting to defaults completely, other than fan controls, and it is still crashing. While resetting the CMOS I'll at least try to find where the wifi card is plugged into and unplug it.

Also, I'll do this in like 30 minutes. It's storming at the moment and my phone is almost dead so I'm going to let it charge up.
 

Vic 40

Titan
Ambassador
And another picture,

oOXyUku.png


the nr12 pins (CLRTC) , two of them, are the clear cmos pins, you need to short them for about 5 sec with a flat screwdriver or soemthing like that. Make sure to set optimal default settinsg after that and don't change anything besides that, maybe bootorder but that is all, want to see if you have issues with ram at spd speed which should be 2133mhz.
 
Feb 12, 2021
60
0
30
And another picture,

oOXyUku.png


the nr12 pins (CLRTC) , two of them, are the clear cmos pins, you need to short them for about 5 sec with a flat screwdriver or soemthing like that. Make sure to set optimal default settinsg after that and don't change anything besides that, maybe bootorder but that is all, want to see if you have issues with ram at spd speed which should be 2133mhz.

Off memory, pretty sure the optimal default sets it to 3200 MHz

if he updated bios, it would have reset bios in the action?

It did indeed reset.
 
Last edited:

Colif

Win 11 Master
Moderator
its hard to tell, the front fan on AIO just has Cyerpower logo on it, rear does look different.

i wonder if they are apevia or Cyberpower, from what I can tell there isn't a great difference.

an air flow fan on an aio... well, its better than nothing I guess?
sure explains the missing 3rd fan.

calls Gamers Nexus, need to look at cyberpower every week lol
 

Colif

Win 11 Master
Moderator
i know I asked already but can you run https://www.hwinfo.com/download/

when you open program, tick box next to sensors only and click run
in next window, there are icons along bottom row.
UoyAHwT.jpg


Click on the icon to right of the clock, that says "logging start" if you hover over it
this opens file explorer and lets you create a log file with a name that makes sense, like date or something
run this until you crash
it will create a csv file that can be read in Excel or Google Docs.
upload to a file sharing site and show link here.

@Phaaze88 noticed something else we hadn't noticed. The NVME lacks a heatsink
Wonder what temps it gets too under a 3090. Its possible the air from GPU can cool NVME as it might still be cooler than what NVME gets too, but a heat sink always helps.
 
Feb 12, 2021
60
0
30
Just got back, turned the pc on, and it crashed within a few minutes, no heavy loads, all temps reporting at around 30C or less, maybe one at 40C (the "hotspot" on the GPU).

I haven't reset the CMOS yet, but otherwise the BIOS settings were reset. I reset them myself, and the memory says it's frequency is 2400 MHz. I am setting it to 2133 to see how that fares.

As for the temp logging, I will try to do that, but it crashes very soon after boot, so may be difficult..

Edit: Its logging now. Will try to upload after next crash. I've manually set the RAM speed to 2133, but otherwise BIOS is still in default (except I tuned fan curve to be at 100% always, since its actually the pump).

Edit 2: Crashed. Here's the log.: https://www.dropbox.com/s/v2eho8m59zdppl7/logging.CSV?dl=0
 
Last edited:
Feb 12, 2021
60
0
30
i know I asked already but can you run https://www.hwinfo.com/download/

when you open program, tick box next to sensors only and click run
in next window, there are icons along bottom row.
UoyAHwT.jpg


Click on the icon to right of the clock, that says "logging start" if you hover over it
this opens file explorer and lets you create a log file with a name that makes sense, like date or something
run this until you crash
it will create a csv file that can be read in Excel or Google Docs.
upload to a file sharing site and show link here.

@Phaaze88 noticed something else we hadn't noticed. The NVME lacks a heatsink
Wonder what temps it gets too under a 3090. Its possible the air from GPU can cool NVME as it might still be cooler than what NVME gets too, but a heat sink always helps.


The log is in the post above. Figured I'd reply to show this intel:

Rebooting into safemode "worked", in that there were no crashes, even while idling from 10:14 to 11:04, so for about 50 minutes.

I've now rebooted into safe mode with networking. In this mode, the desktop doesn't work, and windows explorer seems to repeatedly be crashing, and every time it restarts, Edge opens up. Actually, as I type this, my desktop finally loaded and Windows explorer seems to be working, somewhat. Firefox just stopped responding for about 20 seconds there while typing this, as well. I didn't really test safe mode without networking much, but the desktop worked immediately and there were no processes that stopped responding, though I did just leave it idle at the desktop from there. I'll let safe mode with networking run for a bit and see how that handles.

Edit: Its been about an hour in safe mode with networking, and no blue screens here, either. No WiFi or sound, either, of course. If safe mode has no crashes, that means, even though a BIOS update brought all this back down upon me, its still something safe mode fixes, so some device it doesn't use, or some driver that isn't running. In the morning I'll reset the CMOS (if its still necessary) and yoink out the wifi card (or at least unplug it), and go from there with everyone's guidance.

Edit 2: Before I hop off, I'll go ahead and put this here, the dump files of most of the crashes that have happened since the BIOS update. I decided to just drop them all into a zipped folder for upload, since there were so many.

Dump files here: https://www.dropbox.com/s/88pjbku9rk1ejih/Dump Files 3.zip?dl=0
 
Last edited:
Feb 12, 2021
60
0
30
And another picture,

oOXyUku.png


the nr12 pins (CLRTC) , two of them, are the clear cmos pins, you need to short them for about 5 sec with a flat screwdriver or soemthing like that. Make sure to set optimal default settinsg after that and don't change anything besides that, maybe bootorder but that is all, want to see if you have issues with ram at spd speed which should be 2133mhz.

Alright, I've wiped the CMOS, and taken out the WiFi card (removed the... extra power cable? and taken it out of the slot). Unfortunately, I don't have the extra back panel things that came with the case, so I just have an open slot there. Will edit this if (or most likely when) it crashes.

Edit: Tried doing Total War, and it crashed in about the same amount of time as usual. Here's the dump file: https://www.dropbox.com/s/91r713i7q5mdpoc/091921-6437-01.dmp?dl=0

Will also report back if it crashes outside of Total War.
 
Last edited:

Phaaze88

Titan
Ambassador
The drive's controller reading isn't there. There should be a Drive Temperature 2.
Colums JV and JW are the exact same, all the way down. That's physically not possible if they were the flash and the controller.
Column KB is the HDD, I'm guessing?

So there's a missing controller reading, or WD intentionally omitted the sensor?
 

Colif

Win 11 Master
Moderator
I can find evidence that WD sensors are historically inaccurate

sn550 should be reporting drive temp 2, from what i can tell from a short search.

safe mode working can mean 2 things
its drivers but after 6 pages I have my doubts
its hardware as Safe mode doesn't put as much stress on PC as normal mode drivers do.
 
Feb 12, 2021
60
0
30
Well that's just lovely from WD...

I think it's hardware too - question is, what?


@kingkj52
Have you tried to run the PC with the WiFi card, sound card, and modmic dongle removed?

I'd also like you to try setting up a fan to blow inside the PC, if you have one.

WiFi Card is removed, external sound DAC and modmic dongle are still plugged in.. in safe mode at the moment, because I've had a litany of crashes. Used the PC from the time I reported the crash in Warhammer 2 until a small amount of time ago, no problems, then it just hit once, and then hit over and over. My "Quick Access" menu in the explorer seems to be corrupted, as whatever I had pinned is gone, and trying to add something to it says "A disk error occurred during a read operation."

Here are some more crash logs:

1: https://www.dropbox.com/s/vaz9ja1dyjhs476/091921-7296-01.dmp?dl=0

2: https://www.dropbox.com/s/s4p5wklp939radh/091921-6593-01.dmp?dl=0

3: https://www.dropbox.com/s/zrx5b8rvoyj2tbx/091921-6656-01.dmp?dl=0

4: https://www.dropbox.com/s/cnx39rtsikt9kao/091921-6281-01.dmp?dl=0

I don't currently own a fan, but I'll buy one and do that this coming weekend. I can try removing the usbs for the modmic and external sound card as well as keep the WiFi card out. I'll do that now and boot back into normal mode.


Edit: Did a chkdsk /r, quick access is still FUBAR. External sound card and mod mic also unplugged, still crashed, file here: https://www.dropbox.com/s/pz87btrdxnnpnx6/091921-7406-01.dmp?dl=0
 
Last edited:

gardenman

Splendid
Moderator
I'll look at the latest 5 dumps out of the 18 that I think I haven't been through yet. I ran the dump files through the debugger and got the following information: https://jsfiddle.net/q3tju2v4/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.
File information:091921-7406-01.dmp (Sep 19 2021 - 19:18:32)
Bugcheck:UNEXPECTED_KERNEL_MODE_TRAP_M (1000007F)
Probably caused by:ntkrnlmp.exe (Process: svchost.exe)
Uptime:0 Day(s), 0 Hour(s), 02 Min(s), and 44 Sec(s)

File information:091921-7296-01.dmp (Sep 19 2021 - 18:32:36)
Bugcheck:UNEXPECTED_KERNEL_MODE_TRAP_M (1000007F)
Probably caused by:hardware (Process: System)
Uptime:0 Day(s), 4 Hour(s), 39 Min(s), and 49 Sec(s)

File information:091921-6656-01.dmp (Sep 19 2021 - 18:37:55)
Bugcheck:WHEA_UNCORRECTABLE_ERROR (124)
Probably caused by:GenuineIntel (Process: remoting_host.)
Uptime:0 Day(s), 0 Hour(s), 00 Min(s), and 57 Sec(s)

File information:091921-6593-01.dmp (Sep 19 2021 - 18:36:31)
Bugcheck:DRIVER_IRQL_NOT_LESS_OR_EQUAL (D1)
Probably caused by:ntkrnlmp.exe (Process: System)
Uptime:0 Day(s), 0 Hour(s), 02 Min(s), and 26 Sec(s)

File information:091921-6437-01.dmp (Sep 19 2021 - 13:52:24)
Bugcheck:UNEXPECTED_KERNEL_MODE_TRAP_M (1000007F)
Probably caused by:ntkrnlmp.exe (Process: ChromaVisualiz)
Uptime:0 Day(s), 0 Hour(s), 28 Min(s), and 36 Sec(s)
This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 
Feb 12, 2021
60
0
30
Without having to go through this again, you did reinstall windows?

So many diferent faults which seem to be hardware, can you test with other ram? Maybe friend,neighbour or ...

Reinstall Windows has been done a few times, and they swapped the RAM during their RMA process -- I own no other DDR4 sticks and no one around me has any.
 
Last edited:

Colif

Win 11 Master
Moderator
i know i suggested this several pages ago and you told me how much you had already paid but there is only so much Diagnosis that can be done at a distance, and I feel you might actually get a result if you took PC to a repair store and asked them to look at it and find the cause. They would have working spares they can swap in to isolate the cause.

WHEA errors are hardware.
unexpected Kernel mode traps are not common. None of the BSOD point at anything in particular, the ones that actually show me anything anyway.. 1st error shows me noting I can use.
dump 2 shows 1 line and its CPU accessing ram
last 2 happened after it accessed ram.
That doesn't mean its ram. I feel there is likely corruption somewhere, not sure if its ram or CPU really. Its why it needs to be looked at physically
 
Feb 12, 2021
60
0
30
i know i suggested this several pages ago and you told me how much you had already paid but there is only so much Diagnosis that can be done at a distance, and I feel you might actually get a result if you took PC to a repair store and asked them to look at it and find the cause. They would have working spares they can swap in to isolate the cause.

WHEA errors are hardware.
unexpected Kernel mode traps are not common. None of the BSOD point at anything in particular, the ones that actually show me anything anyway.. 1st error shows me noting I can use.
dump 2 shows 1 line and its CPU accessing ram
last 2 happened after it accessed ram.
That doesn't mean its ram. I feel there is likely corruption somewhere, not sure if its ram or CPU really. Its why it needs to be looked at physically


I've had it at a repair shop twice, paid $200 and they just told me they fixed it and that it didn't crash for them during their tests, which, obviously, it's still broken, not to mention I've RMA'd it with CBPC twice as well...

I'm about at the point where I buy a new computer and sell this one piecemeal for scalp-level prices, which is crappy, but I've got to recoup some part of all the sunken cost here. The only thing that really sucks about that is I don't know what part, if any part, is broken and causing issues, that I shouldn't sell..
 

Colif

Win 11 Master
Moderator
I've had it at a repair shop twice, paid $200 and they just told me they fixed it and that it didn't crash for them during their tests, which, obviously, it's still broken, not to mention I've RMA'd it with CBPC twice as well...

I'm about at the point where I buy a new computer and sell this one piecemeal for scalp-level prices, which is crappy, but I've got to recoup some part of all the sunken cost here. The only thing that really sucks about that is I don't know what part, if any part, is broken and causing issues, that I shouldn't sell..
:(

Damn, I don't know what to say. Almost all of it has been replaced though. So it is a new PC in almost every aspect except the CPU

Both drives were tested
Replaced PSU
But they replaced the SSD and MOBO and this error is still continuing.
GPU was swapped to a EVGA - GeForce RTX 3090 XC3
SSD was swapped to a different one, but the same model. When restoring old files, I saw the serial number was different, but same exact model.
Memory was swapped to same model, different sticks. Can't verify for sure, but they say it happened.
I personally bought and swapped the power supply to a "Seasonic Flagship Prime Series SSR-1000PD 1000W Platinum Full Modular ATX12V & EPS12V 135mm FDB Fan Super Quiet Power Supply" upon RMA return.
The only part not replaced is the CPU and it passes all the tests we throw at it.

Has VIc questioned the power in your house yet? if it works fine for 2 different repair places, and doesn't for you, it can only be something you are either plugging into it or the power in house?

I am just looking for alternatives.
 

Vic 40

Titan
Ambassador
Has VIc questioned the power in your house yet? if it works fine for 2 different repair places, and doesn't for you, it can only be something you are either plugging into it or the power in house?

I am just looking for alternatives.
I did, he said power was crappy, also mentioned his airconditioner not working properly, i think i hinted to what if it has some influence on how the pc works if powered by the same wiring, maybe i wasn't clear. He mentioned the amount of money he pays and that for that kind of money maintenance was below par.

Think either ram or cpu have other wise a fault, could well be that ram from a different maker wil work properly, already said in another thread that sometimes some ram just won't really work well with some cpu/motherboard combo. In that thread was switching ram the solution.