Question Out of ideas, please help. Computer randomly restarting and sometimes various BSOD. Stumped and frustrated.

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Brand new computer, about 6 months old. Last couple weeks i've had issues with my computer randomly restarting and sometimes getting BSOD, various errors.

I've done memtest twice, no errors both times. Just finished a FurMark test and no crashes. Which is why i'm very very stuck and I don't understand what else it could be.

I'm assuming a hardware failure of some sorts. Done some research since the FurMark test and it seems it could be PSU or the PSU isn't giving the right amount of power? But i don't understand why the FurMark passed at high stress.

I am just stumped and i'm getting very frustrated with things. I got a new computer to prevent this fun stuff.

PC - https://pcpartpicker.com/list/KmctrV

I wiped my computer a couple times, I thought maybe software issues, but the problems persisted. I've also switched ram slots, same issues. i did a scv test and it came back with corrupted file and fixed them, problem persisted. (Was recommended after the BSOD)

Edit - Forgot to mention I am getting random FPS drops in various games. Mainly playing PoE atm and I get FPS drops at both intense and seemingly nothing events. With this build it makes no sense I should be getting any severe FPS drops.
 
It's no problem. So first thing I'd be inclined to do is remove the Corsair set, and leave it out, until you resolve the errors/problems. Probably, leave it out permanently. Memory used together that didn't come together is probably one of the top ten reasons we see weird errors from otherwise normal configurations, as you know.

I'd take the Corsair kit out, and make sure that the G.Skill kit is installed in the A2 and B2 slots, only. Install the latest stable BIOS. RESET the BIOS to eliminate problems with non-standard configuration, fix any necessary boot order and fan curve settings, enable the memory XMP profile, then save and exit the BIOS settings configuration program and see if there are still issues. If there ARE, do the same thing, again, but swap the Corsair memory in for the G.Skill memory. Do not at any point run them together again, preferably ever, but for certain at least until you resolve the problems you are experiencing.
 
I just wanna say, yes there are 2 different rams, i honestly didn't even think of that as an issue. I'll remove the less pretty one.. also the corsair one. lol

I suppose this BSOD doesn't exactly track down the previous problems I was having. I'll be back when i get another BSOD, after i remove the ram. Damn.

Thanks for the help guys, I honestly did not know that different ram sets could cause a problem. I thought i also bought the same timings n such so didnt think it an issue.

Appreciate it! Cya soon unfortunately, i think. Hope not!
 
I just wanna say, yes there are 2 different rams, i honestly didn't even think of that as an issue. I'll remove the less pretty one.. also the corsair one. lol

I suppose this BSOD doesn't exactly track down the previous problems I was having. I'll be back when i get another BSOD, after i remove the ram. Damn.

Thanks for the help guys, I honestly did not know that different ram sets could cause a problem. I thought i also bought the same timings n such so didnt think it an issue.

Appreciate it! Cya soon unfortunately, i think. Hope not!



irewYn7.png




The odd man out, (Or, mixed memory)


While memory modules that did not come together in a matched set that was tested by the manufacturer to be compatible, certainly CAN still work together, often it does not. Right up front I'll tell you that if you are trying to get sticks to work in the same machine together that were purchased separately, even if they are otherwise identical according to the kit or model number or if they would seem to have identical timings and voltage requirements, there is a very good chance that you simply will not be able to do that. There is also a pretty fair chance that you might be able to if you are willing to take your time, listen to and understand what you are being told and follow the steps necessary to determining if they will "play nice" or not.

The exception in most cases will be that if the memory from both sets are the same speed and timings and both kits are within the JEDEC specifications for the default speed on that platform, so for example, 2666mhz on the latest Intel Z390 platform, 2133mhz on Ryzen first and second Gen platforms, then they stand a much better chance of working together but if they are higher speed kits the chances begin to diminish from what they might be at the low speed and loose timings end of the scale.

A word of advice. If you just purchased this memory, and for whatever reason you bought two separate sticks of the same memory instead of buying them together in a matched set, see if you can return them for a refund or credit towards buying a similar or same set of matched sticks that come together in a kit. It is ALWAYS better to have matched modules because from brand to brand, or even within the same brand, in fact, even when the part numbers are IDENTICAL, there can be anything from simply slightly different memory chips that were sourced from different bins at the end or beginning of a production run to entirely different configurations altogether even though the model numbers seem to be the same. Some manufacturers even reuse model numbers when they discontinue a product. Point being, memory is only the same for sure when all sticks came out of the same blister pack or packaging and were sold as a tested kit.

In order to determine if differences in the memory, or a need for increased voltage when using more than one stick (Especially if you are running three or more sticks) are responsible for the problems you are having you will always want to begin your troubleshooting process by attempting to boot the machine with only a single stick of memory installed. Also, for practically every consumer motherboard that's been sold since at least as far back as about 2014, the A2 memory slot which is the second slot over from the CPU socket, is THE slot that is most commonly designated for the installation of a single memory module. Slots A2 and B2 are almost always the slots specified in the motherboard memory population rules for use with two modules. If you need to install a third module I have no opinion on which of the remaining slots to use for that, but typically since the A1 slot is right next to the CPU socket and often interferes with the CPU cooler or fan, I'd say the B1 slot was probably just as good.

Honestly, I don't ever recommend that you HAVE three modules installed anyhow. Using memory in pairs is almost always a better option, except on boards that support triple channel memory population, so that normal dual channel operation will occur. And that's another thing. When it comes to memory there are no "single channel" or "dual channel" memory modules. There are ONLY memory modules and the motherboard and CPU architecture will determine whether or not dual, triple or quad channel operation is possible based on the architecture and how many modules are in use. Occasionally though there are situations where it might make sense to run three modules and some boards CAN use three modules in a FLEX type mode where two of the modules will operate in dual channel while the third oddball module will run in single channel. I'd avoid oddball configurations though if possible because many motherboards will simply run ALL modules in single channel mode when an odd number of modules are installed.



If you think you will ever need 16GB of memory, then buy 16GB of memory from the start so you can get it all in a matched set that has been tested,
and eliminate a lot of problems right from the start.



Click here for full guide on troubleshooting problems with PC memory

 
Yes, I have literally seen some systems with quad memory controller architecture run with 8 entirely different sticks, that all had fairly different configurations as far as speed and timings. But those situations are infrequent and are few less common than the number of times I have seen systems with just TWO sticks of memory that were different and (Despite the experiences of some people who simply assume things will work because in their limited experience, they have) would NOT work together DESPITE the fact that on the surface they seemed to be very similar in terms of speed, timings and voltage.

The fact is that most of what determines how well (Or not) two disparate memory modules are going to work together often has very little to do with those specifications at all in reality. It is really down to the more subtle differences of:

What speed are the memory modules? Because with lower speed memory kits (Depending on the platform of course, there are always additional factors and considerations), which for MOST DDR4 platforms we will assume to be within the JEDEC strictures of 2666mhz or lower, it seems there is a lot more forgiveness from the majority of platforms when mixing memory than with higher speed kits. Once you go past 2666mhz, and I'm talking even with one stick that is higher speed, outside the JEDEC specifications, the chances you might have a problem go up considerably. It can of course still work, but it can just as well not work at all or have some combination of mystery issues that don't even seem like they would be memory related but are.

What is the actual makeup/configuration of the memory modules themselves? What IC's (memory chips) were used? Because there is clearly something to be said for the fact that putting two sticks of RAM together that use ICs from entirely different manufacturers may tend to have less probability of working together than sticks using the same brand or type of ICs. And again, in other cases it might not matter at all and will still work. Also HOW the module is built. Is it a single or dual rank module? Is it a dual or single row module? What size of ICs are used, because some motherboards might not be able to compensate and find settings that work for both modules if there are broad differences in what is acceptable and can enable POST for one stick as compared to another, and if that's the case then probably no amount of manual configuration is going to help in that scenario. And these things are something you are very unlikely to ever find included on any list of specifications. You'd have to find a model specific review of that memory kit, which is usually pretty unlikely since memory kits don't tend to get much love in terms of reviews since there is very little to differentiate between them all aside from the very boring details under the hood that 90% of people care nothing about because they ARE going to only install matched sets and it will never be a concern for them.

It's sort of like putting 85 octane gas in your car when the owners manual specifically states that you should use 93. Sure, it WILL run on it, and you MIGHT not have any problems at all. There are however many vehicles out there that if you run 85 in them will experience anything from minor symptoms such as poorer gas mileage to severe issues like preignition/predetonation along with the knock it causes, which is bad. Obviously, not an exact analogy, but similar in that you could get anything from no problems at all to severe problems, and there are additional factors that could amplify or reduce any issues such as what elevation you are at, much as with mixed memory the motherboard in question might also be a factor since the higher the quality of the motherboard the better chance you have of small differences not becoming a problem.

TLDR; it depends on a lot of different factors, not all of which are probably going to be understood by the average person or even myself, considering I am no engineer. At all.
 
So i've been playing a lot of WoW lately. I got about 4-6 crashes yesterday n i've already started the day off with one. When i was playing League of Legends a lot, i rarely every crashed.

Is this a sign of something? So during PoE and WoW, i crash frequently. During LoL, i rarely crash. When i was trying to diagnose this originally, all temps seemed fine, but does this suggest it's an overheating problem?
 
So i've been playing a lot of WoW lately. I got about 4-6 crashes yesterday n i've already started the day off with one. When i was playing League of Legends a lot, i rarely every crashed.

Is this a sign of something? So during PoE and WoW, i crash frequently. During LoL, i rarely crash. When i was trying to diagnose this originally, all temps seemed fine, but does this suggest it's an overheating problem?
No idea. What are the temps at the time this happens? I'd recommend that you open HWinfo and start periodically monitoring GPU, CPU and VRM temperatures as you play to see if there is anything going on there. Crashes alone aren't a sign of high temperatures. High temperatures are a sign of high temperatures. LOL.
 

Colif

Win 11 Master
Moderator
crash with no BSOD? if you BSOD, share the dumps as we might find something in there.

have you removed the 2nd set of ram yet? wonders if you removed right ram. You chose the Corsair cause it is the prettier. maybe it isn't cause.

WOW would likely use more ram than LOL I expect.
 
crash with no BSOD? if you BSOD, share the dumps as we might find something in there.

have you removed the 2nd set of ram yet? wonders if you removed right ram. You chose the Corsair cause it is the prettier. maybe it isn't cause.

WOW would likely use more ram than LOL I expect.
Yeah i get instant restart / crashes without BSOD. Any BSOD i get i post it here. Praying that something is found

I have left the Corsair ram in, yes.
 

gardenman

Splendid
Moderator
I ran the dump file through the debugger and got the following information: https://jsfiddle.net/crojm8u3/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.

File information:123120-6015-01.dmp (Dec 31 2020 - 05:43:18)
Bugcheck:UNEXPECTED_KERNEL_MODE_TRAP_M (1000007F)
Probably caused by:ntkrnlmp.exe (Process: Discord.exe)
Uptime:1 Day(s), 2 Hour(s), 49 Min(s), and 20 Sec(s)

Comment: Only 1 set of RAM installed this time.

This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 
the stack pointer got messed up and it told the cpu to start executing some bad point in memory that happened to be some nt function.
all the parameters to the function were zero so the function called a bugcheck.

you might get this with a overheating problem but it seems that there is just some problem with the cpu and bios version on this chipset right now.
you might just lock down your cpu and wait for bios updates and chipset updates that AMD should release sometime this month.

you might lock down the cpu via
going into registry edit and changing this setting:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\amdppm
Start to have a value of 4

you should note the current value so you can later change it back.




machine info:
~MHz = REG_DWORD 3600
Component Information = REG_BINARY 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Configuration Data = REG_FULL_RESOURCE_DESCRIPTOR ff,ff,ff,ff,ff,ff,ff,ff,0,0,0,0,0,0,0,0
Identifier = REG_SZ AMD64 Family 23 Model 113 Stepping 0
ProcessorNameString = REG_SZ AMD Ryzen 5 3600 6-Core Processor
Update Status = REG_DWORD 1
VendorIdentifier = REG_SZ AuthenticAMD
5: kd> !sysinfo machineid
Machine ID Information [From Smbios 2.8, DMIVersion 0, Size=2520]
BiosMajorRelease = 5
BiosMinorRelease = 14
BiosVendor = American Megatrends Inc.
BiosVersion = 3.70
BiosReleaseDate = 06/09/2020
SystemManufacturer = Micro-Star International Co., Ltd
SystemProductName = MS-7C02
SystemFamily = To be filled by O.E.M.
SystemVersion = 1.0
SystemSKU = To be filled by O.E.M.
BaseBoardManufacturer = Micro-Star International Co., Ltd
BaseBoardProduct = B450 TOMAHAWK MAX (MS-7C02)
BaseBoardVersion = 1.0
 
the stack pointer got messed up and it told the cpu to start executing some bad point in memory that happened to be some nt function.
all the parameters to the function were zero so the function called a bugcheck.

you might get this with a overheating problem but it seems that there is just some problem with the cpu and bios version on this chipset right now.
you might just lock down your cpu and wait for bios updates and chipset updates that AMD should release sometime this month.

you might lock down the cpu via
going into registry edit and changing this setting:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\amdppm
Start to have a value of 4

you should note the current value so you can later change it back.




machine info:
~MHz = REG_DWORD 3600
Component Information = REG_BINARY 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Configuration Data = REG_FULL_RESOURCE_DESCRIPTOR ff,ff,ff,ff,ff,ff,ff,ff,0,0,0,0,0,0,0,0
Identifier = REG_SZ AMD64 Family 23 Model 113 Stepping 0
ProcessorNameString = REG_SZ AMD Ryzen 5 3600 6-Core Processor
Update Status = REG_DWORD 1
VendorIdentifier = REG_SZ AuthenticAMD
5: kd> !sysinfo machineid
Machine ID Information [From Smbios 2.8, DMIVersion 0, Size=2520]
BiosMajorRelease = 5
BiosMinorRelease = 14
BiosVendor = American Megatrends Inc.
BiosVersion = 3.70
BiosReleaseDate = 06/09/2020
SystemManufacturer = Micro-Star International Co., Ltd
SystemProductName = MS-7C02
SystemFamily = To be filled by O.E.M.
SystemVersion = 1.0
SystemSKU = To be filled by O.E.M.
BaseBoardManufacturer = Micro-Star International Co., Ltd
BaseBoardProduct = B450 TOMAHAWK MAX (MS-7C02)
BaseBoardVersion = 1.0
So i went to Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AmdPPM

Clicked on the "Start - REG_DWORD" and set the value from 3 to 4.

That's how i understood that anyways.


I am just doing this now, so i'll see how it goes. I also wanted to post ANOTHER BSOD i got a few hours ago.

https://we.tl/t-cX3qlLwT0t

Is this hoping to stop the crashing somehow? I find it crazy that a change from 3 to 4 can cause that. Crazy stuff.
 

gardenman

Splendid
Moderator
I was waiting to see if john had more ideas as he knows more about this than I do.

I ran the dump files through the debugger and got the following information: https://jsfiddle.net/kt1f62gz/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.
File information:010421-6015-01.dmp (Jan 4 2021 - 18:34:52)
Bugcheck:DRIVER_OVERRAN_STACK_BUFFER (F7)
Probably caused by:ntkrnlmp.exe (Process: System)
Uptime:0 Day(s), 14 Hour(s), 23 Min(s), and 03 Sec(s)

File information:010421-5968-01.dmp (Jan 4 2021 - 04:11:10)
Bugcheck:IRQL_NOT_LESS_OR_EQUAL (A)
Probably caused by:memory_corruption (Process: iCUE.exe)
Uptime:0 Day(s), 2 Hour(s), 58 Min(s), and 21 Sec(s)
This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 

gardenman

Splendid
Moderator
I ran the dump files through the debugger and got the following information: https://jsfiddle.net/zb6po85r/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.
File information:011021-6578-01.dmp (Jan 11 2021 - 02:50:41)
Bugcheck:IRQL_NOT_LESS_OR_EQUAL (A)
Probably caused by:ntkrnlmp.exe (Process: System)
Uptime:1 Day(s), 0 Hour(s), 21 Min(s), and 50 Sec(s)

File information:010921-7234-01.dmp (Jan 10 2021 - 02:28:18)
Bugcheck:IRQL_NOT_LESS_OR_EQUAL (A)
Probably caused by:ntkrnlmp.exe (Process: System)
Uptime:3 Day(s), 1 Hour(s), 23 Min(s), and 16 Sec(s)
Someone really needs to go through this thread and write down what all has been done/tried thus far. This has been alot of answers from multiple people. Maybe a checklist.

Memtest86 ran?
Hard drives tested with what software?
CPU tested with what software?
Tried a different GPU?
Tried a different PSU?
etc.

@Colif have any more ideas?

This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 

Colif

Win 11 Master
Moderator
i don't know if turning ppm off actually helped.

The 1st dump on Monday Jan 4 appears to be the system trying to use power settings and maybe failing cause amdppm isn't there...
this happened just before an error was called - nt!PpmIdleSelectStates+0x7a5 and its the amdppm that sets those states.
2nd error not so clear
Mon Jan 11 error looks same, nt!PpmIdleSelectStates+0x6f1 led to page fault as it couldn't find the driver in memory and then it crashed.
Sunday error is the same

Reenable amdppm, its not helping... its just adding more bsod

This thread going too long... I am going to have to create a summary and see whats been missed.