Question I'm at my wits end troubleshooting, need help please ?

Feb 11, 2025
5
1
15
So I have been having a bunch of seperate issues that have been getting progressively worse over the past week and it's starting to drive me up the wall, as everytime I think I'm getting somewhere it all starts back up again I need some help troubleshooting as unfortunately all my old hardware is incompatible with my new system to swap out and test things.

Symptoms:
1. Sometimes my gpu driver will not launch on boot, which causes artifacting and only one usable monitor out of my 3.
2. Sometimes when it does boot properly it will randomly crash, sometimes with small artifacts, Sometimes just black screen, sometimes just frozen. doesn't seem to matter about what activities are being done.
3. It has corrupted my windows install about 3 times now in the past two weeks, getting stuck on windows boot diagnostics forcing me to reset windows. Sometimes I can do it from this menu without problem, sometimes It throws errors and I need to use a USB boot device. (this has seemed to stabilize it temporarily but now isn't working.)
4. Sometimes just black screens after posting but I hear the windows login sound.

Things I've tried:
1. swapping my RAM for an older set (notably same brand if that matters)
2. upping my RAM voltage slightly to see if maybe it would stabilize (the windows corruption pointed me to RAM initially)
3. unplugging and reseating everything (except cpu as I don't have thermal paste lying around atm)
4. Fresh windows installs. (also switched which drive i installed it on to rule out my ssd)
5. Ran multiple memory tests but unfortunately they either comeback nothing or the computer crashes before finishing the lengthy tests (lookin at you memtest86!)

Specs:
I5-11400f
Nvidia 3060
32 gigs oloy ddr4 RAM
Gigabyte b560 ds3h AC
corsair rm850e psu

At this point I'm out of options for testing but havent narrowed it down enough to justify a purchase atm. I don't have integraded graphics to check the gpu but am leaning towards it's not the issue seeing as it always shows post and bios screens fine in every situation except for when the pc launches without the graphics driver. The windows corruption leads me to RAM being at the heart of it but I don't know if it's my CPU memory controller, something with the Mobo or the RAM itself. Switching the sticks implies its not directly the sticks themselves but they are not on my MOBO's QVL list so I'm not writing it off all together though it would seem odd this started recently after a couple years of no issues. I will say the RAM has never been very stable with XMP on and have had it off for a while due to Baldurs gate having some issues with PCs running xmp profiles.

Any direction or tests I could run would be a huge help, I'm at a loss. I have a kid on the way which is making me hesitant to just throw money at hardware without knowing it will be the solution. Please people of Tom's Hardware help me out, I've been reading through posts this last week to no avail and I'm hoping maybe my specific case will get me a more nuanced conversation going on my issues
 
Could be the graphics card really. Try underclock the card a bit, core clock and memory frequency using Msi Afterburner. There are tutorials on youtube.

For video driver to crash most likely means the card wasn't detected properly in that instance. Could try clean contacts but doubt it would be that if graphics card has always been in the slot.

Regarding your ram, what xmp values are they? And what dimm slots are they in? For two sticks they should be in the gray slots. If your ram is made up of four dimms, well that's another ball game trying to get them working at xmp speeds and also depending how fast as well. 3200MT/s is ideal for 11th gen.
 
  • Like
Reactions: Lamarr the Strelok
3200 is the xmp for the RAM, running at 2400 currently at default. 2 sticks both in the gray matching slots.

I'll try underclocking it this evening and see if that has any noticeable effect. Only reason I've been leaning away from the gpu is because of the windows boot getting corrupted, which from most of what I've read points to a RAM issue. That and it's the most expensive piece to replace, though honestly haven't looked at older card prices in a little bit. Maybe I'll try busting out the old 970 if it'll fit the slot just to try and isolate the issues. Could I try another PCIe slot? read that they are rated for different speeds so I havent tried it yet.
 
Other slot is worth a shot and definitely try your old gpu.

Anything that can cause system instability to crash can potentially corrupt data, not just memory related but too ram can also be an issue. I'd isolate both trying single stick and see which one may be potentially bad and maybe preventing xmp. Graphical glitches is usually graphics card though.
 
  • Like
Reactions: Lamarr the Strelok
Does ram, both sets of new or old both fail memtest without xmp? An over tight cpu cooler can cause memory related problems. Half a turn back from maximum with the tension screws.
I didn't run it on the old set as the problems persisted with them swapped so I didn't really think it was the issue tbh. New set never actually failed, passed a full set once then crashed sometime when I was away as I let the second test run. I will try loosening the cooler a little, it was a prebuilt originally so I could see an overzealous builder really cranking it down, that with the combined extra cold weather lately (moved to Montana from California over the summer so been wondering if the cold might be playing a factor, been -20 a couple of nights lately, though my place never gets that cold obviously but maybe the wood floor is too cold.) Appreciate your help and suggestions! Its running decent this evening so far but as soon as I have an issue I'll swap the GPU just to test. Gonna try and monitor temps etc closer as I use it and see if it reveals anything useful.
Might run memtest on each stick of ram individually for a bit to see if any of them get flagged.

Also might run a diagnostic on the ssd drive to be sure it’s ok.

After that, you may run Intel burn test on your cpu, then run furmark on the gpu for about 20 minutes. Just thinking of things you can try to isolate parts.
I've tested the m2 ssd and my general ssd using the windows diagnostic tools and both came back clean but hard to trust a lot of results with the on and off nature of the problems I'm not sure if its an issue all the time if that makes sense. Some nights I boot it up and it works fine for the night, sometimes It takes 4-5 restarts to get it to boot right and then it works fine, and sometimes like last night it just completely bricks slowly and I have to reset windows.
 
Other slot is worth a shot and definitely try your old gpu.

Anything that can cause system instability to crash can potentially corrupt data, not just memory related but too ram can also be an issue. I'd isolate both trying single stick and see which one may be potentially bad and maybe preventing xmp. Graphical glitches is usually graphics card though.
Well it ended up crashing with some artifacts so I swapped cards and no issues so far. Ran it through 4-5 restarts and no booting issues, I'll let you know if anything changes! Not what i was hoping for but at least i might have something that works now!
 
  • Like
Reactions: boju
Well it ended up crashing with some artifacts so I swapped cards and no issues so far.
Before you chuck the 3060 in the trash can, do you have another computer to test it in? If the replacement GPU is less powerful than the 170W 3060, you might consider temporarily changing the RM850e if the PSU is getting old. The 3060 might still be OK, but the PSU may be on its way out.

My computer was crashing in Topaz Video AI but I fixed the problem by setting MSI Afterburner to limit my Gigabyte 3060 to 95% maximum power. A few months later Topaz fixed the sofware and I removed the 95% power limit. I'm not sure if it was the GPU crashing, but MSI Afterburner made things stable. I was using NVidia's more stable "Studio" driver on the 3060, not the latest "Gaming" driver.

Some nights I boot it up and it works fine for the night, sometimes It takes 4-5 restarts
How cold is the room at night when you start the PC? Is it close to or below freezing? Are there any signs of condensation on cold metal surfaces, especially the inside of the computer case?

This web site says:-
https://ramtechno.com/how-your-power-supply-keeps-its-cool/

The ATX standard recommends an operating ambient temperature range of +10°C to +50°C (under specific test conditions). The EN 60601-1 standard calls for +10°C to +40°C.

Despite the spec stating the minimum recommended temperature of the PSU is +10°C, I wouldn't be too worried if the room temperature is a bit lower, e.g. +5°C or even 0°C, but I'd be cautious about powering on commercial equipment at -20°C.

I've tested Aerospace systems down to -51°C/-60°F, Below freezing, hoar frost sometimes forms on printed circuit boards as the moisture in the air freezes. When you raise the air temperature back above freezing, the frost melts and the boards can become covered in a very thin film of water. It's customary to wait a few minutes for the frost to melt and the water to evaporate before applying power, to avoid possible short circuits.
 
Before you chuck the 3060 in the trash can, do you have another computer to test it in? If the replacement GPU is less powerful than the 170W 3060, you might consider temporarily changing the RM850e if the PSU is getting old. The 3060 might still be OK, but the PSU may be on its way out.

My computer was crashing in Topaz Video AI but I fixed the problem by setting MSI Afterburner to limit my Gigabyte 3060 to 95% maximum power. A few months later Topaz fixed the sofware and I removed the 95% power limit. I'm not sure if it was the GPU crashing, but MSI Afterburner made things stable. I was using NVidia's more stable "Studio" driver on the 3060, not the latest "Gaming" driver.


How cold is the room at night when you start the PC? Is it close to or below freezing? Are there any signs of condensation on cold metal surfaces, especially the inside of the computer case?

This web site says:-
https://ramtechno.com/how-your-power-supply-keeps-its-cool/

The ATX standard recommends an operating ambient temperature range of +10°C to +50°C (under specific test conditions). The EN 60601-1 standard calls for +10°C to +40°C.

Despite the spec stating the minimum recommended temperature of the PSU is +10°C, I wouldn't be too worried if the room temperature is a bit lower, e.g. +5°C or even 0°C, but I'd be cautious about powering on commercial equipment at -20°C.

I've tested Aerospace systems down to -51°C/-60°F, Below freezing, hoar frost sometimes forms on printed circuit boards as the moisture in the air freezes. When you raise the air temperature back above freezing, the frost melts and the boards can become covered in a very thin film of water. It's customary to wait a few minutes for the frost to melt and the water to evaporate before applying power, to avoid possible short circuits.
The PSU is pretty new, I swapped the original thermal take out a couple months ago when I first started seeing small gaming crashes as it looked similar to issues I had in the past that boiled down to the PSU, when it didn't fix the issues I thought maybe it's a software thing that would sort itself out over updates but no luck. My wife has a PC I can try and throw it in to make sure, I didn't end up underclocking it as someone suggested, when it crashed again I just said screw it and swapped em to hopefully have some working entertainment for the rest of the evening.

I don't think it gets below 50F in the house the floors are noticeably colder. My wife works from home so the heater is always at least passively running at a lower temp. Haven't noticed any condensation though, was more just concerned with possible warping from the the temp change I suppose.

I still have the 3060 for now, seems a shame to toss it out even if it's the issue. Surprised it is considering I've only had it since covid, but unfortunately just outside of the RMA period. Just kindness shocking considering I've had the 960 for probably over a decade now and never had issues with it even running it hard beyond it's means (I ran cyberpunk on release on that thing with no issue even when everyone was ranting about it melting their systems)

But when I get home tonight we'll see if it really was the problem or if I just got lucky boots last night.