Question Random nvlddmkm crashes

Mar 6, 2023
4
0
10
Hi,

I try to post on the forum since I am completely lost now.

On the end of 2020 I bought a new PC, with AMD specs.

CPU AMD 5600X
32 GB RAM DDR4 3200MHz
Radeon RX5500 XT
550W power supply Cooler Master MWE Gold V2
Motherboard X570-A PRO (MS-7C37)

Since the start I had random crashes when games were a bit hungry for resources, the PC would shutdown itself even when temps were not that high (between 70 and 80°C on the GPU, max 72 on CPU), and AMD Radeon Software Adrenaline would say it has reset configuration after a driver crash.

I tried many settings to bring down temps, but without results on crashes.

After a year I changed the CPU cooler for a Noctua one, and immediately saw the benefits of it, 61°C max on the CPU (obviously same temps on GPU), and the PC never crashed again.

So at first I thought that the CPU was making the gpu driver crash for any safety reason whatsoever linked to temperature, and was like "cool it now works".

But 6 months ago I decided to buy a better graphics card, an RTX3070, coupled with a new power supply, the same model as before but in 750W this time, with a 2K new monitor.

I installed the card and the power supply, placed the new monitor in primary, installed latest GPU driver, and played.

But after a day or two, I had a crash on GTA V, not rebooting the computer this time but closing the game, with an error in the event viewer on the security tab, reporting that nvlddmkm.sys stopped working.

Since then I spent easily more than 60 hours trying everything from downclocking the GPU, undervolting it, overclocking it, tried every driver since the launch of the card until now, played with and eventually disabled the TDR delay system in the registry, updated the UEFI to last version, nothing helped.

I even completely format the boot drive (clearing my dual boot) and installed a fresh version of Windows 10, without dual boot, and without driver to avoid any conflict.

And with just a GPU driver from NVIDIA installed on the fresh Windows, crashes are still happening.

So I asked the vendor to RMA the card, but before sending it I reinstalled the old card in the system (just to be sure that I have something to play while the card is sent and repaired), keeping the new monitor and power supply, and guess what, computer now restarts randomly in games exactly like when I bought the PC, except now I don't have temps issues (GPU 70°C, CPU 62).

I am sorry because I am pretty sure I missed a lot of informations about testings I have done in the past two years, but you have now kind of an idea about my issues.

Because of this, I am now pretty sure that :

- Power supply is not faulty, since the problem persists through a new one which is more powerful;

- Temps are not the culprit (I have never seen a PC restarts from overheating at 80°C on GPU, even laptops, but now that it is even lower I am pretty sure it does not make any problem);

- Issue only appear in games, on both GPUs the benchmarks passed easily even for two hours conscutive at max power draw;

- Every version of DirectX used in games makes crashes;

- Virtualization enabled or not in the UEFI does not change anything, either XMP, ReBAR and other known protocols that are known to get issues on certain computer specs.

So the question is, do I have a problem with my motherboard ? I did not inspected it with a glass lens or my microscope, but at first look it always seemed OK, no water, liquids or anything have been poured onto the PC, and I try to clean it completely every month to be sure heat does not starts to build in the case.

Could the RAM be faulty even if it registers 1600MHz in CPU-Z on each slot, and that it is installed in the right slots ?

I don't think the CPU would be the problem honestly, but could it in any way ?

Do I just have extreme bad luck and got two faulty GPUs ?

Just in case these are the games I played and that have crashed (some games like Binding of Isaac do not crash, but they are so low in graphics that I dont know if these really counts in the tests) :

GTA V;

Escape from Tarkov;

World of tanks;

Stalker Anomaly;

Minecraft;

Borderlands 2.

Thank you in advance for your understanding, and for your replies.
 
Mar 6, 2023
4
0
10
Hi,

thanks for the replies.

I already tried using DDU and the issue persists through a fresh install of Windows, so this is not the solution to the problem.

For the RAM I tried to do some memtests through windows tool and memtest86, I will try with only one stick at a time this evening.

I will also try to change all my power cables in the case since I have new ones for spare and get you informed.

Thanks for your potential new ideas on the subject.

EDIT : RAM is crucial ballistix 3200 MHz DDR4 CL16 if I remember correctly, I will verify this evening
 
Mar 6, 2023
4
0
10
Hi,

Sorry for the response being late, I could not use my PC this week.

I changed all of the power supply cables in the PC, still crashes.

I tried using only one stick of RAM, failed on the first, now it is somehow stable with the second, I will try to run more games when I have the time to try it out.

I keep you updated soon.
 
Mar 20, 2023
1
0
10
I am a gaming netizen from China, and the failure you described can be resolved by replacing the CPU. After troubleshooting the blue screen error reported by nvlddmkm.sys, I resolved it by replacing the CPU. My configuration CPU: 8700K, motherboard: Z370F, graphics card: 2080ti. I hope I can help you!
 
Mar 6, 2023
4
0
10
Hi,

After further testings it seems the problem did not appeared again, even on GTA V which seems good for me.

So the issue was a faulty RAM module, thanks to dave for the tip, I would have not tried it otherwise.

Case closed!