• Now's your chance to speak with MSI directly! Join our AMA and get entered to win a RTX 2060 GPU, plus more! Ask your questions here.

    Catch the next Tom's Hardware livestream on May 28 at 2PM Eastern: Catch the show here!

    Need Hardware Help? Include the Info Requested Here.
  • Join our Best PC Builds competition and get on the Tom's Hardware home page. Check it out here.

[SOLVED] STOP 0xD1 when playing full-screen games

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Feb 6, 2020
99
8
45
1
Is there a repair shop nearby which would have a GPU they could throw in to see if it has same errors? or another PC they can put your GPU in to test same in another PC.

Without throwing money at new parts randomly, this is the best idea I can think of.
I'm in Toronto so probably. I'll get the inside dusted out and check nearby.

IRQ errors are rare for nvidia driver based errors. I have said that. It could well be the ram on the GPU all we know.
This... actually wouldn't shock me for a couple of reasons:

  1. Geometry corruption and artifacting in WoW when in "optimal power" mode. Consistent corruption of data resident in memory would suggest problems with that specific part of physical memory, no?
  2. The "Shader Header ##" errors. I actually Googled this specific error and a good number of results pointed towards faulty VRAM. Naturally, since VRAM operations are far more commonplace when gaming, this produces a problem, which leads into...
  3. ... When the driver tries to map this memory as a page, it runs into the corruption, which makes it dump the data it was going to write in a mapped page of system memory. Windows no likey, since it's trashing memory it shouldn't be. Crash.
Now, this could naturally be completely wrong. Computers are a bunch of disparate, heavily specialised parts held together by operating system duct tape. But I'm not willing to discount the possibility.
 
Last edited:

axe0axe0

Respectable
Jun 5, 2017
886
27
2,390
101
The sequence so far is Open 3D app -> a few frames get rendered -> GPU vanishes -> driver tries to write incomplete data and shader code to a missing GPU, can't, and crashes -> either program crash or BSoD.

E: I got curious so I decided to read over the more in-depth dump data that GM's been posting, and noticed something interesting. The drivers are crashing during the same page fault operation at the same stack offset every time. Unsure of how meaningful this is, but it's certainly interesting.
I'm not sure if it's incomplete data, minidumps tend to lack data.

The display driver is crashing with the same offset over and over simply means every time it crashes it's doing the same thing.

Why don't you try a driver from this year.
 
Feb 6, 2020
99
8
45
1
Windows tried to trap me in update hell when I went to use DDU to install 442.19, so I'll wait until later tonight for that. I do have a full memory dump that I saved from one of the BSoDs that should have more information. I've taken the liberty of running 7zip on it with Ultra compression, reducing the download size from 1.95 GB to 261.6 MB, making it far more gardenman-friendly if he wants to take a look at it.

FullMemoryDump.7z

List of driver versions I've tried:

  • 384.76 (Gigabyte drivers)
  • 432.00
  • 441.66
All have produced the same result so far.
 

gardenman

Admirable
Moderator
Thanks for compressing it. Axe is the one you want looking at the memory.dmp. However I downloaded it and here are the basic results: https://overdresseddutch.htmlpasta.com/

A while back I was having game crashes (still do). I tried the following drivers: 388.13, 432.00, 390.65, 397.93, 398.18 and 436.02. I finally ended up on 391.05. Newer drivers screw up AVI files in Media Player (for me). Games still crash, but work 1 out of every 4 tries. It's probably my hardware, an older Alienware and GTX 770, but it's weird how some drivers work better than others. If seems if it was hardware, then all drivers would be bad and games would crash every time (same game). I've given up but my PC is 10 years old. I don't game that much and I want a new PC. I'm not saying to give up, I'm just telling what's happened to me with multiple driver versions. After I find one that works, it will work OK a month or two, then things get worse once again.

Wait for more replies.
 

Colif

Win 10 Master
Moderator
I have found Nvidia drivers to be less reliable on older cards in the last few months. I am not exactly sure why, but if I see a PC with an Nvidia card (which is most) and its not an RTX card, and they have latest drivers, I have to stop myself from suggesting ddu.

I would love to know why.

that last BSOD couldn't blame Nvidia anymore if it tried

Axe may see more than I can... he knows what to look for and where to find it, I just learned this all as I go.
 
Last edited:
Feb 6, 2020
99
8
45
1
If seems if it was hardware, then all drivers would be bad and games would crash every time (same game).
This has largely been my issue, which, together with the age of the card and observed geometry corruption in any (though I personally saw it in optimal mode) power state, is what pointed me towards the card possibly being on its last legs.

Initially I was getting checkerboard artifacts on the desktop back in December, but a combination of different drivers and Maximum Performance mode seems to have stopped that. Now all I get is ingame artifacting on optimal mode, hangs in gaming following less pronounced artifacting and BSoDs.

that last BSOD couldn't blame Nvidia anymore if it tried
This full dump was one I saved from a STOP on the 12th, but yes, they have all blamed the same driver for the same thing.

Interestingly, when I hIt the TDR retry limit last week, I hadn't thought to check C:\Windows\LiveKernelDumps. I did so today and it turned up a "STOP" 0x141 (VIDEO_ENGINE_TIMEOUT_DETECTED). Quotes because in Microsoft's own words, 141 isn't an actual STOP error.
 
Last edited:

Colif

Win 10 Master
Moderator
This full dump was one I saved from a STOP on the 12th, but yes, they have all blamed the same driver for the same thing.
what I meant was this.. most stack texts will mention the driver 1 time if you lucky, this one is screaming nvidia
ffff930774e287e8 fffff800393d32e9 : 000000000000000a 0000000000000000 0000000000000002 0000000000000001 : nt!KeBugCheckEx
ffff930774e287f0 fffff800393cf62b : ffff930700000020 ffff930774e28950 0000000000000000 fffff80052267b80 : nt!KiBugCheckDispatch+0x69
ffff930774e28930 fffff80052263f62 : ffffe30dd74e9000 ffffe30dd74102f0 ffffe30dd74e9000 ffffe30dd7e0d010 : nt!KiPageFault+0x46b
ffff930774e28ac0 fffff8005275eb9f : 000000000000000c ffffe30dd75541c0 00000000ffffffff 0000000000000000 : nvlddmkm+0x1e3f62
ffff930774e28af0 fffff8005275ed82 : ffffe30dd7e0d010 ffff930774e28c10 ffffe30dd74e9000 ffffe30dd75541c0 : nvlddmkm+0x6deb9f
ffff930774e28b20 fffff800526ca272 : 0000000000000000 ffff930774e28c10 ffffe30dd74e9000 0100000000100000 : nvlddmkm+0x6ded82
ffff930774e28b90 fffff800526c9f42 : ffffe30dd74ac400 000000000000000c ffffe30dd74e9000 0000000000000000 : nvlddmkm+0x64a272
ffff930774e28bd0 fffff80052251ca1 : ffffe30dd28158a0 fffff8005226a73f ffffe30dd74e9000 0000000000000000 : nvlddmkm+0x649f42
ffff930774e28db0 fffff8005275f288 : ffffe30d00000000 ffffe30dd7e0d010 ffffe30dd74e9000 ffffe30dd74e9001 : nvlddmkm+0x1d1ca1
ffff930774e28e00 fffff80052408294 : ffffe30dd7fe7000 ffffe30dd7e0d010 0000000000000020 ffff930774e28e30 : nvlddmkm+0x6df288
ffff930774e28e70 fffff80052408443 : ffffe30dd74e9000 0000000000000010 0000000000000000 fffff8005267ead8 : nvlddmkm+0x388294
ffff930774e28eb0 fffff8005247a4cf : ffff930700000070 0000000000000000 ffffe30dd7e6f010 ffff930774e28fb0 : nvlddmkm+0x388443
ffff930774e28f10 fffff8005247ab29 : ffffe30dd74e9000 ffffe30dd74e9000 ffffe30dd25bcf00 ffffe30dd7e6f010 : nvlddmkm+0x3fa4cf
ffff930774e28f80 fffff8005263cd07 : ffffe30dd74e9000 ffffe30dd74e9000 ffffe30dd25bcf40 ffffe30dd74e9000 : nvlddmkm+0x3fab29
ffff930774e29050 fffff8005276007f : ffffe30dd25bcd30 ffffe30dd25bcd30 ffffe30dd7e6c0c0 ffffe30dd25bcf40 : nvlddmkm+0x5bcd07
ffff930774e290b0 fffff80052761153 : ffffe30dd74e9000 ffff930774e29720 ffffe30dd25bcd30 fffff800522674f0 : nvlddmkm+0x6e007f
ffff930774e29100 fffff80052423e92 : ffffe30dd74e9000 ffffe30dd7e6c0c0 ffffe30dd74ea298 fffff800522507a0 : nvlddmkm+0x6e1153
ffff930774e29130 fffff80052268628 : ffffe30dd74ea298 0000000000400100 ffff930774e29220 0000000000100000 : nvlddmkm+0x3a3e92
ffff930774e29180 fffff8005226a6ff : ffffe30dd74e9000 ffff930774e29230 0000000000000000 0000000000000000 : nvlddmkm+0x1e8628
ffff930774e291e0 fffff8005226a609 : 0000000000000000 0000000000400100 0000000000000000 ffffe30d00400100 : nvlddmkm+0x1ea6ff
ffff930774e29250 fffff8005226a840 : 0000000000000051 ffff930774e293d0 ffffe30dd7f08000 0000000000000000 : nvlddmkm+0x1ea609
ffff930774e29290 fffff800523e63d8 : 0000000000000051 000000000000000c 0000000000000000 0000000000000000 : nvlddmkm+0x1ea840
ffff930774e292d0 fffff8005247a898 : 0000000000000000 ffffe30dd7e6f010 0000000000000000 ffffe30dd74e9000 : nvlddmkm+0x3663d8
ffff930774e29500 fffff80052479f12 : ffffe30dffffffff ffffe30dd74e9000 ffffe30d00000000 fffff8005227f14f : nvlddmkm+0x3fa898
ffff930774e29570 fffff8005247a11f : ffffe30de9e9a370 fffff80000000017 ffffe30dd74e9000 ffffe30dd7df7950 : nvlddmkm+0x3f9f12
ffff930774e295c0 fffff8005247adc6 : 0000000000000040 ffff930774e297d0 0000000000000040 ffffe30dd74e9000 : nvlddmkm+0x3fa11f
ffff930774e29640 fffff8005261bacc : 000098020e1713f0 000098020e1713f0 ffffe30dd74e9000 ffffe30dd7e6f010 : nvlddmkm+0x3fadc6
ffff930774e296a0 fffff80052241fca : ffffe30dd74e9000 ffffe30dd7e6f010 ffffe30dd7559030 0000000000000000 : nvlddmkm+0x59bacc
ffff930774e296d0 fffff8003926ae95 : ffffd10020b82f80 0000000000000001 ffffe30dd7ddf820 ffffd10020b80180 : nvlddmkm+0x1c1fca
ffff930774e298f0 fffff8003926a4ef : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiExecuteAllDpcs+0x305
ffff930774e29a30 fffff800393c5024 : ffffffff00000000 ffffd10020b80180 ffffd10020b91340 ffffe30de2dd10c0 : nt!KiRetireDpcList+0x1ef
ffff930774e29c60 0000000000000000 : ffff930774e2a000 ffff930774e24000 0000000000000000 0000000000000000 : nt!KiIdleLoop+0x84
I can't tell who is a pro and who just picked up as they went along, like me... I am still learning too :)
 
Feb 6, 2020
99
8
45
1
Jesus, yeah, Windows is pretty clear here about where the memory corruption is coming from. With the number of errors found in the dump's memory snapshot being over ten thousand it also lends credence to my theory that the driver is panicking and dumping unuploaded GPU data and shader code somewhere it's not supposed to be (read: most anywhere in main memory).

For the record, LZMA2 Ultra compression is hard on the CPU and always uses 6.6GB of main memory during the compression operation, so if those weren't ruled out before, they likely are now.
 

Colif

Win 10 Master
Moderator
Interestingly, when I hIt the TDR retry limit last week, I hadn't thought to check C:\Windows\LiveKernelDumps. I did so today and it turned up a "STOP" 0x141 (VIDEO_ENGINE_TIMEOUT_DETECTED). Quotes because in Microsoft's own words, 141 isn't an actual STOP error.

and yet it is - https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0x141---video-engine-timeout-detected

If it was me, I would get a new GPU. I don't like suggesting it without solid proof and I could be wrong. So I apologise in advance, but you yourself are leaning towards it being hardware and a 770 isn't the newest card these days. It could be age.

Not accepting drivers is just 1 sign a card is dying, others are the images you were getting before you messed with settings. It might not be dead now but there are signs. I have killed cards before, trying to push them too hard. Very few of my cards died from old age. My GTX 980 may as I hardly use it.
 
Feb 6, 2020
99
8
45
1
This is a case where MSDN disagrees with WinDbg, at least for the livedumps.

If it was me, I would get a new GPU. I don't like suggesting it without solid proof and I could be wrong. So I apologise in advance, but you yourself are leaning towards it being hardware and a 770 isn't the newest card these days. It could be age.

Not accepting drivers is just 1 sign a card is dying, others are the images you were getting before you messed with settings. It might not be dead now but there are signs. I have killed cards before, trying to push them too hard. Very few of my cards died from old age. My GTX 980 may as I hardly use it.
The fact that I ran extreme-quality ENB presets as well as 2K+ texture mods on it (Skyrim is ass-ugly without visual mods lol) probably contributed, haha. At this point the machine is years out of date so I'll probably opt for a Ryzen build before I move out to Nova Scotia by year's end. The board, RAM and CPU are all fine according to my admittedly unscientific testing, and those 1TB WD Blues are essentially immortal, so I may salvage the drive from it.

If not, I'll buy a 2060 Super and new PSU and see if it works.
 
Feb 6, 2020
99
8
45
1
Alright, so, a bit of a related update.

First, something I forgot to mention that may confirm dying VRAM / GPU core: I was unable to get a picture at the time but during one of my test runs where I didn't immediately get an IRQ error, I had experienced full-screen image corruption. First with snowflakes on the game's current rendered frame, then it very quickly devolved into an absolute mess of pixels all squashed into the middle rows of my monitor with all other pixels being black. I use a USB interface / DAC, so this didn't affect my audio while on Discord. Then a moment later, the driver crashed and recovered. I think this was not long before YouTube triggered the IRQ error.

Second, I will be picking up some new hardware in a couple weeks (tax forms came pretty early!) including a 2060 Super so if a new GPU doesn't fix it then the problem lies either with RAM or Windows itself.
 
Feb 6, 2020
99
8
45
1
2060 Super and new PSU came in today and looks like at least one of my suspicions was correct, because UE4 no longer crashes AT ALL. Thanks for all the help, @Colif!
 
Reactions: Colif

ASK THE COMMUNITY

TRENDING THREADS