Alienware m17 R4, Laptop
MB: Custom Alienware design for laptops, Intel chipset.
CPU: Intel Core i7-10870H 8-core.
GPU: Nvidia Geforce rtc 3070 laptop
RAM: 32GB DDR4-2933MHz
PSU: 240W power adapter
Battery: 68Whr lithium-ion battery
Storage: 1TB PCIe NVMe SSD (dual drives in RAID)
OS: Windows 11 24H2 (I had the previous version at the initial round of BSOD loops)
Bios: 1.26.0 (latest version given my dell’s website)
I started getting BSOD after exiting a game (Dave the diver) on December 31st. I had automatic driver updates on but I can’t see if anything installed in that time range. I entered a BSOD restart loop, it would happen three times and I would get this error:
VIDEO_MEMORY_MANAGEMENT_INTERNAL
And I’m not sure if I got this too but I would get it in subsequent BSOD loops:
VIDEO_DXGKRNL_FATAL_ERROR.
Usually after enough loops, the nvidia GPU stops being detected, and for some reason the charging icon on my laptop would disappear (despite the battery going up in charge anyway). Restarting does not help redetect the gpu, but startup from shutdown especially after some short time passes does. No specific warning or conflict symbols appear in the device manager for any drivers.
Things I tried:
View: https://imgur.com/a/gpu-z-stats-on-intel-nvidia-gpu-s-AI6nmu1
I can use my PC with the new nvidia gpu driver but only when it is disabled in device manager now, enabling it will cause a BSOD not too long after and start the loop again until it isn’t detected anymore.
When I had balanced power settings, the bug code would be 9f caused by driver ntoskrnl.exe. When I would have my battery on performance (which it is now and was originally) the bug code would be 10e and/or 113.
Here is a link with a zip of the minidump files from the original errors (not including the power-related ones, I also included some events from event viewer and a file from saved reliability history if that's of any use):
https://www.mediafire.com/folder/zcb8n6996g2al/PC_Error_Stuff
(I suggest only looking at 010225-9687-01.dmp and 010225-14796-01.dmp since those two are the different errors)
I don’t know what there is to do at this point, I’m considering rolling back to windows 11 23H2 and trying to DDU and install the nvidia drivers again there but I don’t know if that will help. I was also considering a cloud reset in case of any other corrupted files being the cause. I’m worried my GPU is fried but I don't think that’s the case given the above, and I am not savvy with hardware so I’m too scared to open it up and check myself. So I might be forced to give it in to a shop to have them see if I can’t figure it out myself. I heard there were some driver updates coming in around a week and a half, but I don't know if I'm willing to wait that long since I need the nvidia GPU for ML stuff. Is resetting the PC a worthwhile option? Can I do a cloud reset or is a local one preferable?
What do you guys think is going on? Corrupted system/driver files or is my hardware screwed? What do you think my next steps should be?
MB: Custom Alienware design for laptops, Intel chipset.
CPU: Intel Core i7-10870H 8-core.
GPU: Nvidia Geforce rtc 3070 laptop
RAM: 32GB DDR4-2933MHz
PSU: 240W power adapter
Battery: 68Whr lithium-ion battery
Storage: 1TB PCIe NVMe SSD (dual drives in RAID)
OS: Windows 11 24H2 (I had the previous version at the initial round of BSOD loops)
Bios: 1.26.0 (latest version given my dell’s website)
I started getting BSOD after exiting a game (Dave the diver) on December 31st. I had automatic driver updates on but I can’t see if anything installed in that time range. I entered a BSOD restart loop, it would happen three times and I would get this error:
VIDEO_MEMORY_MANAGEMENT_INTERNAL
And I’m not sure if I got this too but I would get it in subsequent BSOD loops:
VIDEO_DXGKRNL_FATAL_ERROR.
Usually after enough loops, the nvidia GPU stops being detected, and for some reason the charging icon on my laptop would disappear (despite the battery going up in charge anyway). Restarting does not help redetect the gpu, but startup from shutdown especially after some short time passes does. No specific warning or conflict symbols appear in the device manager for any drivers.
Things I tried:
- System file scan and DISM (everything fine).
- Updated windows and turned off automatic updates for drivers. I also started disabling a bunch of background services incase they were in the way (e.g. Gaming input services, random dell support assist stuff, etc).
- Updating various Intel drivers to latest versions (e.g. chipset, GPU, management engine, etc). Did this automatically using device manager and manually from drivers listed in dell’s website.
- Reinstalled BIOS and set default settings.
- Full scan for Malware using Windows Defender and Kaspersky (nothing detected) - long shot though, haven't been downloading anything fishy.
- DDU to uninstall nvidia graphics drivers in safe mode, and installed the latest, second latest, and october versions of game ready drivers (and I also tried the latest studio version) from nvidia website, and I also tried those from dell’s website (the latest and second latest). For the nvidia drivers I also usually installed them in safe mode but tried after normal startup. - idk if this matters but note that the pc operates fine with the gpu detected in safe mode.
- I switched my power settings to balanced from performance (this seemed to make it worse and I would get a different BSOD error: DRIVER_POWER_STATE_FAILURE and so I switched back to high performance). Also I had virtual memory disabled, so I enabled it (incase my Alienware command center was overclocking anything despite default BIOS settings).
- Disconnected and reconnected the battery with holding power button for 20-30 seconds (which would usually help the PC to detect the nvidia gpu again on startup when simple shutdown and startup wouldn’t, so I could install the different drivers).
View: https://imgur.com/a/gpu-z-stats-on-intel-nvidia-gpu-s-AI6nmu1
I can use my PC with the new nvidia gpu driver but only when it is disabled in device manager now, enabling it will cause a BSOD not too long after and start the loop again until it isn’t detected anymore.
When I had balanced power settings, the bug code would be 9f caused by driver ntoskrnl.exe. When I would have my battery on performance (which it is now and was originally) the bug code would be 10e and/or 113.
Here is a link with a zip of the minidump files from the original errors (not including the power-related ones, I also included some events from event viewer and a file from saved reliability history if that's of any use):
https://www.mediafire.com/folder/zcb8n6996g2al/PC_Error_Stuff
(I suggest only looking at 010225-9687-01.dmp and 010225-14796-01.dmp since those two are the different errors)
I don’t know what there is to do at this point, I’m considering rolling back to windows 11 23H2 and trying to DDU and install the nvidia drivers again there but I don’t know if that will help. I was also considering a cloud reset in case of any other corrupted files being the cause. I’m worried my GPU is fried but I don't think that’s the case given the above, and I am not savvy with hardware so I’m too scared to open it up and check myself. So I might be forced to give it in to a shop to have them see if I can’t figure it out myself. I heard there were some driver updates coming in around a week and a half, but I don't know if I'm willing to wait that long since I need the nvidia GPU for ML stuff. Is resetting the PC a worthwhile option? Can I do a cloud reset or is a local one preferable?
What do you guys think is going on? Corrupted system/driver files or is my hardware screwed? What do you think my next steps should be?