Question Need help figuring out what's wrong with my rtx 3070 ti (nvlddmkm error)

Status
Not open for further replies.

srdjanmax

Commendable
Jan 4, 2022
4
0
1,510
Hello everyone,

I have a PC with EVGA rtx 3070 ti graphics card. The PC has two identical SSDs. I do my backups by cloning my entire drive to the second SSD. So if the working SSD fails, the system boots from the second SSD.

Long story short, I'm on a Dev channel of the windows insider program. New Dev channel builds don't support Windows Mixed Reality, and I do have a WMR headset. So I have to do clean windows install to get out of the insider program and continue using my WMR headset. So I did a clean Win 11 install on one of the SSDs. The problem is my 3070 ti doesn't function properly with the new windows 11 install. Every few seconds I get a nvlddmkm error ("...nvlddmkm can not be found...") along with the display warning, "display driver nvlddmkm stopped responding and has successfully recovered." I have attached a picture of the event viewer showing the errors. This starts as soon as the Win 11 boots and the login window appears. Every few seconds the PC freezes. Then it works for a few seconds, then it freezes again ... This repeats constantly, which renders the PC unusable. I can somehow go into the device manager, rollback the driver to microsoft basic display adapter. With the basic display adapter the errors do not occur and I can use the computer.

So, I have two windows 11 bootable drives in my PC. In the old windows 11 (dev channel) the 3070 ti works without any issues. In the new windows 11 installation the system keeps freezing every few seconds due to this nvlddmkm display issue. So in terms of solutions, I tried many things without success. The ones that I remember are listed below:
  • It can't be the BIOS, because the BIOS is identical for both win 11 installation
  • Likewise, it can't be the RAM or PSU
  • I thought maybe the drivers are the issue. But I tried the same driver version on the new win 11 and it did not make any difference (DDU, clean install, etc. makes not difference).
  • I also tried plugging in the 3070 ti in a different PC (I have a second PC with RTX 3070), and the nvlddmkm error started happening in this PC as well
  • Enable user permissions to full control nvlddmkm.sys did not help
  • Changing TdrDelay did not help
  • Disabling fast startup, it was not enabled in the first place so no help here
  • Disabled Hardware Acceleration makes no difference
  • Reducing core & memory clock did not help
I have no clue why my 3070 ti works in one windows 11 installation and doesn't work in the other two. I remember having this problem 2-3 years ago and I was able to change something in win 11 to make it work. But for the love of God I can't remember what I did. So I need help figuring out how to make this card work in the fresh win 11 installation. If you have any ideas or suggestions, I can go into both win 11 installations (new and old dev channel), compare the setting, try to figure out what's different and make the changes in the fresh win 11 installation to make the card work again with NVIDIA drivers.

Sorry for the long post. Thank you for any suggestions.

need-help-figuring-out-whats-wrong-with-my-rtx-3070-ti-v0-dvwj3mesa8pd1.jpg
 
If you have two identical drives, why not disconnect one of the drives and see if that removes the issue. You should also be able to narrow down which drive with the OS has a corruption or bug on.

It can't be the BIOS, because the BIOS is identical for both win 11 installation
If you're using one motherboard(besides two SSD's and 2 OSes on said SSD's) then yes it's one BIOS. Speaking of which, please pass on your specs like so:
CPU:
CPU cooler:
Motherboard:
Ram:
SSD/HDD:
GPU:
PSU:
Chassis:
OS:
Monitor:
include the age of the PSU apart from it's make and model. BIOS version for your motherboard at this moment of time.

I thought maybe the drivers are the issue. But I tried the same driver version on the new win 11 and it did not make any difference (DDU, clean install, etc. makes not difference).
You're advised to remove all GPU drivers(Intel, AMD and Nvidia) in Safe Mode, then manually reinstall with the latest driver sourced from Nvidia's support site in an elevated command(sans a reboot). Did you perform this step on both OS drives?

Found this upon some digging. Further reading;
https://steamcommunity.com/app/1888930/discussions/0/6197556602754617056/?ctp=4
 
If you have two identical drives, why not disconnect one of the drives and see if that removes the issue. You should also be able to narrow down which drive with the OS has a corruption or bug on.
Disconnecting one of the drivers does not remove the issue. I'm not sure how to pinpoint the problem with the 3070 ti.
It can't be the BIOS, because the BIOS is identical for both win 11 installation
If you're using one motherboard(besides two SSD's and 2 OSes on said SSD's) then yes it's one BIOS. Speaking of which, please pass on your specs like so:
CPU:
CPU cooler:
Motherboard:
Ram:
SSD/HDD:
GPU:
PSU:
Chassis:
OS:
Monitor:
include the age of the PSU apart from it's make and model. BIOS version for your motherboard at this moment of time.
CPU: AMD Ryzen 7 5800X3D
CPU Cooler: AMD Wraith Prism cooler
Motherboard: MSI MPG B550 Gaming Plus
RAM: Silicon Power Value Gaming DDR4 RAM 32GB (2x16GB) 3200MHz (PC4 25600) 288-pin CL16 1.35V UDIMM Desktop Memory Module with Heatsink
SSD: Acer FA100 2TB M.2 SSD 2280 NVMe Gen3 x4 Internal SSD
GPU: EVGA GeForce RTX 3070 Ti FTW3 ULTRA GAMING
PSU: CORSAIR - RM Series RM750 750W ATX 80 PLUS GOLD
Chassis: Cyberpower ATX midi tower (don't know the details)
OS: Windows 11 Home
Monitor: AOC CU34G2X

I thought maybe the drivers are the issue. But I tried the same driver version on the new win 11 and it did not make any difference (DDU, clean install, etc. makes not difference).
You're advised to remove all GPU drivers(Intel, AMD and Nvidia) in Safe Mode, then manually reinstall with the latest driver sourced from Nvidia's support site in an elevated command(sans a reboot). Did you perform this step on both OS drives?
I did remove all driver in Safe Mode using DDU. There were only NVIDIA drivers present since it was a new clean install of Win 11. And then I reinstalled the NVIDIA drivers, but not using the elevated prompt. I tried multiple versions, but the problem is always present.
I do not recall reinstalling the NVIDIA drivers using elevated prompt on the working Win 11 OS.
Found this upon some digging. Further reading;
https://steamcommunity.com/app/1888930/discussions/0/6197556602754617056/?ctp=4
I'm not sure this is helpful. I don't need to reproduce the issue, because the issue on the newly installed Win 11 is always there.
 
So, I was trying to figure out how to fix my issue, and after torturing my PC through hundreds of mini freezes he finally gave me the BSOD and a dmp file. It appears that there's a VIDEO_SCHEDULER_INTERNAL_ERROR (119) with a dxgmms2.sys involved. This is a DirectX problem, and it does look familiar so I might be onto something. Below is a summary of the dmp readout:


VIDEO_SCHEDULER_INTERNAL_ERROR (119)
The video scheduler has detected that fatal violation has occurred. This resulted
in a condition that video scheduler can no longer progress. Any other values after
parameter 1 must be individually examined according to the subtype.
Arguments:
Arg1: 000000000000a000, The subtype of the BugCheck:
Arg2: ffffb60b97a17000
Arg3: 0000000000000003
Arg4: 0000000000000004

Debugging Details:
------------------

*** WARNING: Check Image - Checksum mismatch - Dump: 0x253d0, File: 0x25696 - C:\ProgramData\Dbg\sym\watchdog.sys\A954D94A22000\watchdog.sys

KEY_VALUES_STRING: 1

Key : Analysis.CPU.mSec
Value: 3031

Key : Analysis.Elapsed.mSec
Value: 3217

Key : Analysis.IO.Other.Mb
Value: 0

Key : Analysis.IO.Read.Mb
Value: 0

Key : Analysis.IO.Write.Mb
Value: 0

Key : Analysis.Init.CPU.mSec
Value: 281

Key : Analysis.Init.Elapsed.mSec
Value: 5580

Key : Analysis.Memory.CommitPeak.Mb
Value: 115

Key : Analysis.Version.DbgEng
Value: 10.0.27704.1001

Key : Analysis.Version.Description
Value: 10.2408.27.01 amd64fre

Key : Analysis.Version.Ext
Value: 1.2408.27.1

Key : Bugcheck.Code.KiBugCheckData
Value: 0x119

Key : Bugcheck.Code.LegacyAPI
Value: 0x119

Key : Bugcheck.Code.TargetModel
Value: 0x119

Key : Dump.Attributes.AsUlong
Value: 1000

Key : Dump.Attributes.DiagDataWrittenToHeader
Value: 1

Key : Dump.Attributes.ErrorCode
Value: 0

Key : Dump.Attributes.LastLine
Value: Dump completed successfully.

Key : Dump.Attributes.ProgressPercentage
Value: 100

Key : Failure.Bucket
Value: 0x119_a000_VIDSCH_RESET_HW_ENGINE_SUSPEND_dxgmms2!VidSchiResetHwEngine

Key : Failure.Hash
Value: {95884248-9389-91d4-05c4-d0b1411f9702}

Key : Hypervisor.Enlightenments.Value
Value: 0

Key : Hypervisor.Enlightenments.ValueHex
Value: 0

Key : Hypervisor.Flags.AnyHypervisorPresent
Value: 0

Key : Hypervisor.Flags.ApicEnlightened
Value: 0

Key : Hypervisor.Flags.ApicVirtualizationAvailable
Value: 1

Key : Hypervisor.Flags.AsyncMemoryHint
Value: 0

Key : Hypervisor.Flags.CoreSchedulerRequested
Value: 0

Key : Hypervisor.Flags.CpuManager
Value: 0

Key : Hypervisor.Flags.DeprecateAutoEoi
Value: 0

Key : Hypervisor.Flags.DynamicCpuDisabled
Value: 0

Key : Hypervisor.Flags.Epf
Value: 0

Key : Hypervisor.Flags.ExtendedProcessorMasks
Value: 0

Key : Hypervisor.Flags.HardwareMbecAvailable
Value: 1

Key : Hypervisor.Flags.MaxBankNumber
Value: 0

Key : Hypervisor.Flags.MemoryZeroingControl
Value: 0

Key : Hypervisor.Flags.NoExtendedRangeFlush
Value: 0

Key : Hypervisor.Flags.NoNonArchCoreSharing
Value: 0

Key : Hypervisor.Flags.Phase0InitDone
Value: 0

Key : Hypervisor.Flags.PowerSchedulerQos
Value: 0

Key : Hypervisor.Flags.RootScheduler
Value: 0

Key : Hypervisor.Flags.SynicAvailable
Value: 0

Key : Hypervisor.Flags.UseQpcBias
Value: 0

Key : Hypervisor.Flags.Value
Value: 16908288

Key : Hypervisor.Flags.ValueHex
Value: 1020000

Key : Hypervisor.Flags.VpAssistPage
Value: 0

Key : Hypervisor.Flags.VsmAvailable
Value: 0

Key : Hypervisor.RootFlags.AccessStats
Value: 0

Key : Hypervisor.RootFlags.CrashdumpEnlightened
Value: 0

Key : Hypervisor.RootFlags.CreateVirtualProcessor
Value: 0

Key : Hypervisor.RootFlags.DisableHyperthreading
Value: 0

Key : Hypervisor.RootFlags.HostTimelineSync
Value: 0

Key : Hypervisor.RootFlags.HypervisorDebuggingEnabled
Value: 0

Key : Hypervisor.RootFlags.IsHyperV
Value: 0

Key : Hypervisor.RootFlags.LivedumpEnlightened
Value: 0

Key : Hypervisor.RootFlags.MapDeviceInterrupt
Value: 0

Key : Hypervisor.RootFlags.MceEnlightened
Value: 0

Key : Hypervisor.RootFlags.Nested
Value: 0

Key : Hypervisor.RootFlags.StartLogicalProcessor
Value: 0

Key : Hypervisor.RootFlags.Value
Value: 0

Key : Hypervisor.RootFlags.ValueHex
Value: 0

Key : SecureKernel.HalpHvciEnabled
Value: 0

Key : WER.OS.Branch
Value: ni_release

Key : WER.OS.Version
Value: 10.0.22621.1


BUGCHECK_CODE: 119

BUGCHECK_P1: a000

BUGCHECK_P2: ffffb60b97a17000

BUGCHECK_P3: 3

BUGCHECK_P4: 4

FILE_IN_CAB: MEMORY.DMP

DUMP_FILE_ATTRIBUTES: 0x1000

FAULTING_THREAD: ffffb60b993d7040

BLACKBOXBSD: 1 (!blackboxbsd)


BLACKBOXNTFS: 1 (!blackboxntfs)


BLACKBOXPNP: 1 (!blackboxpnp)


BLACKBOXWINLOGON: 1

PROCESS_NAME: System

STACK_TEXT:
ffffb48d`9738f718 fffff805`6e465745 : 00000000`00000119 00000000`0000a000 ffffb60b`97a17000 00000000`00000003 : nt!KeBugCheckEx
ffffb48d`9738f720 fffff805`7f206498 : ffffb60b`97b92002 ffffb60b`95cf9950 ffffb60b`95cf9958 ffffb60b`95cf9960 : watchdog!WdLogSingleEntry5+0x3c05
ffffb48d`9738f7d0 fffff805`7f2c67d2 : ffffb60b`95bf9001 00000000`00000000 00000000`00000000 00000000`00989680 : dxgmms2!VidSchiResetHwEngine+0x3b8
ffffb48d`9738f980 fffff805`7f293ced : ffffb60b`97b92000 00000000`00000000 00000000`00000000 00000000`00000000 : dxgmms2!VidSchiResetEngines+0xaa
ffffb48d`9738f9d0 fffff805`7f1c68ac : 00000000`00000000 00000000`00000000 00000000`0000b428 00000000`00989680 : dxgmms2!VidSchiCheckHwProgress+0x2e3ad
ffffb48d`9738fa50 fffff805`7f27a405 : ffffb60b`9736f000 ffffb60b`97b92000 ffffb60b`9736f010 ffffb60b`95f10860 : dxgmms2!VidSchiScheduleCommandToRun+0x5c
ffffb48d`9738fb20 fffff805`7f27a37a : 00000000`00000000 fffff805`7f27a2b0 ffffb60b`97b92000 ffffb60b`8cd46040 : dxgmms2!VidSchiRun_PriorityTable+0x35
ffffb48d`9738fb70 fffff805`5fa79ca7 : ffffb60b`993d7040 fffff805`00000001 ffffb60b`97b92000 00000000`00000000 : dxgmms2!VidSchiWorkerThread+0xca
ffffb48d`9738fbb0 fffff805`5fc1af64 : fffff805`5b3f7180 ffffb60b`993d7040 fffff805`5fa79c50 00000000`00000246 : nt!PspSystemThreadStartup+0x57
ffffb48d`9738fc00 00000000`00000000 : ffffb48d`97390000 ffffb48d`97389000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x34


SYMBOL_NAME: dxgmms2!VidSchiResetHwEngine+3b8

MODULE_NAME: dxgmms2

IMAGE_NAME: dxgmms2.sys

IMAGE_VERSION: 10.0.22621.3810

STACK_COMMAND: .process /r /p 0xffffb60b8cd46040; .thread 0xffffb60b993d7040 ; kb

BUCKET_ID_FUNC_OFFSET: 3b8

FAILURE_BUCKET_ID: 0x119_a000_VIDSCH_RESET_HW_ENGINE_SUSPEND_dxgmms2!VidSchiResetHwEngine

OS_VERSION: 10.0.22621.1

BUILDLAB_STR: ni_release

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

FAILURE_ID_HASH: {95884248-9389-91d4-05c4-d0b1411f9702}

Followup: MachineOwner
---------

14: kd> lmvm dxgmms2
Browse full module list
start end module name
fffff805`7f1c0000 fffff805`7f2d7000 dxgmms2 # (pdb symbols) C:\ProgramData\Dbg\sym\dxgmms2.pdb\159CEB1999ADC38003DA5B56C1E71AE61\dxgmms2.pdb
Loaded symbol image file: dxgmms2.sys
Mapped memory image file: C:\ProgramData\Dbg\sym\dxgmms2.sys\3F765386117000\dxgmms2.sys
Image path: \SystemRoot\System32\drivers\dxgmms2.sys
Image name: dxgmms2.sys
Browse all global symbols functions data Symbol Reload
Image was built with /Brepro flag.
Timestamp: 3F765386 (This is a reproducible build file hash, not a timestamp)
CheckSum: 0011B040
ImageSize: 00117000
File version: 10.0.22621.3810
Product version: 10.0.22621.3810
File flags: 0 (Mask 3F)
File OS: 40004 NT Win32
File type: 3.7 Driver
File date: 00000000.00000000
Translations: 0409.04b0
Information from resource tables:
CompanyName: Microsoft Corporation
ProductName: Microsoft® Windows® Operating System
InternalName: dxgmms2.sys
OriginalFilename: dxgmms2.sys
ProductVersion: 10.0.22621.3810
FileVersion: 10.0.22621.3810 (WinBuild.160101.0800)
FileDescription: DirectX Graphics MMS
LegalCopyright: © Microsoft Corporation. All rights reserved.
 
If anyone's interested, I think I've figured it out. I went to:

NVIDIA Control Panel > Manage 3D settings > Global Settings > Power Management Mode > set it to "Prefer maximum performance".

Not sure if this is a solution 100%, because after I applied this change the PC still continued stuttering followed by a BSOD. But after the next restart, the system booted normally and stuttering was gone. The latest NVIDIA drivers are in use. So I'm happy, since I was busting my head for weeks with this. Over and out!
 
Last edited:
Status
Not open for further replies.