Question [Hang/Freeze/Crash] - Event ID 14 nvlddmkm, AMD+NVIDIA

Page 5 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Apr 17, 2020
3
4
15
This thread is for if your issue occurs sporadically and randomly (from once every week(s) to once every month(s)). If you are having the same event ID repeatedly and consistently you do not have the same issue.

PLEASE DO NOT COMMENT IF YOU DO NOT HAVE QUESTIONS, A FIX OR RELEVANT ADDITIONAL INFO as this will otherwise flood the thread is it did with the previous one. Use the poll instead to add your +1.


Introduction:
This has been a long going thread from Reddit which I was the owner of.
The thread is almost 6 months old, meaning it will most likely archive and I am getting notifications from other people having the issue almost every other day. I decided to compile (300+ comments) the info from that thread and post it here.

Issue Description:
As stated above, this issue occurs sporadically: you can be gaming, editing, or watching a Youtube video, it doesn’t seem to have correlation. On average it will happen once every 3 weeks for me.
When the error occurs, the computer will begin to stutter, your inputs will be extremely jerky/delayed and sometimes the screen will go black with only the cursor displaying. This makes it impossible to use the computer while it occurs.
The issue will sometimes resolve itself (can take from 1 to 15 minutes) and sometimes last too long for it to be worth waiting, meaning I will manually crash my computer.

There seems to be a correlation between having a NVIDIA GPU and an AMD Ryzen CPU, however, (very) few have described the issue happening on a different configuration.

Last time I had the issue: [2020-03-07]
(YYYY,MM,DD).

Event Viewer Error:
Event ID 14 - nvlddmkm
The description for Event ID 14 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
0cec(3098) 00000000 00000000

The message resource is present but the message was not found in the message table

Contacts:
Others and I have attempted contacting both AMD & NVIDIA here is each of their answers (broadly):
AMD: Ensure that you have updated the BIOS and chipset drivers for your motherboard and are using stock settings. They think it's mostly an issue related to the graphic's card. It should be on NVIDIA's side to fix this. It could also be related to this link: https://support.microsoft.com/en-us/help/2665946/display-driver-stopped-responding-and-has-recovered-error-in-windows-7 (which is related to windows 10 as well).
NVIDIA (info from Reddit user u/faildude7): “[...] I contacted with NVIDIA and they told me that is an incompatibility issue that they're aware of it and it's more common with RTX series but it's gonna be hard to fix since it's on AMD's part and should be resolved with a BIOS fix from their part, they told me that it's because the use of virtual threads on multi-threading in Ryzen that doesn't link well with NVIDIA processes.” + “he told me that and it's pretty bad since this error seems like it's 6 years old at least so I don't see them fixing it any time soon, this just was more popular with the RTX series since he told me that it's more common with these graphics but if you search info you can see people complaining for years.”

Fixes
This is a list of attempted and not attempted fixes mentioned by some users from the thread mentioned above. The problem with this issue is also not knowing when the issue will happen. Someone can declare a fix but we don't truly know about whether it works or not after a few months of waiting (as the issue occurs so sporadically). Running generic stress tests will also not make the issue occur.

Attempted Fixes:
  1. RMA (Replace) with the same GPU model. This does not fix the issue. (Generic fix)
  2. Update drivers. This does not fix the issue. (Generic fix)
  3. Use stock settings. This does not fix the issue. (from AMD)
  4. Disabling XMP on the BIOS. This does not fix the issue. (from u/3feetHair)
  5. Enabling XMP on the BIOS. This does not fix the issue. (Generic fix)
  6. Setting your Power Plan to AMD’s “High Performance” (from u/Injuis)
  7. Setting Memory Frequency to the correct frequency through BIOS (from u/NovalAcanthocephala6)
  8. Change your GPU power mode to “Maximum Performance”. NVIDIA Control Panel -> Manage 3D Settings -> Global Settings -> Power Management mode. Many users have mentioned this working (from u/eatwritelaugh) [fix personally attempted on 2020/04/16]
  9. DDUd and reinstall drivers. This does not fix the issue. (from u/Aravind92)
Potential Fixes:
  1. Update BIOS for Chipset & Motherboard. (This has been attempted by some and didn’t fix the issue, however it could still be a potential fix in the future.)
  2. Reset Windows Power Options (from u/Aravind92)
  3. Disable HW Acceleration in Firefox (from u/Aravind92)
  4. RMA (replacing) with a different PSU. This has apparently fixed the issue for one user, however the user had an intel CPU. (from u/Speedfreakz)
  5. Replacing every component in your computer seems to work (which admittedly is not a fix).
  6. Set your RAM to “Power Down Enabled/mode” to disable and “Gear Down Mode” enabled. (from u/Gigakv)
Other Info compiled from user comments on the previous thread:
  • u/sprousaTM has mentioned this only occurring when the PC is idle.
  • u/ponybeu5 has mentioned changing from a 1080 Strix to a 2080 Super EVGA and then starting getting the issue (with AMD 3900X).
  • u/RafaMarioFan has said the following: Started with this setup: GT1030+ used A320+ ryzen3600 + 500W psu (that doesn't have 80plus), changed to a Galax 1660 and started having the error. Replaced PSU for a Corssair CX 550 (80p bronze) which didn’t fix the error. Went back to the GT1030. Bought a RTX 2060S which caused the issue again. Then changed from A320 to a B450 which didn’t fix the issue.
  • A few users have mentioned the issue only occurring a few months after having their PC built. (I personally don’t remember)
  • u/Civil_Specter has mentioned a fix, however it seems like their issue was occurring consistently when gaming, which is different from what a majority of people are getting and most likely a different issue:
    • Use AMD Ryzen Master to disable SMT (Simultaneous Multi threading)
    • Downloaded a program called Project Lasso to improve gaming performance.
    • Used EVGA Precision X1 software to make a custom GPU profile with “Boost Lock” enabled. Which keeps your GPU from idle.
  • u/Gigakv has provided the following info: Can confirm it's still an issue with Ryzen 3000 + Turing
    • Ryzen 2600 + RTX 2060 - no problem
    • Ryzen 3600X + GTX 1070 TI - no problem
    • Ryzen 3600X + RTX 2060 that worked fine with the Ryzen 2600 - GPU crashing when using hardware acceleration \ idle. No issues during gaming or stress testing.
  • A few users mention that it probably has a link with hardware acceleration.
  • A few users mention that it probably has a link with the GPU switching from idle to performance mode.
  • u/MBDdk mentioned that the issue might’ve been fixed in the AMD driver update 20-2-2: “Performing a task switch with some Radeon Software features enabled or some third-party applications with hardware acceleration running in the background may cause a system hang or black screen.”
My Full Computer Specs:
GPU: NVIDIA GeForce RTX 2080 Ti
CPU: AMD Ryzen 9 3900X
CPU Cooler: NZXT Kraken x62
Motherboard: GIGABYTE X570 Aorus Master
RAM: G.Skill Ripjaws V Series 64GB DDR4-3200 [16GB x 4]
PSU: Corsair HX Platinum 750 W80
Case: NZXT H700i ATX Mid Tower
SSD [Windows 10]: Samsung 970 Pro 1TB
HDD: Seagate BarraCuda Pro 4TB 3.5” 7200 RPM

List of a portion of reported setups with the issue from the reddit thread:
  • u/faildude7: Ryzen 9 3950X + RTX 2080Ti + G.Skill Trident Z Neo 32GB (2x16GB) 3600MHz + Gigabyte X570 AORUS Pro
  • u/rimokonman: Ryzen 3700X + NVIDIA 2070S
  • u/eatwritelaugh: Ryzen 5 3600 + RTX 2070 Super + Adata 3200mhz 8x2 + MSI B450 Carbon
  • u/sprousaTM: Ryzen 3800X + RTX 2080 + 16GB Trident RAM + 1200watt corsair power supply (+custom watercooling)
  • u/Aravind92: Ryzen 3600x + MSI Tomohawk B450 Max + gtx 1660 ti + Corsair Vengeance LPX 8 x 2 3000 mhz + cm mwe v2 550w + Acer VG240YP.
  • u/QueBugCheckEx: Ryzen 9 3950x + RTX 2080ti
  • u/PentaChicken: Ryzen 5 3600X and RTX 2070 Super
  • u/Skastrike09: Ryzen 3600 + NVidia 2060
  • u/iksargodzilla: ryzen 3700x + rtx 2060 super + tuf gaming x570-plus (wifi) motherboard + 2x8gb ram
  • u/OOO639: 3700x and a 2080 super
  • u/clockwork000: 3900X + 2070S
  • u/ Dyeneks: RTX 2070 Super + Ryzen 7 3700X
  • u/TheChozoKnight: RTX 2080 + Ryzen 7 3700X + Asus X570F Mobo
  • u/impmallet: 2080ti + 3700x + Gigabyte x570 Aurus Elite
  • u/fluidzreddit: Msi trio 2080ti @ stock + 3700x + Gigabyte x570 master (f11 bios). Ram - Crucial ballistix sport 3200 cas 16 @ xmp
  • u/kaimenlau: Ryzen 3950x + rtx 2080 ti
  • u/NovelAcanthocephala6: 3950x + 2080 super
  • u/MBDdk: Ryzen 5 3600 + NVIDIA RTX 1660S
  • u/Speedfreakz: Asus TUF Z270 MARK 1 + Intel® Core™ i7-7700K CPU @ 4.20GHz + NVIDIA GeForce RTX 2080 SUPER (TU104-450) @ 300 MHz + Samsung SSD 970 EVO Plus 1TBCougar STX 750 power supply.+ 2x 16gb HyperX Predator Rgb memory
  • u/yanboz: 3950X + 2080Ti
  • u/Tounushi: Asus GeForce RTX 2070 Super Dual Evo OC + Asus Prime x370-Pro mobo (BIOS v.4801) + 750W power source + AMD R7 3700X + 16GB DDR4 2666MHz (running 2400) + M.2 SSD OS drive + SSD and HDD gaming and storage drives, respectively
  • u/Skastrike09: RTX 2060, Ryzen 5 3600, and X570 AORUS ELITE
  • u/Oren1: Ryzen 5 3600 + Rtx 2060 super
  • u/rickgolds: Ryzen 5 3600 + RTX 2080 + RAM Trident Z with Samsung B-Die
  • u/Aravind92: Ryzen 3600x, MSI Tomohawk B450 Max, gtx 1660 ti, Corsair Vengence LPX 8 x 2 3000 mhz, cm mwe v2 550w, Acer VG240YP.
  • u/MrSheep_: 1660ti + Ryzen 5 3600.
  • u/KM1k92: R9 3900x + RTX 2060 SUPER
  • u/Sanastro: i9 9900k + Zotac 2080ti
  • u/Flush535: 3700X + 2070S
  • u/3feetHair: Ryzen 2700 + GTX 1660
  • u/Sp0KI: 3700x + x470 + 2070
  • u/ThatOneCrazyFriend: 3900X + 2070 Super
  • u/calscks: 3900X + RTX 2080
  • u/hop-limit: 3800x + rtx 2070s
  • And many others…
Reddit thread: View: https://www.reddit.com/r/techsupport/comments/dnm7pt/event_id_14_nvlddmkm_computer_stutters_2080_ti/
 

Diceman_2037

Distinguished
Dec 19, 2011
56
3
18,535
ok, Did you watch the video I posted, like I said the system isn't hanging, I could still use the pc everytime, just everything was moving slow, once closing MSI afterburner even solved the issue. Two, things, if its a hardware defect, why am I able to constantly stress bnoth the gpu and cpu and not have an issue and why did it get resolved by closing MSI AB.

That is a common behavior associated with IO hangs as a result of bus error/retries or the device no longer responding to lspm commands properly.

Wouldn't a hardware defect with the controller be more apparent, like often causing instabilities rather than once is few weeks to months.

Not since PCIE users AER.

The bus will correct everything it can through FEC and fall back to packet resends ineivatbly.

Consider how slow downloading a file can be on faulty router via TCP, where TCP requests the packets to be resent.
 

AlHouine

Commendable
May 26, 2020
8
0
1,510
That is a common behavior associated with IO hangs as a result of bus error/retries or the device no longer responding to lspm commands properly.



Not since PCIE users AER.

The bus will correct everything it can through FEC and fall back to packet resends ineivatbly.

Consider how slow downloading a file can be on faulty router via TCP, where TCP requests the packets to be resent.

What do you advise then for those meeting this problem of error 14 nvlddmkm with the combo cpu amd and gpu nvidia?

Waiting for updates or rma the processor directly or rma something else?

I don't know what to do with this problem and I'm getting lost in all the possible solutions.
 

Aravind92

Distinguished
Apr 1, 2014
699
9
19,065
That is a common behavior associated with IO hangs as a result of bus error/retries or the device no longer responding to lspm commands properly.



Not since PCIE users AER.

The bus will correct everything it can through FEC and fall back to packet resends ineivatbly.

Consider how slow downloading a file can be on faulty router via TCP, where TCP requests the packets to be resent.

Gotcha, I will wait for it to happen again then, since AGESA 1.0.0.5 update, I've not had the issue as such but the Event viewer recorded a nvlddmkm error on the 3rd of June but not accompanied by the stuttering behaviourn as I was using the pc at the time and id not notice anything, was surprised to see the error there. Will hold for a little while to see if BISO updates help, if not, cpu RMA then, can you tell me how we can explain the issue to AMD though, will they immediately accept a RMA?

And hey thank you for taking the time out to explain it here. Guess, people whose boards have not been recieving updates should rma the CPU wihtout waiting on it, yeah, that will fix the issue? How likely it is to get a second CPU with the issue?

EDIT : Look at my comment below.
 
Last edited:

Aravind92

Distinguished
Apr 1, 2014
699
9
19,065
So, was just googling about the issue and found a thread on nvidia forums, pretty sure this is the guy I spoke with on reddit. BTW, he said it happened on AGESA 1.0.0.6 as well and he switched to an intel pc now and gave his Ryzen board ram and cpu to someone else. Looks like, he actually changed his cpu and motherboard with the retailer after the issue started, Diceman even you seem to have commented on it. I am at loss now, I had rmaing the cpu and board as the final ditch, now that looks bust as it happened after he switched both. Looks like its a software/ BIOS issue or maybe he got faulty PCIE controller on two of his cpus in a row, how likely is that.

Maybe we should be looking into this in another way. I don't know really after this.


And a quote from another reddit user as per his conversation with nvidia support.

I contacted with NVIDIA and they told me that is an incompatibility issue that they're aware of it and it's more common with RTX series but it's gonna be hard to fix since it's on AMD's part and should be resolved with a BIOS fix from their part, they told me that it's because the use of virtual threads on multithreading in ryzen that doesn't link well with NVIDIA processes

Now this, amkes some sencse as people have mentioned disabling smt does solve the issue.

What do you make of this Diceman, when I contacted them about the martter, they said its forwarded to the research team or somehting along the lines, but I am sure nothing came of it.
 
Last edited:
Jul 3, 2020
3
0
10
after agesa updates the issue is still there, before when the pcie crashes it never recovers and will make the system unusable, now it will just go into quick blackscreen or app crash and recovers back to normal. ryzen 3000 have pcie issues and amd is just keep on doing work a rounds, the past WHEA errors are also related to pcie on which amd decided to hide the errors to stop users from complaining and now this agesa updates that just reset or recover pcie but not fixes the problem.
 
Jul 3, 2020
4
0
10
after agesa updates the issue is still there, before when the pcie crashes it never recovers and will make the system unusable, now it will just go into quick blackscreen or app crash and recovers back to normal. ryzen 3000 have pcie issues and amd is just keep on doing work a rounds, the past WHEA errors are also related to pcie on which amd decided to hide the errors to stop users from complaining and now this agesa updates that just reset or recover pcie but not fixes the problem.



I have the same issue. BUT only after updating nvidia drivers to 451.48 from 446.14.
Stutters on 446.14. Black screen on 451.48.

Motherboard: MSI MPG X570 Gaming Edge WiFi
CPU: Ryzen 3600
GPU: KFA2 RTX2070Super Black Edition
Bios: Latest beta with agesa combov2 1.0.0.2

Amd gpu rx570 on my setup works with no gpu issues, but sometimes wifi dissapears. With nvidia gpu i have no wifi issues (SIC!)

P.s sorry for my english :)


Edit:
Before drivers update i have another error code.
Before:
\Device\Video3
0cec(3098) 00000000 00000000

After:
\Device\Video3
0d02(31c8) 00000000 00000000
 
Last edited:

FruitSalad

Commendable
Jun 14, 2020
7
1
1,515
So far after changing the power settings I have not had a single crash, but then again, that could just a coincidence as the crashes are very sporadic

I just got my replacement EVGA 2080 Super via Advanced RMA, so I will probably keep it on high power for a few weeks and then turn it back down and see what happens

Funnily enough the new card has solved the EVGA fan grinding issue

I feel like next time I might just go Intel again...
 
Last edited:

Aravind92

Distinguished
Apr 1, 2014
699
9
19,065
after agesa updates the issue is still there, before when the pcie crashes it never recovers and will make the system unusable, now it will just go into quick blackscreen or app crash and recovers back to normal. ryzen 3000 have pcie issues and amd is just keep on doing work a rounds, the past WHEA errors are also related to pcie on which amd decided to hide the errors to stop users from complaining and now this agesa updates that just reset or recover pcie but not fixes the problem.

Sorry, I am not getting it, nothing happened when the error was recorded on event viewer, no app crashes, no black screen nothing when it happened the last time.

Either way, are you saying it will not be fixed by changing the cpu, all ryzen 3000 cpus coming with defective pcie controller? one thing that baffles me is why it doesn't happen when the pc is under load? And why does is get fixed by putting nvidia's power management setting to prefer maximum performance?
 
Jul 3, 2020
3
0
10
Sorry, I am not getting it, nothing happened when the error was recorded on event viewer, no app crashes, no black screen nothing when it happened the last time.

Either way, are you saying it will not be fixed by changing the cpu, all ryzen 3000 cpus coming with defective pcie controller? one thing that baffles me is why it doesn't happen when the pc is under load? And why does is get fixed by putting nvidia's power management setting to prefer maximum performance?

the problem is related to pcie power savings, in normal scenario the pcie will go into low power mode when there is not enough load. there are ways to stop this pcie switching, force maximum performance in nvidia control panel, having more than 1 monitor also increases load, turning off browser hardware acceleration to stop the gpu/pcie from bouncing back between power saving and not.
 

Aravind92

Distinguished
Apr 1, 2014
699
9
19,065
It just happened again after 2 months, seems to have happened on 3rd June 2020 based on event viewer, but then it did not stutter, now it did, Was starting to watch a live stream on discord and it started stiuttering, this is becoming a real annoyance and not able to confirm what is causing this is the worst, going to update windows nbvidia driver and BIOS update now.lets see.
 

FruitSalad

Commendable
Jun 14, 2020
7
1
1,515
Question for you all, what browsers are you using?

I don't know if its an unrelated issue but since turning the power settings back to optimal I have am having performance issues in Firefox but not chrome. Googling "Firefox nvidia" seems to show people having driver crashes only when running Firefox

EDIT: I think I have some actual reproducible problems now

I use Blue Iris NVR software, and using the WebUI sometimes the video will jerk down to 5fps and then back up to 35fps (Well above double the FPS of the feed!) every second causing horrible playback

When I look into X1 to see the clocks, its going from 300Mhz all the way to 1650MHz every second which is directly inline with the jerking

If I load up the Blue Iris WebUI in chrome, it sits at around 400-500MHz constantly and has no issues with playback

With the power settings turned to max in the NVIDIA control panel, the issue goes away entirely
 
Last edited:
  • Like
Reactions: fluidz
Jul 3, 2020
4
0
10
Question for you all, what browsers are you using?

I don't know if its an unrelated issue but since turning the power settings back to optimal I have am having performance issues in Firefox but not chrome. Googling "Firefox nvidia" seems to show people having driver crashes only when running Firefox

EDIT: I think I have some actual reproducible problems now

I use Blue Iris NVR software, and using the WebUI sometimes the video will jerk down to 5fps and then back up to 35fps (Well above double the FPS of the feed!) every second causing horrible playback

When I look into X1 to see the clocks, its going from 300Mhz all the way to 1650MHz every second which is directly inline with the jerking

If I load up the Blue Iris WebUI in chrome, it sits at around 400-500MHz constantly and has no issues with playback

With the power settings turned to max in the NVIDIA control panel, the issue goes away entirely

Using chrome. Dont'have this issues, only event 14 sometimes
 

Aravind92

Distinguished
Apr 1, 2014
699
9
19,065
Question for you all, what browsers are you using?

I don't know if its an unrelated issue but since turning the power settings back to optimal I have am having performance issues in Firefox but not chrome. Googling "Firefox nvidia" seems to show people having driver crashes only when running Firefox

EDIT: I think I have some actual reproducible problems now

I use Blue Iris NVR software, and using the WebUI sometimes the video will jerk down to 5fps and then back up to 35fps (Well above double the FPS of the feed!) every second causing horrible playback

When I look into X1 to see the clocks, its going from 300Mhz all the way to 1650MHz every second which is directly inline with the jerking

If I load up the Blue Iris WebUI in chrome, it sits at around 400-500MHz constantly and has no issues with playback

With the power settings turned to max in the NVIDIA control panel, the issue goes away entirely

I am using firefox as well.
 

RacAtat007

Distinguished
Aug 8, 2012
219
6
18,695
SOLVED WITH BIOS UPDATED

Specs -
CPU: R7 3700x
GPU: Gigabyte RTX 2060 6gb
Mobo: ASUS AM4 TUF Gaming X570
RAM: 16 GB Corsair Vengeance 3200 (2x8)


I was having exact same issue as described. My PC would randomly stutter really bad then usually freeze forcing me to power down and restart. This happened once or twice over a few weeks but seemed to get more common over time with no cause I could track down. I tried changing anything I could think of with no luck but after a BIOS update it seems to have been fixed. It's been about 3 weeks without the issue happening at all. If you have the same board as me I'm currently on BIOS ver 2203 with no issues. Hope this helps
 
Jul 16, 2020
3
0
10
I'm also having this problem, it's been very frustrating. I actually RMA'ed my GPUs thinking they were the culprit, not a week after the replacements arrived the problem began again. It's infuriating as I will be in the middle of work and the PC will suddenly lock up.

I've noticed the problem for me tends to be worse if I work a full day in UE4, Substance Painter or any other GPU intensive program and then leave my PC on overnight. That seems to lessen the occurrences a little bit. A little bit being key there. I usually experience these crashes every other day, sometimes daily, but after a lot of tweaking the crashes happen much less often now, but they still do happen.
Another thing I've found that helped a little bit was to up the voltage a little bit to the SoC along with frequently restarting the GPU driver (Win+Ctrl+Shift+B). I'm not sure which made the most difference, but if I was a betting man I'd say it's the frequent restarting of the driver. But if someone else wants to try both of these out and see if they experience less crashing maybe can find out.

I've run through all the basic troubleshooting steps, like testing the cards in another PC, testing this system with an older AMD GPU, and all the hardware seems to be OK. Just found this thread, so at least I now know I'm not alone in this.

Now, if I'm not mistaken, some motherboards with multiple PCIe slots will have some of those slots using chipset lanes instead of CPU lanes, has anyone tested using a slot that passes through the chipset vs directly into the CPU?
I've been thinking of trying a single card in every PCIe slot and see if that makes a difference.

Specs:
CPU: AMD Threadripper 3970X
Motherboard: Asus Zenith II Extreme
RAM: 256 GB Trident Z Neo
GPU(s): 2x Nvidia Titan RTX
Motherboard bios: Latest
Drivers: up to date

Device manager event:

Event ID 14/nvlddmkm

\Device\00000192
0d02(31c8) 00000000 00000000
 

Aravind92

Distinguished
Apr 1, 2014
699
9
19,065
Hi Guys,

I've contacted both Nvidia and Amd's support on this matter. AMD responded saying it is sent to their engineering team for them to research.

With Nvidia, I got the case escalated to 2nd level, the 1st level agent hardly understood the issue and honestly wasn't very helpful, just giving me the generic troubleshooting steps. But the 2nd level agent has been working and researching on the matter.

Could you guys please contact them so that both the companies are aware that it is a widespread issue and swapping hardware components doesn't help. With nvidia please have it escalated.
 
  • Like
Reactions: fluidz
Status
Not open for further replies.