[SOLVED] LiveKernelEvent 141... for the past year

Status
Not open for further replies.
Oct 12, 2019
8
0
10
Specs from Speccy v1.32.740
=======================================================================
Operating System: Windows 10 Home 64-bit
CPU: AMD Ryzen Threadripper 2920x
RAM: 16.0GB Single-Channel DDR4
Motherboard: ASRock X399M Taichi
Graphics: 4095MB NVIDIA GeForce RTX 2070 MSI /// 4095MB NIVIDIA GeForce GTX 1080 (listed both GPUs I have tried to use the system with)
Storage: 250GB Samsung 860 EVO SSD & 500GB Samsung 850 EVO SSD
Cooler: Kraken x52 Cooler
PSU: Corsair 750m
=======================================================================

Real fast: Both GPUs are misreported as 4GB in Speccy they are actually 8GB. I never used both at the same time, only separately. Same problems under both.

Tldr; old computer broke and I built an entirely new one (only reusing PSU which was 1.5 years old). Shortly after building my computer I had consistent crashing on any "high" video card utilization software. This is most commonly while playing games, using game development software (ex: Unity), and other intense rendering software like Adobe Photoshop or Affinity Publisher, but it could still occur on something as simple as Twitch and YouTube BUT it does occur less often in those instances. I tried many fixes over the course of the past year, but none seemed to work. The crashing is sometimes represented in different ways based on the application, but there is almost always a temporary black screen when the app crashes followed by a reporting of LiveKernelEvent 141, hardware error. Most modern games commonly represent it as: "Video Card Disconnected Unexpectedly"

I can usually use the software for ~3 hours before a crash occurs. I've monitored utilization, temperatures, and more and nothing seems out of the ordinary (although a few different pieces of software misreport the CPU temp, but I can see that it's correct with the AMD application and mobo readings). After reading some other forum posts I came to the conclusion that it was a hardware problem. At the end of August I purchased a refurbished GTX 1080. The crashes were slightly less frequent, but marginally so. They still occured in every application which leads me to believe it was not in fact a hardware problem (RIP) or it was a different piece of hardware. Someone recommended using a new PSU so I bought a new one (still 750m) and this had no impact (no reuse of cables).

I'm looking for any advice on how to solve this issue. I can also provide any more information that you may need. I can't list everything I've done in the past year, but hopefully its clear that I tried the most obvious fixes.

What I can remember that I've tried:
  • Updating drivers / windows / BIOS
  • Reinstalling drivers / windows
  • Using older versions of drivers / windows
  • Safe Mode --> Uninstall drivers --> Reboot --> Reboot in safe mode --> Resintall drivers --> Reboot
  • Uninstalling pretty much everything except for steam + a game and disabling all startup applications. Game still crashes.
  • Replaced PSU (no impact)
  • Tried a different card (listed above and no impact)
  • Monitored temperatures (CPU always stayed below 70 C and GPU stayed around 40-60C)
  • Turned off secondary display and disconnected and still crashed with a single monitor.
  • Tried different video cables (HDMI, DisplayPort, and VGA).
  • Ran software that theoretically "removed the driver entirely." While this is trusted software in the community I think in general it does little more than safe mode boot with uninstall/reinstall as I listed above.
  • Stress tested computer. Settings appeared to not affect crash time. The crashes appear to occur mostly based on time. If anything, the higher the settings the less frequently it crashes.
  • Ran a memory test using some recommended software on the RAM that found it had no issues.
  • Double and triple checking hardware compatability.
  • Changing refresh rate to see if that affected crash times (unclear if meaningful crash time difference).
  • Every form of asking windows to check and automatically solve problems for you.
  • Validating system files with /scannow.
I worry that it might be a motherboard issue, but honestly I just want the problem to be done with at this point. Please don't ask me to buy a new piece of hardware just for testing purposes though. Thank you for your help.
 
Solution
It´s very uncommon placing only one single RAM stick into a quad channel platform. Even the manual or the memory compatibility list doesn´t state how a single one should be installed 🤔 or that one can be installed alone as well.
Eventually the BIOS is not aware of this and has problems with only one stick. But that´s a very uncommon thing either.

Is the CPU Xtreme OC Switch (MOS_PROCHOT1) set to off on the board (above the DDR4_B1 Dimm slot)?

Is the BIOS version 3.80 installed ?

To check the pins you will have to disassemble the CPU from the board, yes.
View: https://www.youtube.com/watch?v=O1H5_FVX9lU

Maybe this can help as well:
View...
Which exact RAM stick is it? brand and model
In which dimm slot is it installed?

reset the BIOS by jumper clrCMOS / CLRTC

Check the motherboard´s CPU socket for bent or discolored pins

Is the CPU cooler not excessively tightened on the CPU?

In which PCIe slot is the GPU installed? Did you try different PCIe slot?

Are both SSDs SATA drives or M.2?
check the drives with Samsung magician and eventually update the firmware if available

But looks like a motherboard issue.
If you are really unlucky, even the CPU could be the problem.
 
Last edited:
Which exact RAM stick is it? brand and model
In which dimm slot is it installed?

reset the BIOS by jumper clrCMOS / CLRTC

Check the motherboard´s CPU socket for bent or discolored pins

Is the CPU cooler not excessively tightened on the CPU?

In which PCIe slot is the GPU installed? Did you try different PCIe slot?

Are both SSDs SATA drives or M.2?
check the drives with Samsung magician and eventually update the firmware if available

But looks like a motherboard issue.
If you are really unlucky, even the CPU could be the problem.

This is the RAM: https://www.amazon.com/Patriot-Memory-2400MHz-PC4-19200-Single/dp/B074Q1LM4K
There are four slots on the motherboard for the RAM. Two are left of the CPU and two to the right of it. I installed the RAM on the far left slot. I read through the manual while installing and there were rules for installing dual channel, but single channel was supposed to be simple and "work" in any slot and I think it said just place it on the left.

The CPU cooler does not appear in any way to be excessively tight on the CPU. I generally leave screws looser if anything. A stuck/stripped screw is my worst nightmare.

The GPU is installed on the topmost PCIe slot. I have not tried a different slot. I believe my motherboard manual recommended I use the topmost one (forget why), but I can try the one below it.

The SSDs are both SATA drives. I downloaded the software and used the diagnostic scan on both drives (no errors) and updated the firmware.

Before I fully open the PC up and examine the CPU slot I want to make sure I correctly understand what you're asking me to do. Resetting the BIOS is simple. I should add I updated the BIOS late in my year long debugging process because I didn't want to screw anything up. I mostly have questions about checking the motherboard's CPU socket for bent or discolored pins. This would require me to remove the CPU cooler and CPU, no? Would I need to reapply thermal paste? I just want to make sure I don't screw up the computer further. Once I hear back I will make all of the changes you recommended to the hardware (including double checking RAM slot suggestions) and rigorously test the results.

If the pins are bent/discolored would you recommend I simply buy a new motherboard and reuse all of my other components or would that not work for some reason? If the pins seem fine am I <Mod Edit> out of luck?

Thank you for your response.
 
Last edited by a moderator:
It´s very uncommon placing only one single RAM stick into a quad channel platform. Even the manual or the memory compatibility list doesn´t state how a single one should be installed 🤔 or that one can be installed alone as well.
Eventually the BIOS is not aware of this and has problems with only one stick. But that´s a very uncommon thing either.

Is the CPU Xtreme OC Switch (MOS_PROCHOT1) set to off on the board (above the DDR4_B1 Dimm slot)?

Is the BIOS version 3.80 installed ?

To check the pins you will have to disassemble the CPU from the board, yes.
View: https://www.youtube.com/watch?v=O1H5_FVX9lU

Maybe this can help as well:
View: https://www.youtube.com/watch?v=oKmIbvil_0I
 
Last edited:
Solution
It´s very uncommon placing only one single RAM stick into a quad channel platform. Even the manual or the memory compatibility list doesn´t state how a single one should be installed 🤔 or that one can be installed alone as well.
Eventually the BIOS is not aware of this and has problems with only one stick. But that´s a very uncommon thing either.

Is the CPU Xtreme OC Switch (MOS_PROCHOT1) set to off on the board (above the DDR4_B1 Dimm slot)?

Is the BIOS version 3.80 installed ?

To check the pins you will have to disassemble the CPU from the board, yes.
View: https://www.youtube.com/watch?v=O1H5_FVX9lU

Maybe this can help as well:
View: https://www.youtube.com/watch?v=oKmIbvil_0I

The BIOS version is 3.8 and the CPU Xtreme OC Switch is set to off. I can try turning it on if you think that would help. I haven't had time to open the case up yet and check the CPU pins, but I will have time tomorrow. If BIOS reset, PCIe slot swap for graphics, and CPU pin check all come back negative would you recommend I purchase some dual-channel memory or is there more that could be done?
 
The switch is fine as it is, could have explained the issue if it had been enabled
If BIOS reset, PCIe slot swap for graphics, and CPU pin check all come back negative would you recommend I purchase some dual-channel memory or is there more that could be done?
Checking with different RAM can help, get one out of the QVL list of ASRock
https://www.asrock.com/mb/AMD/X399M Taichi/index.asp#Memory

other than that the motherboard might be faulty
if you are very unlucky the CPU might have an issue, but that´s really not a common thing
 
Last edited:
Alright sorry it took me so long (life got in the way), but I finally opened the beast up and went fixing.

I reset the BIOS/clear CMOS. I moved the graphics card to another PCIe slot. I checked the CPU socket and there was a single pin that was slightly bent. It took forever to get it "fixed," but I eventually realigned it. I put the CPU back, cleaned paste off of it and cooler, reapplied paste, and then rescrewed everything and double checked that they were not too tight and followed the ASRock instructions on tightening.

I have no clue if the computer is working as intended yet, but I will report back if I continue to experience crashes. If I don't report back, thank you so much for your help!
 
The switch is fine as it is, could have explained the issue if it had been enabled

Checking with different RAM can help, get one out of the QVL list of ASRock
https://www.asrock.com/mb/AMD/X399M Taichi/index.asp#Memory

other than that the motherboard might be faulty
if you are very unlucky the CPU might have an issue, but that´s really not a common thing
It still crashes... BUT at least I got a new type of error. Event viewer is showing an Event 13, nvlddmkm.

"\Device\Video3
Graphics Exception: SKEDCHECK03_LOCAL_MEMORY_HIGH_SIZE failed"

I don't know what I'm supposed to do now. I've just been looking into this error, but it's not too promising...
 
Status
Not open for further replies.