Question Consistent daily freezes on lock screen ?

Jul 19, 2023
8
1
15
System:
Host: CubeBox Kernel: 5.15.0-76-generic x86_64 bits: 64 compiler: gcc
v: 11.3.0 Desktop: Cinnamon 5.6.8 Distro: Linux Mint 21.1 Vera
base: Ubuntu 22.04 jammy
Machine:
Type: Desktop Mobo: ASUSTeK model: TUF GAMING X570-PLUS (WI-FI) v: Rev X.0x
serial: <superuser required> UEFI: American Megatrends v: 4602
date: 02/23/2023
CPU:
Info: 16-core model: AMD Ryzen 9 3950X bits: 64 type: MT MCP arch: Zen 2
rev: 0 cache: L1: 1024 KiB L2: 8 MiB L3: 64 MiB
Speed (MHz): avg: 2298 high: 3595 min/max: 2200/3500 boost: enabled
cores: 1: 3595 2: 2054 3: 1982 4: 2074 5: 2196 6: 2194 7: 2198 8: 2203
9: 2053 10: 3592 11: 1871 12: 1866 13: 2802 14: 1868 15: 1872 16: 1869
17: 3594 18: 2053 19: 2056 20: 2192 21: 2195 22: 2196 23: 2197 24: 2204
25: 2052 26: 3582 27: 2065 28: 2053 29: 2202 30: 2205 31: 2205 32: 2201
bogomips: 223583
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] vendor: eVga.com.
driver: nvidia v: 470.199.02 bus-ID: 05:00.0
Display: x11 server: X.Org v: 1.21.1.4 driver: X: loaded: nvidia
unloaded: fbdev,modesetting,nouveau,vesa gpu: nvidia
resolution: 1920x1080~60Hz
OpenGL: renderer: NVIDIA GeForce GTX 1080 Ti/PCIe/SSE2
v: 4.6.0 NVIDIA 470.199.02 direct render: Yes
Audio:
Device-1: NVIDIA GP102 HDMI Audio vendor: eVga.com. driver: snd_hda_intel
v: kernel bus-ID: 05:00.1
Device-2: AMD Starship/Matisse HD Audio vendor: ASUSTeK
driver: snd_hda_intel v: kernel bus-ID: 0c:00.4
Sound Server-1: ALSA v: k5.15.0-76-generic running: yes
Sound Server-2: PulseAudio v: 15.99.1 running: yes
Sound Server-3: PipeWire v: 0.3.48 running: yes
Network:
Device-1: Intel Wireless-AC 9260 driver: iwlwifi v: kernel bus-ID: 06:00.0
IF: wlp6s0 state: up mac: a0:e7:0b:b5:24:71
Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
vendor: ASUSTeK driver: r8169 v: kernel port: e000 bus-ID: 07:00.0
IF: enp7s0 state: down mac: 58:11:22:4d:13:ae
Bluetooth:
Device-1: Intel Wireless-AC 9260 Bluetooth Adapter type: USB driver: btusb
v: 0.8 bus-ID: 3-5:2
Report: hciconfig ID: hci0 rfk-id: 0 state: up address: A0:E7:0B:B5:24:75
bt-v: 3.0 lmp-v: 5.1
Drives:
Local Storage: total: 2.3 TiB used: 27.42 GiB (1.2%)
ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 980 500GB size: 465.76 GiB
temp: 35.9 C
ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 970 EVO Plus 2TB
size: 1.82 TiB temp: 38.9 C
ID-3: /dev/sda type: USB model: USB DISK 2.0 size: 28.91 GiB
Partition:
ID-1: / size: 456.89 GiB used: 27.41 GiB (6.0%) fs: ext4
dev: /dev/nvme0n1p2
ID-2: /boot/efi size: 511 MiB used: 6.1 MiB (1.2%) fs: vfat
dev: /dev/nvme0n1p1
Swap:
ID-1: swap-1 type: file size: 2 GiB used: 0 KiB (0.0%) file: /swapfile
Sensors:
System Temperatures: cpu: N/A mobo: N/A gpu: nvidia temp: 40 C
Fan Speeds (RPM): N/A gpu: nvidia fan: 0%

I've been having issues for the past few months where my system will freeze in place on a screen, forcing me to reset. If I then unplug the monitors from the GPU and plug them back in, the screen does not come back up. I am using two monitors, one connected by HDMI and one by DP. I just recently (past week) installed a 2nd SSD to run Linux Mint. I have witnessed this occur on Windows, on Linux, and even in the BIOS screen.

I have witnessed this issue occurring on the lock screen with a clock display, so I am able to pinpoint to the second when the freeze occurred. However, checking the syslog shows no activity or information around the time of freeze.

Fixes I've already attempted with no avail:
-Updated BIOS
-Updated and reverted GPU drivers
-Upgraded DP Firmware for GPU
-Reseat GPU
-Change PS cables to CPU
-Ran memtest overnight to check for RAM issues, 7 passes and no errors

I'm starting to believe the GPU may be in need of replacement, it's around 5 years old.

Any help or advice to debug this issue would be greatly appreciated!
 
Please include the make and model of your PSU and it's age. You could try and replace the PSU with one borrowed from your neighbor or friend and see if the issue persists.

Side note, would be to see if reinstalling Windows 10 onto a spare standalone from the other 2 drives and manually installing all drivers manually alleviates the issue.
 
Are you overclocking:
1). CPU (PBO, etc.)
2). RAM (XMP/DOCP/EXPO)
3). GPU (MSI Afterburner, etc.)
If so, disable any overclocks and see if stability improves.

I can't see any mention of your RAM, apart from the fact it passed Memtest 7 times.
How many DIMMs are installed and of what capacity?
Manufacturer and part number?

I recently upgraded an old 2600X with 3,000MT/s RAM and enabled XMP in the BIOS. The PC booted up fine and passed two runs of Memtest (3 hours 20min), which clearly showed the RAM was set to 3,000 during the test.

However, within 15 minutes of running any normal program, including web browsers, the PC would crash and reboot. I fixed the problem by manually tweaking the CL (CAS) setting of the RAM from 16 (the default in XMP) to 20. I haven't tried reducing CL to 19, 18 or 17 yet, because I rarely use this system. I know the 2600X isn't rated past DDR4-2933, but I took a chance with 3,000.

In the past, I'd always assumed that if memory passed several runs of Memtest it was probably OK, but this time I was wrong. Passing Memtest does not guarantee stability in all instances.

Although your RAM passes Memtest 7x, if it's overclocked try setting it back to JEDEC defaults (probably 2,133 or 2,400MT/s).

Or, as you say, it could be the GTX 1080 is at fault. If you have an old GT 710 or similar GPU lying around, give it a try before buying a new card. N.B. Does your system crash with only one monitor connected?
 
Please include the make and model of your PSU and it's age. You could try and replace the PSU with one borrowed from your neighbor or friend and see if the issue persists.

Side note, would be to see if reinstalling Windows 10 onto a spare standalone from the other 2 drives and manually installing all drivers manually alleviates the issue.
PSU is a Corsair RM850x that is about one year old.