Question Bizarre Crash

Oct 18, 2024
1
0
10
I am having very odd crashing issue.
My computer will suddenly shut off without even putting any info in the event log other than the 'unexpected shut down' expected for a power failure. On boot after it will sometimes have a message "hot component detected". So I up my fan speed and log temperatures using 'open hardware monitor'. The logs do not show any temperature above 70C since I updated fan speed, but the problem continued. So I go nuclear mode, open the case and put a large room fan pointing inside to no avail. The log has CPU, GPU, HDD, and SSD (no sensors for power or ram).

But it gets stranger. This issue only occurs when running two ANN concurrently, yolov11 and SAM-2. The problem occurs regardless of if I put the data on SSD or HDD.
I know ANN are heavy load, so I try FURMARK, except with no problems at all! Unlike the ANN, FURMARK can get the GPU temps up to 95C, quite the head scratcher. (No problems with CPU stress tests either)

Specs:
CPU: i5-6600
GPU: NVidia 2060 Super
RAM: 32GB "My Technology Geeks" (This is the name of reman company, so I assume Chinese ram)
SSD: 512GB 'heoriady'. Definitely China, also the temp always reports '40C' even when everything else is closer to 30.
HDD: 3TB Hitachi
PSU: Dell (365W), the maximum logged (open hardware monitor) power usage during the ANN usage was 258w (CPU(package+cores+Graphics+DRAM)+GPU)

From all I can tell this issue shouldn't happen.
 
Last edited:
RAM: 32GB "My Technology Geeks" (This is the name of reman company, so I assume Chinese ram)
is this a 2x 16GB kit,
2 separate 16GB kits,
4x 8GB kit, etc..?

go ahead and run a few rounds of Memtest from a USB drive and see if any hiccups occur.
computer will suddenly shut off without even putting any info in the event log other than the 'unexpected shut down' expected for a power failure
usually related to a power supply that just can't handle the load.

considering that you run stress tests with no similar response,
watch power usage during these processes and see if the multiple AAN instances seem to cause any heavier power load.