Question Troubleshooting periodic server restart

jozeftierney

Reputable
May 4, 2018
54
0
4,540
We built a small business server last summer and this past winter it began to restart a couple times a week. I sometimes goes a couple days without doing it and sometimes up to four times a day (twice within the last hour).

System specs:
Intel Xeon 1230 V5
Windows server essentials 2016
16 gigs KVR24E17D8/16 kingston RAM
250W single power supply

Running through an APC UPS

I've combed through the event logs, server boots up and gives code 6008 (unexpected restart) and I can't find any similar events appearing before the crashes. There is also no similarities between the times the server restarts. It runs around 36 degrees Celsius (~96 degrees F) and it's set to not automatically restart after a blue screen but the server is always starting up by the time we realize it crashed. I also made sure there were no scheduled tasks causing this, I am very confident no one else has been accessing the server remotely or even been near it physically.

The weird thing is that while it happens at any point throughout the day, it almost happens at somehour:22 or somehour:52 (like 9:22, 11:22, 5:52, 7:22). Almost all the event 6008 occurrences in the event viewer have a timestamp with 22 for the minute it occurred.

The server runs our Solidworks 2018 license manager and vault manager as well as quickbooks and maybe some other scheduled backups, nothing very demanding.

My boss thinks it may have started after a windows update in February, the event log shows this error occurring before that but it could have been a different cause, I'm not sure.

Windows and all drivers are up to date, I checked the BIOS event viewer for anything but it's completely empty even though it's always been enabled, is that normal?

I've spent the last two days scouring the internet and trying other peoples solutions but still no luck, any help or leads are greatly appreciated. let me know if there's any useful information I've missed.

Thanks!
 

jozeftierney

Reputable
May 4, 2018
54
0
4,540
Is it possible to list your Server's make and model or at the very least the make and model of the motherboard in your server? It could be a power issue, whereby the PSU is failing after a particular duration of usage.

ASUS-P10S-M-DC Server Board

sorry, should've included that before. We got the ASUS rs100-e9-pi2 which was the chassis, motherboard and psu all in one. Running memory tests tonight then the PSU is the next thing to test.
 

kanewolf

Titan
Moderator
I don't think so, can't find anything showing it does and I've never set it up.
You may have never set it up, but it might benefit you. If you look at the motherboard description you see a "dedicated management LAN" above the USB ports. BUT it says the IPMI module is optional. So you may not have full remote management. The optional board plugs in between the CPU socket and the back panel I/O. You would have to check to see if you have the module.
 

jozeftierney

Reputable
May 4, 2018
54
0
4,540
I have realized I was only looking at the system window in the event viewer, upon inspecting the application window I found that the Software Protection service causes a lot of events at matching times (1:20, 2:20, 3:20...). Could this service be tied to the problem?
 

Zhire

Distinguished
Dec 23, 2003
13
0
18,510
So this does not appear to be directly related to a hardware issue. Being that this is cyclic in nature, it would be most likely some application or service trying to authenicate or being installed on the server. I would look at audit log and check who or what is logging in around that time to remove an exterior actor from the framework. After this would like at taking a snap shot of the process running at or around 22 after. This is most likely your culprit.
 

jozeftierney

Reputable
May 4, 2018
54
0
4,540
So this does not appear to be directly related to a hardware issue. Being that this is cyclic in nature, it would be most likely some application or service trying to authenicate or being installed on the server. I would look at audit log and check who or what is logging in around that time to remove an exterior actor from the framework. After this would like at taking a snap shot of the process running at or around 22 after. This is most likely your culprit.

the only thing i can find about an audit log is through microsoft office which isn't installed on the server.

I disabled the google update task which ran every hour (12:18, 1:18...) and it cut the number of occurrences of Software Protection Service from 12 in a row (all happening within a second or two) down to 6.