Inconsistently unstable system, crashes, bootloops, you name it!

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

buck369

Prominent
Jan 6, 2018
15
0
510
Hey guys,

I built my first PC recently, it consists of:

MSI B350 Tomahawk

Ryzen 1600

Corsair TX750M

Corsair Vengeance LED DDR4 3200 MHz 2x8gb (CMU16GX4M2C3200C16R)

Samsung PM961 SSD 512GB M.2 NVMe

RX Vega 56

Seagate Barracuda 3TB

Windows 10 Home

In a Phanteks P400S tempered glass

___________________________
TLDR:

My system sometimes workes fine and I can game for 5 hours. Sometimes it crashes shortly after entering windows: this problem seemed to go away after turning off AMD Cool'n'quiet. But I recently had it crash during a Windows update which just made everything worse.

These crashes sometimes sent me into bootloops. Sometimes I have bootloops from cold boot. Crashes have happened at stock CPU and RAM settings.
___________________________

What follows is in chronological order what my PC has been doing for about a month.

At the time this happened the CPU was running at stock settings.

I was playing GTA5 for about 4 - 5 hours, I had A-XMP activated, and was running with absolutely no issues.

Then I closed the game and set Windows in sleep mode. Upon returning i fired up Microsoft edge and about 30 seconds in the screen freezes, the CPU fan speeds up a little bit and the power shuts off.

Computer automatically restarts, I get into windows, launch Edge again and crash soon thereafter. This time the computer automatically tries to boot unsucessfully 5 - 6 times, power turns on and off with nothing appearing on screen at any time, no BIOS or Windows screens.

I eventually am prompted with a Windows recovery screen, asking me choose between repair options or restart computer.

I chose restart computer.

I figured it might be the A-XMP, which was the last thing I fiddled with, so I turn that of in BIOS.

I get into windows, I launch Edge again, and after a while I press the windows key twice, which again freezes the computer and causes the power to shut off.

I am now prompted with a black screen reading "Reboot and select proper boot device"

A concern was that I could not find my Samsung SSD in the boot menu upon restarting BIOS, it used to read ~ "UEFI HARD DRIVE:Windows Boot Manager (SAMSUNG MZVLW512HMJP-00000)", now there was only an option of "UEFI Hard drive".

I can boot using "UEFI Hard drive" and get into Windows. But i keep getting crashes after a minute or so.

To clearify, these crashes are happening at stock CPU clock and RAM (2133) settings. Only deviation in BIOS should be CPU fan speed, which I had turned down in Idle because it went up by 1000 RPM when I activated A-XMP. The CPU temp was less than 35 degrees at the time of the crashes according to HWinfo64. Though I will try to reset fan speeds as well and see if that's the problem..

Update: I reset the fan speed, which was still my custom curve from before, I.e. BIOS does not appear to have been automatically reset. Now the whole name of my Samsung SSD showed up in the boot options, but it failed to boot of it twice, I was returned to BIOS where the name had disappeared and was now showing "UEFI Hard drive" again. I chose this as boot option 1, then I noticed a setting called "HDD BBS Priority, which had my 3TB storage disk as the only option (which was selected), I disabled this setting.

I booted into Windows. Tried my hand with GTA 5 instead of Edge. After about 3 minutes the screen froze for about 10 seconds, after which the computer crashed, I was quickly booted into Windows again.

It seems really strange to me how I can play for 4 hours straight and then suddenly not run Edge or press the windows key.

My CPU temperature was never over 51 degrees celsius while playing. HW Info did show that my Samsung M.2. SSD had reached around 75 - 80 (don't remember exactly) as a maximum while playing. I don't know if this is enough to cause damage to the SSD, which could cause my crashes?

I've ran Memtest86 for ten passes with RAM at 2133 after all this happened, I had no errors at all. I've also done 6 passes with A-XMP enabled at 3200, no errors reported.

I turned off Hibernation sleep in Windows and have not used the sleep setting since I started having these issues.

________________________________

I turned off AMD Cool'n'Quiet and since I've had 5 - 6 gaming sessions without any issues now, ranging from 1 - 3 hours.

One issue that I noticed was that sometimes my memory speed is reduced to 2133 even though A-XMP is activated in BIOS/UEFI, along with the 1000 RPM CPU fan increase that follows with the activation of A-XMP.

Unfortunately, problems returned yesterday:

Upon starting the computer it boots of my USB stick with Memtest86 on it, even though USB is not prioritized ahead of my SSD in the boot sequence. Memtest86 starts very slowly reading my hardware.

I turn off the computer and start it again: I counted 4 or 5 failed attempts in a bootloop, then I am prompted with a "Reboot and select proper boot device".

I turn off the computer, start it again and the same thing happens, 5 failed attempts in a bootloop which is followed by the "Reboot and select proper boot device".

I reboot into BIOS/UEFI and look at the boot order: I believe it has changed again since i last saved my desired order. I change "UEFI Hard disk" to 1, and "Hard Disk" to 2.

I save the settings and try to boot: another bootloop of five unsuccesful attempts follow. Then a Windows screen appears saying "incorrect password", this is the screen that you are prompted with if you enter the wrong password for the Windows log in. I hit OK and enter my correct password and I'm finally inside Windows. After this I could play games for about 90 - 120 minutes without any issues. Note that even though A-XMP was activated and memory at 3200 MHz last time I checked BIOS/UEFI: the memory is running at 2133 according to HWInfo64 and CPU-Z. The fan curve is however still affected by the A-XMP setting, which has it running about 1000 RPM faster in idle.

I'm really not an expert but I'm thinking I might have (had?) several issues? I have not had any crashes in Windows/while playing since deactivating AMD Cool'n'Quiet. But I'm still experiencing issues with booting, specifically getting bootloops, having my boot order changed around and the names of the items in the list changing.

I'm thinking this is caused either by my memory settings or an issue with my Samsung SSD, or the Windows installation. As mentioned, Memtest86 works flawlessly, but I'm not sure if my RAM is actually compatible, it is not on this list but some that are similarly named are: https://www.msi.com/Motherboard/support/B350-TOMAHAWK.html#support-mem-12 Note that I am not on the latest BIOS version, I have the one from July still.

I will try running the memory without A-XMP for a while, at stock 2133 setting to see if I get any further boot issues.

____________________________________

I've had a few sessions of usage/gaming run fine. A-XMP is still activated an indicating that memory is running at 3200 in the settings, but the speed stated in UEFI/BIOS says 2133, which is what I get according to HWInfo. I turned of A-XMP. Upon trying to reboot I get a black flashing screen, which indicates no signal from the computer. I turn off and boot into windows without any issue.

However my GPU started making a new noise, a high frequency high pitch buzzing when I'm in a game. It comes and goes instantly when alt-tabbing. This was with a custom radeon profile though, it mostly goes away at the default settings. This seems unrelated to my other issues though.

I've noticed ticking noises from my computer, that can be felt when touching the chassis. I suspected this was my HDD (not the OS SSD) so I disconnected it this morning. With the backplate off, i started the computer and was prompted with a Windows update, you know the one that says "Don't turn off the computer".

At about 30% the computer crashes straight into UEFI/BIOS.

Note that this is the first crash I've had in many sessions that was "when using the computer", i.e. not a bootloop. Otherwise I've had no crashes in Windows/gaming since turning AMD Cool'n'quiet.

I exit UEFI and return to a Windows update screen at 35%, which goes almost instantly to 90 and 100%.

At Windows log in screen I enter password but nothing happens. I do it again and get logged in.

After this the computer feels a bit strange, doesn't register all my input, after a while it freezes, goes to black screen and reboots into UEFI/BIOS. Similar thing happens again shortly after.

I wanted to try to reset my Windows system to an earlier point, but as none were saved this was not possible according to Windows.

Any help is greatly appreciated cause I'm lost and frustrated. Let me know if you need any additional information, thanks!
 

buck369

Prominent
Jan 6, 2018
15
0
510
Without changing anything since my last post. I was able to run Ubuntu from a live USB stick, two sessions without any issues at all watching youtube and playing Sudoku..

Then I went over on Windows again and I could run the PC for about two hours without issues.

Today I could play Rising Storm 2 Vietnam for an hour.

Strange things are:
- that the GPU seemed to run on its lowest state, utilization was very low even though I had noticeable performance issues. After maybe 45 minutes all the lights on the GPU came on and the fan speeded up, seemed strange.
- I can't open Radeon Settings anymore, Í click the button but nothing happens, it has an icon running on the lower right of the task bar but no interaction or response from it
- Time has changed on PC, it's now one hour behind my time for some reason.
- That it didn't crash

So I'm back to inconsistently unstable it seems. Not sure which is worse.
 
Hmm, Very often if time is changing in Bios, it's an indication of a dead or dying CMOS battery (CR2032) or inconsistent power from the PSU.

A CMOS Battery is about $4 so try that. The only way to tell if it's the PSU is to swap it out with another of the same Wattage or higher unless you can take it to be tested under load.

The GPU lighting up after 45mins is strange and again could be a lack of consistent power and an indication that it's the PSU or poor cabling and connections.
 

buck369

Prominent
Jan 6, 2018
15
0
510
From the other evening:
Boot->start event viewer and HWinfo, the former stops responding, freeze->blue artifact screen, crashes and reboots to "incorrect PW or username" screen.

Log in, can play Rising Storm 2 for 2hours, video card is again running on very low speeds and fan RPM, low utilization with poor performance. Then I exit the game and watch some youtube clips. I Get the blue artifact screen and crash to BIOS. I believe this is the first crash I've had that wasn't within a few minutes from starting the computer. When it works for 5 minutes it usually goes on until I want it to turn off, this was a first. I try to boot but the screen is just black for a few minutes, the PC then turns itself off.

New day, new adventure:
Turn on -> log in -> do nothing -> blue artifact screen without the artifacts -> crash.
Promted with Windows Recovery "Windows didn't load correctly". I Click restart-> just sits at BIOS screen (The one that says press f11 for this, del for that) -> I think I pressed cntrl alt del -> PC just sits at BIOS screen again.
I can't press cntrl alt delete this time, no response. I turn the power off.

Turn on -> Bios screen which says "DISC CHECKING" - > Scanning and repairing C, it completes at 100%.
I get a Windows Automatic repair.
I just restart. I log in, notice that the time is now correct again, a week ago it was one hour behind. Computer goes to sleep.

Wake the PC -> DL game through steam but soon crashes to Windows recovery.
I just choose restart.
I get to pre login Windows screen, mouse moves but nothing else responds. I get black screen->blue screen-> reboot.
Then I can finish the download and play the game for two hours.
Computer sleeps.

Since doing a recovery and rolling back GPU drivers, I have not been able to start "AMD Settings" or click the corresponding logo in the Windows tray on the bottom right.:
I try to install the latest Radeon drivers, after pressing OK to give admin approval, the pop-up dissapears but I can only see my background image with a gray transparent overlay still. Then I get back to the desktop ->black screen->blue artifact screen. Crashes to Windows log in screen, unresponsive->black screen->blue screen->crash.
BIOS screen flashes, then black screen-> Low resolution Windows screen " :( Windows needs to restart"
PC goes into BIOS -> Automatic repair, Windows asks me to choose keyboard layout... Go with Swedish. I choose startup repair - > Windows couldn't repair, there is no log file where the screen would imply that there should be one. I turn off the PC.

Turn on-> log in-> try installing same radeon drivers-> Works-> Now I can open Radeon settings again. I play Rising Storm 2 for 4 hours without issues. Turn off.

This morning:

PC on -> log in -> open Edge and Event viewer, it freezes. I press caps lock repeatedly and the indicator on the keyboard switches initially, but after a while the indicator stops responding to my button presses, seconds later it crashes. -> Windows screen "Windows needs to restart, collecting...0%", it disappears without moving from 0%. -> preparing automatic repair, choose keyboard layout again, startup repair, doesn't work this time either. I turn off.

Turn on PC -> Goes directly to blue screen with "Incorrect user name or password" -> OK -> Log in -> do nothing -> crash to BIOS within a minute. The BOOT debug LED is on. -> Black screen with "Reboot and select proper boot device". -> Cntrl alt delete to reboot ->BIOS screen flashes -> Windows recovery -> Just continue to Windows -> "Incorrect user name or password" again - > Log in-> do nothing -> screen goes black, all I see is taskbar with 3 logos -> crashes to "User name or PW incorrect" -> I crash again to the same screen -> OK -> Log in screen is unresponsive -> goes black -> crashes -> "Incorrect PW or user name" -> crashes again to same screen. I turn off.

TLDR: New stuff is a crash after a two hour gaming session, usually only crashes within minutes of starting or it doesn't crash.
BIOS running Disc scanning and repairing.
Time is correct again, last week it was one hour behind.
Radeon settings works again after installing new drivers from feb 1 (or 2), the program would previously just ignore me. GPU also worked better once I installed them, rather than staying on its lowest states of activity and getting really hot it would use higher ones and get better performance.
Could play for 1, 2 and 4 hours on separate occasions but incessant crashes on all other attempts at using the computer.

New event IDs: 25, 8229, 16389.
Screen: https://imgur.com/a/KPRlg

I will reinstall Windows again and not install any drivers.

I don't have any powerfull enough PSUs lying around. Does it make sense that it works for many hours but 90% of the time crashes fairly quickly if it's the PSU?

Thanks!

 

teslacoilftw

Honorable
Aug 4, 2012
45
0
10,560
This may seem like an Odd Request but can you take some pictures of the inside of your case? Like some clear pictures of Your SSD drives.. your Ram.. your Video card etc...

As I've stated before this seems like a Memory Management Issue either Ram or hard drive...
Format hard drive and reinstall Windows Making sure your installing the 64 Bit Version of Windows..
When Reinstalling Windows Make sure your 3TB Hard drive is not plugged in.

Don't install any Windows updates.... Don't activate...
As soon as Windows 10 boots... after install go into Microsoft updates and disable all updates..

Don't install any Radeon drivers...
Lets see if the computer will Restart with out any drivers...

let us know...
 

buck369

Prominent
Jan 6, 2018
15
0
510
Hey again, thanks for the responses, really appreciate it.

Here are some screens, in some random order:

https://imgur.com/a/Z1t5K

The pics of the processor and bottom of the cooler are back from december. There's a black dot on under "Ryzen" on the processor, tried to scrape that with my nail back in december but it wouldn't come off so I left it.

So I cracked the case open, used alot of force to remove the power cables from the GPU, then I took the photos. I think I touched the exposed bit on the GPU (the stuff in the "X", see screenshot), I was wearing an anti-static bracelet though, should I worry about damage? (It seems to work fine, if you read on).

I tried Ubuntu from a USB stick just to make sure everything was back together correctly.

I reinstalled Windows, booting of a UEFI USB . There are screens to show the process. I did not have my HDD plugged in, only the SSD. I formatted three of the existing partitions, the MSR reserved one I could not. I Installed Windows 10 Home 64 bit.

After installation BIOS did some disc checking, getting devices ready.

I get into Windows and use Services.msc to disable and stop Windows update. I also disable automatic driver updates. Windows did some checking after updates during the installation though. Only weird thing was the blue thinking circle on the mouse pointer when hovering the task bar.

In Event Viewer I'm getting loads of 200-202 errors, might be the disabling of Windows update.

I've noticed another strange thing in Event Viewer. You can see that I've got the Kernel power and failed crash dump errors that I've had before, only I havent had a crash yet. I turned the computer off normally and the next time I had that error.

I've had it on for many hours this weekend, mostly for gaming, no crashes yet. I had an error a few times when trying to run "Rising Storm 2: VNGame.exe application error, the instruction at 0Xlotsofnumbersandletters referenced memory at 0xagain. The memory could not be read.". I circumvented the problem by exiting steam and trying again.

I can now open AMD settings and configure my GPU. I could not do that after doing the system restore a while back, which should have been the exact same drivers.

In one screen there's the Windows WHQL support setting, should I have this on?

What should I do next? In what steps should I try to break it? Why do I get Kernel Power errors like before but without crashes?

Seems unlikely that beating up my GPU solved anything, maybe it frightened the other parts.

There's also a new BIOS out since a week ago. I have not updated though.

Thanks!
 

teslacoilftw

Honorable
Aug 4, 2012
45
0
10,560
I'd leave the Bios alone unless the update for the Bios stipulates an issue that you are having...
I'd leave the computer alone and not perform any updates for a few days and see if it starts crashing.. if it doesn't the good news is.. it's not hardware problem..
the bad.. it's some Driver issue.
As for as the WHQL Support.. I personally leave it on. As far as I understand it's basically the "auto" IRQ assignment when installing operating systems...

 

buck369

Prominent
Jan 6, 2018
15
0
510
Hey guys, I'm sorry to be back!

I followed your instructions back in February and reinstalled Windows, deactivated the Windows update service and did not install any drivers. A few Windows updates may have snook in directly after the installation, I don't exactly recall.

To my suprise, the PC worked fine, until this past weekend on the first of April...

After waking the PC from sleep I fire up Cities: Skylines and load in to a game. The screen goes completely red, the sound gets stuck and Windows crashes to UEFI. The boot/HDD debug LED on the MB is on.

I exit UEFI, log in to Windows, the screen goes blue with white artifacts and crashes. Event viewer has the same log messages are my previous posts, volmgr, power kernel, along with entries saying that Windows update isn't working.

I can restart and play Cities:Skylines just fine after this.

The next time I try the game works for 10 min, then I get the red screen and stuck sound again. The fan speeds up as it usually does before these crashes, but sort of hesitates a bit, it took longer than usual to crash.

This evening I fire up Rising Storm to see if the crash was specific to Cities: Skylines as I just started playing it this weekend. While loading in to a server the red screen appears and i crash to BIOS/UEFI, the boot/HDD debug LED is on. I exit and try to reboot, same debug light is on and it fails.

I exit again and am prompted with the MB start screen, which then sends me to "Reboot and select proper boot device". I turn the PC off. The boot order is set to UEFI HDD, which is correct, but the MB doesn't detect the HDD.

I turn the PC of again, i only get to BIOS/UEFI, boot LED is on.

I turn the power off, try again, this time I can log in to Windows. I try to back up some files, open Youtube, MS Edge crashes, I go to open Event Viewer, PC acts very sluggish, I get a blue screen and crash to Windows log in screen. I log in, open Edge, the Edge window goes solid gray, I get a blue screen with artifacts, crash to BIOS/UEFI with boot LED on.

Then I came to share my misery here. So the PC worked fine from about mid february to april 1. During this time I had blocked Windows update, performed no driver updates, installed nothing but a few games. It was working without issue but suddenly is unusable again. The new feature of the crashes is a red screen when it crashes from a game.

I'm thinking it seems like a GPU problem suddenly? Seems to be triggered by games and Youtube. But then again I had crashes before where the PC just idled from I logged in. There is obiously something going on with the boot SSD, since the debug light comes on all the time.

Best regards,