Question Samsung 980 pro nvme ssd as boot drive causing BSOD

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Nov 14, 2020
3
0
10
It has driving me mad for 2 weeks, why my newly built workstation was randomly getting BSOD with a whea uncorrectable error.
After pretty much every problem solution i could think of and/or find online, the only thing left was the motherboard or the 980 pro boot drive.
Here is a spec list:

Asus pro ws x570-Ace mobo
128 gb total 4 x 32gb corairs vengeance lpx 3200
AMD ryzen 3900XT
Fractal s36+ dynamic
1500w Be Quiet dark power pro PSU
1x m2 nvme samsung 980 pro 1tb
1x m2 nvme corsair mp400 2tb
2x Evga 3080 XC3 ultra gaming
1x Evga 1080ti Black gaming
5x mk140 corsair case fans
Fractal define 7 case with Linkup riser cable

I tried the samsung 980 pro boot drive in both m2 slots (pcie 4.0 x4 and pcie 4.0 x2 (x1 shared)), but got a BSOD on both.
I tried reinstalling windows with UEFI and Legacy but both times got BSOD.
The disk passed all tests of samsung magician software, and the workstation passes all the tests for 1 hour of OCCT, CPU-Z and Furmark. But at random, unpredictable times i got BSOD.
The weird thing is it became worse the more GPU's where installed.
Single 1080 ti -> 1-2 days without BSOD
Single 3080 -> max 12 hours without BSOD (sometimes after 20 min or less)
Double 3080 -> mostly not longer then 1 hour

Seeming i already formatted the SSD and reinstalled windows, monitored temperatures and they kept low, kept getting BSOD in both m2 slots and it passed all the diagnostics, i skipped this as a potential cause of the BSOD early on in my problem solving.
After the problem persisted, i took to a last resort and installed windows on the corsair mp400 drive instead of the samsung 980 pro, and voila. It has been running stable without BSOD for 2 days with all the ram and all the GPU's installed (off course i am still nail biting and hoping it wont crash again... but i think this solved it). I left the samsung 980 pro in the second m2 slot, but it is not running windows or being used in any other way.

But i am still boggled by the question why this has happend. Is it motherboard related, that it loses sufficient power to the nvme drive when more GPU's are installed? Or is it a firmware/integrity problem of the Samsung 980 pro (on both m2 slots, using different pcie lanes)?
Weird thing is that the speed was bottlenecked in the second m2 slot, due to a lack of lanes, but the computer still crashed, so the speed of the 980 pro can't be the cause.

I also wanted to share this to see if any other people are experiencing BSOD with the 980 pro as boot drive.
 
Dec 15, 2020
1
0
10
I'm also experiencing a similar issue:

I have an Asus X99 Pro + Intel 5820k, and I recently upgraded to a 1TB 980 Pro. Ever since then, I've been getting random WHEA_UNCORRECTABLE_ERROR blue screens.

According to BlueScreenView, the Microsoft Storage Port Driver (storport.sys) appears in the stack trace, which does point to the SSD being the culprit. Can anyone else confirm if they're seeing the same thing?
 
Dec 7, 2020
3
1
10
According to BlueScreenView, the Microsoft Storage Port Driver (storport.sys) appears in the stack trace, which does point to the SSD being the culprit. Can anyone else confirm if they're seeing the same thing?

I see that BlueScreenView analyzes the minidump. I've been using Debugging Tools for Windows for the same purpose. I couldn't find any minidump. Presumably that's because the SSD's controller stopped responding, so there was no way to write it to the drive. So I guess my symptoms were a little different.

Things have been "so far so good" for me ever since I changed my case fan setup, let's hope it stays that way. So I still think I was looking at a thermal issue caused by my GPU.
 
storport.sys is called by the NVMe driver - note that there's both Microsoft's default/stock NVMe driver and also Samsung's own. Obviously you should use Samsung's if at all possible, although some laptops using Intel RST will instead let the Intel driver manage storage which disables the ability to use Samsung's driver.
 
Dec 20, 2020
16
2
15
I'm having the same issue since day 1 of building my PC.

Specs are:
Gigabyte Z490 Vision D (newest BIOS as of today)
Intel Core i9-10900K
2x HyperX Fury RGB 32GB 2x16 DDR4 3600MHz CL17 XMP (HX436C17FB3AK2/32) [4 sticks total, 64GB total] - running at 3000MHz
Gigabyte GTX 1080 (from last build, awaiting on my RTX 3080 to arrive)
be quiet Straight Power 11 750W
CPU cooling is AiO Alphacool Eisbaer 360
Case is Fractal Design Define 7 TG Light Tint (8 fans total incl. 3 CPU block fans)
Samsung 980 PRO 1TB

I sometimes don't get any BSOD at all, sometimes 2-3 a day. It's always WHEA Uncorrectable error with harsh repeating sound in the headphones. Dump never progresses above 0% and no minidump is ever generated - like the SSD got physically unplugged, there is never any trace of error.
After the PC restarts, it goes into BIOS and the 980 Pro is undetected completely. I can restart it hovever many times I like and it still won't be detected, I have to turn the computer off completely for a few seconds and then it will detect it and start booting into Windows. After it boots Windows 10 (20H2 version), first 1-2 minutes are extremely sluggish, can't open anything, all startup programs load extremely slowly one by one, then snap everything goes back to normal.

I thought at first surely the 980 Pro is having a thermal issue, so I moved it to a slot under the GPU and bought the best heatsink solution I could find, which turned out to be Cryorig FrostBit M.2 Heatpipe. While it did considerably lower the temperature, it did not help the problem. The drive never goes over 50C anyway. Usually stays above 40C.

BSODs are almost exclusive to games, no particular one, just any game at all. And funny thing is, most of my games aren't even installed on that drive, but completely different one. It doesn't matter which drive runs the game. I'll get the BSOD either way. Sometimes I'll watch a movie streamed from a local server and I'll get the BSOD as well. Sometimes I will browse something in Firefox, and the page will start acting funny, browser will kinda hang, and then I'll get the same BSOD also.

I tried messing with all sorts of settings in BIOS, disabled MCE, tried to find different drivers for the NVMe etc, but at this point I'm at a loss. Everything points to the drive controller suddenly being missing in action.


This is extremely frustrating since I have waited nearly 10 years to upgrade to this PC, last PC only had the GPU upgraded to a 1080, that I have in this rig temporarily. I did considerable maintenance on this GPU including thermal paste upgrade to liquid metal, which lowered it's temperature.
 

julien-page

Reputable
Jan 13, 2018
8
0
4,510
Hello everyone,

I had the same WHEA UNCORRECTABLE ERROR as you stuck at 0% with no boot drive on first restart with my new build : 5800x, 4x8 crucial ballistix and a 980 pro. Since the four last weeks I tried everything except my ssd and now I just changed it to check if that can solve my problem. This post give me hope, maybe I finally get the faulty component.
But I didn't get this BSOD since at least 2 weeks, I have a different crash than you and I'm wondering if you had it too.
The crash : windows seems to work, but I can't open app in the taskbar and the hidden taskbar, I can't open the start menu, some apps don't work properly but don't crash and no error message. I was able to change the fans speed in Argus monitor, but only once. I tried to open YouTube on Firefox but the site load infinetly like I don't have internet, but without showing the error page. Still able to move my apps on the screen. Forced to hard restart with the start button on my case and after that absolutely no error message in windows event viewer except the non planned turn off that I personally did. This crash occur approximately once per week. Does it ring a bell to someone? Sometimes when the crash occur I'm even still able to play few minutes and when I try to close the game it freeze. This crash happen on average once every 4 days, but it depends. One time a got no crash during 7 days and sometimes I can have two crashes in the same day.
 
Dec 20, 2020
16
2
15
Hello everyone,

I had the same WHEA UNCORRECTABLE ERROR as you stuck at 0% with no boot drive on first restart with my new build : 5800x, 4x8 crucial ballistix and a 980 pro. Since the four last weeks I tried everything except my ssd and now I just changed it to check if that can solve my problem. This post give me hope, maybe I finally get the faulty component.
But I didn't get this BSOD since at least 2 weeks, I have a different crash than you and I'm wondering if you had it too.
The crash : windows seems to work, but I can't open app in the taskbar and the hidden taskbar, I can't open the start menu, some apps don't work properly but don't crash and no error message. I was able to change the fans speed in Argus monitor, but only once. I tried to open YouTube on Firefox but the site load infinetly like I don't have internet, but without showing the error page. Still able to move my apps on the screen. Forced to hard restart with the start button on my case and after that absolutely no error message in windows event viewer except the non planned turn off that I personally did. This crash occur approximately once per week. Does it ring a bell to someone? Sometimes when the crash occur I'm even still able to play few minutes and when I try to close the game it freeze. This crash happen on average once every 4 days, but it depends. One time a got no crash during 7 days and sometimes I can have two crashes in the same day.
That's exactly what I had before. I could open file explorer, but could not open any file, everything would load for infinite time and seem really sluggish. Firefox tabs were white and could not interact with any page, nor could I load any new ones. Few minutes of this and I either had to manually reset it or I would get that same BSOD either way.

Did you fix this by replacing the 980 Pro? Looks to me like those drives have some serious issues.
 

julien-page

Reputable
Jan 13, 2018
8
0
4,510
That's exactly what I had before. I could open file explorer, but could not open any file, everything would load for infinite time and seem really sluggish. Firefox tabs were white and could not interact with any page, nor could I load any new ones. Few minutes of this and I either had to manually reset it or I would get that same BSOD either way.

Did you fix this by replacing the 980 Pro? Looks to me like those drives have some serious issues.
I take out my 980 pro two days ago to switch on my older 970 evo, so for now it's too early to say if it solved it, but for now, no crash.
 
Dec 20, 2020
16
2
15
The worst and most annoying thing is that I'm totally positive that the drive is faulty, but it's impossible to request an RMA on a hunch. It will just be sent back to me as "working" and I might be charged some fees even... frustrating.
 
Last edited:

julien-page

Reputable
Jan 13, 2018
8
0
4,510
@FrozenHaxor One of my friends just got the same issue as us, the windows weird crash, but he doesn't have a samsung 980 PRO but a samsung 970 EVO PLUS. So I don't know what that means. In my case, no crash in almost 3 days, but it's still not enough to be sure.
Do you know how your BIOS is configured (AHCI, Raid mode, UEFI and Legacy)?
 
Dec 20, 2020
16
2
15
I have SATA set to disabled completely as I don't have any drives that require it, UEFI and my motherboard is a bit weird, when I set CSM Disabled, it will reenable anyway on the next boot. Windows boots through Windows Boot Manager purely UEFI though.
 
Dec 26, 2020
3
0
10
I just created my account because I stumbled into this thread. I recently built a amd 5000 pc and have been plagued by whea errors. There are a lot of bsod reports from these new amd cpus. I figured that I was apart of the group. I noticed one thing with my bsods is on restart I would lose the 980 pro in bios. Also I would never get any minidump from the bsod, it would just hang at 0% forever. I removed my 980 pro from the mobo and installed windows on my other m.2 drive (970 evo). So far no restarts or whea errors. I'm glad I found this thread because this seems like exactly the same issue that everyone else has been having. Looking forward to some resolution, even if it is rma.
 
Dec 20, 2020
16
2
15
I just created my account because I stumbled into this thread. I recently built a amd 5000 pc and have been plagued by whea errors. There are a lot of bsod reports from these new amd cpus. I figured that I was apart of the group. I noticed one thing with my bsods is on restart I would lose the 980 pro in bios. Also I would never get any minidump from the bsod, it would just hang at 0% forever. I removed my 980 pro from the mobo and installed windows on my other m.2 drive (970 evo). So far no restarts or whea errors. I'm glad I found this thread because this seems like exactly the same issue that everyone else has been having. Looking forward to some resolution, even if it is rma.
To me it appears those 980 Pro SSDs have either faulty controllers or buggy firmware. I have i9 so it's not AMD exclusive error. It's acting like the drive was just ripped out of the motherboard and ceases to exist. Only a full power cycle makes it appear and usable again.
 
Dec 26, 2020
3
0
10
To me it appears those 980 Pro SSDs have either faulty controllers or buggy firmware. I have i9 so it's not AMD exclusive error. It's acting like the drive was just ripped out of the motherboard and ceases to exist. Only a full power cycle makes it appear and usable again.
I reached out to Samsung C/S for advice. Hopefully they will rma. I'll let you guys know my findings.
 

julien-page

Reputable
Jan 13, 2018
8
0
4,510
Bad news, I still got the weird crash even with the 980 PRO removed. It can't be two defective M.2 SSD from samsung, the 980 PRO and the 970 EVO. So I don't know what it can be. We all have different systems. It's been almost 4 weeks that I'm trying to solve this and I'm totally desperate, I don't know what to do anymore. Anybody have more informations?
 
Dec 20, 2020
16
2
15
Bad news, I still got the weird crash even with the 980 PRO removed. It can't be two defective M.2 SSD from samsung, the 980 PRO and the 970 EVO. So I don't know what it can be. We all have different systems. It's been almost 4 weeks that I'm trying to solve this and I'm totally desperate, I don't know what to do anymore. Anybody have more informations?
Do you still get no minidump and get stuck at 0% on the BSOD? And then the drive is being undetected as usual?
 

julien-page

Reputable
Jan 13, 2018
8
0
4,510
Do you still get no minidump and get stuck at 0% on the BSOD? And then the drive is being undetected as usual?
No the only crash that I have is windows still running, but with the taskbar not responding, apps not responding and webpages loading infinitely. The BSOD never appear with this crash and there no error in event viewer.
 
Dec 20, 2020
16
2
15
No the only crash that I have is windows still running, but with the taskbar not responding, apps not responding and webpages loading infinitely. The BSOD never appear with this crash and there no error in event viewer.
Which windows revision are you on? Did you try to reinstall? If you're on 2004, switch to 20H2. If you're on 20H2 and it still happens, try 1909. Otherwise this is hardware. Your case seems alot different from ours with the infinite BSOD.
 

julien-page

Reputable
Jan 13, 2018
8
0
4,510
I'm on the 20H2. I'll try different things, someone said that it solve that by disabling all sounds drivers except the one that you are using. Apparently it can cause conflict. It seems so random at this point. I don't think it can be a defective CPU, I can play for hours even for days without any crash. If it was that it would occur every time that I'm under heavy load I even made a PRIME95 for 30 minutes without any problem. It can't be my GPU, I have it for almost 2 years without any issue. I tested my ram with Memtest86+ and Memtest5 under anta777 without any error at all. I change my motherboard from b550 aorus pro ac to ROG Strix B550-A without any improvement. I changed my PSU and the crash was there before and after the modification, so it can't be that either. I'm on my third windows installation and every one of them caused the crash.
 
Dec 20, 2020
16
2
15
I'm on the 20H2. I'll try different things, someone said that it solve that by disabling all sounds drivers except the one that you are using. Apparently it can cause conflict. It seems so random at this point. I don't think it can be a defective CPU, I can play for hours even for days without any crash. If it was that it would occur every time that I'm under heavy load I even made a PRIME95 for 30 minutes without any problem. It can't be my GPU, I have it for almost 2 years without any issue. I tested my ram with Memtest86+ and Memtest5 under anta777 without any error at all. I change my motherboard from b550 aorus pro ac to ROG Strix B550-A without any improvement. I changed my PSU and the crash was there before and after the modification, so it can't be that either. I'm on my third windows installation and every one of them caused the crash.
Try Windows 1909. It's the last stable release.
 

julien-page

Reputable
Jan 13, 2018
8
0
4,510
Try Windows 1909. It's the last stable release.
It's probably the next thing on my list. Today I tried the sound driver thing, didn't work. I tried to run my ram at stock, didn't work. Now I'm on the new Asus bios released 25th december.
Interesting fact, 22 and 23 december I get the crash four times before switching on my old 970 EVO. After that I didn't get any problem in three days. This morning after my crash I put back my 980 PRO and since then I got already two crashes. It seems that the 980 PRO is way more unstable than the 970 EVO, I don't know if it can help to find the problem.
 
Dec 7, 2020
3
1
10
tl;dr: power management might be the problem.

As I stated in my previous posts, I am getting the same symptoms that are being reported by @FrozenHaxor and several other people.

I initially suspected a thermal issue, so I cranked up my case fan settings. That seemed to help for a while, but then this issue happened again just one more time while I was playing a game a few days ago.

I did some Googling and noticed that the Anandtech review of this part mentions crashes due to ASPM on AMD systems: https://www.anandtech.com/show/16087/the-samsung-980-pro-pcie-4-ssd-review/8

Quoting the article:
We haven't sorted out all the power management quirks (or, less politely: bugs) on our new Ryzen testbed, so the idle power results below are mostly from our Coffee Lake system. The PCIe Gen4 drives have been tested on both systems, but for now we are unable to use the lowest-power idle states on the Ryzen system.

Since AMD has not enabled PCIe 4 on their Renoir mobile platform and Intel's Tiger Lake isn't quite shipping yet, these scores are still fairly representative of how these Gen4-capable drives handle power management in a typical mobile setting. Once we're able to get PCIe power management fully working crash-free on our Ryzen testbed, we'll update these scores in our Bench database.

Because of this, I have updated to the latest beta BIOS for my ASUS motherboard and the latest 2.10 chipset driver from AMD, hoping that they've resolved these issues. I've still got Maximum power savings configured for PCI Link Power Management in the Windows Power control panel; if the issue recurs again, my next steps will be to move to less aggressive ASPM sleep states or disable ASPM altogether.

Unlike some other motherboards, there is no way to disable ASPM in this Ryzen BIOS; it has to be done in the Windows power control panel under PCI Link Power Management. I'd prefer to keep ASPM enabled if possible because it reduces drive idle power 10x and is a necessary prerequisite for package C6 state to function.

Edited to add: I'm a little unclear which power states the Anandtech article is referring to. Could be ASPM or could be device states. The latter are controlled by Windows settings documented at https://docs.microsoft.com/en-us/wi...-management-for-storage-hardware-devices-nvme so the quickest and easiest way to disable all that stuff seems to be to set the power scheme to Ryzen High Performance. I do see the drive idling a couple degrees hotter at that setting.
 
Last edited:
  • Like
Reactions: SHAKKA82
Nov 25, 2020
2
0
10
tl;dr: power management might be the problem.

As I stated in my previous posts, I am getting the same symptoms that are being reported by @FrozenHaxor and several other people.

I initially suspected a thermal issue, so I cranked up my case fan settings. That seemed to help for a while, but then this issue happened again just one more time while I was playing a game a few days ago.

I did some Googling and noticed that the Anandtech review of this part mentions crashes due to ASPM on AMD systems: https://www.anandtech.com/show/16087/the-samsung-980-pro-pcie-4-ssd-review/8

Quoting the article:


Because of this, I have updated to the latest beta BIOS for my ASUS motherboard and the latest 2.10 chipset driver from AMD, hoping that they've resolved these issues. I've still got Maximum power savings configured for PCI Link Power Management in the Windows Power control panel; if the issue recurs again, my next steps will be to move to less aggressive ASPM sleep states or disable ASPM altogether.

Unlike some other motherboards, there is no way to disable ASPM in this Ryzen BIOS; it has to be done in the Windows power control panel under PCI Link Power Management. I'd prefer to keep ASPM enabled if possible because it reduces drive idle power 10x and is a necessary prerequisite for package C6 state to function.

Edited to add: I'm a little unclear which power states the Anandtech article is referring to. Could be ASPM or could be device states. The latter are controlled by Windows settings documented at https://docs.microsoft.com/en-us/wi...-management-for-storage-hardware-devices-nvme so the quickest and easiest way to disable all that stuff seems to be to set the power scheme to Ryzen High Performance. I do see the drive idling a couple degrees hotter at that setting.

Thanks for this information, I was able to solve the problem by modifying the power management settings. I also noticed that the SSD was in the HDD list in the Bios information. So I also turned off the HDD after 0 minutes to disable this feature. I also disabled the power management of the PCI Express link state.

Since then, I no longer have any fatal errors. Very happy to be using the 980 pro again.
 
Last edited by a moderator:
Dec 26, 2020
3
0
10
Small update. I turned my ssd into smasung for "repair" still haven't heard anything back from them for awhile. Frustrating since I would like to try this fix.
 
Jan 11, 2021
1
0
10
Has there been some new developments of this? I "sadly" bought a new system with a 980pro and I got the same issues as all of you (random BSOD, 0%minidump, ssd not showing in bios except of hard reboot).

Is this power setting proposed earlier working? Has someone found a workaround ?


Edit:
The power settings seems to FIX it!
I updated to the last beta bios and tried installing again the amd chipset drivers. Did not do much per se BUT then after setting the power settings to "high performance" (you don't have the ryzen profiles anymore with the 5000 line up), turning the "turn off hard disk after" to 0 and setting the pcie- link state power management to Off, I did not experiment any more crash (yet)
 
Last edited:
Status
Not open for further replies.