Question Samsung 980 pro nvme ssd as boot drive causing BSOD

Status
Not open for further replies.
Nov 14, 2020
3
0
10
0
It has driving me mad for 2 weeks, why my newly built workstation was randomly getting BSOD with a whea uncorrectable error.
After pretty much every problem solution i could think of and/or find online, the only thing left was the motherboard or the 980 pro boot drive.
Here is a spec list:

Asus pro ws x570-Ace mobo
128 gb total 4 x 32gb corairs vengeance lpx 3200
AMD ryzen 3900XT
Fractal s36+ dynamic
1500w Be Quiet dark power pro PSU
1x m2 nvme samsung 980 pro 1tb
1x m2 nvme corsair mp400 2tb
2x Evga 3080 XC3 ultra gaming
1x Evga 1080ti Black gaming
5x mk140 corsair case fans
Fractal define 7 case with Linkup riser cable

I tried the samsung 980 pro boot drive in both m2 slots (pcie 4.0 x4 and pcie 4.0 x2 (x1 shared)), but got a BSOD on both.
I tried reinstalling windows with UEFI and Legacy but both times got BSOD.
The disk passed all tests of samsung magician software, and the workstation passes all the tests for 1 hour of OCCT, CPU-Z and Furmark. But at random, unpredictable times i got BSOD.
The weird thing is it became worse the more GPU's where installed.
Single 1080 ti -> 1-2 days without BSOD
Single 3080 -> max 12 hours without BSOD (sometimes after 20 min or less)
Double 3080 -> mostly not longer then 1 hour

Seeming i already formatted the SSD and reinstalled windows, monitored temperatures and they kept low, kept getting BSOD in both m2 slots and it passed all the diagnostics, i skipped this as a potential cause of the BSOD early on in my problem solving.
After the problem persisted, i took to a last resort and installed windows on the corsair mp400 drive instead of the samsung 980 pro, and voila. It has been running stable without BSOD for 2 days with all the ram and all the GPU's installed (off course i am still nail biting and hoping it wont crash again... but i think this solved it). I left the samsung 980 pro in the second m2 slot, but it is not running windows or being used in any other way.

But i am still boggled by the question why this has happend. Is it motherboard related, that it loses sufficient power to the nvme drive when more GPU's are installed? Or is it a firmware/integrity problem of the Samsung 980 pro (on both m2 slots, using different pcie lanes)?
Weird thing is that the speed was bottlenecked in the second m2 slot, due to a lack of lanes, but the computer still crashed, so the speed of the 980 pro can't be the cause.

I also wanted to share this to see if any other people are experiencing BSOD with the 980 pro as boot drive.
 

mdd1963

Polypheme
so your 32 GB modules are actually tested/certified at 3200 MHz, and are listed as such on the Asus website for your X570? If not...Try lower RAM clocks...or only 2 sticks....

I seriously doubt the 980 Pro is the issue when you have that sort of RAM config at that speed, but, if you are saying that config is rock solid always with every other drive, then, ...go with your gut.
 

Lutfij

Titan
Moderator
Which BIOS version are you currently on for your motherboard? I'd advise on using just one stick of ram on the primary ram slot and see if you get to use your system without an issue. Speaking of issue, how did you source the installer for your OS? Once you install the NVMe drive, you're advised to install the NVMe drivers, found off of Samsung's support page.
 
Nov 14, 2020
3
0
10
0
so your 32 GB modules are actually tested/certified at 3200 MHz, and are listed as such on the Asus website for your X570? If not...Try lower RAM clocks...or only 2 sticks....

I seriously doubt the 980 Pro is the issue when you have that sort of RAM config at that speed, but, if you are saying that config is rock solid always with every other drive, then, ...go with your gut.
Thanks for your reply,
Yes, i tried with only 2 mem stick, and then the other 2 to exclude the possibility of a ram problem (which is the most listed solution for this kind of BSOD online). I also did windows memory dianostic and occt ram test without errors. Now computer is running smooth with all ram sticks but with the mp400 instead of the 980 pro
 
Nov 14, 2020
3
0
10
0
Which BIOS version are you currently on for your motherboard? I'd advise on using just one stick of ram on the primary ram slot and see if you get to use your system without an issue. Speaking of issue, how did you source the installer for your OS? Once you install the NVMe drive, you're advised to install the NVMe drivers, found off of Samsung's support page.
First i had the bios version on my mobo that came with the board, number 2010. During problem solving i updated it to the latest version from the asus website. Did not solve the problem.
The 980 pro does not have a driver/firmware update listed on the samsung website, i guess it's to new to have a new one from the factory standard. Samsung magician also said it was up to date with its update tool.


Ram sticks i tried with all 4 x 32gb modules (which is supported on the asus website), and 2 sticks, and then with the 2 other sticks. Tried widows memory diagnostic and occt ram check. All did not find errors or solve the BSOD.

Windows was installed with the official usb stick, installed it first without UEFI and after the BSOD, formatted the drive and installed it with UEFI. Did not solve the problem. Switching the SSD however did solve the problem. So the problem seems solved, due to switching the boot drive from the 980 pro to the mp400. I just can't figure out why, and how this can be related to the amount of GPU's installed
 

daryth84

Commendable
Oct 10, 2018
2
0
1,510
0
Very odd, are you running MSI Afterburner by chance?

Bios set to PCIe 4.0? The Ace seems plenty capable of your 4.0 lane configuration, my best guess is the board itself maybe defective and struggling to get enough power through them all somewhere? what PSU are you using?

Are you noticing any other significant symptoms prior to failure?
 
Nov 21, 2020
8
1
15
0
@woodsadrian

YES YES YES

Now I know I am on the right track.

I am 5 hours into doing the exact same thing, removing 980 Pro drive from being boot drive and swapped Windows Image to the second NVMe ( Crucial P5 )

I have been having the exact same problem you have but I'm sure it's not related to how many GPU you have.

I also have the new AMD Ryzen 9 5900X and from the day I set it up I have been having BSOD or random restarts , averaged at 2 per day over a week.

Now 5 hours is no proof but it feels like I am going to finally work out what's the culprit,. it's the Samsung 980 Pro!!!

I will hang around and keep you updated...
 
Reactions: Kimi4life
Nov 21, 2020
9
1
15
0
@woodsadrian

YES YES YES

Now I know I am on the right track.

I am 5 hours into doing the exact same thing, removing 980 Pro drive from being boot drive and swapped Windows Image to the second NVMe ( Crucial P5 )

I have been having the exact same problem you have but I'm sure it's not related to how many GPU you have.

I also have the new AMD Ryzen 9 5900X and from the day I set it up I have been having BSOD or random restarts , averaged at 2 per day over a week.

Now 5 hours is no proof but it feels like I am going to finally work out what's the culprit,. it's the Samsung 980 Pro!!!

I will hang around and keep you updated...
Man, I have exactly the same problem. New build, with 2 times the 980 pro and a sata 860 qvo. All the crashes were driving me completely nuts. Then I just installed in 860 and voila, immediately got rid of the problem.

Have you found a solution to get the 980s to work yet? Maybe raid? Would that help as the driver might be amd instead of samsungs unreleased driver?
 
Nov 21, 2020
8
1
15
0
I have some bad news. Not 20 minutes after posting above, I got a BSOD

Event Viewer

A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Bus/Interconnect Error
Processor APIC ID: 0

The details view of this entry contains further information.
 
Nov 21, 2020
9
1
15
0
I have some bad news. Not 20 minutes after posting above, I got a BSOD

Event Viewer

A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Bus/Interconnect Error
Processor APIC ID: 0

The details view of this entry contains further information.
:(

What do you think is the cause? Bios/cpu combo with m2 ssd?

This is so frustrating
 
Nov 21, 2020
8
1
15
0
CPU fault perhaps. I read one or two users on redit say a replaced CPU fixed it.
How do you RMA a $500 USD CPU just on a hunch?
 
Nov 21, 2020
9
1
15
0
CPU fault perhaps. I read one or two users on redit say a replaced CPU fixed it.
How do you RMA a $500 USD CPU just on a hunch?
Damn, there goes the fun with my ryzen 5950x. Same here, no idea how to rma since it does seem te be working with sata ssd. I double checked and all the cpu pins are fine too. I'll let you know if I find something else
 
Reactions: shueardm
Nov 21, 2020
8
1
15
0
I guess I could replace my RAM and PSU but would be in hope, no reason to believe they are to blame, both brand new, memory is tested
 
Nov 21, 2020
9
1
15
0
I guess I could replace my RAM and PSU but would be in hope, no reason to believe they are to blame, both brand new, memory is tested
Most people I find while browsing seem to indicate some kind of bios incompatibility for my board (aorus master X570). Maybe that is the problem?
 
Nov 21, 2020
9
1
15
0
same board but there are others who have no problem
With the same drives? I am quite confident now (installed on SSD again) that either the M2 connection with the CPU, the Bios, or the 980 pro itself is the problem. The only thing is that I don't know which one is most likely. Would another M2 in the same slot likely work? And if not, I'm still unsure whether its a bios or CPU problem. CPU stress testing is all ok in my current windows install.
 
Nov 21, 2020
9
1
15
0
With the same drives? I am quite confident now (installed on SSD again) that either the M2 connection with the CPU, the Bios, or the 980 pro itself is the problem. The only thing is that I don't know which one is most likely. Would another M2 in the same slot likely work? And if not, I'm still unsure whether its a bios or CPU problem. CPU stress testing is all ok in my current windows install.
Ok F my life. It now started happening with SSD as well...
 
Nov 21, 2020
8
1
15
0
Sucks.
At least we can put back out 7GB/s 980 Pro now

What BIOS version have you got? I didn't realize until I was told last night that Gigabyte BIOS , when there's a letter after the number then it's a beta.
So I have had a Beta BIOS all along since I installed the 5900X on the day I got it, that's when I used QFlash + to put the F31e on to the board.

Maybe we just need to be patient for the next BIOS and it will be fixed?
 
Nov 21, 2020
9
1
15
0
Sucks.
At least we can put back out 7GB/s 980 Pro now

What BIOS version have you got? I didn't realize until I was told last night that Gigabyte BIOS , when there's a letter after the number then it's a beta.
So I have had a Beta BIOS all along since I installed the 5900X on the day I got it, that's when I used QFlash + to put the F31e on to the board.

Maybe we just need to be patient for the next BIOS and it will be fixed?
I reverted back to F30 and a boot in the 980 pro. When I removed 2 other ram dimms (than first) from the system it was working! I enabled XMP, tested and seemed fine. Inserted the other dimms again and again enabled XMP. Went wrong at first but after voltage increase it worked ok for some hours. Now at the end of the day I had 2 random shutdowns still (maybe soc voltage?), but overall it's much better on F30!
 
Nov 21, 2020
9
1
15
0
Sucks.
At least we can put back out 7GB/s 980 Pro now

What BIOS version have you got? I didn't realize until I was told last night that Gigabyte BIOS , when there's a letter after the number then it's a beta.
So I have had a Beta BIOS all along since I installed the 5900X on the day I got it, that's when I used QFlash + to put the F31e on to the board.

Maybe we just need to be patient for the next BIOS and it will be fixed?
Had any luck?
 
Nov 25, 2020
2
0
10
0
It has driving me mad for 2 weeks, why my newly built workstation was randomly getting BSOD with a whea uncorrectable error.
After pretty much every problem solution i could think of and/or find online, the only thing left was the motherboard or the 980 pro boot drive.
Here is a spec list:

Asus pro ws x570-Ace mobo
128 gb total 4 x 32gb corairs vengeance lpx 3200
AMD ryzen 3900XT
Fractal s36+ dynamic
1500w Be Quiet dark power pro PSU
1x m2 nvme samsung 980 pro 1tb
1x m2 nvme corsair mp400 2tb
2x Evga 3080 XC3 ultra gaming
1x Evga 1080ti Black gaming
5x mk140 corsair case fans
Fractal define 7 case with Linkup riser cable

I tried the samsung 980 pro boot drive in both m2 slots (pcie 4.0 x4 and pcie 4.0 x2 (x1 shared)), but got a BSOD on both.
I tried reinstalling windows with UEFI and Legacy but both times got BSOD.
The disk passed all tests of samsung magician software, and the workstation passes all the tests for 1 hour of OCCT, CPU-Z and Furmark. But at random, unpredictable times i got BSOD.
The weird thing is it became worse the more GPU's where installed.
Single 1080 ti -> 1-2 days without BSOD
Single 3080 -> max 12 hours without BSOD (sometimes after 20 min or less)
Double 3080 -> mostly not longer then 1 hour

Seeming i already formatted the SSD and reinstalled windows, monitored temperatures and they kept low, kept getting BSOD in both m2 slots and it passed all the diagnostics, i skipped this as a potential cause of the BSOD early on in my problem solving.
After the problem persisted, i took to a last resort and installed windows on the corsair mp400 drive instead of the samsung 980 pro, and voila. It has been running stable without BSOD for 2 days with all the ram and all the GPU's installed (off course i am still nail biting and hoping it wont crash again... but i think this solved it). I left the samsung 980 pro in the second m2 slot, but it is not running windows or being used in any other way.

But i am still boggled by the question why this has happend. Is it motherboard related, that it loses sufficient power to the nvme drive when more GPU's are installed? Or is it a firmware/integrity problem of the Samsung 980 pro (on both m2 slots, using different pcie lanes)?
Weird thing is that the speed was bottlenecked in the second m2 slot, due to a lack of lanes, but the computer still crashed, so the speed of the 980 pro can't be the cause.

I also wanted to share this to see if any other people are experiencing BSOD with the 980 pro as boot drive.
I have the same problem with M2 samsung 980 pro.
After BSOD with a whea uncorrectable error, I arrive in the bios setup and I see that the M2 samsung 980 pro is no longer present in the SSD list. To reboot I have to cut the power and only after that, the M2 980 pro is present again, recognized and allows me to boot windows.

I finally cloned my 980 pro on an old samsung 850 SSD and since then I have no more WHEA errors. However, I regret not being able to take advantage of the speed of the 980 pro for loading applications. A little frustrated to have spent for an object that doesn't work.

MSI X570 CARBON WIFI
RYZEN 9 3950X
32GB BALLISTK 3600MHZ
BEQUIET STRAIGHT POWER 1200W
WINDOWS 10 64 OFFICIAL DVD 2019
SAMSUNG M2 980PRO 1TO
RX580 ARMOR
 
Last edited:

Lucas T

Honorable
Aug 13, 2014
13
0
10,510
0
I have the same problem with M2 samsung 980 pro.
After BSOD with a whea uncorrectable error, I arrive in the bios setup and I see that the M2 samsung 980 pro is no longer present in the SSD list. To reboot I have to cut the power and only after that, the M2 980 pro is present again, recognized and allows me to boot windows.

I finally cloned my 980 pro on an old samsung 850 SSD and since then I have no more WHEA errors. However, I regret not being able to take advantage of the speed of the 980 pro for loading applications. A little frustrated to have spent for an object that doesn't work.

MSI X570 CARBON WIFI
RYZEN 9 3950X
32GB BALLISTK 3600MHZ
BEQUIET STRAIGHT POWER 1200W
WINDOWS 10 64 OFFICIAL DVD 2019
SAMSUNG M2 980PRO 1TO
RX580 ARMOR
I have the exact same issue/symptoms. Weird BIOS behaviour and all.

ASROCK B550 PHANTOM ITX
RYZEN 7 5800X
16GB GSKILL FLARE X 3200MHZ
CORSAIR SF750
WINDOW 10 64-bit HOME
SAMSUNG M2 980PRO 1TO
RTX 3080 VGA XC3 ULTRA
 

Lucas T

Honorable
Aug 13, 2014
13
0
10,510
0
@SHAKKA82 @woodsadrian @shueardm @Kimi4life

I was having this issue on my ryzen build (specs above). As a troubleshooting step I installed the 980 pro into my XPS 13 9360 (intel i5-7200u). I am getting the exact same behavior on the XPS. Random WHEA BSOD and won't detect drive in bios until hard power-cycle. At least for me, the problem seems isolated to the SSD. Looks like we may have gotten some dud drives. Either that or windows is borking something. At least I won't need to RMA my CPU lol.
 
Dec 7, 2020
3
1
10
0
I have the same problem with M2 samsung 980 pro.
After BSOD with a whea uncorrectable error, I arrive in the bios setup and I see that the M2 samsung 980 pro is no longer present in the SSD list. To reboot I have to cut the power and only after that, the M2 980 pro is present again, recognized and allows me to boot windows.
Exactly the same here. It only seems to happen when gaming. So seems like the drive automatically shut down because it reached the critical temperature. You may need to consider what to do about your case airflow.

If you have mobo and videocard from the same vendor you may find a better solution, but ASUS Q-Fan doesn't know how to read the GPU temp on my videocard from MSI, so I came up with a workaround that seems to work so far:
  • changed my chassis fans from the manual controller that came with my case, to the motherboard's chassis fan connectors
  • go into Q-Fan (ASUS BIOS on a PRIME B550 PLUS) and configured the chassis fans temperature source to multiple sources so it uses the highest of the cpu, motherboard, and chipset
  • set the upper temperature on the intake fan to 52C; fan speed will be at 100% above this temp. The rationale is that I want it to react to ambient case temperature and I haven't seen the chipset go above 52C in my testing. I think the default fan curve on most motherboards doesn't hit 100% until 70C, and at least on the ASUS, it doesn't use the chipset temperature input by default.
 
Last edited:
Status
Not open for further replies.

ASK THE COMMUNITY