[SOLVED] Can a faulty motherboard cause ssd failure?

Dec 24, 2021
6
0
10
All of a sudden I started getting random screen freezes and power on problems on my laptop. Then I checked my ssd on hard disk sentinel and it shows 1% health, 100% performance. Naturally I bought a brand new ssd,installed win10, used it for a while (ran everything perfectly) and turned it off only to see my laptop turning on by himself the next day and keep sending the following message: port 0: SPCC ssd, S.m.a.r.t status bad,backup and replace. No matter what I try and what laptop I try it just shows the same message so I put the 1% health one in and it's running smoothly without any problems whatsoever.. Can a faulty hard drive jack on the motherboard cause something like this or what is it? What is even weirder, total runtime is 18 days and I downloaded a bunch of other programs to check ssd health and all of them show 99% health except for 1% on hard disk sentinel.. Does anyone have any idea what to do?
 
Solution
Yes in my opinion, I dont mean hardware terminal failure but in terms of data corruption and SMART errors.

A while back I started getting corrupted windows boot files, and had also noticed some backups were failing to read data. It was very difficult to diagnose as the cause I considered highly unusual.

It initially started on my 850 PRO SATA SSD, which I booted from, one day the i/o locked at 100% during a index database rebuild, I had to hard reset, then not long after that it was unable to boot windows saying critical boot files were corrupted, I restored from backup, but then the boot failure occurred again within a couple of weeks. I also did cable swaps.

I sent it to samsung and used another SATA SSD temporarily, this logged...

RealBeast

Titan
Moderator
All of a sudden I started getting random screen freezes and power on problems on my laptop. Then I checked my ssd on hard disk sentinel and it shows 1% health, 100% performance. Naturally I bought a brand new ssd,installed win10, used it for a while (ran everything perfectly) and turned it off only to see my laptop turning on by himself the next day and keep sending the following message: port 0: SPCC ssd, S.m.a.r.t status bad,backup and replace. No matter what I try and what laptop I try it just shows the same message so I put the 1% health one in and it's running smoothly without any problems whatsoever.. Can a faulty hard drive jack on the motherboard cause something like this or what is it? What is even weirder, total runtime is 18 days and I downloaded a bunch of other programs to check ssd health and all of them show 99% health except for 1% on hard disk sentinel.. Does anyone have any idea what to do?
Please list your laptop and SSD model numbers. Thanks
 
Dec 24, 2021
6
0
10
Could we see the SMART reports?

In addition to HD Sentinel, you could use CrystalDiskInfo.
-- S.M.A.R.T. --------------------------------------------------------------
ID Cur Wor Thr RawValues(6) Attribute Name
05 100 100 _50 000000000000 Reallocated Sector Count
09 100 100 __0 0000000001B4 Power-On Hours
0C 100 100 __0 000000000271 Power Cycle Count
A7 100 100 __0 000000000000 SSD Protect Mode
A8 100 100 __0 00000000008C PHY Error Count
A9 _95 _95 _10 000000A00014 Bad Block Count
AB __0 __0 __0 000000000000 Program Fail Count
AC __0 __0 __0 000000000000 Erase Fail Count
AD 200 200 __0 001000210018 Erase Count
AF 100 100 _10 000000000000 Bad Cluster Table Count
B1 100 100 __0 000000000000 Read Retry Count
B4 100 100 __0 000000000CA8 Spare Block Count Left
BB 100 __0 __0 000000000000 Reported UNC Errors
C0 100 100 __0 000000000028 Unexpected Power Loss Count
C2 _48 _48 __0 003000300030 Temperature
C7 100 100 __0 000000000000 UDMA CRC Error Count
CE 200 200 __0 000000000010 Minimum Erase Count
CF 200 200 __0 000000000021 Maximum Erase Count
D0 200 200 __0 000000000018 Average Erase Count
D1 200 200 __0 00000000006D Minimum Erase Count of SLC block
D2 200 200 __0 0000000000F8 Maximum Erase Count of SLC block
D3 200 200 __0 0000000000B4 Average Erase Count of SLC block
E7 _99 _99 __5 000000000001 SSD Life Left
F1 100 100 __0 0000000007CF Write Sector Count
F2 100 100 __0 000000000981 Read Sector Count
F5 100 100 __0 000000000000 Bit Error Count

I could post screenshots as well but I'm not sure how, I just registered on this forum and it's asking me for a link,like I can't post my photos
 
I think that the SMART tools are looking at attribute 0xE7. Its raw value is 1, but the normalised value is 99. To me, that means that the SSD has used up only 1% of its rated life, but HD Sentinel appears to interpret this number as the percentage life left.

That said, the bad block count has a raw value of 20 (= 0x14 in hexadecimal). That would suggest that it has started to degrade, assuming that it didn't leave the factory in that state. The number of spare blocks remaining is 3240 (=0xCA8), so that looks good.

Why do SMART tools report attributes differently?

http://www.hddoracle.com/viewtopic.php?f=46&t=3093
 
Last edited:
Dec 24, 2021
6
0
10
I think that the SMART tools are looking at attribute 0xE7. Its raw value is 1, but the normalised value is 99. To me, that means that the SSD has used up only 1% of its rated life, but HD Sentinel appears to interpret this number as the percentage life left.

That said, the bad block count has a raw value of 20 (= 0x14 in hexadecimal). That would suggest that it has started to degrade, assuming that it didn't leave the factory in that state. The number of spare blocks remaining is 3240 (=0xCA8), so that looks good.

Why do SMART tools report attributes differently?

http://www.hddoracle.com/viewtopic.php?f=46&t=3093
Thanks for answering! The only thing I still don't understand is why would laptop boot by itself and show bad sectors on a brand new ssd
 
Dec 24, 2021
6
0
10
Maybe it's DOA? We would need to see its SMART report.

You can upload screenshots to imgur.com.
I don't have it unfortunately, I tried to access that drive in any way I knew how but it was like a brick, wouldn't respond to anything.. I had warranty, currently waiting for a replacement, gonna have to wait and see if the new one behaves the same
 

chrysalis

Distinguished
Aug 15, 2003
145
4
18,715
Yes in my opinion, I dont mean hardware terminal failure but in terms of data corruption and SMART errors.

A while back I started getting corrupted windows boot files, and had also noticed some backups were failing to read data. It was very difficult to diagnose as the cause I considered highly unusual.

It initially started on my 850 PRO SATA SSD, which I booted from, one day the i/o locked at 100% during a index database rebuild, I had to hard reset, then not long after that it was unable to boot windows saying critical boot files were corrupted, I restored from backup, but then the boot failure occurred again within a couple of weeks. I also did cable swaps.

I sent it to samsung and used another SATA SSD temporarily, this logged no errors nut this other SSD had superior error correction on the controller.

When samsung returned it, they said there was no defect, I tested it in my spare PC no problems whatsoever, as soon as back in my main PC, bam errors.

I then tested one of my old samsung 830s which also have weak error correction, lots of errors on main PC, no errors elsewhere. My samsung 860 EVO was fine, which has better error correction than the 850 PRO.

I discovered if SATA was slowed down to gen 2 speed the errors stopped, which was the first thing that alerted me to the actual cause, a signal integrity problem. I also discovered the intel AHCI drivers were better than msachi drivers for error handling, on the intel drivers, they automatically slow down the SATA speed until next reboot when an error is detected to mitigate problems.

I believed the problem to be the signal integrity from SATA ports to the main chipset, I thought over options, which was either new motherboard or M.2 drive, I got the M.2 drive.

For a short time all seemed well, but then "media and data integrity" started increasing on the SMART data, and on a backup macrium was unable to read the data.

This time instead of RMA, I checked it in spare PC, and yep no errors.

I had remembered when I upgraded to my 9900k, I had changed a few bios settings.

I eventually discovered the problem to be my system agent voltage, it was too high, when I reduced it, the errors stopped on both M.2 and SATA drives. Now over a year later not a single new error logged on my M.2. No unable to read data problems, no windows instability, no booting issues. I expect the bad SA voltage was corrupting data on the DMI bus which includes SATA communication.

Now I have imperfect SMART numbers which is annoying but I am very glad I found the cause and it didnt require a new motherboard.

Note the ram was testing fine during the debacle. HCI and karhu long runs.
 
Solution
I notice that the OP has a PHY Error Count of 0x8C (= 140 decimal). The UDMA CRC Error Count is 0, though.

The SATA PHY is comprised of the Tx and Rx differential pairs in the SATA data link. This seems to correlate with the observations reported by @chrysalis. Some SSDs have a "SATA downshift" SMART attribute. Perhaps these are related?
 
Last edited:
Dec 24, 2021
6
0
10
Yes in my opinion, I dont mean hardware terminal failure but in terms of data corruption and SMART errors.

A while back I started getting corrupted windows boot files, and had also noticed some backups were failing to read data. It was very difficult to diagnose as the cause I considered highly unusual.

It initially started on my 850 PRO SATA SSD, which I booted from, one day the i/o locked at 100% during a index database rebuild, I had to hard reset, then not long after that it was unable to boot windows saying critical boot files were corrupted, I restored from backup, but then the boot failure occurred again within a couple of weeks. I also did cable swaps.

I sent it to samsung and used another SATA SSD temporarily, this logged no errors nut this other SSD had superior error correction on the controller.

When samsung returned it, they said there was no defect, I tested it in my spare PC no problems whatsoever, as soon as back in my main PC, bam errors.

I then tested one of my old samsung 830s which also have weak error correction, lots of errors on main PC, no errors elsewhere. My samsung 860 EVO was fine, which has better error correction than the 850 PRO.

I discovered if SATA was slowed down to gen 2 speed the errors stopped, which was the first thing that alerted me to the actual cause, a signal integrity problem. I also discovered the intel AHCI drivers were better than msachi drivers for error handling, on the intel drivers, they automatically slow down the SATA speed until next reboot when an error is detected to mitigate problems.

I believed the problem to be the signal integrity from SATA ports to the main chipset, I thought over options, which was either new motherboard or M.2 drive, I got the M.2 drive.

For a short time all seemed well, but then "media and data integrity" started increasing on the SMART data, and on a backup macrium was unable to read the data.

This time instead of RMA, I checked it in spare PC, and yep no errors.

I had remembered when I upgraded to my 9900k, I had changed a few bios settings.

I eventually discovered the problem to be my system agent voltage, it was too high, when I reduced it, the errors stopped on both M.2 and SATA drives. Now over a year later not a single new error logged on my M.2. No unable to read data problems, no windows instability, no booting issues. I expect the bad SA voltage was corrupting data on the DMI bus which includes SATA communication.

Now I have imperfect SMART numbers which is annoying but I am very glad I found the cause and it didnt require a new motherboard.

Note the ram was testing fine during the debacle. HCI and karhu long runs.
Your post gave me an idea and all I can say is: thank you! There is some sort of power problem or was since now everything works well.. Since ssd failure I've been running the old 1% one, I was booting ,rebooting, putting laptop to sleep no problem and then two days ago old booting problems appeared and my laptop was like a brick. I was sure ssd and/or motherboard are toast but by some weird miracle I remembered to check battery, which on my laptop is internal and noticed some lumps on top as if something happened inside... Since removing the battery completely there is not a single problem. I believe faulty battery was messing with voltage somehow and my hardware got the worst of it.. I'm mad at myself for not thinking of this sooner since laptop would go to sleep with 40% battery and show 0% when I would plug it in.. Still waiting on ssd replacement to check if new one shows any sign of faultiness but I'm pretty sure the battery was the problem(thankfully)