Help with SMART readouts for 1tb Samsung HDD

Meryn

Honorable
Oct 12, 2012
3
0
10,510
Hi, I just had a scare where my system refused to boot from this HDD, infact the bios didnt recognize the disk at all. The problem passed, but I upgraded to a ssd all the same. Would like to keep using the hdd for storage, but need to be fairly sure my data are safe as the disk is used for business. Weekly updates are scheduled, but all the same, I would hate to lose a week's worth of work.

Can anyone make sense of these readings and maybe tell me what to keep an eye out for, in terms of SMART scores? Much appreciated!

ID Description Raw Normalized Worst Threshold
1 Raw Read Error Rate 2 100 90 51
3 Spin Up Time 7930 76 76 11
4 Start/Stop Count 1614 98 98 0
5 Reallocated Sector Count 1 100 100 10
7 Seek Error Rate 0 100 100 51
8 Seek Time Performance 10143 100 100 15
9 Power-on Hours 20733 96 96 0
10 Spin Retry Count 0 100 100 51
11 Drive Calibration Retry Count 0 100 100 0
12 Power-on Count 1499 99 99 0
13 Soft Read Error Rate 2 100 94 0
183 Runtime Bad Count (total) 0 100 100 0
187 Uncorrectable Error Count 1906 100 100 0
188 Command Timeout 0 100 100 0
190 Airflow Temperature 370278422 78 61 0
194 HDD Temperature 403832856 76 60 0
195 ECC Error Rate 4349275 100 100 0
196 Reallocation Event Count 0 100 100 0
197 Current Pending Sector Count 0 100 100 0
198 Off-Line Uncorrectable Sector Count 1 100 100 0
199 CRC Error Count 0 100 100 0
200 Write Error Rate 0 100 100 0
201 Soft Read Error Rate 0 100 100 0
 

John_VanKirk

Distinguished
Hi Meryn, & Welcome to Tom's Hardware!

There's no guarantee on any HDD, brand new, or older. From your SMART data there has been 2 Raw Read Errors, and 1 Reallocated Sector Count, dates unknown. How old and how big is this drive?

Reasonable to reformat it, and watch it's SMART data more closely.

One realtime applet I have liked over the years is ActiveSmart, by Ariolic. It runs in the background and warns you by system tray alert of any significant change in these parameters, including temperature changes. You can set the timing and alert levels if you want. Cost $18. Here is the URL if interested

http://www.ariolic.com/activesmart/index.html

Remember HDD's are warrantied 2-5 years, but not guaranteed to not fail. So it's worth monitoring it closely.

Someone else might know more how significant the Raw Read Error Rate is, in that is has dropped from a normalized value from 100 to 90, with a threshold of 50.

 
Personally, I'd be most concerned about the raw value (1906) of the Uncorrectable Error Count, although its normalised value is still at 100.

IMHO, the ECC Error Rate (4349275) would be something to investigate but I don't know how it is calculated. Its normalised value (100) is OK, but I'd still want to compare your SMART data against other Samsung drives.

The Power-on Hours value of 20733 looks more like Power-on Minutes, or maybe 10 minute increments. You could observe this SMART attribute at 10 minute intervals to see how often its raw value ticks over.

What is the actual model number of your drive?
 

Meryn

Honorable
Oct 12, 2012
3
0
10,510


Thanks for your replies and your hearty welcome. The drive is around three years old and it's a Samsung HD103UJ 1TB. The Uncorrectable Error Count and the ECC Error Rate are the ones that Samsung Magician software says are *failing*. Yet the drive still works, so apparently there is some distance between individual SMART tests failing and the drive itself failing.

The Uncorrectable Error Rate stays the same after the computer has been powered down all night, at raw value 1906. The ECC value has gone down significantly overnight (although Magician says it's still a fail) to 32848.

I will try a reformat as soon as I get my other (older) MAXTOR drive hooked back up and will keep a close eye on the SMART values.

Is it possible that, left untreated, these two FAILs are the final warning I get before the drive giving up the ghost?
 
IMHO, I would keep an eye on those attributes that John has identified. I would also be concerned about the Uncorrectable Error Count. However, the ECC Error Rate is probably OK, despite Samsung Magician's "failing" diagnosis.

Some searching with Google turns up the following SMART results:

http://painterfactory.com/forums/storage/67/13875/BRENDEN-DESKTOP.txt
http://www.diskusjon.no/index.php?app=core&module=attach&section=attach&attach_id=425974
http://forums.overclockers.co.uk/showthread.php?p=21681807
http://www.elektroda.pl/rtvforum/download.php?id=391157
http://www.mmo-champion.com/threads/1152325-Anyone-Familiar-with-SMART-HDD-Info
http://www.avforums.com/forums/attachments/computer-classified-adverts/340520d1348678025-sale-harddisk-sale-2tb-1tb-500gb-320gb-160gb-sata-ide-3-5-samsung-hd103uj-1tb-serial-s13pj1kq701800-no-warranty.txt
http://forum.thecusforum.eu/archive/index.php/t-895.html
http://forums.whirlpool.net.au/archive/1455067

In every case the raw Uncorrectable Error Count is zero, so your drive does appear have a problem with this attribute.

However, in every case the raw values of the ECC Error Rate are very high, which suggests that this is normal, as it is for Seagate HDDs and SandForce SSDs. In the latter cases the numbers reflect a sector count, not an error count. It appears that Samsung's own software doesn't understand Samsung's own SMART attributes. To be fair, the software seems to have been written for SSDs, not HDDs.

The Power-On Hours attribute is also much higher than expected in the above URLs. It is probably better referred to as Power-On Time. You could easily determine the unit of measurement by recording the elapsed time between each increment in the raw value.
 

Meryn

Honorable
Oct 12, 2012
3
0
10,510


Thanks. I've evacuated the data, formatted the disk and copied the data back. It still shows a raw value of 1906 on attribute 187.

http://kb.acronis.com/content/9122 said:
Although this parameter is not considered critical by the most hardware vendors, degradation of this parameter may indicate electromechanical problems of the disk. Regular backup is recommended. If no other (critical) parameters report a problem, hardware replacement is recommended on mission critical systems only.

I think you're right about the Power-On attribute. 10mins probably as it's now on 20746. I'm backing up regularly and have put sensitive data on other drives, so this one can be used mainly for storage.