[smartmontools-support] Disk failure notification options
Alex
mysqlstudent at gmail.com
Tue Jan 5 03:17:16 CET 2021
Hi,
I have a fedora33 system with smartmontools-7.1 and believe I have a
failing disk:
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD30EFRX-68EUZN0
...
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 1946
3 Spin_Up_Time 0x0027 178 178 021 Pre-fail
Always - 6100
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 116
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 2
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 044 044 000 Old_age
Always - 41073
10 Spin_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 113
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 55
193 Load_Cycle_Count 0x0032 001 001 000 Old_age
Always - 9461832
194 Temperature_Celsius 0x0022 118 110 000 Old_age
Always - 32
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 26
If I recall, the raw_read_error_rate was less than a few hundred about
a month ago. Are there any other indications here that would lead you
to believe this disk is failing? Do these disks typically last more
than 60,000 hours?
I have smartd running:
/usr/sbin/smartd -n -q never
I've also configured /etc/smartmontools/smartd.conf with the following
for this disk:
/dev/sdc -a -R 1 -W 4,45,55 -H -m admin at example.com -M exec
/usr/libexec/smartmontools/smartdnotify -n standby,10,q
I'm hoping this command will do the following:
- monitor all drive aspects
- send an alert whenever the Raw_Read_Error_Rate changes
- send an alert whenever Temperature changes >= 4 Celsius or , >= 45C
and log a critical alert when temp is >= 55
I just want to be sure I'm not doing something wrong that will
overlook an early warning alert for this drive failing.
Thanks,
Alex
More information about the Smartmontools-support
mailing list