[smartmontools-support] Smartd is not running selective tests

Anthony Desmarais anthony at tunguydesmarais.com
Mon Mar 27 20:14:14 CEST 2023


Hi wonder if anyone can help me.

I have two drives in my one linux box running Fedora 37. I also have 
smartmontools ver 7.3-3.

The box has a western digital purple 8TB drive as well as a Seagate 
Skyhawk 6TB drive.

I am trying to get smartd to run a selective test every monday at 1am, 
performing a test of about a quarter of the drive every week.

To start with I executed the first selective test manually with the 
following commands:

smartctl -t select,0-3907013292 
/dev/disk/by-id/ata-WDC_WD80PURZ-85YNPY0_R6GE804Z
smartctl -t select,0-2930261292 
/dev/disk/by-id/ata-ST6000VX001-2BD186_ZR13347M

Both of these ran just fine and I can see in the smart report that the 
tests completed successfully (see attached text file containing both 
reports).

Then in smartd.conf i have added these two lines:

/dev/disk/by-id/ata-SQF-S25M8-256G-SAC_2FA6078110F500505907 -I 194 -d 
ata -f -l error -l selftest -l selfteststs  -m anthony at tunguydesmarais>
/dev/disk/by-id/ata-WDC_WD80PURZ-85YNPY0_R6GE804Z -I 194 -d ata -a -m 
anthony at tunguydesmarais.com -n standby,12,q -s (S/../../3/03|c/../../1>
/dev/disk/by-id/ata-ST6000VX001-2BD186_ZR13347M -I 194 -d ata -a -m 
anthony at tunguydesmarais.com -n standby,12,q -s (S/../../3/03|c/../../1/0>

So the last two disks should run a selective test every monday at 1am. 
However this test in not running. Looking is me syslog i see the 
following errors:

Mar 27 01:53:02 daemon.crit [26]: smartd - smartd[1076]: - Device: 
/dev/disk/by-id/ata-WDC_WD80PURZ-85YNPY0_R6GE804Z, prepare Selective 
Self-Test failed
Mar 27 01:53:02 daemon.crit [26]: smartd - smartd[1076]: - Device: 
/dev/disk/by-id/ata-ST6000VX001-2BD186_ZR13347M, prepare Selective 
Self-Test failed

Out of interest I have another box running debian with two identical 
drive in it (my backup NAS). I have the same setting in that one and I 
see that the selective tests run just fine.


-------------- next part --------------
smartctl -a /dev/disk/by-id/ata-WDC_WD80PURZ-85YNPY0_R6GE804Z
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.2.7-200.fc37.x86_64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Purple
Device Model:     WDC WD80PURZ-85YNPY0
Serial Number:    R6GE804Z
LU WWN Device Id: 5 000cca 263c606df
Firmware Version: 80.H0A80
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5319
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Mar 27 19:52:28 2023 SAST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  101) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off supp                ort.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (1225) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_                FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail  Always       -                       0
  2 Throughput_Performance  0x0005   132   132   054    Pre-fail  Offline      -                       112
  3 Spin_Up_Time            0x0007   185   185   024    Pre-fail  Always       -                       325 (Average 388)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -                       585
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -                       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -                       0
  8 Seek_Time_Performance   0x0005   128   128   020    Pre-fail  Offline      -                       18
  9 Power_On_Hours          0x0012   094   094   000    Old_age   Always       -                       46761
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -                       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -                       575
 22 Unknown_Attribute       0x0023   100   100   025    Pre-fail  Always       -                       100
192 Power-Off_Retract_Count 0x0032   098   098   000    Old_age   Always       -                       2468
193 Load_Cycle_Count        0x0012   098   098   000    Old_age   Always       -                       2468
194 Temperature_Celsius     0x0002   196   196   000    Old_age   Always       -                       33 (Min/Max 21/43)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -                       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -                       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -                       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -                       7

SMART Error Log Version: 1
ATA Error Count: 7 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 7 occurred at disk power-on lifetime: 7780 hours (324 days + 4 hours)
  When the command that caused the error occurred, the device was active or idle                .

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 08 e8 30 f4 85 40 08   6d+15:04:19.306  WRITE FPDMA QUEUED
  ea 00 00 00 00 00 a0 08   6d+15:04:19.305  FLUSH CACHE EXT
  ea 00 00 00 00 00 a0 08   6d+15:04:19.294  FLUSH CACHE EXT
  61 10 d0 20 f4 85 40 08   6d+15:04:19.294  WRITE FPDMA QUEUED
  ea 00 00 00 00 00 a0 08   6d+15:04:12.297  FLUSH CACHE EXT

Error 6 occurred at disk power-on lifetime: 7679 hours (319 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle                .

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 80 00 08 ba 40 08   2d+10:11:16.115  WRITE FPDMA QUEUED
  61 00 d8 00 50 ba 40 08   2d+10:11:16.115  WRITE FPDMA QUEUED
  61 e0 d0 20 46 ba 40 08   2d+10:11:16.112  WRITE FPDMA QUEUED
  61 20 c8 00 40 ba 40 08   2d+10:11:16.110  WRITE FPDMA QUEUED
  61 00 c0 00 38 ba 40 08   2d+10:11:16.107  WRITE FPDMA QUEUED

Error 5 occurred at disk power-on lifetime: 7659 hours (319 days + 3 hours)
  When the command that caused the error occurred, the device was active or idle                .

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 28 18 d8 12 00 40 08   1d+14:10:59.413  WRITE FPDMA QUEUED
  61 20 08 00 11 c0 40 08   1d+14:10:59.413  WRITE FPDMA QUEUED
  61 20 00 00 11 40 40 08   1d+14:10:59.412  WRITE FPDMA QUEUED
  61 10 f0 00 11 80 40 08   1d+14:10:59.407  WRITE FPDMA QUEUED
  61 28 e8 00 11 c0 40 08   1d+14:10:59.404  WRITE FPDMA QUEUED

Error 4 occurred at disk power-on lifetime: 7656 hours (319 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle                .

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 68 00 e0 ce 40 08   1d+11:11:20.644  WRITE FPDMA QUEUED
  61 00 18 00 80 cf 40 08   1d+11:11:20.644  WRITE FPDMA QUEUED
  61 00 10 00 78 cf 40 08   1d+11:11:20.639  WRITE FPDMA QUEUED
  61 00 08 00 70 cf 40 08   1d+11:11:20.633  WRITE FPDMA QUEUED
  61 00 00 00 68 cf 40 08   1d+11:11:20.627  WRITE FPDMA QUEUED

Error 3 occurred at disk power-on lifetime: 7656 hours (319 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle                .

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 c0 38 40 9d 9b 40 08   1d+11:11:02.596  WRITE FPDMA QUEUED
  61 80 48 80 a6 9b 40 08   1d+11:11:02.595  WRITE FPDMA QUEUED
  61 80 40 00 a0 9b 40 08   1d+11:11:02.591  WRITE FPDMA QUEUED
  61 40 30 00 98 9b 40 08   1d+11:11:02.590  WRITE FPDMA QUEUED
  61 c0 28 40 95 9b 40 08   1d+11:11:02.586  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA                _of_first_error
# 1  Selective offline   Completed without error       00%     46695         -
# 2  Short offline       Completed without error       00%     46625         -
# 3  Short offline       Completed without error       00%     46459         -
# 4  Short offline       Completed without error       00%     46301         -
# 5  Short offline       Completed without error       00%     46146         -
# 6  Short offline       Completed without error       00%     45995         -
# 7  Short offline       Completed without error       00%     45839         -
# 8  Short offline       Completed without error       00%     45770         -
# 9  Selective offline   Completed without error       00%     45737         -
#10  Selective offline   Completed without error       00%     45734         -
#11  Selective offline   Aborted by host               90%     45727         -
#12  Short offline       Completed without error       00%     45608         -
#13  Short offline       Completed without error       00%     45455         -
#14  Extended offline    Completed without error       00%     45439         -
#15  Extended offline    Aborted by host               30%     45341         -
#16  Short offline       Completed without error       00%     45297         -
#17  Extended offline    Aborted by host               10%     45161         -
#18  Short offline       Completed without error       00%     45110         -
#19  Extended offline    Aborted by host               50%     45073         -
#20  Short offline       Completed without error       00%      7621         -
#21  Extended offline    Aborted by host               80%      7620         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA     MAX_LBA  CURRENT_TEST_STATUS
    1        0  3907013292  Not_testing
    2        0           0  Not_testing
    3        0           0  Not_testing
    4        0           0  Not_testing
    5        0           0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



smartctl -a /dev/disk/by-id/ata-ST6000VX001-2BD186_ZR13347M
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.2.7-200.fc37.x86_64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Skyhawk
Device Model:     ST6000VX001-2BD186
Serial Number:    ZR13347M
LU WWN Device Id: 5 000c50 0e3da56c5
Firmware Version: CV12
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5319
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Mar 27 19:53:50 2023 SAST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 694) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x70bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   080   065   006    Pre-fail  Always       -       102841324
  3 Spin_Up_Time            0x0003   093   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       171
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   069   060   045    Pre-fail  Always       -       8637676
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       1697h+00m+00.000s
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       169
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   064   057   040    Old_age   Always       -       36 (Min/Max 35/36)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       113
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       433
194 Temperature_Celsius     0x0022   036   043   000    Old_age   Always       -       36 (0 21 0 0 0)
195 Hardware_ECC_Recovered  0x001a   080   065   000    Old_age   Always       -       102841324
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       1615h+06m+31.297s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       91616742
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       11224582

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Selective offline   Completed without error       00%      1630         -
# 2  Short offline       Completed without error       00%      1561         -
# 3  Short offline       Completed without error       00%      1394         -
# 4  Short offline       Completed without error       00%      1236         -
# 5  Short offline       Completed without error       00%      1081         -
# 6  Short offline       Completed without error       00%       930         -
# 7  Short offline       Completed without error       00%       774         -
# 8  Short offline       Completed without error       00%       705         -
# 9  Selective offline   Completed without error       00%       667         -
#10  Selective offline   Aborted by host               90%       662         -
#11  Short offline       Completed without error       00%       543         -
#12  Short offline       Completed without error       00%       390         -
#13  Short offline       Completed without error       00%       232         -
#14  Extended offline    Completed without error       00%        89         -
#15  Short offline       Completed without error       00%        45         -
#16  Extended offline    Completed without error       00%        31         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA     MAX_LBA  CURRENT_TEST_STATUS
    1        0  2930261292  Not_testing
    2        0           0  Not_testing
    3        0           0  Not_testing
    4        0           0  Not_testing
    5        0           0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


More information about the Smartmontools-support mailing list