[smartmontools-support] Device: /dev/sda [SAT], 524288 Currently unreadable (pending) sectors

Todd Harrington toddh at axiomteksystems.com
Tue Dec 19 14:16:32 CET 2017


Hi,



Our customer is running some extensive disk tests on the Transcend SSD420K
512GB MLC SSD with 750TB (TBW). They are running CentOS Linux 7.3 and when
I asked then what they were doing for a workload they said:



A heavy Read/Write disk load was running at the time from a KVM Virtual
Machine. Before the load started the write cache was disabled on the drive
and Native Command Queueing was disabled.



The problem is that they are using Linux *smartmon* tools and they first
received this message on the two of the system they tried this extensive
disk test on:



[134.111.122.26-node1 ~]$ zgrep "Currently unreadable" /var/log/message*

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages:Dec 15 22:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 04:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 06:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors

/var/log/messages.1.gz:Dec 15 21:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email



I shared the SMART data from that SSD with Transcend and asked Transcend
for help and they reported back to me:



With regards to the Hardware_ECC_Recovered high value of “354”, since the
Reallocated_Sector_Ct and Read_Error_Rate return “0”, our team does not
believe that the SSD is showing signs of depreciating health.



Below is the SMART data for this SSD. They have written about 78TB which is
about 10% of the drives life I think?



smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-693.2.2.el7.x86_64]
(local build)

Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org



=== START OF INFORMATION SECTION ===

Model Family:     SiliconMotion based SSDs

Device Model:     TS512GSSD420K

Serial Number:    D898940005

Firmware Version: O1225G

User Capacity:    512,110,190,592 bytes [512 GB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    Solid State Device

Device is:        In smartctl database [for details use: -P show]

ATA Version is:   ACS-2 (minor revision not indicated)

SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Mon Dec 18 15:59:17 2017 EST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled



=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED



General SMART Values:

Offline data collection status:  (0x02)        Offline data collection
activity


was completed without error.


Auto Offline Data Collection: Disabled.

Self-test execution status:      (   0)              The previous self-test
routine completed


without error or no self-test has ever


been run.

Total time to complete Offline

data collection:                 (   57) seconds.

Offline data collection

capabilities:                                       (0x71) SMART execute
Offline immediate.


No Auto Offline data collection support.


Suspend Offline collection upon new


command.


No Offline surface scan supported.


Self-test supported.


Conveyance Self-test supported.


Selective Self-test supported.

SMART capabilities:            (0x0002)            Does not save SMART data
before


entering power-saving mode.


Supports SMART auto save timer.

Error logging capability:        (0x01)            Error logging supported.


General Purpose Logging supported.

Short self-test routine

recommended polling time:           (   1) minutes.

Extended self-test routine

recommended polling time:           (   1) minutes.

Conveyance self-test routine

recommended polling time:           (   1) minutes.



SMART Attributes Data Structure revision number: 1

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED
WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate     0x0000   100   100   000    Old_age
Offline      -       0

  5 *Reallocated_Sector_Ct*   0x0000   100   100   000    Old_age
Offline      -       0

  9 Power_On_Hours          0x0000   100   100   000    Old_age
Offline      -       600

12 Power_Cycle_Count       0x0000   100   100   000    Old_age
Offline      -       26

160 Uncorrectable_Error_Cnt 0x0000   100   100   000    Old_age
Offline      -       0

161 Valid_Spare_Block_Cnt   0x0000   100   100   000    Old_age
Offline      -       115

163 Initial_Bad_Block_Count 0x0000   100   100   000    Old_age
Offline      -       50

164 Total_Erase_Count       0x0000   100   100   000    Old_age
Offline      -       321118

165 Max_Erase_Count         0x0000   100   100   000    Old_age
Offline      -       177

166 Min_Erase_Count         0x0000   100   100   000    Old_age
Offline      -       77

167 Average_Erase_Count     0x0000   100   100   000    Old_age
Offline      -       156

168 Max_Erase_Count_of_Spec 0x0000   100   100   000    Old_age
Offline      -       3000

169 Remaining_Lifetime_Perc 0x0000   100   100   000    Old_age
Offline      -       95

175 Program_Fail_Count_Chip 0x0000   100   100   000    Old_age
Offline      -       0

176 Erase_Fail_Count_Chip   0x0000   100   100   000    Old_age
Offline      -       0

177 Wear_Leveling_Count     0x0000   100   100   050    Old_age
Offline      -       291

178 Runtime_Invalid_Blk_Cnt 0x0000   100   100   000    Old_age
Offline      -       0

181 Program_Fail_Cnt_Total  0x0000   100   100   000    Old_age
Offline      -       0

182 Erase_Fail_Count_Total  0x0000   100   100   000    Old_age
Offline      -       0

192 Power-Off_Retract_Count 0x0000   100   100   000    Old_age
Offline      -       8

194 Temperature_Celsius     0x0000   100   100   000    Old_age
Offline      -       37

195 Hardware_ECC_Recovered  0x0000   100   100   000    Old_age
Offline      -       7837

196 Reallocated_Event_Count 0x0000   100   100   016    Old_age
Offline      -       0

197 *Current_Pending_Sector*  0x0000   100   100   000    Old_age
Offline      -       0

198 Offline_Uncorrectable   0x0000   100   100   000    Old_age
Offline      -       0

199 UDMA_CRC_Error_Count    0x0000   100   100   050    Old_age
Offline      -       0

232 Available_Reservd_Space 0x0000   100   100   000    Old_age
Offline      -       100

241 Host_Writes_32MiB       0x0000   100   100   000    Old_age
 Offline      -       2330590

242 Host_Reads_32MiB        0x0000   100   100   000    Old_age
Offline      -       2687687

245 TLC_Writes_32MiB        0x0000   100   100   000    Old_age
Offline      -       2568944



SMART Error Log Version: 1

No Errors Logged



SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error

# 1  Extended offline    Completed without error       00%
88         -

# 2  Extended offline    *Completed without error*       00%
86         -



SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

    7        0    65535  Read_scanning was completed without error

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.





Also and Extended SMART test was run and has passed.



The question I have is can anyone help understand what this SMART
error/warning means. I can be one of 3 things:



1)       It is normal and no action is necessary but needs to be explained
further

2)       It is an issue with the Transcend drive/FW and Transcend needs to
look into this further

3)       There is an issue with the computer -  Maybe noise, power, or SATA
TX/TX signal strength?



We need to help them find out what caused these messages and if it is a
sign of a problem with the SSD or is this normal after writing 78TB of
data? Is the number 524288 concerning in this warning message?



We don’t want to ignore this warning in the event it is going to lead to a
major issue down the road.



Regards,

Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://listi.jpberlin.de/pipermail/smartmontools-support/attachments/20171219/004b1055/attachment.html>


More information about the Smartmontools-support mailing list