[smartmontools-support] Device: /dev/sda [SAT], 524288 Currently unreadable (pending) sectors
Todd Harrington
toddh at axiomteksystems.com
Tue Dec 19 14:16:32 CET 2017
Hi,
Our customer is running some extensive disk tests on the Transcend SSD420K
512GB MLC SSD with 750TB (TBW). They are running CentOS Linux 7.3 and when
I asked then what they were doing for a workload they said:
A heavy Read/Write disk load was running at the time from a KVM Virtual
Machine. Before the load started the write cache was disabled on the drive
and Native Command Queueing was disabled.
The problem is that they are using Linux *smartmon* tools and they first
received this message on the two of the system they tried this extensive
disk test on:
[134.111.122.26-node1 ~]$ zgrep "Currently unreadable" /var/log/message*
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:18:34 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages:Dec 15 22:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:18:24 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 04:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:18:22 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 06:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:18:23 node1 smartd[1133]: Device: /dev/sda
[SAT], 524288 Currently unreadable (pending) sectors
/var/log/messages.1.gz:Dec 15 21:48:22 node1 smartd[1133]: Device: /dev/sda
[SAT], No more Currently unreadable (pending) sectors, warning condition
reset after 1 email
I shared the SMART data from that SSD with Transcend and asked Transcend
for help and they reported back to me:
With regards to the Hardware_ECC_Recovered high value of “354”, since the
Reallocated_Sector_Ct and Read_Error_Rate return “0”, our team does not
believe that the SSD is showing signs of depreciating health.
Below is the SMART data for this SSD. They have written about 78TB which is
about 10% of the drives life I think?
smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-693.2.2.el7.x86_64]
(local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: SiliconMotion based SSDs
Device Model: TS512GSSD420K
Serial Number: D898940005
Firmware Version: O1225G
User Capacity: 512,110,190,592 bytes [512 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Dec 18 15:59:17 2017 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x02) Offline data collection
activity
was completed without error.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test
routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 57) seconds.
Offline data collection
capabilities: (0x71) SMART execute
Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data
before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 1) minutes.
Conveyance self-test routine
recommended polling time: ( 1) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0000 100 100 000 Old_age
Offline - 0
5 *Reallocated_Sector_Ct* 0x0000 100 100 000 Old_age
Offline - 0
9 Power_On_Hours 0x0000 100 100 000 Old_age
Offline - 600
12 Power_Cycle_Count 0x0000 100 100 000 Old_age
Offline - 26
160 Uncorrectable_Error_Cnt 0x0000 100 100 000 Old_age
Offline - 0
161 Valid_Spare_Block_Cnt 0x0000 100 100 000 Old_age
Offline - 115
163 Initial_Bad_Block_Count 0x0000 100 100 000 Old_age
Offline - 50
164 Total_Erase_Count 0x0000 100 100 000 Old_age
Offline - 321118
165 Max_Erase_Count 0x0000 100 100 000 Old_age
Offline - 177
166 Min_Erase_Count 0x0000 100 100 000 Old_age
Offline - 77
167 Average_Erase_Count 0x0000 100 100 000 Old_age
Offline - 156
168 Max_Erase_Count_of_Spec 0x0000 100 100 000 Old_age
Offline - 3000
169 Remaining_Lifetime_Perc 0x0000 100 100 000 Old_age
Offline - 95
175 Program_Fail_Count_Chip 0x0000 100 100 000 Old_age
Offline - 0
176 Erase_Fail_Count_Chip 0x0000 100 100 000 Old_age
Offline - 0
177 Wear_Leveling_Count 0x0000 100 100 050 Old_age
Offline - 291
178 Runtime_Invalid_Blk_Cnt 0x0000 100 100 000 Old_age
Offline - 0
181 Program_Fail_Cnt_Total 0x0000 100 100 000 Old_age
Offline - 0
182 Erase_Fail_Count_Total 0x0000 100 100 000 Old_age
Offline - 0
192 Power-Off_Retract_Count 0x0000 100 100 000 Old_age
Offline - 8
194 Temperature_Celsius 0x0000 100 100 000 Old_age
Offline - 37
195 Hardware_ECC_Recovered 0x0000 100 100 000 Old_age
Offline - 7837
196 Reallocated_Event_Count 0x0000 100 100 016 Old_age
Offline - 0
197 *Current_Pending_Sector* 0x0000 100 100 000 Old_age
Offline - 0
198 Offline_Uncorrectable 0x0000 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0000 100 100 050 Old_age
Offline - 0
232 Available_Reservd_Space 0x0000 100 100 000 Old_age
Offline - 100
241 Host_Writes_32MiB 0x0000 100 100 000 Old_age
Offline - 2330590
242 Host_Reads_32MiB 0x0000 100 100 000 Old_age
Offline - 2687687
245 TLC_Writes_32MiB 0x0000 100 100 000 Old_age
Offline - 2568944
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00%
88 -
# 2 Extended offline *Completed without error* 00%
86 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
7 0 65535 Read_scanning was completed without error
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Also and Extended SMART test was run and has passed.
The question I have is can anyone help understand what this SMART
error/warning means. I can be one of 3 things:
1) It is normal and no action is necessary but needs to be explained
further
2) It is an issue with the Transcend drive/FW and Transcend needs to
look into this further
3) There is an issue with the computer - Maybe noise, power, or SATA
TX/TX signal strength?
We need to help them find out what caused these messages and if it is a
sign of a problem with the SSD or is this normal after writing 78TB of
data? Is the number 524288 concerning in this warning message?
We don’t want to ignore this warning in the event it is going to lead to a
major issue down the road.
Regards,
Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://listi.jpberlin.de/pipermail/smartmontools-support/attachments/20171219/004b1055/attachment.html>
More information about the Smartmontools-support
mailing list