[smartmontools-support] OS hangs and COMRESET events: failing drive?
Gruff Hacker
gruffhacker-cyg at yahoo.com
Sat May 23 19:41:29 CEST 2020
Hi,
I have a LITEONIT LGT-256M6G M.2 SATA SSD on a Windows 10 system.
For a few days the OS has been hanging periodically and the only thing I can see in Windows events logs is this:
Source: iaStorA
Event ID: 129
Reset to device, \Device\RaidPort0, was issued.
There is no RAID configured.
During the hangs, the drive access light is on steady the entire time.
There are apparently a large number of issues that can cause this, but the only other thing I noticed is this in the smartmontools output:
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x000a 2 218 Device-to-host register FISes sent due to a COMRESET
Every time one of these hangs happens, the counter of that COMRESET event in the smartmontools output will increment.
Any ideas? Does this indicate a failing drive?
What is the meaning of this "Device-to-host register FISes sent due to a COMRESET" event?
Full details below.
Thanks
smartctl 6.3 2014-07-26 r3976 [x86_64-w64-mingw32-win8] (sf-6.3-1)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: LITEONIT LGT-256M6G
Firmware Version: DG83001
User Capacity: 256,060,514,304 bytes [256 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS, ATA/ATAPI-7 T13/1532D revision 4a
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat May 23 13:20:45 2020 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 10) seconds.
Offline data collection
capabilities: (0x15) SMART execute Offline immediate.
No Auto Offline data collection support.
Abort Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x00) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO---- 100 100 070 - 0
5 Reallocated_Sector_Ct PO---- 100 100 000 - 0
9 Power_On_Hours -O---- 100 100 000 - 12462
12 Power_Cycle_Count -O---- 100 100 000 - 403
177 Wear_Leveling_Count PO---- 100 100 000 - 1778823
178 Used_Rsvd_Blk_Cnt_Chip PO---- 100 100 000 - 8
181 Program_Fail_Cnt_Total PO---- 100 100 000 - 0
182 Erase_Fail_Count_Total PO---- 100 100 000 - 13
187 Reported_Uncorrect -O---- 100 100 000 - 0
192 Power-Off_Retract_Count PO---- 100 100 000 - 205
196 Reallocated_Event_Count PO---- 100 100 000 - 0
198 Offline_Uncorrectable PO---- 100 100 000 - 0
199 UDMA_CRC_Error_Count PO---- 100 100 000 - 9
232 Available_Reservd_Space PO---- 100 100 010 - 0
241 Total_LBAs_Written PO---- 100 100 000 - 1897582
242 Total_LBAs_Read PO---- 100 100 000 - 1550151
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 0
SMART Log Directory Version 0
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 GPL,SL R/O 1 Summary SMART error log
0x06 GPL,SL R/O 1 SMART self-test log
0x07 GPL,SL R/O 1 Extended self-test log
0x10 GPL,SL R/O 1 NCQ Command Error log
0x11 GPL,SL R/O 1 SATA Phy Event Counters
0x80-0x9f GPL,SL R/W 1 Host vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data Transfer
SMART Extended Comprehensive Error Log (GP Log 0x03) not supported
SMART Error Log Version: 0
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 35888 -
# 2 Short offline Completed without error 00% 35632 -
# 3 Extended offline Completed without error 00% 44053 -
# 4 Short offline Completed without error 00% 43541 -
Selective Self-tests/Logging not supported
SCT Status Version: 3
SCT Version (vendor specific): 1 (0x0001)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: ? Celsius
Power Cycle Min/Max Temperature: ?/ ? Celsius
Lifetime Min/Max Temperature: ?/ ? Celsius
Under/Over Temperature Limit Count: 0/0
SCT Temperature History Version: 2
Temperature Sampling Period: 0 minutes
Temperature Logging Interval: 0 minutes
Min/Max recommended Temperature: ?/ ? Celsius
Min/Max Temperature Limit: ?/ ? Celsius
Temperature History Size (Index): 128 (0)
Index Estimated Time Temperature Celsius
1 2020-05-23 11:13 ? -
... ..(126 skipped). .. -
0 2020-05-23 13:20 ? -
SCT Error Recovery Control:
Read: 2 (0.2 seconds)
Write: 2 (0.2 seconds)
Device Statistics (GP Log 0x04) not supported
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 0 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 218 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FIS
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC
More information about the Smartmontools-support
mailing list