[smartmontools-support] OS hangs and COMRESET events: failing drive?

Gruff Hacker gruffhacker-cyg at yahoo.com
Sat May 23 19:41:29 CEST 2020


Hi,

I have a LITEONIT LGT-256M6G M.2 SATA SSD on a Windows 10 system.
For a few days the OS has been hanging periodically and the only thing I can see in Windows events logs is this:

Source:        iaStorA
Event ID:      129
Reset to device, \Device\RaidPort0, was issued.

There is no RAID configured.
During the hangs, the drive access light is on steady the entire time.
There are apparently a large number of issues that can cause this, but the only other thing I noticed is this in the smartmontools output:
SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2          218  Device-to-host register FISes sent due to a COMRESET

Every time one of these hangs happens, the counter of that COMRESET event in the smartmontools output will increment.
Any ideas?  Does this indicate a failing drive?  

What is the meaning of this "Device-to-host register FISes sent due to a COMRESET" event?

Full details below.
Thanks

smartctl 6.3 2014-07-26 r3976 [x86_64-w64-mingw32-win8] (sf-6.3-1)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org


=== START OF INFORMATION SECTION ===
Device Model:     LITEONIT LGT-256M6G
Firmware Version: DG83001
User Capacity:    256,060,514,304 bytes [256 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS, ATA/ATAPI-7 T13/1532D revision 4a
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat May 23 13:20:45 2020 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, frozen [SEC2]


=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


General SMART Values:
Offline data collection status:  (0x02)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:         (   10) seconds.
Offline data collection
capabilities:              (0x15) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Abort Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0002)    Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x00)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      (  10) minutes.
SCT capabilities:            (0x003d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.


SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO----   100   100   070    -    0
  5 Reallocated_Sector_Ct   PO----   100   100   000    -    0
  9 Power_On_Hours          -O----   100   100   000    -    12462
 12 Power_Cycle_Count       -O----   100   100   000    -    403
177 Wear_Leveling_Count     PO----   100   100   000    -    1778823
178 Used_Rsvd_Blk_Cnt_Chip  PO----   100   100   000    -    8
181 Program_Fail_Cnt_Total  PO----   100   100   000    -    0
182 Erase_Fail_Count_Total  PO----   100   100   000    -    13
187 Reported_Uncorrect      -O----   100   100   000    -    0
192 Power-Off_Retract_Count PO----   100   100   000    -    205
196 Reallocated_Event_Count PO----   100   100   000    -    0
198 Offline_Uncorrectable   PO----   100   100   000    -    0
199 UDMA_CRC_Error_Count    PO----   100   100   000    -    9
232 Available_Reservd_Space PO----   100   100   010    -    0
241 Total_LBAs_Written      PO----   100   100   000    -    1897582
242 Total_LBAs_Read         PO----   100   100   000    -    1550151
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning


General Purpose Log Directory Version 0
SMART           Log Directory Version 0
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01       GPL,SL  R/O      1  Summary SMART error log
0x06       GPL,SL  R/O      1  SMART self-test log
0x07       GPL,SL  R/O      1  Extended self-test log
0x10       GPL,SL  R/O      1  NCQ Command Error log
0x11       GPL,SL  R/O      1  SATA Phy Event Counters
0x80-0x9f  GPL,SL  R/W      1  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer


SMART Extended Comprehensive Error Log (GP Log 0x03) not supported


SMART Error Log Version: 0
No Errors Logged


SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     35888         -
# 2  Short offline       Completed without error       00%     35632         -
# 3  Extended offline    Completed without error       00%     44053         -
# 4  Short offline       Completed without error       00%     43541         -


Selective Self-tests/Logging not supported


SCT Status Version:                  3
SCT Version (vendor specific):       1 (0x0001)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                     ? Celsius
Power Cycle Min/Max Temperature:      ?/ ? Celsius
Lifetime    Min/Max Temperature:      ?/ ? Celsius
Under/Over Temperature Limit Count:   0/0


SCT Temperature History Version:     2
Temperature Sampling Period:         0 minutes
Temperature Logging Interval:        0 minutes
Min/Max recommended Temperature:      ?/ ? Celsius
Min/Max Temperature Limit:            ?/ ? Celsius
Temperature History Size (Index):    128 (0)


Index    Estimated Time   Temperature Celsius
   1    2020-05-23 11:13     ?  -
 ...    ..(126 skipped).    ..  -
   0    2020-05-23 13:20     ?  -


SCT Error Recovery Control:
           Read:      2 (0.2 seconds)
          Write:      2 (0.2 seconds)


Device Statistics (GP Log 0x04) not supported


SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            0  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2          218  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS
0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
0x0010  2            0  R_ERR response for host-to-device data FIS, non-CRC
0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
0x0013  2            0  R_ERR response for host-to-device non-data FIS, non-CRC


More information about the Smartmontools-support mailing list