[smartmontools-support] OS hangs and COMRESET events: failing drive?
Christian Franke
Christian.Franke at t-online.de
Tue May 26 19:43:48 CEST 2020
Gruff Hacker via Smartmontools-support wrote:
> I have a LITEONIT LGT-256M6G M.2 SATA SSD on a Windows 10 system.
> For a few days the OS has been hanging periodically and the only thing I can see in Windows events logs is this:
>
> Source: iaStorA
> Event ID: 129
> Reset to device, \Device\RaidPort0, was issued.
>
> There is no RAID configured.
> During the hangs, the drive access light is on steady the entire time.
> There are apparently a large number of issues that can cause this, but the only other thing I noticed is this in the smartmontools output:
> SATA Phy Event Counters (GP Log 0x11)
> ID Size Value Description
> 0x000a 2 218 Device-to-host register FISes sent due to a COMRESET
>
> Every time one of these hangs happens, the counter of that COMRESET event in the smartmontools output will increment.
> Any ideas? Does this indicate a failing drive?
>
> What is the meaning of this "Device-to-host register FISes sent due to a COMRESET" event?
COMRESET is an OOB signal on the SATA interface used to reset the
device. The "... due to a COMRESET" counter is cleared on power on and
usually increases by 2 during each (warm-)boot due to COMRESETs from
BIOS and device driver.
In the above case, the counter increased further because the device
driver decided to reset the device several times. Unfortunately this
provides no hint why the resets were issued.
> Full details below.
> Thanks
>
> smartctl 6.3 2014-07-26 r3976 [x86_64-w64-mingw32-win8] (sf-6.3-1)
> Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
>
> === START OF INFORMATION SECTION ===
> Device Model: LITEONIT LGT-256M6G
> Firmware Version: DG83001
> User Capacity: 256,060,514,304 bytes [256 GB]
> ...
>
> SMART Attributes Data Structure revision number: 1
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
> 1 Raw_Read_Error_Rate PO---- 100 100 070 - 0
> 5 Reallocated_Sector_Ct PO---- 100 100 000 - 0
> 9 Power_On_Hours -O---- 100 100 000 - 12462
> 12 Power_Cycle_Count -O---- 100 100 000 - 403
> 177 Wear_Leveling_Count PO---- 100 100 000 - 1778823
> 178 Used_Rsvd_Blk_Cnt_Chip PO---- 100 100 000 - 8
> 181 Program_Fail_Cnt_Total PO---- 100 100 000 - 0
> 182 Erase_Fail_Count_Total PO---- 100 100 000 - 13
> 187 Reported_Uncorrect -O---- 100 100 000 - 0
> 192 Power-Off_Retract_Count PO---- 100 100 000 - 205
> 196 Reallocated_Event_Count PO---- 100 100 000 - 0
> 198 Offline_Uncorrectable PO---- 100 100 000 - 0
> 199 UDMA_CRC_Error_Count PO---- 100 100 000 - 9
> 232 Available_Reservd_Space PO---- 100 100 010 - 0
> 241 Total_LBAs_Written PO---- 100 100 000 - 1897582
> 242 Total_LBAs_Read PO---- 100 100 000 - 1550151
> ||||||_ K auto-keep
> |||||__ C event count
> ||||___ R error rate
> |||____ S speed/performance
> ||_____ O updated online
> |______ P prefailure warning
> ...
SMART values show no signs of trouble. If the raw value of ID 199
increases further, check for M.2 connector problems.
Regards,
Christian
More information about the Smartmontools-support
mailing list