[smartmontools-support] Reading NVME logs
Thane K. Sherrington
thane at computerconnectionltd.com
Wed Dec 21 16:24:59 CET 2022
Hi Christian,
Thank you for the very informative response. Really appreciate
it. The old SMART tests were easier to read and, I think, more useful.
Thane K. Sherrington
Computer Connection, Ltd. ...taking the mystery out of computers since 1982.
Winner of the 2012 Ian Spencer - Excellence in Business Award
*Thanks for making us the Reader's Choice Best Computer Store in 2016,
2017, 2018 and 2019!*
95 College St., Antigonish,
NS B2G 1X6
902-863-3361 (phone)
902-863-2580 (fax)
thane at computerconnectionltd.com
On 21-Dec-2022 10:59 a.m., Christian Franke wrote:
> Thane K. Sherrington wrote:
>> Hi all,
>> Looking at NVME logs, I find them hard to understand.
>>
>> Take the following logs (questions inline).
>>
>> smartctl pre-7.4 2022-07-17 r5397 [i686-w64-mingw32-w11-21H2(64)]
>> (CircleCI)
>> ...
>> === START OF SMART DATA SECTION ===
>> **>>>>>>>>>>>>> *SMART overall-health self-assessment test result:
>> PASSED - I assume this is good, but a perfect indicator, since I've
>> seen regular drives say the passed SMART but had read errors.*
>
> This message only exists for consistency with ATA and SCSI output. It
> prints PASSED if and only if the "Critical Warning" byte from
> SMART/Health info is zero.
>
> Since the early days of ATA SMART, read errors do not imply that SMART
> failure is reported.
> https://www.smartmontools.org/wiki/FAQ#ATAdriveisfailingself-testsbutSMARThealthstatusisPASSED.Whatsgoingon
>
>
>
>> ...
>> SMART/Health Information (NVMe Log 0x02)
>
> For details about this log, see for example "Figure 207" from "NVM
> Express Base Specification, revision 2.0b":
> https://nvmexpress.org/developers/nvme-specification/
>
>
>> Critical Warning: 0x00
>
> Bit 0 of this byte would be set if "the available spare capacity has
> fallen below the threshold.". Then "FAILED!" would be printed above.
> See spec for more bits.
>
>
>> *>>>>>>>>>>>>> Temperature: 79 Celsius - This
>> seems high - is that a problem?*
>
> Possibly. If this persists, I would suggest to add more cooling.
>
>
>> **>>>>>>>>>>>>> *Available Spare: 100% - This
>> looks like 100% of the spare is free. I assume that's a good sign?*
>
> Yes.
>
>
>> Available Spare Threshold: 10%
>> **>>>>>>>>>>>>> *Percentage Used: 16% - Does this
>> mean that 16% of the drive is used, or 16% of the spare space is used?*
>
> "Contains a vendor specific estimate of the percentage of NVM
> subsystem life used based on the actual usage and the manufacturer’s
> prediction of NVM life. ...".
> See spec for full text.
>
>
>> *>>>>>>>>>>>>> **Unsafe Shutdowns: 703 - Do unsafe
>> shutdowns matter? *
>
> This possibly indicates that the system was not properly shut down
> frequently.
>
>
>> *>>>>>>>>>>>>> **Error Information Log Entries: 3,470 - Is this
>> a bad thing?*
>
> Unknown. Experience shows that drive firmware of some vendors
> increments this frequently.
>
>
>> Error Information (NVMe Log 0x01, 16 of 63 entries)
>> **>>>>>>>>>>>>> *No Errors Logged - If there are 3470 log entries,
>> why are no errors logged?*
>>
>
> "The controller should clear this log page by removing all entries on
> power cycle and Controller Level Reset."
> This differs from ATA error logs which are persistent.
>
> Regards,
> Christian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://listi.jpberlin.de/pipermail/smartmontools-support/attachments/20221221/9e25f8a4/attachment.htm>
More information about the Smartmontools-support
mailing list