[smartmontools-support] Reading NVME logs

Thane K. Sherrington thane at computerconnectionltd.com
Wed Dec 21 16:24:59 CET 2022


Hi Christian,
     Thank you for the very informative response.  Really appreciate 
it.  The old SMART tests were easier to read and, I think, more useful.

Thane K. Sherrington

Computer Connection, Ltd. ...taking the mystery out of computers since 1982.
Winner of the 2012 Ian Spencer - Excellence in Business Award
*Thanks for making us the Reader's Choice Best Computer Store in 2016, 
2017, 2018 and 2019!*
95 College St., Antigonish,
NS B2G 1X6
902-863-3361 (phone)
902-863-2580 (fax)
thane at computerconnectionltd.com
On 21-Dec-2022 10:59 a.m., Christian Franke wrote:
> Thane K. Sherrington wrote:
>> Hi all,
>>     Looking at NVME logs, I find them hard to understand.
>>
>> Take the following logs (questions inline).
>>
>> smartctl pre-7.4 2022-07-17 r5397 [i686-w64-mingw32-w11-21H2(64)] 
>> (CircleCI)
>> ...
>> === START OF SMART DATA SECTION ===
>> **>>>>>>>>>>>>> *SMART overall-health self-assessment test result: 
>> PASSED - I assume this is good, but a perfect indicator, since I've 
>> seen regular drives say the passed SMART but had read errors.*
>
> This message only exists for consistency with ATA and SCSI output. It 
> prints PASSED if and only if the "Critical Warning" byte from 
> SMART/Health info is zero.
>
> Since the early days of ATA SMART, read errors do not imply that SMART 
> failure is reported.
> https://www.smartmontools.org/wiki/FAQ#ATAdriveisfailingself-testsbutSMARThealthstatusisPASSED.Whatsgoingon 
>
>
>
>> ...
>> SMART/Health Information (NVMe Log 0x02)
>
> For details about this log, see for example "Figure 207" from "NVM 
> Express Base Specification, revision 2.0b":
> https://nvmexpress.org/developers/nvme-specification/
>
>
>> Critical Warning:                   0x00
>
> Bit 0 of this byte would be set if "the available spare capacity has 
> fallen below the threshold.". Then "FAILED!" would be printed above. 
> See spec for more bits.
>
>
>> *>>>>>>>>>>>>> Temperature:                        79 Celsius - This 
>> seems high - is that a problem?*
>
> Possibly. If this persists, I would suggest to add more cooling.
>
>
>> **>>>>>>>>>>>>> *Available Spare:                    100% - This 
>> looks like 100% of the spare is free.  I assume that's a good sign?*
>
> Yes.
>
>
>> Available Spare Threshold:          10%
>> **>>>>>>>>>>>>> *Percentage Used:                    16% - Does this 
>> mean that 16% of the drive is used, or 16% of the spare space is used?*
>
> "Contains a vendor specific estimate of the percentage of NVM 
> subsystem life used based on the actual usage and the manufacturer’s 
> prediction of NVM life. ...".
> See spec for full text.
>
>
>> *>>>>>>>>>>>>> **Unsafe Shutdowns:                   703 - Do unsafe 
>> shutdowns matter? *
>
> This possibly indicates that the system was not properly shut down 
> frequently.
>
>
>> *>>>>>>>>>>>>> **Error Information Log Entries:      3,470 -  Is this 
>> a bad thing?*
>
> Unknown. Experience shows that drive firmware of some vendors 
> increments this frequently.
>
>
>> Error Information (NVMe Log 0x01, 16 of 63 entries)
>> **>>>>>>>>>>>>> *No Errors Logged - If there are 3470 log entries, 
>> why are no errors logged?*
>>
>
> "The controller should clear this log page by removing all entries on 
> power cycle and Controller Level Reset."
> This differs from ATA error logs which are persistent.
>
> Regards,
> Christian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://listi.jpberlin.de/pipermail/smartmontools-support/attachments/20221221/9e25f8a4/attachment.htm>


More information about the Smartmontools-support mailing list