[smartmontools-support] smartctl -l error causes NVME to die on Debian unstable, Dell XPS 13 7390, Micron 2200S
B
b at mydomainnameisbiggerthanyours.com
Sat Dec 28 08:12:23 CET 2019
At this point I am convinced this is a kernel issue on my specific
hardware. It's the common factor in all of my problems and changing
other variables has no effect.
I suspect this problem will go away the next time Debian published a new
kernel.
I didn't find anything specific to my issue when googling around, but I
found some other issues on i915 systems, like I have, causing problems
for NVMEs. This might be some something like that.
One thing to note is that the firmware on my drive is 22001030, but the
current available on LVFS and Dell's website is 22001020. I may have
some freaky experimental firmware. I see some interesting dates on the
Dell website that make me think they may have released 22001030 and then
pulled it for some reason.
On 12/27/19 5:10 AM, Christian Franke wrote:
> B wrote:
>> I've discovered that running "smartctl -l error" against my new Dell
>> XPS 13 laptop with a Micron 2200S NVMe causes the drive to die. This
>> obviously causes the entire system to fail, because the filesystem is
>> no longer readable, until the power is pulled and then I can boot
>> normally again.
>>
>> The system is a Dell XPS 13 7390 with EFI version 1.3.1. The NVME is
>> a Micron 2200S NVMe 512GB.
>>
>> My OS is Debian unstable/sid, kernel package
>> linux-image-5.3.0-3-amd64 (5.3.15-1), and smartctl --version says
>> it's "7.0 2018-12-30 r4883 [x86_64-linux-5.3.0-3-amd64] (local build)".
>>
>> I first saw the problem when running smartctrl -a against the NVME
>> drive. Then I narrowed it down to being caused by "smartctrl -l error".
>>
>> When the drive dies I get repeating errors in my syslog:
>>
>> kernel: DMAR: DRHD: handling fault status reg 3
>> kernel: DMAR: [DMA Read] Request device [71:00.0] fault addr
>> ffe48000 [fault reason 06] PTE Read access is not set
>>
>> Notably, the problem is only happening on a Debian unstable
>> installation and this is likely to be a Debian problem, but I figured
>> it was a good idea to report this here first since I usually don't
>> bother reporting bugs to Debian anymore, since they often take years
>> (this is not hyperbole) to even acknowledge bug reports for most
>> packages, including the kernel.
>
> The smartmontools Debian package has a new maintainer. If possible,
> please report this issue in Debian bug tracker:
> https://bugs.debian.org/cgi-bin/pkgreport.cgi?package=smartmontools
>
>
>> I have not yet been able to reproduce the problem on any Live images
>> yet, though I'm still trying. I've tested with
>> ubuntu-18.04.3-desktop-amd64.iso and ubuntu-19.10-desktop-amd64.iso,
>> and I'm not seeing the issue there. Notably, Ubuntu 19.10 is on
>> kernel 5.3.0-18, which is pretty close to what is on Debian, but
>> there must be some difference somewhere.
>>
>> Many live images, including Debian's current stable Live Image, are
>> too old for the new hardware and are not booting at all, so my
>> options with testing are somewhat limited in that respect.
>>
>> Oh, also, I am not having this problem on two other systems with
>> different NVME drives (A Samsung and a WD), but both have the exact
>> same Debian unstable packages and versions, so there is clearly some
>> hardware factor in this.
>>
>> I have not tried swapping the NVME drive in the laptop yet because I
>> don't have a spare handy, but that is possible to arrange if desired.
>
> Yes, please. It would be very interesting whether this also depends on
> NVMe drive firmware.
>
>
>>
>> Any ideas/comments?
>
> Sorry, no. Linux NVMe support in smartmontools is now 3+ years old and
> I don't remember any similar problem report.
>
> _______________________________________________
> Smartmontools-support mailing list
> Smartmontools-support at listi.jpberlin.de
> https://listi.jpberlin.de/mailman/listinfo/smartmontools-support
More information about the Smartmontools-support
mailing list