[smartmontools-support] smartctl -l error causes NVME to die on Debian unstable, Dell XPS 13 7390, Micron 2200S
B
b at mydomainnameisbiggerthanyours.com
Wed Dec 25 03:55:23 CET 2019
I've discovered that running "smartctl -l error" against my new Dell XPS
13 laptop with a Micron 2200S NVMe causes the drive to die. This
obviously causes the entire system to fail, because the filesystem is no
longer readable, until the power is pulled and then I can boot normally
again.
The system is a Dell XPS 13 7390 with EFI version 1.3.1. The NVME is a
Micron 2200S NVMe 512GB.
My OS is Debian unstable/sid, kernel package linux-image-5.3.0-3-amd64
(5.3.15-1), and smartctl --version says it's "7.0 2018-12-30 r4883
[x86_64-linux-5.3.0-3-amd64] (local build)".
I first saw the problem when running smartctrl -a against the NVME
drive. Then I narrowed it down to being caused by "smartctrl -l error".
When the drive dies I get repeating errors in my syslog:
kernel: DMAR: DRHD: handling fault status reg 3
kernel: DMAR: [DMA Read] Request device [71:00.0] fault addr
ffe48000 [fault reason 06] PTE Read access is not set
Notably, the problem is only happening on a Debian unstable installation
and this is likely to be a Debian problem, but I figured it was a good
idea to report this here first since I usually don't bother reporting
bugs to Debian anymore, since they often take years (this is not
hyperbole) to even acknowledge bug reports for most packages, including
the kernel.
I have not yet been able to reproduce the problem on any Live images
yet, though I'm still trying. I've tested with
ubuntu-18.04.3-desktop-amd64.iso and ubuntu-19.10-desktop-amd64.iso, and
I'm not seeing the issue there. Notably, Ubuntu 19.10 is on kernel
5.3.0-18, which is pretty close to what is on Debian, but there must be
some difference somewhere.
Many live images, including Debian's current stable Live Image, are too
old for the new hardware and are not booting at all, so my options with
testing are somewhat limited in that respect.
Oh, also, I am not having this problem on two other systems with
different NVME drives (A Samsung and a WD), but both have the exact same
Debian unstable packages and versions, so there is clearly some hardware
factor in this.
I have not tried swapping the NVME drive in the laptop yet because I
don't have a spare handy, but that is possible to arrange if desired.
Any ideas/comments?
More information about the Smartmontools-support
mailing list