[smartmontools-support] using smartd to monitor a rotating group of USB drives?

Nathan Stratton Treadway nathanst at ontko.com
Wed Oct 31 06:19:25 CET 2018


We have a backup server which has four SATA drives inside the system,
plus a cycle of USB external drives which are plugged in for a few days
while waiting for a backup to be written to them, then once used are
unplugged (and taken off site), and the next drive in the cycle in
plugged in.

We list the four internal explicitly drives in smartd.conf, and smartd
monitors them with no problem.


We'd also like to have smartd check the status of whatever external
drive(s) happen to be plugged in at any point in time, and send an alert
email if there are SMART attributes (or error logs, or whatever) on those
drives reflecting a failure condition... but it doesn't really work as
we'd like.

(We've been trying this out using smartmontools 6.5+svn4324-1 as found
in the Bionic release of Ubuntu.)

Of course we looked at the the "-d removable" option lines in
smartd.conf... but that doesn't seem to apply DEVICESCAN-detected
devices.

In any case, we're running into the following issues in our scenario:

A) we get "SMART error (FailedOpenDevice) detected" warning messages
   each time a drive is unplugged, repeated daily until some new drive
   is plugged in and takes over the old /dev/sdX device name.

B) when a new drive is plugged in, it appears that smartd doesn't check 
   to see if the drive now found as /dev/sdX is actually the same drive
   as the one found at start-up time -- and thus over time multiple
   different drives are all treated as the same drive, with data saved 
   the same /var/lib/smartmontools/*<DEVICE_MODEL>_<SERIAL_NUMBER>*
   files (named after whatever happened to be plugged in at startup
   time) -- and, I believe, any warning emails sent include the original 
   drive's info in the body of the message, rather than the info for the
   drive that's actually attached at that time.

C) smartd only checks for devices mapped to the /dev/sd* files that were
   found at startup time; if more drives are plugged in simultaneously
   after startup, the "extras" won't be detected.


I looked through the FAQ page on the Wiki and the Trac tickets, but the
only thing I found that appeared directly related to the above issues
was Track ticket #60 "DEVICESCAN and hotplug", which covers issue C).

Is there some way I'm missing to suppress the FailedOpenDevice alert
message completely for particular devices?

It seems like issue B) would cause problems in many situations, even on
systems that don't purposefully go through a who cycle of external
drives over time, but I haven't been able to find an existing Trac
ticket for it.  Does it make sense for me to open one?


Has anyone else had any success using smartd to watch out for
warning/error conditions on a rotating group of external USB drives
(under Linux in particular)?

One thought I had to try working around all three issues (at the expense
of actively monitoring the drives for the full period they are plugged
in) was to leave the standard Ubuntu-installed smartd running as usual,
monitoring the internal drives, and then periodically run a separate
"smartd -q onecheck -c /etc/smartd_for_usb_drives.conf" command from
time to time to do a status check on whatever external drives are
available at that moment.

Will it cause any problems/conflicts to run "smartd -q onecheck" at the
same time as the standard long-term smartd process is running (assuming
that I make sure the internal drives are excluded from monitoring by the
smartd_for_usb_drives.conf file)?

Thanks.

							Nathan

----------------------------------------------------------------------------
Nathan Stratton Treadway  -  nathanst at ontko.com  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



More information about the Smartmontools-support mailing list