[smartmontools-support] mdadm stuck at 0% reshape after grow

Fri Dec 8 00:40:25 CET 2017

Cross-posting to smartmontools, if any of you could be kind enough to
explain what's happening? Please keep linux-raid and me in the
cross-post as we are not subscribed to smartmontools ...

The command in question is

# smartctl -l scterc,70,70 /dev/sdb ; echo $?

On 07/12/17 17:40, Andreas Klauer wrote:
> On Thu, Dec 07, 2017 at 02:58:32PM +0100, Andreas Klauer wrote:
>> Perhaps it's an intermittent error specific to your smartctl version?
> 
> Looking at the source code, it seems to be a case of:
> 
>     sct supported, erc unsupported - exit 4 (fail)
>     sct unsupported as a whole     - exit 0 (success)
> 
> So some drives will give the wrong exit code for this command.
> 
> Should probably be reported as a bug to smartmontools.
> 
If true, this actually has quite a big impact on hobbyist raid
installations. By default, if a desktop drive hiccups, md-raid will kick
it from the array instead of sorting it out. And there's a script
recommended to fix the problem (changing the linux timeouts) except that
if this is true the script will break.

If we've got a enterprise drive, then the above command works, sets the
timeout to 7 seconds, and returns 0 for success.

For drives like my Barracuda, sct is supported but erc isn't, and I get
4 returned, so the script correctly detects a fail and sets the linux
timeout to 180 seconds.

But if this is true, for a drive that doesn't support sct (I don't have
one to test), the command will fail but return 0!!! So the script
doesn't realise anything is wrong, doesn't fix the defaults, and leaves
the raid array in the dangerous situation where linux will time out long
before the drive does.

So does smartctl return 0 if the drive doesn't support sct? If so why?
And what's the easiest way to detect such a drive if so?

Cheers,
Wolo