[smartmontools-support] Calcuting smartctl output to recovery bad-sector on FreeBSD ZFS

Christian Franke Christian.Franke at t-online.de
Sun Mar 14 15:54:24 CET 2021


Budi Janto wrote:
>
>
> On 3/14/21 12:42 AM, Christian Franke wrote:
>> Did you try any of the suggested smartctl options (-l xerror -l 
>> defects) ?
>
> # smartctl -l defects /dev/ada2
> smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-STABLE amd64] (local build)
> Copyright (C) 2002-20, Bruce Allen, Christian Franke, 
> www.smartmontools.org
>
> Pending Defects log (GP Log 0x0c) not supported

Unfortunately only few recent drives support this useful log.


> # smartctl -l xerror /dev/ada2
> smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-STABLE amd64] (local build)
> Copyright (C) 2002-20, Bruce Allen, Christian Franke, 
> www.smartmontools.org
>
> ...
> Error 30 [5] occurred at disk power-on lifetime: 28340 hours (1180 
> days + 20 hours)
>   When the command that caused the error occurred, the device was 
> active or idle.
>
>   After command completion occurred, registers were:
>   ER -- ST COUNT  LBA_48  LH LM LL DV DC
>   -- -- -- == -- == == == -- -- -- -- --
>   40 -- 51 00 00 00 01 c7 35 17 a0 40 00  Error: UNC at LBA = 
> 0x1c73517a0 = 7637112736
>
>   Commands leading to the command that caused the error were:
>   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time 
> Command/Feature_Name
>   -- == -- == -- == == == -- -- -- -- --  --------------- 
> --------------------
>   60 01 00 00 38 00 01 c7 35 16 b0 40 08 39d+05:11:02.405  READ FPDMA 
> QUEUED

This means:
READ FPDMA QUEUED (NCQ Tag 0x07) of 0x0100 (256) logical sectors 
starting at LBA 0x0001c73516b0 (7637112496) failed at LBA 0x0001c73517a0 
(7637112736).

7637112736 is likely the LBA of one of the unreadable physical sectors. 
It needs 33 bits and is therefore not visible in the legacy logs.


> ...
>> I'm not sure whether 'conv=noerror,sync' has any effect in 
>> conjunction with /dev/zero.
>>
>> Caching should be suppressed with '*flag=direct'. Check first that 
>> the physical sector is actually unreadable, for example:
>>
>> # dd if=/dev/ada2 of=/dev/null bs=4096 count=1 skip=417763282 
>> iflag=direct
>>
>> If and only if this command reports a read error, try to overwrite 
>> the physical sector:
>>
>> # dd if=/dev/zero of=/dev/ada2 bs=4096 count=1 seek=417763282 
>> oflag=direct
>
> I read from this https://datto.engineering/post/causing-zfs-corruption,
> my goal is how to cover the bad-sector, so that the system does not 
> use it. Just curious before I gave a new hard drive. In freebsd-ufs 
> type already success follow this step:
>
> # smartctl -l selftest /dev/ada0 | awk 'NR==7'
> # 1  Extended offline    Completed: read failure       90% 36067      
> 27292160
>
>       ^^^^^^^^ (L)

Using '-l selftest' may not work for disks > 2TiB, see my last mail. Use 
-l xselftest instead.

Do a full read scan on the device with some tool like badblocks (I 
prefer GNU ddrescue). This should find the actually unreadable blocks.


> ...
> Is it possible in freebsd-zfs to use the same method? Thanks.

I'm not familiar with freebsd-zfs, sorry.

Regards,
Christian



More information about the Smartmontools-support mailing list