[smartmontools-support] "Unexpected sense" errors logged on Dell PERC H700 controllers

Terry Kennedy TERRY at glaver.org
Sun Jun 23 15:48:19 CEST 2019


  All systems are FreeBSD 12 (amd64) and all have smartmontools 7.0. All of
the systems also have Dell PERC H700 controllers. That last part is import-
ant, as those controllers are apparently more picky than the drives about
what sort of commands they will pass through.

  At smartd startup and during the nightly periodic job that checks the 
health of the drives, the controller logs a message like this for each
drive:

mfi0: 14428 (614612552s/0x0002/info) - Unexpected sense: PD 02(e0x20/s2) Path 5000cca02a444485, CDB: 4d 00 40 ff 00 00 00 3e fc 00, Sense: 5/24/00

  Running smartmontools with the "-r ioctl,4" option to generate a trace
shows a difference between the systems that log the unexpected sense errors
and ones that don't.

  System with the problem:

[SNIP]
=== START OF INFORMATION SECTION ===
Vendor:               HITACHI
Product:              HUS156030VLS600
Revision:             E774
Compliance:           SPC-4
User Capacity:        300,000,000,000 bytes [300 GB]
Logical block size:   512 bytes
Rotation Rate:        15000 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca02a444484
Serial number:        LVW6JWKM
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Sun Jun 23 09:54:30 2019 EDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
 [log sense: 4d 00 40 00 00 00 00 00 04 00 ]
  CAM status=0x1, SCSI status=0x0, resid=0x0
  Incoming data, len=4:
 00     00 00 00 0e
  status=0x0
 [log sense: 4d 00 40 00 00 00 00 00 12 00 ]
  CAM status=0x1, SCSI status=0x0, resid=0x0
  Incoming data, len=18:
 00     00 00 00 0e 00 02 03 05  06 0d 0e 0f 10 15 18 2f
 10     30 37
  status=0x0
 [log sense: 4d 00 40 ff 00 00 00 3e fc 00 ]
  CAM status=0x8c, SCSI status=0x2, resid=0x0
  Incoming data, len=16124 [only first 256 bytes shown]:
 00     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 20     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 30     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 40     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 50     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 60     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 70     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 80     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 90     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 a0     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 b0     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 c0     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 d0     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 e0     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 f0     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
  sense_len=0x20, sense_resid=0x0
  >>> Sense buffer, len=32:
 00     70 00 05 00 00 00 00 18  00 00 00 00 24 00 00 cf
 10     00 03 00 00 f8 23 00 00  00 00 00 00 00 00 00 00
  status=0x2: sense_key=0x5 asc=0x24 ascq=0x0
Log Sense for supported pages and subpages failed [unsupported field in scsi command]
scsiGetSupportedLogPages: number of unreported (standard) log pages: 1 (sub-pages: 0)
[SNIP]

  System without the problem:

[SNIP]
=== START OF INFORMATION SECTION ===
Vendor:               HITACHI
Product:              HUS154530VLS300
Revision:             B598
Compliance:           SPC-3
User Capacity:        300,000,000,000 bytes [300 GB]
Logical block size:   512 bytes
Rotation Rate:        15000 rpm
Logical Unit id:      0x5000cca00966629c
Serial number:        JLWU9K7C
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Sun Jun 23 09:56:58 2019 EDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
 [log sense: 4d 00 40 00 00 00 00 00 04 00 ]
  CAM status=0x1, SCSI status=0x0, resid=0x0
  Incoming data, len=4:
 00     00 00 00 0e
  status=0x0
 [log sense: 4d 00 40 00 00 00 00 00 12 00 ]
  CAM status=0x1, SCSI status=0x0, resid=0x0
  Incoming data, len=18:
 00     00 00 00 0e 00 02 03 05  06 0d 0e 0f 10 15 18 2f
 10     30 37
  status=0x0
scsiGetSupportedLogPages: number of unreported (standard) log pages: 1 (sub-pages: 0)
[SNIP]

  The difference is that the failing systems all do a third LOG SENSE
command requesting a subpage of 0xff.

  Does anyone have any ideas as to what is going on here, in particularly
why smartctl is doing a third LOG SENSE for some drives and not for others?
Note that while the drive models are different, they report the same pages
as available.

  I can provide more data if requested, and can also provide remote access
to some of the systems in question if needed by a developer.

        Terry Kennedy     http://www.glaver.org      New York, NY USA



More information about the Smartmontools-support mailing list