[smartmontools-support] Long SMART test trashes RAID read rate.

David Mathog mathog at caltech.edu
Tue Mar 20 00:59:28 CET 2018

Hi all,

I spent today trying to figure out why one of our big PowerEdge systems 
suddenly could not read from a RAID at more than 30Mb/s.  Oddly while 
that was the rate for one process,it would also do two at that speed.  
The test was just:

   dd if=bigfile bs=8192 of=/dev/null

Go up to four processes and it topped out at around 80Mb/s.  The normal 
read rate is on the order of 400Mb/s.

After much futzing around it turned out when a long SMART test is 
running on any disk on the system (A) which has this problem the IO goes 
down to 30Mb/s.  It doesn't matter which disk is running the test.  It 
doesn't get worse if they all are.  The system we have which is most 
like the first system (C) does not have this issue.

            A       C
Centos     6.7     6.9
RAM        512     512 Gb
CPUs       56      40  (actually threads)
PowerEdge  T630    T630
Xeon       E5-2695 E5-2650 (both v3)
speed      2.30GHz 2.30Ghz
cpufreq?   yes     no
PERC       H730    H730P
SAS disk   ST2000NM0023
SAS disk           ST4000NM0005

There are a bunch of small differences between the two systems so it is
hard to say for sure which is the actual culprit.  I have never seen 
this issue before on another system.

Anybody else seen this?  Is it the disks, the controller, the 
combination of the two???  Cpufreq thrown in for good measure?


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

