Revision as of 02:26, 21 February 2018 by Admin (talk | contribs)
Jump to navigation Jump to search

Adaptec SSD Caching Configuration

To get the best performance with all SSD RAID arrays using an Adaptec RAID card, make sure you disable write back caching and read caching via arcconf cli. You can leave the per-drive ssd write caches enabled, but I've found that for most RAID cards using ssds, enabling the read and write caches on the RAID card actually lowers performance in most cases. With some Adaptec RAID cards, you may not be able to totally disable write caching, and you may have to choose between "write through caching" or "write back caching". If this is the case you will want to use "Write Through caching" as it will not lower performance and it will avoid caching writes on the RAID card cache, instead it will go straight through to the SSDs and only considers a write operation complete once the SSDs have written the data to their flash, or write cache if enabled. Most higher end SSDs have power loss protection for their internal write caches, so if this is the case there's no need to worry about data loss. However, if your SSDs do not have a capacitor on board which flushes the cache, and you must be 100% writes hit the disk then you will either have to disabled the SSD cache, or get a RAID card with a BBU. I recommend running the two arcconf commands listed below which disabled write back caching and read caching on the RAID card.

/usr/StorMan/arcconf SETCACHE 1 LOGICALDRIVE 0 wt
/usr/StorMan/arcconf SETCACHE 1 LOGICALDRIVE 0 roff

If you are using Ubuntu then I suggest making sure that you are using the latest Linux Kernel to get the best I/O performance. Any kernel that is newer than 3.16 will provide significant I/O boosts compared to older kernels.

Adaptec Arcconf CLI Cheatsheet

Display Adaptec RAID card config information and health

This is one of the most basic arcconf commands, it displays general RAID card configuration information such as the amount of arrays, drives in arrays, their health status and the basic configuration of the RAID, if you need to know if read or write caching is enabled, you can use this command to check. If you have more than one Adaptec RAID card installed, or want more specific information you can use the options below.

/usr/StorMan/arcconf getconfig 1

AD—Adapter information only
LD—Logical drive information only
PD—Physical device information only
MC—maxCache information only
AL—All information (optional)

Install arcconf CLI

Installing the arcconf utility is pretty simple, in fact this install method for arcconf will work with Ubuntu, CentOS, or really any modern Linux Distro, all we are doing here is extracting the binary and giving it executable permissions so we can talk to the Adaptec RAID card. I recommend downloading the latest version of arcconf which can be found Here

chmod +x linux_x64/cmdline/arcconf
linux_x64/cmdline/arcconf $command

Once you have downloaded the arcconf zip file, unzip it somewhere that makes sense, allow it to execute and start using it. I recommend placing this either somewhere in /root/ like /root/arcconf.

Update Firmware / BIOS for Adaptec 71605e

This updates the Adaptec (microsemi) 71605e to the latest firmware and BIOS version (As of August 26 2016). Please note that a reboot is required and the wget links probably will not work because you need to accept an agreement, but the commands should still work once you have the updated firmware on the server. You can find the latest firmware build [HERE]

/usr/StorMan/arcconf romupdate 1 /$path/$to/as716E01.ufi

I've found that in general, updating the BIOS, Firmware, and Driver for RAID cards improves performance and stability, especially update the driver if you are using one of the newer OS's like RedHat 7 / CentOS 7.

How to Configure Per Device (drive) Write Caches

With Adaptec RAID cards you can configure each device to enable / disable it's write cache. Disabling a drive's write cache will significantly reduce performance, but if you must ensure that every write goes to disk, this is how you do it. If you are experiencing poor write performance with an SSD RAID, check if the drive write caches are ENABLED, if they are not, consider enabling them for better random write performance. To get a list of the devices and the channel they are on, run the command below.

/usr/StorMan/arcconf getconfig 1 | grep  Channel

   Channel description                      : SAS/SATA
         Reported Channel,Device(T:L)       : 0,12(12:0)
         Reported Channel,Device(T:L)       : 0,13(13:0)
         Reported Channel,Device(T:L)       : 0,14(14:0)
         Reported Channel,Device(T:L)       : 0,15(15:0)

Once you've got the Channel and Device info you can run the command below, replacing $Channel and $Device_Number with the actual numbers (like 0 12). Specify WT at the end, this stands for Write Through, which is the same thing as disabling the write cache all together.

/usr/StorMan/arcconf setcache 1 DEVICE $Channel $Device_Number WT

To enable device write caches, run this command:

/usr/StorMan/arcconf setcache 1 DEVICE $Channel $Device_Number WB

I ran some quick benchmarks to find out just how big of an impact drive writes caches had on performance, turns out that random write performance is about 100x better with drive write caches enabled. To test this I used the following server:

Intel E3-1271 v3
Adaptec 71605e, using RAID 10 (raid card read and write caches disabled)
1 x Crucial MX200 250GB SSD (OS drive)
4 x Crucial MX200 250GB SSD (Data Drive used for testing)
Ubuntu 14.04.4 LTS

Adaptec 71605e crucial mx200 250GB raid10 drive write cache benchmarks.png

This is the FIO command I am using to test random write performance. Using a file size of 8GB and running the test for 300 seconds, or 5 minutes.

fio --time_based --name=benchmark --size=8G --runtime=300 --filename=rand --ioengine=libaio --randrepeat=0 --iodepth=32 --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k --group_reporting

Here's what 4K Random Write performance looks like with each SSD's write cache DISABLED: IOPS=1104

benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32
benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32
Starting 4 processes
benchmark: Laying out IO file(s) (1 file(s) / 8192MB)
Jobs: 4 (f=4): [wwww] [100.0% done] [0KB/4043KB/0KB /s] [0/1010/0 iops] [eta 00m:00s]
benchmark: (groupid=0, jobs=4): err= 0: pid=1734: Thu Apr  7 19:06:35 2016
  write: io=1294.8MB, bw=4417.7KB/s, iops=1104, runt=300158msec
    slat (usec): min=3, max=507427, avg=287.04, stdev=7783.40
    clat (msec): min=1, max=1235, avg=115.62, stdev=126.09
     lat (msec): min=1, max=1235, avg=115.90, stdev=126.29
    clat percentiles (usec):
     |  1.00th=[ 1144],  5.00th=[ 1208], 10.00th=[ 1224], 20.00th=[ 1624],
     | 30.00th=[ 2928], 40.00th=[34048], 50.00th=[64256], 60.00th=[101888],
     | 70.00th=[246784], 80.00th=[250880], 90.00th=[252928], 95.00th=[284672],
     | 99.00th=[493568], 99.50th=[561152], 99.90th=[839680], 99.95th=[937984],
     | 99.99th=[1056768]
    bw (KB  /s): min=   28, max= 2160, per=25.25%, avg=1115.47, stdev=414.70
    lat (msec) : 2=24.56%, 4=7.50%, 10=1.68%, 20=2.15%, 50=8.82%
    lat (msec) : 100=14.89%, 250=16.55%, 500=22.91%, 750=0.76%, 1000=0.15%
    lat (msec) : 2000=0.02%
  cpu          : usr=0.63%, sys=2.11%, ctx=317036, majf=0, minf=35
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=331450/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=1294.8MB, aggrb=4417KB/s, minb=4417KB/s, maxb=4417KB/s, mint=300158msec, maxt=300158msec

Disk stats (read/write):
    dm-0: ios=0/434647, merge=0/0, ticks=0/62936824, in_queue=68641052, util=100.00%, aggrios=0/339046, aggrmerge=0/95607, aggrticks=0/37729104, aggrin_queue=37728696, aggrutil=100.00%
  sda: ios=0/339046, merge=0/95607, ticks=0/37729104, in_queue=37728696, util=100.00%

Here's what 4K Random Write performance looks like with each SSD's write cache ENABLED: IOPS=95390

benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32
benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32
Starting 4 processes
benchmark: Laying out IO file(s) (1 file(s) / 8192MB)
Jobs: 2 (f=0): [wwEE] [100.0% done] [0KB/366.4MB/0KB /s] [0/93.8K/0 iops] [eta 00m:00s]
benchmark: (groupid=0, jobs=4): err= 0: pid=1235: Thu Apr  7 18:24:54 2016
  write: io=111786MB, bw=381561KB/s, iops=95390, runt=300002msec
    slat (usec): min=1, max=33479, avg= 4.78, stdev=42.75
    clat (usec): min=44, max=537616, avg=1336.39, stdev=2427.25
     lat (usec): min=47, max=537619, avg=1341.24, stdev=2427.64
    clat percentiles (usec):
     |  1.00th=[   84],  5.00th=[  159], 10.00th=[  266], 20.00th=[  494],
     | 30.00th=[  756], 40.00th=[ 1032], 50.00th=[ 1320], 60.00th=[ 1608],
     | 70.00th=[ 1880], 80.00th=[ 2128], 90.00th=[ 2352], 95.00th=[ 2448],
     | 99.00th=[ 2544], 99.50th=[ 2576], 99.90th=[ 5600], 99.95th=[11328],
     | 99.99th=[22912]
    bw (KB  /s): min=25510, max=99248, per=25.09%, avg=95722.49, stdev=4947.63
    lat (usec) : 50=0.01%, 100=1.84%, 250=7.64%, 500=10.72%, 750=9.70%
    lat (usec) : 1000=9.00%
    lat (msec) : 2=35.57%, 4=25.41%, 10=0.05%, 20=0.05%, 50=0.01%
    lat (msec) : 750=0.01%
  cpu          : usr=4.39%, sys=13.20%, ctx=21665880, majf=0, minf=42
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=28617240/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=111786MB, aggrb=381560KB/s, minb=381560KB/s, maxb=381560KB/s, mint=300002msec, maxt=300002msec

Disk stats (read/write):
    dm-0: ios=0/28704755, merge=0/0, ticks=0/39245604, in_queue=39416440, util=100.00%, aggrios=0/28622340, aggrmerge=0/104721, aggrticks=0/38156236, aggrin_queue=38159416, aggrutil=100.00%
  sda: ios=0/28622340, merge=0/104721, ticks=0/38156236, in_queue=38159416, util=100.00%

Set Adaptec Write Cache Mode

If you are using SSD RAID with an Adaptec card you should make sure that you disable writeback caching. This can reduce performance pretty significantly, at least that's what I've seen in my tests. I would suggest seeing for yourself and run some benchmarks to see what cache mode works best for your environment. Typically you can only choose Write Through, or Write Back caching, choose Write Through caching which is basically the same as disabling the cache.

If you are using SSDs in your raid set Logical Drive 0 to WriteThrough (Same as no write cache)

/usr/StorMan/arcconf SETCACHE 1 LOGICALDRIVE 0 wt

If you are using spinning drives and have a BBU installed then set Logical Drive 0 to WriteBack

/usr/StorMan/arcconf SETCACHE 1 LOGICALDRIVE 0 wb

Set Adaptec Read Cache Mode

Set Adaptec Logical Drive 0 to use Read Caching

/usr/StorMan/arcconf SETCACHE 1 LOGICALDRIVE 0 ron

Set Logical Drive to NO Read Caching

/usr/StorMan/arcconf SETCACHE 1 LOGICALDRIVE 0 roff

Verify all drives are recognized by arcconf

This command will list all the drives recognized by the controller, if any drives are missing this will tell you.

/usr/StorMan/arcconf GETCONFIG 1 PD

Run manual drive scan using arcconf

If you are missing drives in your RAID you can use arcconf to attempt a re-scan of all Adaptec RAID card arrays in your server. This can take some time depending on the amount of drives and the condition of the array.

/usr/StorMan/arcconf RESCAN 1

Enable or Disable Adaptec Arcconf Consistency Check

You can modify the frequency of RAID consistency checks using arcconf. After "CONSISTENCYCHECK" you enter in the raid controller ID, if you only have a single RAID card then just use a value of "1". After "PEROID" you specify the amount of days in between each check. The lowest value you can use for "PERIOD" is 10 days, and the highest is 365 days. Adaptec runs the consistency check in the background, but I would still expect slightly slower response times while this consistency check is running so make sure you don't schedule this too frequently.

/usr/StorMan/arcconf CONSISTENCYCHECK 1 PERIOD 30

If you don't care about your data and want to disable any checks, simply use "CONSISTENCYCHECK 1 OFF"

/usr/StorMan/arcconf CONSISTENCYCHECK 1 OFF

Get Adaptec RAID SMART Information

To check the SMART stats for all SSDs or HDDs in a RAID array, use the arcconf command below to get this data. This should display per drive health stats and life left.

/usr/StorMan/arcconf GETSMARTSTATS 1

Get RAID card bios, firmware and driver version info with arcconf cli

If you want to view detailed information about your RAID card's BIOS, firmware and driver version you can use "GETVERSION" which will by default display information for all Adaptec RAID cards on the server.

/usr/StorMan/arcconf GETVERSION 

How to scan for added or removed drives with arcconf

If you recently added, or removed drives from an Adaptec RAID array and want to make sure arcconf got all the changes, run the rescan command which makes the RAID card rescan all of the drives attached to a controller to check for any changes and to make sure all SSDs or HDDs are correctly added to the array. This scan can take 10 minutes or more if you have tons of drives in the RAID array so make sure you plan for this before telling arcconf to rescan all the things.

/usr/StorMan/arcconf RESCAN

How to use Arcconf to update ROM / UFI

To update your RAID card's ROM, or UFI, or whatever you want to call it you can simply run the "ROMUPDATE" command, you should specify the controller ID when you do this, if you only have one RAID card in your server then use "1". You run arcconf ROMUPDATE from the same directory that the UFI file is located otherwise it will fail.

/usr/StorMan/arcconf ROMUPDATE 1 AC12356.UFI

Use arcconf to save RAID card configuration to xml

If you want to save the current configuration of your Adaptec RAID card to an xml file for re-use later on all you need to do is run "SAVECONFIG" then specify where the configuration xml file should be saved. You can also specify where to create a log of this event in case something fails during the save of your configuration.

/usr/StorMan/arcconf SAVECONFIG $/path/to/save/config.xml $/path/to/save/configfile.log

How to debug and troubleshoot with arcconf

From time to time you may encounter random lockups or strange issues that may have been caused by something odd going on in your RAID array. If you want assistance from Adaptec, or just want verbose RAID card logs you can use the "SAVESUPPORTARCHIVE" command which basically grabs a ton of verbose logs and statistics about your RAID card and dumps everything into a directory for analysis which you can then compress and send to Adaptec to have them review the logs for a proper diagnosis.

/usr/StorMan/arcconf SAVESUPPORTARCHIVE 

You can view the output in the following directory (if you are using Linux)

cd /var/log/

How to enable or disable Adaptec RAID Card NCQ

If you want to enable NCQ (Native Command Queuing) for your RAID card's logical devices, simply enter in the controller ID, if you only have one controller use "1".

/usr/StorMan/arcconf SETNCQ 1 ENABLE

If you want to disable NCQ (Native Command Queuing) for your RAID card's logical devices, simply enter in the controller ID, if you only have one controller use "1", however this time you want to say "DISABLE". Generally speaking you want to leave this setting at it's default value which should be enabled.

/usr/StorMan/arcconf SETNCQ 1 DISABLE

Set Performance Mode for Adaptec RAID using Arcconf

By default newer Adaptec 7 series and up have the ability to change their "performance" mode. The default settings is Dynamic which detects RAID IO usage over time and configures the RAID controller settings based on previous activity. For the most part the dynamic mode does a good job.

/usr/StorMan/arcconf SETPERFORM 1 $performance_mode

There are 4 performance modes to choose from.

1 -- Dynamic/Default Performance mode. Dynamically updated based on historical usage
2 -- OLTP/Database Performance mode. If you run mainly MySQL or similar type Database applications then this mode might provide better random read and write performance
3 -- Big Block Bypass. This mode bypasses the RAID DRAM write cache and goes directly to storage. 
4 -- Custom option

If you wanted to select performance mode 2, for OLTP and Databases, run this arcconf command

/usr/StorMan/arcconf SETPERFORM 1 2

Adaptec 71605e Optimization

General Information

User Guide

This card seems to perform very well with SSDs.

The configuration below seems to offer the best overall performance for an SSD RAID. I usually like to use RAID 10 with SSDs and stay away from any parity based RAID types like 5 or 6 which can really reduce write performance due to constant parity writes which lower the life of an SSD. You must make sure that the 71605e is using the latest driver, which should be 1.2.1. If you are using an older driver and you notice random crashes / hangups that mention aac_fib_timeout or something similar, the way to resolve the issue is to install the latest driver. If you do not do this you will continue to encounter random kernel panics. This is not an issue with Adaptec's cards, it is simply a matter of making sure you always use the latest drivers with any type of add on / PCI E card.

Logical device information
Logical device number 0
   Logical device name                      : raid
   Block Size of member drives              : 512 Bytes
   RAID level                               : 10
   Status of logical device                 : Optimal
   Size                                     : 915190 MB
   Parity space                             : 915200 MB
   Stripe-unit size                         : 256 KB
   Read-cache setting                       : Disabled
   Read-cache status                        : Off
   Write-cache setting                      : Disabled
   Write-cache status                       : Off
   Partitioned                              : Yes
   Protected by Hot-Spare                   : No
   Bootable                                 : Yes
   Failed stripes                           : No
   Power settings                           : Disabled

How to upgrade Adaptec RAID card firmware and drivers CentOS 6

This may be out dated, however the overall process should still be the same, just make sure to wget the latest drivers and rpms, then modify the commands below accordingly. Updating to the latest driver and firmware is the best way to resolve any stability issues you might be seeing, also, updated firmware typically helps to improve performance.

tar zxvf aacraid-dkms-1.2.1-40300.tgz
tar zxvf aacraid_linux_rpms_v1.2.1-40300.tgz

yum install gcc kernel-headers kernel-devel
rpm -iv dkms-
rpm -iv aacraid-
rpm -iv aacraid-1.2.1-40300.rpm
cd /usr/src/aacraid-
dkms add -m aacraid -v
dkms build -m aacraid -v
dkms install -m aacraid -v

/usr/StorMan/arcconf getconfig 1

Update Adaptec RAID Driver RedHat 7 and CentOS 7

Once you have downloaded the driver, extract it using tar zxvf then run the commands below. You will need to use the latest epel repo to get access to the correct DKMS package, but once you install that installation should be pretty easy. If you are using CentOS 7 you will want to make sure you are also running the latest kernel version, which should be 3.10.x. You will also want to make sure that you have selected the best tuned-adm profile and that the I/O scheduler is set to noop, especially with large SSD raids and 71605e.

rpm -i aacraid-
rpm -Uvh
yum install dkms kernel-devel-$(uname -r) kernel-headers-$(uname -r)
dkms add -m aacraid -v
dkms build -m aacraid -v
dkms install -m aacraid -v

Latest Adaptec arcconf cli user guide

If you are looking for more detailed arcconf CLI commands you can find a somewhat recent version by clicking on the link below.