Hardware

From wiki.mikejung.biz
Jump to: navigation, search

Liquidweb 728x90.jpg

PCIe Flash

Virident FlashMax 2 1.1TB

Setup and Installation

Installation is pretty simple for this card. Just make sure to have proper cooling, as this card gets pretty hot under heavy IO. The card will start to throttle performance once it hits the mid 70C range, so if you notice poor performance, make sure that you are cooling the card properly.

For CentOS 6.5, all you should need to do is install the following packages:

rpm -ivh kmod-vgc-redhat6.1+-4.1-68415.C9B.x86_64.rpm
rpm –ivh vgc-utils-4.1.68415.C9B.x86_64.rpm

Make sure that the BIOS is configured correctly. If you allow for power saving features, you might notice poor performance.

Make Sure CPU / SYS Fans are set to high
Disable: CPU performance states, C States, EIST
Disable: C1E States / C1 enhanced states
Disable: Avtive State power management

Configuration and Commands

The Virident FlashMax 2, 1.1TB card has two operating modes. One is "High Capacity", which provides more usable space, but slightly reduces performance. The other mode is "Max Performance" which gives you slightly less usable space, but provides better performance. To set the card to Max Performance, you can run the following: NOTE: This will wipe the card, so be careful!

vgc-config -d vgca -n 1 -m maxperformance

## Or, for Capacity

vgc-config -d vgca -n 1 -m maxcapacity

To view basic health statistics for the Virident FlashMax 2:

vgc-monitor -d vgca

Output provides a lot of useful information, such as the temperature, if the card is being throttled, and the total amount of READ and WRITE data, along with life remaining, and reserves remaining. Pretty basic, but very useful!

vgc-monitor: 4.1.68415.C9B

Driver Uptime: 0:39
Card Name      Num Partitions      Card Type         Status
vgca           1                   VIR-M2-LP-1100-2B Good

  Serial Number      : $Serial           
  Card Info          : Part: $number
                       Rev : FlashMAX II 60174, x8 Gen2
                       Boot ROM : not applicable
  Temperature        : 67 C (Safe)
  Temp Throttle      : Inactive
  Card State Details : Normal
  Action Required    : None

  Partition      Usable Capacity     RAID      
  vgca0          923 GB              enabled   

    Mode                          : maxperformance      
    Total Flash Bytes             : 145248278787072 (145.25TB) (reads)
                                    218961891983360 (218.96TB) (writes)
    Remaining Life                : 99.03%              
    Partition State               : READY               
    Flash Reserves Left           : 100.00%

FusionIO

Install

Usually, it is best to just build a new RPM for your kernel version, make sure the packages needed are installed:

yum install kernel-headers-`uname -r` kernel-devel-`uname -r` gcc rsync rpm-build make

Rebuild the RPM using your Kernel, then install the driver:

rpmbuild --rebuild iomemory-vsl-3.2.4.1086-1.0.el6.src.rpm
rpm -Uvh /root/rpmbuild/RPMS/x86_64/iomemory-vsl-2.6.32-358.11.1.el6.x86_64-3.2.4.1086-1.0.el6.x86_64.rpm

Once all the packages are downloaded, run this to install the rpms (might be different depending on the OS)

rpm -Uvh fio-common-3.2.4.1086-1.0.el6.x86_64.rpm fio-firmware-fusion-3.2.4.20130604-1.noarch.rpm fio-sysvinit-3.2.4.1086-1.0.el6.x86_64.rpm fio-util-3.2.4.1086-1.0.el6.x86_64.rpm libvsl-3.2.4.1086-1.0.el6.x86_64.rpm

Once all the packages are installed, make sure that the card is updated with the latest firmware.

fio-update-iodrive fusion_3.2.4-20130604.fff

Make sure the module is loaded

modprobe iomemory-vsl

Check the status to make sure the drive is happy

fio-status

Optimization

To improve Write performance, you can under-provision the device so that it only uses 80 - 90% of the available space:

fio-format -s 90% /dev/fct0

or

fio-format -s 80% /dev/fct0


Things to check for and disable in the BIOS (mileage may vary)

Processor and Clock Options ->

Simultaneous Multi-Threading -> Disabled
Intel Turbomode Boost -> Disabled
C1E Support -> Disabled
Intel C-State Tech -> Disabled

Advanced Chipset Control -> CPU Bridge Configuration -> Throttling Closed
Loop -> Disabled

Hardware Health Configuration -> FAN Speed Control Modes -> Full Speed / FS

Disable IRQ Balance and CPU speed scaling

apt-get -y remove irqbalance

cp /etc/init.d/ondemand /etc/init.d/ondemand.bak
sed -i 's|echo -n ondemand|echo -n performance|' /etc/init.d/ondemand
/etc/init.d/ondemand stop
/etc/init.d/ondemand start

Splitting The Drive

You can split one FusionIO device into two virtual devices if you want to. This should not effect performance too much, and in some cases doing this can actually double the overall performance of the drive, since you can do double the workload using the same device. To split the device, do the following:

fio-update-iodrive --split -d /dev/fct0 <firmware-path>
reboot
modprobe iomemory-vsl
fio-format /dev/fct0 /dev/fct1

Raid Controllers

Adaptec 7 Series Info

71605e



Hardware Information

Adaptec

Adaptec ASR 6405e

  • https://www.adaptec.com/en-us/products/series/6e/
  • 4 internal ports
  • 6 Gb/s throughput at each port
  • PMC-Sierra PM8013 Dual Core RAID on Chip (ROC)
  • SAS 2.0 interfaces and PCIe Gen 2 Host Connection
  • Hybrid RAID 1 & 10: SSD + HDD for Maximum Performance and Reliability
  • Enclosure management support via LED header and SES2/SGPIO

DataSheet https://www.adaptec.com/nr/pdfs/ds_series_6e.pdf

Adaptec ASR 5805 512MB

DataSheet http://www.adaptec.com/nr/pdfs/ds_series5.pdf

Adaptec ASR 2405

DataSheet https://www.adaptec.com/nr/pdfs/ds_series2.pdf

LSI

LSI MegaRAID SAS 9260-4i

DataSheet http://www.lsi.com/downloads/Public/MegaRAID%20SAS/MegaRAID%20SAS%209260-4i/MR_SAS9260-4i_PB_FIN_071212.pdf

LSI MegaRAID SAS 9260-8i

DataSheet http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/MR_SAS9260-8i_9260DE-8i_PB_FIN_090110.pdf

LSI MegaRAID SAS 9260-16i

DataSheet http://www.lsi.com/downloads/Public/RAID%20Controllers/RAID%20Controllers%20Common%20Files/MegaRAID_SAS9260_16i_web.pdf

LSI MegaRAID SAS 8704EM2

LSI MegaRAID SAS 8704ELP

Configuration Terminology

Read Policies

Always Read Ahead

  • This specifies that the controller uses read-ahead if the two most recent disk accesses occurred in sequential sectors. If all read requests are random, the algorithm does not read ahead, however all requests are continually evaluated for possible sequential operation.

No Read Ahead

  • Only the requested data is read and the controller does not read ahead any data.

Write Cache Policies

Write-Through

  • Caching strategy where data is committed to disk before a completion status is returned to the host operating system
  • Considered more secure, since a power failure will be less likely to cause undetected drive write data loss with no battery-backed cache present
  • Data is moved directly from the host to the disks, avoiding copying the data intermediary into cache which can improve overall performance for streaming workloads if Direct IO mode is set

Write-Back

  • A caching strategy where write operations result in a completion status being sent to the host operating system as soon as data is in written to the RAID cache. Data is written to the disk when it is forced out of controller cache memory.
  • Write-Back is more efficient if the temporal and/or spatial locality of the requests is smaller than the controller cache size.
  • Write-Back is more efficient in environments with “bursty” write activity.
  • Battery backed cache can be used to protect against data loss as a result of a power failure or system crash.

Data Placement Policies

Direct IO

  • All read data is transferred directly to host memory bypassing RAID controller cache. Any Read Ahead data is cached.
  • All write data is transferred directly from host memory bypassing RAID controller cache if Write-Through cache mode is set
  • Recommended for all configurations

Cached IO

  • All read and write data passes through controller cache memory on its way to or from the host (including write data in write-through mode.)
  • Required ONLY for CacheCade v1.1 read-only caching, not recommended for CacheCade v2.x and higher or any other configurations.

LSI suggested configuration

HDD / SAS

  • Stripe: 256KB
  • Read Policy: Always read through
  • Write Policy: Write back from Transactional / random
  • IO Policy: Direct IO
  • Disk Cache Policy: enable
  • Adaptive Read Ahead suggested for HDD configs

SSD

  • Stripe Size: 256KB
  • Read Policy: No read through
  • Write Policy: Write through
  • IO Policy: Direct IO
  • Disk Cache Policy: Enable
  • No Read Ahead suggested for all SSDs

LSI Suggested Settings for SSDs

Transactional:

  • RAID Write Cache -- Disabled
  • RAID Read Cache -- No Read Ahead
  • Stripe Size -- 64KB

Streaming:

  • RAID Write Cache -- Enabled
  • RAID Read Cache -- Always Read Ahead
  • Stripe Size -- 64 KB

Write Through

  • More Secure, prevents loss of data during power outage
  • Data moves directly from host to the disks, avoiding the RAID card's Cache
  • Overall, improves steaming (sequential) reads and writes

Write Back

  • Less secure, data can be lost during a power outage
  • Data is written from the host to the RAID card's Cache, then to disk
  • Useful if the data being written is smaller than the controller's Cache
  • More efficient with "bursty" write activity
  • BBU's can be used to negate power loss concerns, but are expensive

Direct IO

  • All read data is transferred directly to the host's memory, bypassing RAID card's Cache
  • Any Read Ahead data is cached
  • All write data is transferred directly from host's memory, bypassing RAID card's Cache if Write Through is enabled
  • LSI suggests this for ALL configurations

Cached IO

  • ALL read and write data passes through the RAID card's Cache to or from the host, including Write Through data
  • Required only for CacheCade v1.1 read-only caching
  • LSI does not suggest this for CacheCade v2 or higher, or any other configurations.

SSD

Enabling and Testing Trim

  • Most modern SSDs have a built-in, automatic garbage collection system, so for the most part they take care of themselves, however, there is file system support when using EXT4, so when files are removed, TRIM is automatically ran. This might have some performance impact on systems that delete files constantly, but TRIM is very important for the overall health of the drive.

To enable TRIM for the file system:

vim /etc/fstab

##Find out what partitions are using the SSD, if only one drive in system, it should be /
##Add the following to the mount options:

discard
  • Reboot the system for this to take effect. Once it is back online, check to make sure discard is enabled:
mount

To test out TRIM, do the following:

dd if=/dev/urandom of=tempfile count=100 bs=512k oflag=direct

##Find out what the STARTING LBA address is of the file:
hdparm --fibmap tempfile

##Read the first address using the following command (change /dev if needed)
hdparm --read-sector #LBA_address /dev/sda

##There should be a bunch of numbers there, now remove the file:
rm tempfile
sync

##Use the same command as above to re-read the LBA, it should now be all zeros.
hdparm --read-sector #LBA_address /dev/sda
  • If you get all zeros, then awesome, TRIM is enabled. If you do not get all zeros, then something is not working correctly.

General Performance Tuning

  • In /etc/fstab Add discard to the main partition to enable TRIM

Check IO scheduler using:

cat /sys/block/sda/queue/scheduler

There are two different options here, I will need to test both, but either one should perform better than the default:

deadline for better performance:

echo deadline > /sys/block/sda/queue/scheduler
echo 1 > /sys/block/sda/queue/iosched/fifo_batch

noop for better performance:

echo noop > /sys/block/sdb/queue/scheduler

You can also make changes using the following method:

block/sda/queue/scheduler = noop to your /etc/sysfs.conf
OR elevator=noop to the kernel boot parameters in your /etc/default/grub (assuming only 1 SSD is in use on the system)

Example Grub line:

linux     /boot/vmlinuz-2.6.31-15-generic root=UUID=b5c7bed7-58f1-4d03-88f4-15db4e367fa0 ro   quiet splash elevator=noop
  • Add noatime to the SSD mount in /etc/fstab
  • Also add /tmp to mount on RAM to increase the lifespan of the drive. Do this in /etc/fstab
  • Check Motherboard for on-board disk caching. If it is set to Write Through, change to Write-Back. Once enabled in Bios, you can change this on the fly with the following:
##Enable Write-Back caching:
hdparm -W1 /dev/sda

##Disable Write-Back caching:
hdparm -W0 /dev/sda
  • SSDs usually perform better with Write-Back enabled.
  • To make these changes persist between reboots, add the commands to the /etc/rc.local file.

Intel DC S3700 Series

  • www.tomshardware.com/reviews/ssd-dc-s3700-enterprise-storage,3352.html

100 GB Specs

  • eight-channel controller
  • 25 nm HET-MLC NAND
  • Rated at 500MB/s Sequential Read
  • Rated at 200MB/s Sequential Writes
  • 4K Random Read 75,000 IOPS
  • 4K Random Write 19,000 IOPS
  • 1.83 PB RATED write endurance

SSD Kernel Block Parameters

For SSD based servers that need to have the fastest IO possible, it's best to apply these settings. Replace "$dev" with the ssd device.

echo 0 > /sys/block/$dev/queue/add_random
echo 0 > /sys/block/$dev/queue/rotational
echo 2 > /sys/block/$dev/queue/rq_affinity
echo noop > /sys/block/$dev/queue/scheduler

SSD BIOS and Grub Optimizations

  • Bios Tweaks -- Disable c-states in bios
  • Grub Tweaks -- Disable c-states in boot loader
intel_idle.max_cstate=0
processor.max_cstate=0

HDD

mdadm(software RAID)

While not required, it's a good idea to create a conf file to make sure the RAID is noted and initialized correctly /etc/mdadm.conf is the location

To create a full config file, use the following commands:

sh -c 'echo DEVICE /dev/$disk1 /dev/$disk2 /dev/$disk3 > /etc/mdadm.conf'
sh -c 'mdadm --detail --scan >> /etc/mdadm.conf'
cat /etc/mdadm.conf

Example commands

Create RAID5 out of three disks

mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1

Watch the status of a RAID

watch cat /proc/mdstat

HDparm

##Dumps lots of identifying and status information
hdparm -I /dev/$disk

##Queries the drive’s current power management state
hdparm -C /dev/$disk

##Quick-tests interface bandwidth (no actual disk reads)
hdparm -T /dev/$disk

##Quick-tests overall platter-to-host sequential reads
hdparm -t /dev/$disk

Finding information on the drives:

SCSI

sdparm /dev/sda

Check for disk age:

smartctl -a /dev/sda | grep Power_On_Hours

Run a smart test (quick)

smartctl -t short /dev/whateverdevice

CPU

CPU SPEED

/etc/init.d/cpuspeed stop
chkconfig cpuspeed off
cpuspeed -C
grep MHz /proc/cpuinfo 

Set CPU Governor to Performance Ubuntu

Install cpufrequtils, then set each CPU to performance. Just replace the "for each" range to however many CPUs you have.

apt-get install cpufrequtils
for each in {0..4} ; do cpufreq-set -c $each -g performance; done

You can use the powersave governor if you would rather save power or lower heat output for your CPU.

for each in {0..4} ; do cpufreq-set -c $each -g powersave; done


Disable c-states in grub

vim /etc/default/grub

##Modify CMDLINE

GRUB_CMDLINE_LINUX="intel_idle.max_cstate=0 processor.max_cstate=0"

##Save File and update grub

update-grub
reboot

NIC

Update NIC Driver

cd /usr/local/src && 
wget -q -O -http://downloadmirror.intel.com/9180/eng/e1000-8.0.35.tar.gz  
tar zxv e1000-8.0.35.tar.gz
cd /usr/local/src/e1000-8.0.35/src/  
make install
ifdown eth0 && rmmod e1000 && modprobe e1000
&& ifup eth0 && /etc/init.d/ipaliases restart

File system commands

Create a new file system

mkfs.ext3 /dev/parition
  • you can use ext4, or whatever you want really, the above is just an example.