Google PerfKitBenchmarker

From wiki.mikejung.biz
Jump to: navigation, search

Google PerfKitBenchmarker Overview

Google recently announced a new benchmarking framework that is meant to make it easier to benchmark cloud servers, or vps servers by including a ton of benchmarks within one easy to use utility. The utility tests "non tuned" servers, meaning that it will use the same / default settings for every VM tested so that performance comparisons between cloud providers can be consistent.

There may be some cases when tweaks will be added for a test: "Only in the rare cause where there is a common practice like setting the buffer pool size of a database do we change any settings."

Google PerfKitBenchmarker Installation

To install PerfKitBenchmarker on Ubuntu 14.10 you can use git to clone the current build and use pip to install all the required dependencies. Please note that PerfKitBenchmarker is designed to be run from a remote computer. The idea is to let PerfKitBenchmarker connect via ssh key and test remote servers from a single location. This means you could setup a single VM somewhere and run the tests from that VM without having to login to all the remote servers you are testing. It is possible to run PerfKitBenchmarker on a local machine without connecting to the cloud.

apt-get install git python-pip python-dev unzip openjdk-7-jre
git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
cd PerfKitBenchmarker/
pip install -r requirements.txt

PerfKitBenchmarker Test Stages

  • GetInfo - This stage returns the name of the benchmark and the number of machines required to run a single instance of the benchmark. This stage also provides a detailed description of the benchmark, and if the benchmark requires a scratch disk.
  • Prepare - This stage takes a list of VMs as an input parameter. The benchmark will then get all binaries required to run the benchmark and will created the necessary data files to run the tests with, if a data file is required.
  • Run - This stage takes a list of VMs as an input parameter. The benchmark will then run the benchmark upon the machines specified. During this stage, a directory will be created with the test results.
  • Cleanup - This stage takes a list of VMs as an input parameter. The benchmark will return the machine to the state it was at before the Prepare was called. Any test files or data will be removed during this stage.

PerfKitBenchmarker Benchmark Explainations

block_storage_workload

Runs FIO in sequential, random, read and write modes to simulate various scenarios.

This benchmark takes the following arguments.

perfkitbenchmarker.benchmarks.block_storage_workloads_benchmark

--iodepth_list: A list of iodepth parameter used by fio command in simulated database and streaming scenarios only. (a comma separated list)

--maxjobs: The maximum allowed number of jobs to support. (default: '0')(an integer)

--workload_mode: <logging|database|streaming>: Simulate a logging, database or streaming scenario.(default: 'logging')

coremark

A simple processor benchmark.

fio

Runs FIO in sequential, random, read and write modes to simulate various scenarios. I've had a ton of issues using the perfkitbenchmarker fio test. For some reason I'm not able to change the scratch directory location, even if I create a new .json file for the VM I want to test the fio.job file will still try to use /scratch0 even though I'm not specifying that anywhere. For more information on this, see the links below. It sounds like they are working on fixing this / making it easier to tell fio where to place the test file on the remote server

The fio test has the following arguments

perfkitbenchmarker.benchmarks.fio_benchmark

--fio_benchmark_filename: scratch file that fio will use (default: 'fio_benchmark_file')

--fio_jobfile: job file that fio will use(default: 'fio.job')

--memory_multiple: size of fio scratch file compared to main memory size.(default: '10')(an integer)

ping

Benchmarks ping latency over internal IP addresses

sysbench_oltp

If you are using a custom VM .json file, you need to make sure that "scratch_disk_mountpoints" is set to a directory that can be used by MySQL, otherwise the Sysbench OLTP test will fail. For instance, if I set the following in my_vm.json

 
"scratch_disk_mountpoints": ["/dev/vda3"],

And try to run the sysbench_oltp test using this command

./pkb.py --benchmarks=sysbench_oltp --static_vm_file=my_vm.json --run_uri=test2

the test will fail, and I will get this error:

perfkitbenchmarker.errors.SshConnectionError: Got non-zero return code (1) executing sudo sed -i "s/datadir=\/var\/lib\/mysql/datadir=\/dev/vda3\/mysql/" /etc/mysql/my.cnf

If you get an error like this and want to know how to get around the issue, you should update scratch_disk_mountpoints to something like

"scratch_disk_mountpoints": ["/home/"],

At this point the test should run just fine.

unixbench

Runs UnixBench.


Configuring Google PerfKitBenchmarker to test non-cloud server

You can use PerfKitBenchmarker to run benchmarks on NON AWS / Google server. All you need to do is create a json file that includes the IP of the server to test, the user name to login as, the location of the ssh key to use to login to the remote server and a zone label which you can change to whatever you want.

cd PerfKitBenchmarker/
vim $server_info.json

[
 {"ip_address": "$IP_of_server_to_run_tests_on",
  "user_name": "root",
  "keyfile_path": "/$path/to/ssh_key",
  "zone": "$whatever_you_want"}
]

Once the file is saved in the main PerfKitBenchmarker directory you can run a command like this which will tell PerfKitBenchmarker to run tests on the server that is specified in the json file.

./pkb.py --benchmarks=fio --static_vm_file=$server_name_to_benchmark.json

PerfKitBenchmarker command line options

Once you have finished installing perfkit you can cd into the perfkitbenchmarker directory, then run the pkb.py script to initiate benchmarks. There are a ton of options you can use with pkb.py, to view all the options run pkb.py --help

cd perfkitbenchmarker/
./pkb.py --help
perfkitbenchmarker.pkb:

--benchmark_config_pair: Benchmark and its config file pair, separated by :. (a comma separated list)

--benchmarks: Benchmarks and/or benchmark sets that will be run. The default is 'standard_set', can specify other test using comma separated list
   
--duration_in_seconds: duration of benchmarks. (only valid for mesh_benchmark)
    
--image: Default image that will be linked to the VM

--log_level: <debug|info>: The log level to run at. The default value is 'info'

--machine_type: Machine types that will be created for benchmarks that don't require a particular type.

--num_vms: For benchmarks which can make use of a variable number of machines, the number of VMs to use. Default value is '1'

--owner: Owner name. Used to tag created resources and performance records. Default value is 'root'

--parallelism: The number of benchmarks to run in parallel. The default value is '1'

--project: GCP project ID under which to create the virtual machines

--run_stage: <all|prepare|run|cleanup>: The stage of perfkitbenchmarker to run. By default it runs all stages.

--run_uri: Name of the Run. If provided, this should be alphanumeric and less than or equal to 10 characters in length.

--scratch_disk_size: Size, in gb, for all scratch disks, default is 500 (500GB)

--scratch_disk_type: <standard|ssd|iops>: Type for all scratch disks. The default is standard

--ssh_options: Additional options to pass to ssh. (default: '')

--static_vm_file: The file path for the Static Machine file. See static_virtual_machine.py for a description of this file.

--[no]use_local_disk: For benchmarks that use disks, this tells the test to use local disk(s) instead of remote storage if set to 'true', default is 'false;.

--[no]version: Display the version and exit. Default value is 'false'

--zones: A list of zones within which to run PerfKitBenchmarker. This is specific to the cloud provider you are running on. If multiple zones are given, PerfKitBenchmarker will create 1 VM in zone, until enough VMs are created as specified in each benchmark.

PerfKitBenchmarker Benchmarks and Benchmark Sets

./pkb.py run 

Benchmarks:
        aerospike:  Runs Aerospike
        block_storage_workload:  Runs FIO in sequential, random, read and write modes to simulate various scenarios.
        bonnie++:  Runs Bonnie++.
        cassandra_stress:  Benchmark Cassandra using cassandra-stress
        cluster_boot:  Create a cluster, record all times to boot
        copy_throughput:  Get cp and scp performance between vms.
        coremark:  Run Coremark a simple processor benchmark
        fio:  Runs fio in sequential, random, read and write modes.
        hadoop_terasort:  Runs Terasort
        hpcc:  Runs HPCC.
        iperf:  Run iperf
        mesh_network:  Measures VM to VM cross section bandwidth in a mesh network.
        mongodb:  Run YCSB against MongoDB.
        netperf:  Run TCP_RR, TCP_CRR, UDP_RR and TCP_STREAM Netperf benchmarks
        object_storage_service:  Object/blob storage service benchmarks.
        ping:  Benchmarks ping latency over internal IP addresses
        redis:  Run memtier_benchmark against Redis.
        speccpu2006:  Run Spec CPU2006
        sysbench_oltp:  Runs Sysbench OLTP
        unixbench:  Runs UnixBench.

Benchmark Sets:
        arm_set:  ARM benchmark set.
        mit_set:  Massachusetts Institute of Technology benchmark set.
        canonical_set:  Canonical benchmark set.
        google_set:  This benchmark set is maintained by Google Cloud Platform Performance Team.
        rackspace_set:  Rackspace benchmark set.
        stanford_set:  Stanford University benchmark set.
        thesys_technologies_set:  Thesys Technologies LLC. benchmark set.
        cloudspectator_set:  CloudSpectator benchmark set.
        cisco_set:  Cisco benchmark set.
        tradeworx_set:  Tradeworx Inc. benchmark set.
        mellanox_set:  Mellanox benchmark set.
        intel_set:  Intel benchmark set.
        microsoft_set:  Microsoft benchmark set.
        qualcomm_technologies_set:  Qualcomm Technologies, Inc. benchmark set.
        cloudharmony_set:  CloudHarmony benchmark set.
        ecocloud_epfl_set:  EcoCloud/EPFL benchmark set.
        red_hat_set:  Red Hat benchmark set.
        broadcom_set:  Broadcom benchmark set.
        centurylinkcloud_set:  This benchmark set is supported on CenturyLink Cloud.
        standard_set:  The standard_set is a community agreed upon set of benchmarks to measure Cloud performance.

Google PerfKitBenchmarker Configuration