Load Balancing

From wiki.mikejung.biz
Jump to: navigation, search

Liquidweb 728x90.jpg

Basic Concepts and Terminology[edit]

  • In a Stingray traffic management system, you configure a Virtual Server object to manage connections from remote clients, and a Pool object to manage connections to your local servers. The remote clients in this case are Apache Servers, or even PHP / Application Servers.

In addition to normal web servers, you can also load balance caches like Varnish or Memcached. The options for what you can load balance are almost limitless, but in general you want to start off with the basic services and make sure that works before you try and come up with some elaborate load balanced option.


  • A pool is a collection of nodes. Each node corresponds to a back-end server and port, such as server1.mysite.com:80. You can set up several pools, and they may have nodes in common. This can be useful if you want to have some overlapping pools for failover or redundancy reasons. If you had 4 webservers that you wanted to load balance, you would want to have 4 nodes in the HTTP pool.
  • A virtual server assigns requests to a pool, which load-balances them across its nodes. Each node in the pool must be able to receive requests through the port specified, using the virtual server’s protocol.
  • As well as load balancing, each pool has its own settings for session persistence, SSL encryption of traffic, and auto-scaling. These can all be edited via the configuration page for that pool.
  • To access the configuration page for a pool, click the Services button in the top menu bar and then click the Pools tab. All your pools are listed. You can create a new pool

on this page, or click the name of an existing one to access the Pools > Edit page for that pool.

Virtual Server[edit]

  • A virtual server on a traffic manager machine processes incoming network traffic, and will typically handle all of the traffic for a certain protocol (HTTP, FTP etc) 1. It has a default pool which it sends traffic to, but it first runs a list of rules and these rules may select a different pool to use. The traffic is balanced across the nodes in the selected pool.
  • To modify settings for a virtual server, click the Services button and then the Virtual Servers tab. Your virtual servers are listed. You can create a new virtual server here, or click the name of an existing one to access the Virtual Servers > Edit page for that virtual server.
  • Generally, you should plan to run one Virtual Server for each distinct service you are running, i.e., each TCP or UDP port you are accepting traffic on.

Request Rule[edit]

  • A request rule can do much more than just select a pool. It can read an entire request, inspect and rewrite it, and control how the other traffic management features on the traffic manager are used to process that particular request. It can select the pool based on the contents of the request.

Response Rule[edit]

  • Response rules are run to process responses; they can inspect and rewrite responses, control how the response is processed, or even instruct the traffic manager to try the request again against a different pool or node.

Load Balancing Algorithms[edit]

The traffic manager offers a choice of load-balancing algorithms which distribute requests among the nodes in the pool. The algorithms are as follows:

Round Robin

  • Connections are routed to each of the back-end servers in turn.

Weighted Round Robin

  • As for Round Robin, but with different proportions of traffic directed to each node. The weighting for each node must be specified using the entry boxes provided.


  • Monitors the load and response times of each node, and predicts the best distribution of traffic. This optimizes response times and ensures that no one server is overloaded.

Least Connections

  • Chooses the back-end server which currently has the smallest number of connections.

Weighted Least Connections

  • Chooses the back-end server which currently has the smallest number of connections, scaled by the weight of each server. Weights can be specified in the entry boxes provided.

Fastest Response Time

  • Sends traffic to the back-end server currently giving the fastest response time.

Random Node

  • Chooses a back-end server at random.

Which load balancing method is best?[edit]

  • Least Connections is generally the best load balancing algorithm for homogeneous traffic, where every request puts the same load on the back-end server and where every back-end server is the same performance. The majority of HTTP services fall into this situation. Even if some requests generate more load than others (for example, a database lookup compared to an image retrieval), the ‘least connections’ method will evenly distribute requests across the machines and if there are sufficient requests of each type, the load will be very effectively shared. Weighted Least Connections is a refinement which can be used when the servers have different capacities; servers with larger weights will receive more connections in proportion to their weights.

Least Connections is not appropriate when individual high-load requests cause significant slowdowns, and these requests are infrequent. Neither is it appropriate when the different servers have different capacities. The Fastest Response Time algorithm will send requests to the server that is performing best (responding most quickly), but it is a reactive algorithm (it only notices slowdowns after the event) so it can often overload a fast server and create a choppy performance profile.

  • Perceptive is designed to take the best features of both ‘Least Connections’ and ‘Fastest Response’. It adapts according to the nature of the traffic and the performance of the servers; it will lean towards 'least connections' when traffic is homogeneous, and 'fastest response time' when the loads are very variable. It uses a combination of the number of current connections and recent response times to trend and predict the performance of each server.

Under this algorithm, traffic is introduced to a new server (or a server that has returned from a failed state) gently, and is progressively ramped up to full operability. When a new server is added to a pool, the algorithm tries it with a single request, and if it receives a reply, gradually increases the number of requests it sends the new server until it is receiving the same proportion of the load as other equivalent nodes in the pool. This ramping is done in an adaptive way, dependent on the responsiveness of the server. So, for example, a new web server serving a small quantity of static content will very quickly be ramped up to full speed, whereas a Java application server that compiles JSPs the first time they are used (and so is slow to respond to begin with) will be ramped up more slowly.

  • ‘Least Connections’ is simpler and more deterministic than ‘Perceptive’, so should be used in preference when appropriate.

Managing Services[edit]

1) The Admin UI home page shows that you have not yet created any pools or virtual servers. Click the “Wizards:” drop-down box and choose Manage a New Service to step through the wizard.

2) Specify a name which you will use to identify the virtual server within the configuration interface. Choose a protocol and port for the virtual server (e.g. HTTP, default port 80).

3) Create a list of back-end nodes, which will form the default pool for the virtual server The nodes are identified by hostname and port, and you can modify them later from the Pools > Edit page. You should ensure that you can serve content directly from the hostname/port combinations you specify.

4) Finally, review the settings you have chosen before clicking Finish.

5) You can now test your traffic manager setup by browsing to it, using the port you set up for your new service:


Creating a Cluster[edit]

  • If you are deploying two or more Stingray traffic managers in a cluster, you should first perform the initial configuration process for each one.
  • Before making any other changes, you should join the traffic managers together to form a cluster
  • If you are creating a new traffic manager cluster from scratch, you should choose one traffic manager as the first cluster member. Then log in to the Admin UI of each subsequent traffic manager and use the Join a cluster wizard to join to the first. Each time you perform the operation, the expanding cluster will be shown in the wizard.

Traffic IP Addresses[edit]

  • The traffic manager’s fault tolerance capability allows you to configure 'Traffic IP addresses. These IP addresses are not tied to individual machines, and the traffic manager cluster ensures that each IP address is fully available, even if some of the clustered traffic manager machines have failed.

Traffic IP Address modes:

Single-hosted Traffic IPs are raised on a single traffic manager in your fault tolerant cluster. If that traffic manager fails, another traffic manager will raise that IP address and start accepting traffic.

Multi-hosted Traffic IPs are raised on all of the traffic managers in your cluster, using a multicast MAC address that ensures that all incoming traffic is sent to all machines. A custom Linux kernel module is used to evenly distribute the traffic between the working traffic managers.

  • Enabling Multi-Hosted Traffic IPs imposes a performance hit due to the additional packet processing required by the traffic managers in your cluster. Empirical tests indicate that CPU utilization will increase by 25-30% at moderate traffic levels (10,000 requests per second), with a corresponding limit on top-end capacity.

Example Cluster Configurations[edit]


You can set up a single-hosted traffic IP group spanning both traffic manager machines, containing this single IP address. The traffic managers will negotiate and one of them will raise the IP address. It handles all the incoming requests. The second traffic manager machine is available on standby. If the first machine should fail, the second machine takes over the IP address and starts to manage the traffic. The advantage of this configuration is that you can be confident that there is sufficient resource in reserve to handle the traffic should one of the two traffic managers fail. Debugging and fault resolution is easier when only one traffic manager is handling traffic.


  • In an active-active configuration, both traffic managers manage your traffic. The distribution mode (single-hosted IP or multi-hosted IP) controls how the traffic is shared between them.
  • With single-hosted mode, you can configure two Traffic IP Addresses in a Traffic IP Group, and configure your DNS name to map to the two addresses, such as and The traffic managers will negotiate to raise one traffic IP address each. A DNS server can allocate requests to each IP address in turn (round-robin DNS), and each traffic manager handles the requests it receives.
  • If one of the machines fails, the other machine will detect this. It will then raise the failed machine’s traffic IP address in addition to its own, and handle all the traffic

sent to either address.

  • With multi-hosted mode, you can continue to operate with one Traffic IP Address, simplifying your DNS and reducing the number of externally-facing IP addresses you require. The Traffic IP Address is raised on all of the traffic managers in the Traffic IP Group, and incoming traffic to that IP address is shared evenly between the traffic managers.


A virtual server can decrypt SSL traffic. This can be useful for two reasons:

1) After decryption, a rule can analyze the request's headers and contents to make an informed routing decision. Without decrypting the packets very little information is available.

2) Decrypting requests requires processing power. It may be more efficient if the traffic manager decrypts requests before passing them on to the nodes, reducing the load on the back-end servers.

  • To set up a virtual server to decrypt SSL traffic, go to the Virtual Servers > Edit page for that virtual server and click on SSL Decryption. You can choose whether to decrypt traffic, and which certificate from the SSL Certificates Catalog to use.
  • If the protocol value for the virtual server is set to ‘SSL’, this indicates that the virtual server is just forwarding SSL traffic in SSL pass-through mode.
  • If you want to configure SSL decryption, you must first change the protocol value to the correct value for the internal protocol (e.g. HTTP). In this case, your pools are probably sending traffic to nodes which expect SSL encrypted traffic, so you will also need to configure SSL encryption in the pools.

Session Persistence[edit]

Session persistence is the process by which all requests from the same client session are sent to the same back-end server. It can be used for any TCP or UDP protocol.

  • A pool serving static web content usually has no requirement for session persistence; each page or image for a particular client can be served from a different machine with no ill effects. Another pool, serving an online shopping site, may use session persistence to ensure that a user's requests are always directed to the node holding details of their shopping basket.
  • The traffic manager offers several methods to identify requests which belong to the same session. A variety of different cookies can be used; persistence can be based on a rule; or the client’s IP address can be used to identify sessions. If incoming traffic is SSL-encrypted, the SSL session ID can be used.
  • You can choose what to do if a persistent session is lost. This might be due to invalid session data, or because the node handling it has failed. In this case you can choose to close the connection, have requests sent to a new node, or redirect the user to a specified URL such as an error page.
  • You can apply session persistence to a pool by clicking the Session Persistence link on the Pools > Edit page for that pool. Select a session persistence class and click the Update button.

Interesting Settings[edit]

System > Backups[edit]

  • Might be a good idea to use this?

System > Global Settings > Cache Settings:[edit]


  • The maximum size of the HTTP web page cache. This is specified as either a percentage of system RAM, 20% for example, or an absolute size such as 200MB.
  • Default 20% (1.4GB on an 8GB server)


  • Maximum number of entries in the cache. Approximately 1.4 KB will be pre-allocated per entry for metadata, this is in addition to the memory reserved for the content cache.
  • Default 10000


  • Largest size of a cacheable object in the cache. This is specified as either a percentage of the total cache size, 2% for example, or an absolute size such as 20MB.
  • Default 2%


  • Whether or not to use a disk-backed (typically SSD) cache. If set to Yes cached web pages will be stored in a file on disk. This enables the traffic manager to use a cache that is larger than available RAM. The webcache!size setting should also be adjusted to select a suitable maximum size based on your disk space.

Note that the disk caching is optimized for use with SSD storage

  • Default no


  • The maximum number of entries in the IP session cache. This is used to provide session persistence based on the source IP address. Approximately 100 bytes will be pre-allocated per entry.
  • Default: 2048

Installing Mod_Zeus[edit]


You need to make sure that Nginx is compiled with:


This goes in the nginx.conf file:

   # set real ip for zeus
   set_real_ip_from   $ip;
   set_real_ip_from   $ip;
   real_ip_header     X-Cluster-Client-Ip;

Overview and Links for Stingray and Riverbed Traffic Manager[edit]


License Guide

Spec Sheet

Moar Documents

Traffic Manager User Guide