You don’t have to be Stephen Hawking to get high numbers of website visits. But just like the Cambridge University website crumbled as fans rushed to read Prof Hawking’s PhD thesis, your site could be knocked offline by a large spike in web traffic.

Fortunately, you also don’t need to be a theoretical physicist to manage large traffic volumes as efficiently as possible. One of the most effective solutions for maximising server performance is a load balancer.

Load balancer definition

What is a load balancer? It’s a piece of software or hardware that distributes web traffic across multiple servers.

By efficiently spreading the workload across multiple physical or virtual machines in a cluster, a load balancer works to maximise the availability of a website or application.

Load balancing can also increase performance by maximising network throughput – the rate at which data can be successfully transferred.

How do load balancers work?

A load balancer sits between a cluster of two or more servers and the public internet. It receives requests from clients (e.g. web browsers) and directs them to the appropriate server.

Even though the load balancer is communicating back and forth between clients and servers, the client is none the wiser. As far as the client is concerned, it’s always talking directly to a back-end server.

Availability and flexibility

By regulating traffic across several servers, a load balancer prevents individual machines becoming swamped with requests. This ensures services are available for more concurrent users, with faster loading times and a more consistent experience, even if individual servers are put under pressure or taken offline altogether.

A load-balanced setup not only provides increased availability for end-users, but also offers increased flexibility on the back-end.

With a load balancer, you can perform maintenance and updates on servers in your cluster, or even take individual machines offline, safe in the knowledge that availability will always be maintained.

Load balancing: a persistent issue

When discussing load balancing, ‘persistence’ refers to the information carried over multiple requests during a user’s session.

Some session information is stored in the browser; items added to a user’s shopping basket on an ecommerce site, for example. But when session information is stored on the server – e.g. when the user proceeds to checkout and logs in to their account – persistence gets a bit trickier.

If a user’s requests initially connect to one server that stores session information locally, but subsequent requests go to other servers in the cluster, the session information may no longer be available.

So how do you ensure persistence with a load balancer?

For a consistent user experience, a load-balanced system can ensure that each session is handled by a single server, instead of redirecting the user to a new server for each request. Other persistence solutions include shared storage and databases that can be accessed by every server in the cluster.

Of course, if persistence isn’t required, there’s no need to implement it. But for complex, interactive websites such as online shops, persistence is a business-critical feature.

On the Fasthosts CloudNX platform, you can enable persistence for up to 1200 seconds – ensuring each client will have all its requests directed to the same server within the set time period.

Round robin vs least connections

The end-goal of a load balancer is to distribute traffic efficiently among servers, but there are several different procedures it can follow to achieve this. Two of the most widely implemented load balancing algorithms are known as ‘round robin’ and ‘least connections’.

Round robin is the simplest and most common load balancing technique. In a round robin setup, the load balancer runs through a sequence of servers, handing off each new request to the next server on the list. This ensures that requests are distributed evenly across the cluster.

While the round robin method is easy to implement, it can have drawbacks. Even if each server in the cluster is assigned the same number of requests, clients may stay connected to a particularly unlucky server for longer, meaning a higher workload for that machine.

An alternative to round robin is least connections. As the name implies, this procedure looks for the server with the lowest number of current connections before handing it a new request.

By taking the current load of each server into consideration, least connections can be a more effective load balancer algorithm in clusters where clients tend to stay connected to a particular server for longer.

On CloudNX, it’s simple to set up either round robin or least connections on your load balancer. Just select your load balancing algorithm of choice when you create a new load balancer in the Cloud Panel, or edit existing load balancers to change the balancing procedure at any time.

Security benefits of load balancing

As well as boosting availability and performance, load balancing can also provide security advantages. A load balancer can act as a gatekeeper between your servers and the public internet, preventing clients from contacting back-end servers directly.

This can be achieved by configuring your load balancer to forward ports. For example, by taking incoming HTTP requests on the standard port (80), then forwarding them on to a custom port on a server.

Port forwarding can be useful if you have an application running on a custom port, and you don’t want to expose it to the public internet. It’s a simple case of setting the app to only accept connections from your load balancer.

At Fasthosts, we provide a fully configurable load balancer for Cloud Servers on CloudNX. Build your own custom cloud infrastructure with the performance and resilience of load balancing, plus a full range of enterprise features. Our Web Hosting also has intelligent load balancing built into the platform, ensuring peak performance even under heavy load or during maintenance.