Techniques for Creating Scalable and Highly Available Sites

Now that you have a fairly good understanding of scalability and availability, the next step is to familiarize yourself with the techniques you can use to achieve scalable and highly available Web sites.

This section describes the following topics:

What is clustering?

Clustering is a technique in which two or more Web servers supporting one or more domains (www.yourcompany.com) are grouped together as a cluster of servers to collectively accommodate increases in load and provide system redundancy.

The following figure shows an example of a server cluster for a sample Web site:

Clustering for scalability works by distributing load among each server in the cluster (load balancing) using either an unintelligent-but-regular distribution sequence (round-robin DNS and routers) or a predefined threshold or algorithm that you specify and can adjust for each server in the cluster (specialized clustering software).

Clustering for failover relies on redundant servers to ensure that business-critical applications remain available if one of the servers in a cluster fails. Intelligent software-based failover solutions can detect when a server has failed and automatically redirect new incoming HTTP requests to the cluster members that are available. Some hardware-based failover devices that have less built-in intelligence require an administrator's intervention once the failure is detected.

Clustering can be accomplished using software-based solutions, such as round-robin DNS by itself or together with a third-party package, a hardware-based solution, such as a packet router, or a combination of the two.

Hardware-based clustering solutions

The most common and reliable hardware-based clustering solution is a device known as a packet router. One of the most popular routers on the market is Cisco System's LocalDirector. A router sits in front of a cluster of Web servers and directs incoming HTTP requests to available Web servers that form the cluster. A router works by assessing the speed and volume of IP packet flow to and from the Web servers and then selecting the best server to accommodate the traffic. This process is fast and efficient. The router device in conjunction with the clustered Web servers comprise what is known as a virtual server.

Routers are considered semi-intelligent devices because they can detect a server failure and redirect requests to other servers. If a Web server fails or stops responding, the router stops sending packets to the unresponsive server. Routers are not considered fully intelligent because while they can redirect requests upon discovering a failure, they do not allow you to configure redirection thresholds for individual servers. They also do not provide for application-aware load balancing.

The following figure shows a router distributing requests in round-robin fashion to the available servers in a Web server cluster:

Advantages

A hardware-based clustering solution, such as a router, is an attractive solution for the following reasons:


Note

Not all load-balancing devices have the same features or offer the same capabilities.


Considerations

Carefully evaluate the following issues against a router's attributes:

Software-based clustering solutions

There are several flavors of software-based clustering solutions on the market. Just like hardware-based clustering solutions, there are strengths and weaknesses associated with each. These software solutions include:

ClusterCATS, Allaire's software clustering solution for load balancing and high availability, allows you to easily create, optimize, and maintain "smart" clusters to support your Web applications. ClusterCATS runs on NT, Solaris, and Linux platforms and works with leading mission-critical Web servers, including Microsoft IIS, Netscape Enterprise Server, and Apache. It is easily administered from remote locations and provides robust features, including:

Advantages

The following benefits make a software-based clustering solution attractive:

Considerations

Consider the following issues when evaluating software-based solutions for your environment:

Combining hardware and software clustering solutions

Instead of having to choose either a hardware solution or a software solution, another possibility is to combine both types of clustering choices. Combining hardware and software solutions will certainly provide the greatest scalability and availability capabilities for your site. Additionally, a combined solution is an attractive option if your organization has already invested in one but is looking for more comprehensive coverage. Having the flexibility to integrate hardware with software means that your organization won't necessarily have to absorb a capital loss on a previous technology investment if you decide to purchase additional clustering technology.

However, as already discussed, not all hardware or software solutions are equal. Many have different features and capabilities, and not all hardware and software integrate well together. Be sure to investigate thoroughly when purchasing additional technology to augment your current solution.

For a visual representation of hardware and software clustering solutions working together, see "Hardware-based clustering solutions".