Kopparapu, Chandra. Load Balancing Servers, Firewalls, and Caches. New York: John Wiley & Sons, 2002. ISBN 0-471-41550-2.
Don't even think about deploying a server farm or geographically dispersed mirror sites without reading this authoritative book. The Internet has become such a mountain of interconnected kludges that something as conceptually simple as spreading Web and other Internet traffic across a collection of independent servers or sites in the interest of increased performance and fault tolerance becomes a matter of enormous subtlety and hideous complexity. Most of the problems come from the need for “session persistence”: when a new user arrives at your site, you can direct them to any available server based on whatever load balancing algorithm you choose, but if the user's interaction with the server involves dynamically generated content produced by the server (for example, images generated by Earth and Moon Viewer, or items the user places in their shopping cart at a commerce site), subsequent requests by the user must be directed to the same server, as only it contains the state of the user's session.

(Some load balancer vendors will try to persuade you that session persistence is a design flaw in your Web applications which you should eliminate by making them stateless or by using a common storage pool shared by all the servers. Don't believe this. I defy you to figure out how an application as simple as Earth and Moon Viewer, which does nothing more complicated than returning a custom Web page which contains a dynamically generated embedded image, can be made stateless. And shared backing store [for example, Network Attached Storage servers] has its own scalability and fault tolerance challenges.)

Almost any simple scheme you can come up with to get around the session persistence problem will be torpedoed by one or more of the kludges and hacks through which a user's packet traverses between client and server: NAT, firewalls, proxy servers, content caches, etc. Consider what at first appears to be a foolproof scheme (albeit sub-optimal for load distribution): simply hash the client's IP address into a set of bins, one for each server, and direct the packets accordingly. Certainly, that would work, right? Wrong: huge ISPs such as AOL and EarthLink have farms of proxy servers between their customers and the sites they contact, and these proxy servers are themselves load balanced in a non-persistent manner. So even two TCP connections from the same browser retrieving, say, the text and an image from a single Web page, may arrive at your site apparently originating from different IP addresses!

This and dozens of other gotchas and ways to work around them are described in detail in this valuable book, which is entirely vendor-neutral, except for occasionally mentioning products to illustrate different kinds of architectures. It's a lot better to slap your forehead every few pages as you discover something else you didn't think of which will sabotage your best-laid plans than pull your hair out later after putting a clever and costly scheme into production and discovering that it doesn't work. When I started reading this book, I had no idea how I was going to solve the load balancing problem for the Fourmilab site, and now I know precisely how I'm going to proceed. This isn't a book you read for entertainment, but if you need to know this stuff, it's a great place to learn it.

February 2005 Permalink