A while ago, Monty Williams at GemStone took an informal poll of folks on the GLASS mailing list to see what front-end web servers people were using. When the votes — at least those that responded publicly — Apache won, followed closely by lighttpd. Cherokee, nginx, and Swazoo came in tied for third.
I didn’t participate at the time because I hadn’t made up my mind. We had test deployments running on both Apache and nginx, all using FastCGI to communicate with the backend gems.
Well, we had issues as our usage ramped up. We’re now generating PDFs dynamically from one of our GLASS apps, and the PDF creation takes a little longer than your average web request. Not only that, but an inefficient piece of code in our PDF generation made the process even longer.
The bug was easy enough to profile and fix, but it worked out to be a good test of our server setup. What happens when our backend servers get busy during a long-running process? In our case, the answer was “the whole site goes comatose until it’s done”. Not good.
I thought the answer to this was obvious: we needed more FastCGI handlers on the backend, so if one of them was tied up during a long-running process. Apache doesn’t supply an easy way to distribute requests across FastCgiExternalServers, and although I know there are add-ons that can do this, they looked like serious overkill for what I’m trying to accomplish. I just want the requests farmed out to multiple dispatchers.
So I put nginx out front instead of Apache, since nginx has a round-robin load balancer that’s simple as pie to set up. It almost worked. Its round-robin logic is blind, so it distributes requests without taking into account how busy any individual backend is.
If this was a grocery store with 4 checkout lanes, it’s as if the store manager was standing in front of them, sending the first shopper to lane 1, the second to lane 2, etc., without ever looking behind him to see which lane had the shortest line. So if lane 3 is clogged up, you might just be unlucky enough to get put there — even though lanes 1, 2, and 4 are wide open.
I downloaded the source code for Cherokee, which is a really interesting-looking server. But its round-robin load balancer appears to work exactly the same way. Caveat: I haven’t actually tested it, just browsed the source for signs of hope and didn’t see any.
On the other hand, lighttpd maintains a running “load” for each FastCGI backend server. When an incoming request hits, lighttpd chooses from among the servers with the lightest load. This is exactly what I was looking for. Each shopper gets put in the shortest checkout line. Sure, if all lines are the same length when you arrive, you might still get put in the unlucky line, but… what’cha gonna do?
I understand that some benchmarks show Cherokee and nginx running slightly faster than lighttpd — but in a configuration like this, the frontend is such a small part of the overall load that I doubt a tiny speed advantage will make much difference. The queuing policies, however, make a noticeable difference. Under this configuration, our site is noticeably snappier, and I was able to send one of the FastCGI servers off into a tailspin without any of the site’s other visitors even noticing.