Archive for August, 2009

Scaling Up vs Scaling Out at Stack Overflow

Saturday, August 1st, 2009

MediaWiki Servers Image - Copyright Rob Halsey - CC 2.5 License

StackOverflow, a popular programming website, has posted their architecture and scaling approach for their website. Their blog post has some excellent scaling tips, especially if you’re running a Windows/.Net stack.

The blog post also raises an interesting point regarding scaling up vs. out. Scaling up, a.k.a. vertical scalability, involves adding more memory and CPUs to a machine to gain performance improvements. Scaling out, or horizontal scalability, involves adding more machines to a system or part of the system in order to gain performance.

In recent years, scaling up has gotten bad press, as it has finite limitations that prevent it from being used effectively by the biggest sites on the web, like Google and Facebook. There is a limit to the ram and CPUs you can add to a machine. In the case of Stack Overflow, however, they have been mainly using a scale up strategy. The reality is most sites don’t have a need to scale to Google or Facebook-size proportions–so the choice is not a binary scale up vs out decision. Additionally, some tiers in a web architecture are easier to scale horizontally. For example, adding web servers is often a fairly straightforward proposition. However, adding database server machines is usually more challenging, as in some cases you will be looking at sharding, breaking off some piece of functionality like reporting, or moving to a master-slave setup.

If you run an open source stack, scaling out is more practical, in general. Stack Overflow was built on a Windows/.Net foundation where licensing costs provide a natural incentive to look at scaling up first, rather than scaling out. One take-away, to my mind, is that scaling up is a cost effective solution at certain stages in the growth of a web application. Scaling a website is normally a gradual process of continually removing the next most-pressing bottleneck. If you look at the growth patterns of larger websites, most started by plucking low-hanging fruit–often this involves scaling up a bit–like adding more ram, or CPUs to your database machine. Eventually, if your site attains a VERY large size, the optimizations you make are likely to make your application more complicated, and less flexible, so you pay a maintenance and design penalty for those optimizations. For example, database sharding, a scaling out strategy, tends to be done in the very late stages of growth, and the reason is that it limits how effectively some queries can run, and even whether some queries are possible. Scaling up, however, does not pose these problems.

Scaling up is a strategy worth keeping in your toolbox. Even if your plan is mainly scaling out, adding more ram and CPU to the boxes you are scaling can make sense. There is no shame in scaling “diagonally” when the situation warrants it. When choosing an architecture, know that architectures that only scale up will eventually hit a wall if enough growth occurs. That doesn’t mean your site will see enough traffic to hit that wall, and it doesn’t mean that scaling out can’t be done later. Realistic capacity planning is important at this stage. There is no need design every app to reach Google-scale. Sometimes vertical scaling is plenty, and when it is, it can be very cost effective.

As an aside, and on the flipside, with the advent of cloud computing and virtualization, there is a certain extra appeal to tearing the cost of scaling free from the moorings of software license costs. An OSS approach, which avoids licensing fees, can make it very cost-effective to run a full-size test environment in the cloud, on-demand, for testing purposes. This can provide tremendous advantages when testing performance and experimenting with different configurations.

[Post to Twitter] Tweet This Post