Why Node.js is awesome: A short history of web applications

Multithreaded JavaScript has been published with O'Reilly!

Warning: This is somewhat of a rant and I have no numbers to back up my claims.

In the beginning, there was a 1:1 mapping between URL's and the filesystem from which they returned files. These files were heterogeneous, being both binary image files and text html files. Dynamic code was kept in cgi-bin directories and not usually mixed with everything else. Each HTTP request spawned a new web server process, it retrieved a file, and died. And it was good.

Then, the SEO people came. They wanted cleaner URLs. And, MVC frameworks came. And they happened to provide clean URLs. And before you knew it, a URL no longer pointed directly to a file in your filesystem. And it was okay.

Then, we started doing more of the application development on the client side. Suddenly the client was smart, and only needed to grab small bits of XML or JSON or HTML data from the server, and would process it into the DOM somehow. And it was okay.

You see, most web apps use a traditional web server designed for this nearly antiquated model of making occasional HTTP requests which grab a single file from the file system and return it. This is usually not a good thing.

In an attempt to make our new, pretty URLs and MVC frameworks work with the old servers (read as Apache), we start to use funny files which awkwardly make the old server work with our new way of doing things. Thus the .htaccess / mod_rewrite combination was born.

The way it works with Apache is if a file is requested, the directory the site is anchored to is checked for a .htaccess file. And the parent. And the grandparent. And so on until you hit the root. This is done for every single request. Now, this file, if it contains mod_rewrite directives, will explain a regular expression for matching the URL to a particular file. This was the ultimate of hacks to make Apache work with modern web applications.

Now, not all situations are this bad. With Apache, one can disable the .htaccess files, keep it all in httpd.conf. With lighttpd, you can define these rules in the master pseudo json syntax lighttpd.conf file (which resides in memory).

These are all awkward solutions for modern web application development. Why do we even need a web server anyway? All it does is take a request from a browser, figure out what file to grab or code to run, and return the result.

This is where Node.js got it right from the start. Why would someone need to keep the web server separate from the application code and keep a heterogeneous file system / url schema? Did you know that your cookies (up to 4k worth) are sent back to the server with every single request to the same domain, even for these images which don't care about them?

With a Node.js app, the server is built right into your code. If you are using something like Express, the code to handle this is really simple, and probably isn't any more complex than what your MVC framework is doing now:

var app = express.createServer();
app.get('/', function(req, res){
    res.send('Hello World'); });
app.listen(3000);

You'll probably want to bind your app to a certain (sub)domain, and server your static files from a different (sub)domain (e.g. Content Delivery Networks) and use a super simple "web server" for the CDN.

Once you do this, the huge overhead of keeping a bunch of Apache processes in memory goes away. Your site can handle a lot more traffic. Since inter-process communication and filesystem reads are decreased dramatically, your application is a lot more efficient and you'll need less servers to scale.

Thomas Hunter II Avatar

Thomas has contributed to dozens of enterprise Node.js services and has worked for a company dedicated to securing Node.js. He has spoken at several conferences on Node.js and JavaScript and is an O'Reilly published author.