PHP + Apache Stack vs Node.js
If you've been following my blog through the years, you'd know that I'm a big PHP fan. I earned my PHP5 ZCE (Zend Certified Engineer) certificate a few years back. I've built a couple hundred content-based websites using various Content Management Systems, as well as a dozen or so apps using different PHP Frameworks. I had a blast at the 2011 ZendCon, and I've even taught a PHP meetup for about nine months.
Honestly, I've been using both languages recently. At work our websites are built using PHP (although I'm finding more reasons for Node.js each day). For my side projects, I'm writing multiplayer web games and hardware interfacing software using Node.js.
Both environments have their pros and their cons, and neither language is the perfect solution for every project. In this post I'm going to compare and contrast the two environments, covering their strengths and weaknesses, and outline which is better for various situations.
Strengths of PHP
PHP is by far the most widely used server-side web programming language. It is old, and there is never short supply of cheap, shared hosting providers. Some of the largest and commonly used platforms/apps use PHP. WordPress, the most popular of self-hosted blogging platforms is PHP. MediaWiki, Joomla, are some more common self hosted apps which use PHP.
Some of the biggest websites use PHP, such as Facebook, Wikipedia. PHP uses traditional (read, familiar) Object Oriented methodologies. There are loads of PHP web frameworks and language documentation.
PHP is great for serving up content websites. PHP sits behind a web server, which can check to see if the file being requested exists in the filesystem. If so, the file can be served to the client without needing to run any PHP code. This isn't necessarily a pro of the language itself, but it is a beneficial side-effect for most situations.
PHP also has corporate backing by the Zend company (Their tagline is “The PHP Company”). This backing is required by big corporations, which all share the same philosophy, “If something doesn't cost money we don't want it”.
Weaknesses of PHP
PHP is not meant to be run for extended amounts of time. Many will argue with me here, but the language by default is set to terminate itself once it has been running for 30 seconds, or if it reaches a certain amount of memory usage. This can be disabled, and apps can be built to run for a long time successfully, but this is not where PHP shines.
The language isn't able to run code in parallel. You can, using tools like Gearman, pass off some work to be handled by other processes, and kinda keep an eye on the progress, but the process is rather klunky, and is not what PHP is intended for. _Gearman itself offers some other great features and can be used by many different environments, including Node.js._
Back in the day, when URLs and filesystems had a 1:1 mapping, it made perfect sense to have a web server separate from the language it is running. But, nowadays, any PHP app with attractive URLs running behind the Apache web server is going to need a .htaccess file, which tells the server a regular expression to check before serving up a file. Sound complex and awkward with unnecessary overhead? That's because it is.
Overall, the stack required to support PHP is overly complex when compared to something simpler like Node. One needs Apache, which has some global settings as well as site specific settings. One also needs PHP, which has global php.ini settings, some of which can be overridden at run time (but not all). There is also a bunch of old stuff left around which should be removed, e.g., y2k support (finally removed in 5.4).
The official website is quite ugly and outdated. The docs are okay, the user contributed notes are very useful, however the method to update docs involves SVN and hacking away at XML files, and is not exactly encouraging for most people to write docs.
Package management is virtually non-existant. Sure, there is PEAR, but that tool is ridiculously painful to use. Some other package managers have appeared, e.g. Pyrus (PEAR2) and Packagist, but usage is so scattered that there is no de facto standard. There probably never will be to be honest. There is also PHPClasses.org, but this site is painful to use and requires a signed-in user to browse.
Since a PHP process starts, does some boilerplate work, performs the taks the user actually wants, and then dies, data is not persistent in memory. You can keep this data persistent using third party tools like Memcache or traditional database, but then there is the overhead of communicating with those external processes.
Strengths of Node.js
Node.js's biggest strength, IMO, is that it is event driven. Node apps run great over long periods of time. The event emitter code, while pretty simple at its core, provides a powerful and consistent interface for triggering code execution when needed.
Node has a web server built in. Some people call this a bad thing, I call those people crazy. Having the server built in means that you don't have the awkward .htaccess config thing going on. Every request is understood to go through the same process, without having to hunt through the filesystem and figure out which script to run.
The number one bottleneck with web apps is not the time it takes to calculate CPU hungry operations, but rather network I/O. If you need to respond to a client request after making a database call and sending an email, you can perform the two actions and respond
when both are complete.
The package management system, npm, is great (although the website leaves much to be desired). Instead of having to follow strict guidelines and formatting requirements (e.g. PHP's PEAR), anyone can put anything into npm (even me!). It is similar to the Android market. Sure, you can get some crappy things out of it, but if one uses common sense they won't be downloading a virus.
Being so new, it doesn't have a lot of baggage leftover from days of old. Having a server built in, the stack is a lot simpler, there are less points of failure, and there is more control over what you can do with HTTP responses (ever try overwriting the name of the web server using PHP?).
Data can be persisted in memory very easily, so if you are sharing data between different clients (e.g. with a multiplayer game), this sharing of data is built in.
Weaknesses of Node.js
Node.js is a very new, API unstable, untested platform. If you are going to be building a large corporate scale app with a long lifetime, Node.js is not a good solution. The Node API's are changing almost daily, and large/longterm apps will need to be rewritten often.
If you are serving up a lot of static files, such as images, you don't want to use Node.js, otherwise you get back to the situation where you are checking the filesystem if things exists. This can be fixed by moving all static content to a subdomain or Content Delivery Network.
The persistent memory thing can be a little tricky. If you don't know what you're doing, you might accidentally share data between clients (which could be a disaster). Also, you are in bigger danger of memory leaks. If you keep appending data to an array in PHP, the script only really has a lifetime of 0.1 seconds, but your Node.js script needs to run forever, and you can easily blow it up over a period of time.
Node is single threaded, even though it kind of appears to be multithreaded with the asynchronous execution it does. This doesn't really matter too much, when is the last time your app's biggest bottleneck was CPU instead of waiting on network I/O? So, while it would be cool to be multi-threaded, it doesn't usually harm an app.
Which should I learn?
This is a great question. If you don't have any experience developing server side software, you should probably start off with PHP. It is a great, easy language for performing the request -> response pattern which makes the stateless internet great. You will surely find more forum posts from people having the same problem as you (even though the posts might be 10 years old).
If you are looking to build other kinds of software, Node.js may be a safer bet. For example, I'm writing some software which requires code to be triggered by changes made to hardware, such as a wireless network coming into range. Writing this kind of software using PHP would be hacky and painful, but with Node.js it is a breeze.
Here's some quick examples of different programming situations and which language you should probably go with.
- Are you building some sort of daemon? Use Node.
- Are you making a content website? Use PHP.
- Do you want to share data between visitors? Use Node.
- Are you a beginner looking to make a quick website? Use PHP.
- Are you going to run a bunch of code in parallel? Use Node.
- Are you writing software for clients to run on shared hosts? Use PHP.
- Do you want push events from the server to the client using websockets? Use Node.
- Does your team already know PHP? Use PHP.
- Are you building a command line script? Both work.
Here's a funny way of looking at things… To emulate the functionality of Node.js using PHP + Apache, you would need a couple other services running. To have Node act the same as PHP, you would simply run a bunch of synchronous code.
Node.js ≈ PHP + Apache + Memcached + Gearman - complexity