Realtime Node.js App: Building a Server

This post is the third in a three-part series describing our investigations into scalability for a second screen application we built for PBS. You can read the series introduction here.

Being familiar with the stress testing procedure is all well-and-good, but that knowledge won’t really help you unless you have a server to test. In this post, we’ll cover how to build a production-ready server suitable for your shiny new Node.js application.

I should note that the DIY approach is not the only option. Services like Nodejitsu can make it very easy to get a Node.js application off the ground, and they should be considered before committing to a more hands-on approach. This article on Node.js hosting options should jump-start your research along these lines.

In order to give the most useful and direct advice, we’re going to make a few assumptions. The first is that you already have a server (be it your own or one rented from a VPS like Rackspace or Media Temple). The second is that the Debian operating system is installed*. Finally, this guide assumes you have root access to your machine.

Environment Setup

The default Debian install is really nice and all, but it could use a number of tweaks to be a secure Node.js server.

First off, you’ll want to use SSH keys to authenticate your connection. Here’s how you can add your public key (extension .pub) to the server’s list of authorized keys:

# change this to refer to your public key
IDENTITY_FILE=/path/to/your/public/key.pub
# change this to your server's address
HOST=140.241.251.240

ssh-copy-id -i $IDENTITY_FILE root@$HOST

(This guide will use environmental variables to help you see the values that you can change.)

In order to discourage brute force password guessing attempts on your server, it’s a good idea to only allow the root user to log in with SSH keys. You can do this by setting the following value in /etc/ssh/sshd_config:

PermitRootLogin without-password

Since the root user has absolute power over the system, it’s best to create a less priviledged user to run the show. You can explicitly grant abilities to this user as appropriate.

export NODE_USER=bob
adduser $NODE_USER

Application Setup

Node.js has received quite a lot of attention in recent years, but it’s still no Python. You won’t find it installed by default: that will take some special effort. The most straightforward approach to installing Node.js is through the aptitude package manager:

sudo apt-get install nodejs

…but the version of Node.js available through the Debian package repository lags behind the latest stable release (at the time of this writing, 0.8.15 is considered stable while Debian officially supports 0.6.19). Especially in pre-alpha stages, so-called “minor” releases can have significant impact on your application. All this means you will likely want to installed the latest code from source.

For the most recent installation instructions, refer to the “Installation” page on the official Node.js wiki. We’ll offer a few suggestions on changes to that procedure here.

First of all, we recommend following the instructions for installing to a custom folder. This gives you more fine-grained control over permissions. As the root user, create a base directory namespaced under “joyent” (the main sponsor of Node.js):

mkdir -p /opt/joyent/node-0.8
# Making the installation directory owned by the Node user will allow it to
# globally install packages without additional permissions.
chown $NODE_USER:users /opt/joyent/node-0.8

# Making the default directory a symlink will make it easier to switch
# between multiple versions should that become necessary.
ln -s /opt/joyent/node-0.8 /opt/joyent/node

You can use apt-get to install a package’s build dependencies without having to look them up:

# Install packages necessary to build Node.js
apt-get build-dep nodejs

Since you are installing a version of Node.js that is newer than the one tracked by apt-get, it’s possible (though unlikely) that the build dependencies have changed. If so, you may have to explicitly install packages listed in the Node.js wiki.

Provided you are installing a tagged release (highly advisable for production), you can ensure that no tomfoolery took place in the download by checking the new file’s SHA against the one published in the release notes.

SOURCE_TAR=node-v0.8.15.tar.gz
shasum $SOURCE_TAR

From here, you can switch into the Node user and proceed with the installation. Just be sure to alter the invocation of the “configure” script according to our earlier change:

./configure --prefix=/opt/joyent/node-0.8

Next up, clone your project source code into place. These steps are version control system-specific, so I’ll trust that you know what you’re doing. You’ll likely have to install your VCS first, though.

Automating Deployment

Depending on how often you intend to ship changes, requiring “manual” updates (where someone connects to the production machine and pulls the latest code) may not be feasible. At best, it is a tedious task. At worst, it could slow your team down and open the door to any number of deployment errors. This makes automating deployment a very attractive option.

Before you go down this route, you should create a branch of your code base that is guaranteed to be “stable”. This way, your server will always be up-to-date with code that has been properly vetted for production. Altogether this process is referred to as “continuous deployment”. Creating a guaranteed-stable branch is beyond the scope of this post, but it is an absolutely necessary first step.

Possibly the simplest method to automate deployment is to script your production server to fetch the latest code at startup, before it actually executes your application. We’ll be writing a startup script later in this post, so if you decide to take this route, simply add in the appropriate fetch commands to that script.

Really though, this method is too passive to be optimal. You will likely ship new versions more often than you need to reboot your server, and you definitely want to minimize server downtime. A better solution would involve responding to new versions immediately by simply re-starting the application (not the entire server).

To achieve this, you can use web “hook” services provided by many source code hosting providers (including BitBucket and GitHub). Once enabled, the provider will respond to changes in the source code by sending an HTTP POST request to a destination of your choosing. A simple web server running on your production system can recieve this request and pull the latest stable source as described above.

Scripting Startup

First, you’ll need a script that runs your application. If you have any assets that need building (not strictly necessary for a Node.js app) or if you want to pull the latest code from your repository, this is the place to do it:

#! /bin/bash
node server.js

Save this as startup.sh in the Node user’s home directory.

Now create a daemon to run this script at system startup. One way to accomplish this in Debian is with an init.d service. You can find the complete definition of the service here. There’s a lot going on there, so I’ll lay out the most important aspects for our purposes:

  • Required-Start and Required-Stop: If your application relies on other services (such as, for instance, redis-server), include them in these lists
  • ulimit -n 40960: Under-the-hood, Linux creates a file descriptor for each TCP connection. By default, your machine will likely not support more than 1024 file descriptors to be open at any one time. In order for the process to support more simultaneous connections, you’ll have to set this value prior to invoking your startup script.
  • start-stop-daemon: This utility is doing most of the heavy lifting. It is running your startup script in the background and running it as the Node.js user you created.

Save this as node-service in the root user’s home directory. Finally, make the system aware of it:

chmod 755 ./node-service
ln -s $(pwd)/node-service /etc/init.d/node-service
insserv -d ./node-service

Now your application will run automatically on system startup! The root user can manually stop and restart the process like so:

~/node-service stop
~/node-service start

Odds and Ends

If you’re using Socket.io, you will want to review the advised production settings described on their wiki page here. These tweaks can have an appreciable impact on performance–both from the client’s and server’s perspective.

You’re likely using NPM to manage Node.js packages. In this case, you’ll want to take some extra steps to secure those dependencies. NPM can be surprisingly fluid with dependencies out-of-the-box. For instance, even if your package.json file specifies all dependencies in terms of “absolute” semantic versions, any of those packages may specify a version range. This means that your dependencies’ dependencies (and recursively on down) may change at any time.

To get around this, you could version control the source code of all your dependencies, but that gets messy fast. Instead, consider using npm-lockdown. (NPM itself ships with a similar tool called shrinkwrap, but it doesn’t completely address the problem. You can read about why in npm-lockdown’s readme.) Those looking for a more thorough discussion on using NPM in production (including why the default fluidity is intentional) should check out episode 37 of the excellent Nodeup podcast.

Now that the server is raring to go, you’re likely interested in collecting performance data; maybe you’d even like to run some stress tests. The sar utility is intended to monitor system activity, and it should serve you well. Install it with:

$ sudo apt-get install sysstat

This utility can log statistics on a wide array of parameters, so check out the documentations (man sar) to learn more.

Infinity and Beyond

If you’ve made it through all this, you have transformed your vanilla Debian server (quite a feat of engineering in its own right) into a badass rig to run your Node.js application. Congratulations!

But not so fast–before you unleash your application on the world, you really ought to do some stress testing. By simulating heavy load in a controlled environment, you can learn more about bottlenecks in your system (and hopefully avoid catastrophe). Lucky for you, this post is the last in a series dedicated to stress testing a real-time Node.js application.

The first part focuses on my experience stress testing a Node.js application, and part two describes the procedure itself.


* Why Debian, you ask? I wanted to follow best practices for whatever distro I used, and it just so happens that Debian developer Paul Tagliamonte co-works here at the Bocoup Loft. His knowledge and patience were invaluable as I navigated these waters. Thanks, Paul!

This entry was posted by Mike Pennisi (@jugglinmike) on January 14, 2013 in JavaScript, Node.js and Feature.

Comments