This post is the third in a three-part series describing our investigations into scalability for a second screen application we built for PBS. You can read the series introduction here.
Being familiar with the stress testing procedure is all well-and-good, but that knowledge won’t really help you unless you have a server to test. In this post, we’ll cover how to build a production-ready server suitable for your shiny new Node.js application.
I should note that the DIY approach is not the only option. Services like Nodejitsu can make it very easy to get a Node.js application off the ground, and they should be considered before committing to a more hands-on approach. This article on Node.js hosting options should jump-start your research along these lines.
In order to give the most useful and direct advice, we’re going to make a few
assumptions. The first is that you already have a server (be it your own or one
rented from a VPS like
Rackspace or Media
Temple). The second is that the Debian
operating system is installed*. Finally,
this guide assumes you have root
access to your machine.
Environment Setup
The default Debian install is really nice and all, but it could use a number of tweaks to be a secure Node.js server.
First off, you’ll want to use SSH keys to authenticate your connection. Here’s
how you can add your public key (extension .pub
) to the server’s list of
authorized keys:
# change this to refer to your public key
IDENTITY_FILE=/path/to/your/public/key.pub
# change this to your server's address
HOST=140.241.251.240
ssh-copy-id -i $IDENTITY_FILE root@$HOST
(This guide will use environmental variables to help you see the values that you can change.)
In order to discourage brute force password guessing attempts on your server,
it’s a good idea to only allow the root
user to log in with SSH keys. You can
do this by setting the following value in /etc/ssh/sshd_config
:
PermitRootLogin without-password
Since the root
user has absolute power over the system, it’s best to create a
less priviledged user to run the show. You can explicitly grant abilities to
this user as appropriate.
export NODE_USER=bob
adduser $NODE_USER
Application Setup
Node.js has received quite a lot of attention in recent years, but it’s still
no Python. You won’t find it installed by default: that will take some special
effort. The most straightforward approach to installing Node.js is through the
aptitude
package manager:
sudo apt-get install nodejs
…but the version of Node.js available through the Debian package repository lags behind the latest stable release (at the time of this writing, 0.8.15 is considered stable while Debian officially supports 0.6.19). Especially in pre-alpha stages, so-called “minor” releases can have significant impact on your application. All this means you will likely want to installed the latest code from source.
For the most recent installation instructions, refer to the “Installation” page on the official Node.js wiki. We’ll offer a few suggestions on changes to that procedure here.
First of all, we recommend following the instructions for installing to a
custom folder. This gives you more fine-grained control over permissions.
As the root
user, create a base directory namespaced under “joyent” (the main
sponsor of Node.js):
mkdir -p /opt/joyent/node-0.8
# Making the installation directory owned by the Node user will allow it to
# globally install packages without additional permissions.
chown $NODE_USER:users /opt/joyent/node-0.8
# Making the default directory a symlink will make it easier to switch
# between multiple versions should that become necessary.
ln -s /opt/joyent/node-0.8 /opt/joyent/node
You can use apt-get
to install a package’s build dependencies without having
to look them up:
# Install packages necessary to build Node.js
apt-get build-dep nodejs
Since you are installing a version of Node.js that is newer than the one
tracked by apt-get
, it’s possible (though unlikely) that the build
dependencies have changed. If so, you may have to explicitly install packages
listed in the Node.js wiki.
Provided you are installing a tagged release (highly advisable for production), you can ensure that no tomfoolery took place in the download by checking the new file’s SHA against the one published in the release notes.
SOURCE_TAR=node-v0.8.15.tar.gz
shasum $SOURCE_TAR
From here, you can switch into the Node user and proceed with the installation. Just be sure to alter the invocation of the “configure” script according to our earlier change:
./configure --prefix=/opt/joyent/node-0.8
Next up, clone your project source code into place. These steps are version control system-specific, so I’ll trust that you know what you’re doing. You’ll likely have to install your VCS first, though.
Automating Deployment
Depending on how often you intend to ship changes, requiring “manual” updates (where someone connects to the production machine and pulls the latest code) may not be feasible. At best, it is a tedious task. At worst, it could slow your team down and open the door to any number of deployment errors. This makes automating deployment a very attractive option.
Before you go down this route, you should create a branch of your code base that is guaranteed to be “stable”. This way, your server will always be up-to-date with code that has been properly vetted for production. Altogether this process is referred to as “continuous deployment”. Creating a guaranteed-stable branch is beyond the scope of this post, but it is an absolutely necessary first step.
Possibly the simplest method to automate deployment is to script your production server to fetch the latest code at startup, before it actually executes your application. We’ll be writing a startup script later in this post, so if you decide to take this route, simply add in the appropriate fetch commands to that script.
Really though, this method is too passive to be optimal. You will likely ship new versions more often than you need to reboot your server, and you definitely want to minimize server downtime. A better solution would involve responding to new versions immediately by simply re-starting the application (not the entire server).
To achieve this, you can use web “hook” services provided by many source code hosting providers (including BitBucket and GitHub). Once enabled, the provider will respond to changes in the source code by sending an HTTP POST request to a destination of your choosing. A simple web server running on your production system can recieve this request and pull the latest stable source as described above.
Scripting Startup
First, you’ll need a script that runs your application. If you have any assets that need building (not strictly necessary for a Node.js app) or if you want to pull the latest code from your repository, this is the place to do it:
#! /bin/bash
node server.js
Save this as startup.sh
in the Node user’s home directory.
Now create a daemon to run this script at system startup. One way to accomplish
this in Debian is with an init.d
service. You can find the complete
definition of the service here.
There’s a lot going on there, so I’ll lay out the most important aspects for
our purposes:
Required-Start
andRequired-Stop
: If your application relies on other services (such as, for instance,redis-server
), include them in these listsulimit -n 40960
: Under-the-hood, Linux creates a file descriptor for each TCP connection. By default, your machine will likely not support more than 1024 file descriptors to be open at any one time. In order for the process to support more simultaneous connections, you’ll have to set this value prior to invoking your startup script.start-stop-daemon
: This utility is doing most of the heavy lifting. It is running your startup script in the background and running it as the Node.js user you created.
Save this as node-service
in the root user’s home directory. Finally, make
the system aware of it:
chmod 755 ./node-service
ln -s $(pwd)/node-service /etc/init.d/node-service
insserv -d ./node-service
Now your application will run automatically on system startup! The root
user
can manually stop and restart the process like so:
~/node-service stop
~/node-service start
Odds and Ends
If you’re using Socket.io, you will want to review the advised production settings described on their wiki page here. These tweaks can have an appreciable impact on performance–both from the client’s and server’s perspective.
You’re likely using NPM to manage Node.js packages. In this
case, you’ll want to take some extra steps to secure those dependencies. NPM
can be surprisingly fluid with dependencies out-of-the-box. For instance, even
if your package.json
file specifies all
dependencies in terms of “absolute” semantic versions,
any of those packages may specify a version range. This means that your
dependencies’ dependencies (and recursively on down) may change at any time.
To get around this, you could version control the source code of all your
dependencies, but that gets messy fast. Instead, consider using
npm-lockdown. (NPM itself ships with a
similar tool called shrinkwrap
, but
it doesn’t completely address the problem. You can read about why in
npm-lockdown’s readme.) Those
looking for a more thorough discussion on using NPM in production (including
why the default fluidity is intentional) should check out episode 37 of the
excellent Nodeup podcast.
Now that the server is raring to go, you’re likely interested in collecting
performance data; maybe you’d even like to run some stress tests. The sar
utility is intended to monitor system activity, and it should serve you well.
Install it with:
$ sudo apt-get install sysstat
This utility can log statistics on a wide array of parameters, so check out the
documentations (man sar
) to learn more.
Infinity and Beyond
If you’ve made it through all this, you have transformed your vanilla Debian server (quite a feat of engineering in its own right) into a badass rig to run your Node.js application. Congratulations!
But not so fast–before you unleash your application on the world, you really ought to do some stress testing. By simulating heavy load in a controlled environment, you can learn more about bottlenecks in your system (and hopefully avoid catastrophe). Lucky for you, this post is the last in a series dedicated to stress testing a real-time Node.js application.
The first part focuses on my experience stress testing a Node.js application, and part two describes the procedure itself.
* Why Debian, you ask? I wanted to follow best practices for whatever distro I used, and it just so happens that Debian developer Paul Tagliamonte co-works here at the Bocoup Loft. His knowledge and patience were invaluable as I navigated these waters. Thanks, Paul!