Multi-Instance Node.js App in PaaS Using Redis Pub/Sub

If you chose PaaS as hosting for your application, you probably had or will have this problem: Your app is deployed to small "containers" (known as dynos in Heroku, or gears in OpenShift) and you want to scale it.

In order to do so, you increase the number of containers—and every instance of your app is pretty much running in another virtual machine. This is good for a number of reasons, but it also means that the instances don't share memory.

In this tutorial I will show you how to overcome this little inconvenience.

When you chose PaaS hosting, I assume that you had scaling in mind. Maybe your site already witnessed the Slashdot effect or you want to prepare yourself for it. Either way, making the instances communicate with each other is pretty simple.

Keep in mind that in the article I will assume that you already have a Node.js app written and running.

Step 1: Redis Setup

First, you have to prepare your Redis database. I like to use Redis To Go, because the setup is really quick, and if you are using Heroku there is an add-on (although your account must have a credit card assigned to it). There is also Redis Cloud, which includes more storage and backups.

From there, the Heroku setup is pretty easy: Select the add-on on the Heroku Add‑ons page, and select Redis Cloud or Redis To Go, or use one of the following commands (note that the first one is for Redis To Go, and the second one is for Redis Cloud):

1	$ heroku addons:add redistogo
2	$ heroku addons:add rediscloud

Step 2: Setting Up node_redis

At this point, we have to add the required Node module to the package.json file. We will use the recommended node_redis module. Add this line to your package.json file, in the dependencies section:

1	"node_redis": "0.11.x"

If you want, you can also include hiredis, a high-performance library written in C, which node_redis will use if it's available:

1	"hiredis": "0.1.x"

Depending on how you created your Redis database and which PaaS provider you use, the connection setup will look a bit different. You need host, port, username, and password for your connection.

Heroku

Heroku stores everything in the config variables as URLs. You have to extract the information you need from them using Node's url module (config var for Redis To Go is process.env.REDISTOGO_URL and for Redis Cloud process.env.REDISCLOUD_URL). This code goes on the top of your main application file:

var redis = require('redis'); 
var url = require('url'); 

var redisURL = url.parse(YOUR_CONFIG_VAR_HERE); 
var client = redis.createClient(redisURL.host, redisURL.port); 

client.auth(redisURL.auth.split(':')[1]); 

Others

If you created the database by hand, or use a provider other than Heroku, you should have the connection options and credentials already, so just use them:

1	var redis = require('redis');
2	var client = redis.createClient(YOUR_HOST, YOUR_PORT);
3	client.auth(YOUR_PASSWORD);

After that we can start working on communication between instances.

Step 3: Sending and Receiving Data

The simplest example will just send information to other instances that you've just started. For example, you can display this information in the admin panel.

Before we do anything, create another connection named client2. I will explain why we need it later.

Let's start by just sending the message that we started. It's done using the publish() method of the client. It takes two arguments: the channel we want to send the message to, and the message's text:

1	client.publish('instances', 'start');

That's all you need to send the message. We can listen for messages in the message event handler (notice that we call this on our second client):

1	client2.on('message', function (channel, message) {

The callback is passed the same arguments that we pass to the publish() method. Now let's display this information in the console:

1	if ((channel == 'instances') and (message == 'start'))
2	console.log('New instance started!');
3	});

The last thing to do is to actually subscribe to the channel we will use:

1	client2.subscribe('instances');

We used two clients for this because when you call subscribe() on the client, its connection is switched to the subscriber mode. From that point, the only methods you can call on the Redis server are SUBSCRIBE and UNSUBSCRIBE. So if we are in the subscriber mode we can publish() messages.

If you want you can also send a message when the instance is being shut down—you can listen to the SIGTERM event and send the message to the same channel:

1	process.on('SIGTERM', function () {
2	client.publish('instances', 'stop');
3	process.exit();
4	});

To handle that case in the message handler add this else if in there:

1	else if ((channel == 'instances') and (message == 'stop'))
2	console.log('Instance stopped!');

So it looks like this afterwards:

client2.on('message', function (channel, message) { 

    if ((channel == 'instances') and (message == 'start')) 
        console.log('New instance started!'); 
    else if ((channel == 'instances') and (message == 'stop')) 
        console.log('Instance stopped!'); 

});

Note that if you are testing on Windows, it does not support the SIGTERM signal.

To test it locally, start your app a few times and see what happens in the console. If you want to test the termination message, don't issue the Ctrl+C command in the terminal—instead, use the kill command. Note that this is not supported on Windows, so you can't check it.

First, use the ps command to check what id your process has—pipe it to grep to make it easier:

1	$ ps -aux \| grep your_apps_name

The second column of the output is the ID for which you are looking. Keep in mind that there will be also a line for the command you just ran. Now execute the kill command using 15 for the signal—it's SIGTERM:

1	$ kill -15 PID

PID is your process ID.

Real-World Examples

Now that you know how to use the Redis Pub/Sub protocol, you can go beyond the simple example presented earlier. Here are a few use-cases that may be helpful.

Express Sessions

This one is extremely helpful if you are using Express.js as your framework. If your application supports user logins, or pretty much anything that utilizes sessions, you will want to make sure the user sessions are preserved, no matter if the instance restarts, the user moves to a location that is handled by another one, or the user is switched to another instance because the original one went down.

A few things to remember:

The free Redis instances will not suffice: you need more memory than the 5MB/25MB they provide.
You will need another connection for this.

We will need the connect-redis module. The version depends on the version of Express you are using. This one is for Express 3.x:

1	"connect-redis": "1.4.7"

And this for Express 4.x:

1	"connect-redis": "2.x"

Now create another Redis connection named client_sessions. The usage of the module again depends on the Express version. For 3.x you create the RedisStore like this:

1	var RedisStore = require('connect-redis')(express)

And in 4.x you have to pass the express-session as the parameter:

1	var session = require('express-session');
2	var RedisStore = require('connect-redis')(session);

After that the setup is the same in both versions:

1	app.use(session({ store: new RedisStore({ client: client_sessions }), secret: 'your secret string' }));

As you can see, we are passing our Redis client as the client property of the object passed to RedisStore's constructor, and then we pass the store to the session constructor.

Now if you start your app, log in, or initiate a session and restart the instance, your session will be preserved. The same happens when the instance is switched for the user.

Exchanging Data With WebSockets

Let's say you have a completely separated instance (worker dyno on Heroku) for doing more resource-eating work like complicated calculations, processing data in the database, or exchanging a lot of data with an external service. You will want the "normal" instances (and therefore the users) to know the result of this work when it's done.

Depending on whether you want the web instances to send any data to the worker, you will need one or two connections (let's name them client_sub and client_pub on the worker too). You can also reuse any connection that is not subscribing to anything (like the one you use for Express sessions) instead of the client_pub.

Now when the user wants to perform the action, you publish the message on the channel that is reserved just for this user and for this specific job:

1	// this goes into your request handler
2	client_pub.publish('JOB:USERID:JOBNAME:START', JSON.stringify(THEDATAYOUWANTTOSEND));
3	client_sub.subscribe('JOB:USERID:JOBNAME:PROGRESS');

Of course you'll have to replace USERID and JOBNAME with appropriate values. You should also have the message handler prepared for the client_sub connection:

client_sub.on('message', function (channel, message) { 

    var USERID = channel.split(':')[1]; 
    
    if (message == 'DONE') 
        client_sub.unsubscribe(channel); 
    
    sockets[USERID].emit(channel, message); 

});

This extracts the USERID from the channel name (so make sure you don't subscribe to channels not related to user jobs on this connection), and sends the message to the appropriate client. Depending on which WebSocket library you use, there will be some way to access a socket by its ID.

You may wonder how the worker instance can subscribe to all of those channels. Of course, you don't just want to do a few loops on all possible USERIDs and JOBNAMEs. The psubscribe() method accepts a pattern as the argument, so it can subscribe to all JOB:* channels:

1	// this code goes to the worker instance
2	// and you call it ONCE
3	client_sub.psubscribe('JOB:*')

Common Problems

There are a few problems you may encounter when using Pub/Sub:

Your connection to the Redis server is refused. If this happens, make sure you provide proper connection options and credentials, and that the maximum number of connections has not been reached.
Your messages are not delivered. If this happens, check that you subscribed to the same channel you are sending messages on (seems silly, but sometimes happens). Also make sure that you attach the message handler before calling subscribe(), and that you call subscribe() on one instance before you call publish() on the other.