Node.js Clustering

Node.js is becoming more and more popular among developers for building scalable network applications due to its asynchronous nature, capable of handling a huge number of simultaneous connections with high throughput. The other reason why people are choosing node is because of its simplicity — every Node process runs in a single thread. This means you don't have to deal with multi-threading challenges such as race conditions and/or deadlocks.

However, one drawback of running a Node application in a single thread also means running it on a single core — meaning, it doesn't take advantages of multi-core systems, which can potentially be a huge waste of available resources. The solution that Node provides for scaling up your applications is by splitting the single process into multiple child processes, or in Node's terminology, workers. This can be achieved with the cluster module which comes pre-packaged with Node.

The Node.js Cluster Module

In a nutshell, the cluster module allows you to spawn workers that all share server ports with the main Node process, also known as the master.

HOW DOES IT WORK?

The cluster module will basically setup a master process which in turn will be in charge of creating and controller workers (child processes). The master communicates with the workers via IPC (Inter-Process Communication) channels and distributes load to among them in a round-robin fashion (except in Windows where the OS handles the distribution of incoming connections).

Although it seems complex to implement the Node.js cluster, it's easier than you think. But first, let's start with an app with no cluster implementation.

Create a new file named server.js and write the following:

const http = require('http');

const server = http.createServer((req, res) => {  
  res.writeHead(200);
  res.end('Hello World');
});

server.listen(3000);  
console.log('server listening on port 3000);  

Now, if you run node server.js and access localhost:3000, you should get the "Hello World" response back.

The idea behind the cluster module in Node is quite simple — you identify which part of your code should be the master and which one for the workers. Ideally, the master should not be a server — it should only be responsible for creating and managing workers and the cluster module allows you to identify the master with the following:

if(cluster.isMaster){  
}

Then, you can use the fork() method to create your workers. Let's modify our server and implement the cluster module:

'use strict';

const http = require('http');  
const os = require('os');  
const cluster = require('cluster');

if (cluster.isMaster) {  
  const cpus = os.cpus().length;

  for (let i = 0; i < cpus; i++) {
    cluster.fork();
  }
}
else {  
  const server = http.createServer((req, res) => {
    res.writeHead(200);
    res.end('Hello World');
  });

  server.listen(3000);
  console.log('server listening on port 3000');
}

Now, if you restart your application and take a look at the console, your will see "server listening on port 3000" once for each CPU installed on your machine like that:

cluster

And of course, you can spawn as many workers as you want — you're not limited by the number of CPU cores on your machine since a worker is nothing more but a child process.

The above code will work just fine, but what if one or more workers fail? Fortunately, the cluster module allows you to detect failed processes and restart them as follows:

'use strict';

const http = require('http');  
const os = require('os');  
const cluster = require('cluster');

if (cluster.isMaster) {  
  const cpus = os.cpus().length;

  for (let i = 0; i < cpus; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker} died with code ${code} and signal ${signal}`);
    console.log('Restarting worker');
    cluster.fork();
  });

}
else {  
  const server = http.createServer((req, res) => {
    res.writeHead(200);
    res.end('Hello World');
  });

  server.listen(3000);
  console.log('server listening on port 3000');
}

Clustering with PM2

Another alternative to the inbuilt Node.js cluster module is PM2 by Keymetrics. With PM2, you don't have to write a single line of code to implement clustering in your application. Moreover, it will handle possible failures and restart your workers automatically.

To get started, install PM2 globally with NPM:

npm install -g pm2  

Your server will become:

const http = require('http');

const server = http.createServer((req, res) => {  
  res.writeHead(200);
  res.end('Hello World');
});

server.listen(3000);  
console.log('server listening on port 3000);  

Then, start PM2 by typing:

pm2 start server.js  

This command will start your server in fork_mode, and PM2 will automatically spawn as many workers as you have CPU cores:

pm2-fork

Alternatively, you can start your server in cluster_mode by typing:

pm2 start server.js -i 4  

pm2-cluster

The benefit of using PM2 in cluster mode is that you can scale your application in real-time — if you need more workers than that's currently available, you can use PM2's scale command to adjust the size of your cluster:

pm2 scale [app-name] [no-of-workers]

#example:
pm2 scale server 8  

And of course, you can downscale with the same command.

Conclusion

Node's cluster module is a power tool that lets you scale your applications to improve performance. On the other hand, you can achieve the same thing with PM2 with some other added benefits such as cluster scaling, process monitoring amongst other things. You can read more about PM2 here.

Hope you've enjoy this post...

Djamseed Khodabocus

Geek, blogger, programmer, coffee lover, music fan, tv nerd, photography enthusiast.

Mauritius

Subscribe to Djamseed Khodabocus

Get the latest posts delivered right to your inbox.

or subscribe via RSS with Feedly!