~ read.

Queueing Data with Nodejs and Redis using BLPOP

As our web apps begin to scale, we often want to start thinking about collecting and analyzing data about our users. With that in mind, we want to make sure that we're optimizing the performance of our website so that the user experience doesn't take a hit. One of the best ways to do this is by taking the data and sending it to a secondary server, sometimes called a slave/minion/worker server, to do all the analysis. This way the main or 'master' server can focus on handling all the direct client-server interaction. That still leaves us with one question - how do we get the data to the worker?

That's where Redis comes in to play. Redis is an in-memory key-value store database. This means a few things:

  • Redis uses RAM for storage as opposed to the traditional disk. Therefore, don't rely on Redis for persistance and always write important data to a disk-storage database. With that said, Redis does provide configurations to persist data to disk-storage which you can learn about here (Thanks @itamarhaber)

  • Redis holds data in key-value stores (where keys map to the values in a dictionary like model) and accepts everything from strings, to lists, to hashes.

  • Redis is FAST! Redis doesn't need to write to a disk, therefore there are significantly less operations for it to do making it incredibly speedy (you can read up on this here).

Redis also has natural queueing commands built into the platform making it a perfect choice for sending data or pointers to data through it. We can use Redis lists for the queue (A queue is a first-in first-out data structure) and as we place data into the queue from the master server, we can grab them from the front of the queue on the worker server.

In order to start we'll need to:

  1. Install Redis on our local machine/host (check out the quick start guide here.

  2. If you're on your local machine, start the server through the command line tool by typing:

    $ redis-server

  3. Install the Node Redis client module from your project directory:
    projectDir $: npm install redis

Next we require in the module and connect to the client:

var redis  = require('redis');

var port   = process.env.REDIS_PORT || null;
var host   = process.env.REDIS_HOST || null;
var client = redis.createClient(port, host,{detect_buffers: true});

The port and host automatically default to '6379' and '127.0.0.1' respectively. The third argument lets you set any additional options that you may want. In this case, the detect_buffers option set to true enables us to make sure that any data we send in as node buffer objects will be returned as node buffer objects to callbacks.

We are now connected! In order to place data in the queue we simply call the rpush method from the master server:

client.rpush('dataQueue', 'A string of data')

The first argument is the key of the list and the second is that actual data you want to pass in. If the list key doesn't already exist, Redis will create the key for you. The second argument is the actual data that you want to pass in.

Now in order for the worker to grab data from the front of the queue, the worker calls the lpop method:

client.lpop('dataQueue', function(error, data){
  if (err) { 
    console.error('There has been an error:', error);
    }
  console.log('We have retrieved data from the front of the queue:', data);
})

The first argument is once again the name of the list, and the second argument is the callback which takes two arguments, an error and the first data removed and returned from the front of the list.

Awesome, we now have a working queue system between the main server and the worker server, but there's a problem. How is the worker going to know when to pop from the queue without having to consistently check? There are a number of ways to handle this, but the easiest is to just use the blpop method.

The blpop method works exactly like the lpop method except for when the queue/list is empty. If the queue is empty, the blpop will actually wait for an lpush or an rpush to occur on the queue, and then it will execute. It's like setting up a one-time 'trigger' to fire-off everytime data gets placed into the queue; we have to make sure to set-up the trigger again everytime it fires. Here's how to setup a blpop:

var blpopQueue = function() {
  client.blpop('dataQueue', 0, function(err, data){
    console.log('We have retrieved the data from the front of the queue:', data);
      blpopQueue();
  });
};

We create a function called blpopQueue that calls client.blpop. Blpop takes three arguments, the name of the list, a timeout argument specifiying the number of seconds to block (how long to keep the 'trigger' running) - leaving this at zero will block indefinitely, and a callback function. We set up the blpopQueue function so that we can set up the blpop or the 'trigger' again after it fires by invoking it in the callback.

Setting up the communication this way between the master server and the worker server promotes a fast and reliable queueing system. If you have any questions or comments be sure to tweet at me at @JoshSGman.

Additional Resources:

redis documentation
redis npm documentation

comments powered by Disqus