Redis and Node Part 3: Atomicity with MULTI

Multithreaded JavaScript has been published with O'Reilly!

This is part three of a four part series on using Redis with Node.js. The content of these posts is partially adapted from my book, Advanced Microservices. There is also a companion presentation, Node, Redis, and You!, which I’ve given at several Meetups and a conference.

Issuing single commands in Redis are atomic, which means they're self contained and other commands can't get run in the middle and lead to crazy side-effects. However when we need to run multiple commands and have them be atomic that's when using the MULTI and EXEC commands become useful.

In this article we're going to build ourselves a distributed job scheduler. This scheduler will function within a pool of Node instances. Each Node instance will check for work to do every second. Any of the Node instances can, in theory, schedule jobs to be performed in the future. Jobs will be added to a Sorted Set with the score being a unix millisecond timestamp to perform the work, and with the value being a string describing the work to be performed.

Job Scheduler using Callbacks

First let's take a look at the naïve approach to building this application. We need to run two Redis commands; the first one gets a list of work to be performed. The second one removes the work that needs to be performed so that we don't attempt to do this same work again. We can do this by nesting the two Redis callbacks within each other:

const redis = require('redis').createClient();
const JOBS = 'jobs'; // Sorted Set

redis.zadd(JOBS, Date.now() + 5 * 1000, 'email user 1');
redis.zadd(JOBS, Date.now() + 10 * 1000, 'email user 2');

setInterval(() => {
  let now = Date.now();
  redis.zrangebyscore(JOBS, 0, now, (err, jobList) => { // get jobs until now
    redis.zremrangebyscore(JOBS, 0, now, (err) => { // delete jobs until now
      console.log('jobs', jobList.length ? jobList : 'N/A', process.pid); // perform work
    });
  });
}, 1 * 1000);

The two important commands we're using are as follows: ZRANGEBYSCORE: This gets a list of jobs to be done until now. We're asking for jobs due between 0 (the beginning of time) to the current time. ZREMRANGEBYSCORE: This deletes a list of jobs to be done until now. It takes the same arguments, meaning that it should in theory delete the same jobs which we just retrieved. And of course we did a couple ZADD commands to schedule some work.

Now let's take a look at the output from the MONITOR Redis command. We can run an instance of redis-cli in a terminal and run “MONITOR” to get into this mode. This command shows us all commands executed on the Redis server, as well as information about the client executing each command. I've added some colors to show which process is in charge of which commands.

1474247436.153554 [0 127.0.0.1:20000] "zrangebyscore" "jobs" "0" "1474247436153"
1474247436.153999 [0 127.0.0.1:20000] "zremrangebyscore" "jobs" "0" "1474247436153"
1474247437.155540 [0 127.0.0.1:20001] "zrangebyscore" "jobs" "0" "1474247437155"
1474247437.156171 [0 127.0.0.1:20001] "zremrangebyscore" "jobs" "0" "1474247437155"
1474247437.157580 [0 127.0.0.1:20000] "zrangebyscore" "jobs" "0" "1474247437157"
1474247437.158185 [0 127.0.0.1:20000] "zremrangebyscore" "jobs" "0" "1474247437157"
1474247438.161422 [0 127.0.0.1:20000] "zrangebyscore" "jobs" "0" "1474247438160"
1474247438.161558 [0 127.0.0.1:20001] "zrangebyscore" "jobs" "0" "1474247438160"
1474247438.162285 [0 127.0.0.1:20000] "zremrangebyscore" "jobs" "0" "1474247438160"
1474247438.162373 [0 127.0.0.1:20001] "zremrangebyscore" "jobs" "0" "1474247438160"
1474247439.164502 [0 127.0.0.1:20001] "zrangebyscore" "jobs" "0" "1474247439164"
1474247439.165080 [0 127.0.0.1:20001] "zremrangebyscore" "jobs" "0" "1474247439164"

Unfortunately we have a race condition in our code! We can see this race condition happen with the below output.

jobs N/A 20000
jobs N/A 20001
jobs N/A 20000
jobs ['email user 1'] 20000
jobs ['email user 1'] 20001
jobs N/A 20001

The problem that is occurring is that two Node instances can send intermingled commands to the Redis server. Specifically the first process sends a ZRANGEBYSCORE and then the second process sends the same command, then the first process and second process execute the job deletions. This means both processes got the same list of jobs to be performed.

Job Scheduler using MULTI/EXEC

Each time a new application connects to Redis it will have it's own private connection which it can use for issuing commands. What we're going to do now is wrap our commands using a MULTI and EXEC. Once a MULTI is encountered, Redis won't issue the commands until a corresponding EXEC is encountered. This allows other connections to make commands and once we're ready all of our commands are executed atomically, without risk that another connection is intermingling commands.

Luckily for us the Redis library provides a convenient means to do this. It makes use of the method chaining pattern, common in languages like JavaScript. This gives us an eloquent syntax for running a MULTI command.

const redis = require('redis').createClient();
const JOBS = 'jobs'; // Sorted Set

redis.zadd(JOBS, Date.now() + 5 * 1000, 'email user 1');
redis.zadd(JOBS, Date.now() + 10 * 1000, 'email user 2');

setInterval(() => {
  let now = Date.now();
  redis.multi() // Same concept as a DB transaction
    .zrangebyscore(JOBS, 0, now) // get jobs until now
    .zremrangebyscore(JOBS, 0, now) // delete jobs until now
    .exec((error, data) => {
      let jobList = data[0];
      console.log('jobs', jobList.length ? jobList : 'N/A', process.pid); // perform work
    });
}, 1 * 1000);

With these changes being made we no longer have to worry about different clients mixing commands.

Now let's take a peak at the output from the MONITOR command we used before:

1474250213.374094 [0 127.0.0.1:20000] "multi"
1474250213.374140 [0 127.0.0.1:20000] "zrangebyscore" "jobs" "0" "1474250213373"
1474250213.374174 [0 127.0.0.1:20000] "zremrangebyscore" "jobs" "0" "1474250213373"
1474250213.374200 [0 127.0.0.1:20000] "exec"
1474250213.377766 [0 127.0.0.1:20001] "multi"
1474250213.377821 [0 127.0.0.1:20001] "zrangebyscore" "jobs" "0" "1474250213377"
1474250213.377872 [0 127.0.0.1:20001] "zremrangebyscore" "jobs" "0" "1474250213377"
1474250213.377899 [0 127.0.0.1:20001] "exec"
1474250214.380577 [0 127.0.0.1:20000] "multi"
1474250214.380623 [0 127.0.0.1:20000] "zrangebyscore" "jobs" "0" "1474250214380"
1474250214.380657 [0 127.0.0.1:20000] "zremrangebyscore" "jobs" "0" "1474250214380"
1474250214.380682 [0 127.0.0.1:20000] "exec"

As you can see each block of corresponding commands are executed together, it is now impossible for the race condition we encountered before to occur.

jobs N/A 20000
jobs ['email user 1'] 20001
jobs N/A 20000

If you ever find yourself needing to issue multiple commands atomically, where the output of one command doesn't affect the input of another command, MULTI and EXEC are here to help.


That's the end of part three. The next and final part of this series will deal with even more atomicity concerns and introduce Lua scripting. And of course, if you found this content useful, please checkout my book Advanced Microservices.

Thomas Hunter II Avatar

Thomas has contributed to dozens of enterprise Node.js services and has worked for a company dedicated to securing Node.js. He has spoken at several conferences on Node.js and JavaScript and is an O'Reilly published author.