Redis Lua scripting for library writers

Posted on June 15, 2020 by wjwh

The single threaded nature of Redis makes lot of concurrency problems much easier since a lot of issues just go away when it’s simply impossible for multiple operations to happen at the same time. Despite (or maybe because of) this, transactions in Redis are not as advanced as in many other databases. Using the MULTI/EXEC commands, you can queue up multiple commands to be executed atomically, but if the commands later in the transaction depend on the output of earlier commands this is not enough. There is a form of optimistic locking with the WATCH command, but it’s more a workaround than a rock solid solution since rollbacks are not supported and will have to be implemented clientside. Luckily, from version 2.6 of Redis onwards you can write your own custom commands as Lua scripts and this is now the recommended way to implement more complicated transactions.

Invoking Lua scripts

Running a Lua script is fairly simple: you can use the EVAL command which takes the script as one of its arguments. Once you have run a script, Redis will save it in the [script cache] and from then on you can use the EVALSHA command, which takes the SHA hash of a script instead of the script itself and then looks up the script from the script cache. This is obviously faster since less bytes have to be sent across the connection but it will fail if the script is not present in the script cache for some reason. If you try to run a script with EVALSHA that is not present in the script cache, Redis will respond with an error of the form (error) 'NOSCRIPT' No matching script. Please use [EVAL](/commands/eval).

This behavior is very nice if you have complete control over the Redis instance, since you can run the script(s) you want manually just once to get them into the script cache and from then on you can just use the EVALSHA command. However, there are many cases where you do not control the server. For example, you might be using a hosted Redis service that might reboot the server at an unpredictable moment for maintenance or you might be working on library code that will be run by unknown users on unknown Redis instances. Relying on EVALSHA alone will not be possible, but using EVAL all the time is wasteful. Trying to maintain some internal state in the app server to run EVAL the first time and EVALSHA is still wasteful if there are multiple horizontally scaled app servers. Even worse, if the Redis server gets rebooted but not the app servers, they will remain under the impression that the script is in the script cache while it is actually not so.

Exceptions to the rescue

Redis is brilliant as a simple way to share state between horizontally scaled servers. One of the libraries written at my previous job was Prorate, which turns a Redis server into a shared rate limiter. (There is also a Crystal version available here). For example, you can give it a ping with the client IP during every call to /auth and it will let you know if that IP has done too many requests in the last few minutes or not. How many requests are “too” many is configurable. Most of the library is implemented as a Lua script that implements the Leaky Bucket algorithm. To make sure that the script is correctly loaded and that it’s the latest version, the Lua script is included in the library and its hash is computed at boot time by putting it into a constant. In both Ruby and Crystal, constants are initialised when the file is loaded for the first time, which in almost all cases will be during program startup:

LUA_SCRIPT_CODE = File.read(File.join(__dir__, "rate_limit.lua"))
LUA_SCRIPT_HASH = Digest::SHA1.hexdigest(LUA_SCRIPT_CODE)

By calculating hash directly from the current version of the script, it’s impossible to forget to update the hash when updating the code. To run the script, we always try evalsha first. In the absolute vast majority of cases, the script will already be present in the script cache and we can proceed as normal. If this is the first call for a new version of the script or if the script cache was somehow cleared, Redis will return an error which we can catch as a Redis::CommandError:

def run_lua_throttler(redis:, identifier:, bucket_capacity:,
                              leak_rate:, block_for:, n_tokens:)
  redis.evalsha(LUA_SCRIPT_HASH, [], [identifier, bucket_capacity, 
                          leak_rate, block_for, n_tokens])
rescue Redis::CommandError => e
  if e.message.include? "NOSCRIPT"
    # The Redis server has never seen this script before.
    # Needs to run only once in the entire lifetime of the Redis
    # server, until the script changes - in which case it will be
    # loaded under a different SHA.
    redis.script(:load, LUA_SCRIPT_CODE)
    retry
  else
    raise e
  end
end

It is usually a code smell to use exceptions for control flow, but in this case encountering a NOSCRIPT error should be so rare that it is truly an “exceptional” case. We found this method of rescue Redis::CommandError and then checking for NOSCRIPT in the error message to be a powerful pattern that has worked flawlessly for many billions of calls by now across several of our applications.

Conclusion

Redis is already amazing for single commands, but implementing transactions where later steps depend on the output can be tricky. Lua scripts present an very powerful solution, but are sometimes regarded as being “difficult to deal with”, especially in library code. Catching the error and issuing a single EVAL with the full script is a great way to distribute custom Lua scripts for Redis with libraries. The code shown in this post is in Ruby, but the pattern is reusable across basically every language with a Redis library. Surprisingly, many people we’ve shown this to had not considered this pattern before. If you haven’t either, consider using Lua script in your next Redis library!