As I mentioned in the post on managing file uploads, the most common cause of an unresponsive rails application is having some long-running requests consuming all your rails processes. For managing uploads and downloads you can off-load the time-consuming work to apache modules like mod-x-sendfile and modporter, but for areas where your application’s logic itself is a bottleneck you need to use message queues.
There are literally hundreds of different options available when choosing a message queue, so many that people often balk at the prospect of figuring out which product to use. There are several great presentations and articles comparing the pros and cons of all the different options out there, so I’m not going to try to do that here. Most of the reviews of work-queue solutions focus on using hundreds of workers to handle millions of messages a day. We all love a good scaling story, but this tends to leave developers with the mistaken impression that workers and queues are only for the big-guys. In reality most sites could benefit from offloading some work, and delayed_job makes it really easy way to get started.
The delayed_job page on github has a full set of installation instructions including the migration you’ll have to run to create the jobs table. After that you’ll need to use your favourite process-monitor to keep the workers running, Chris from github was kind enough to publish their god config
Picking work to Off-Load.
While you can uses queues to solve a wide range of problems, by far the simplest to start with are ‘fire and forget’ tasks. Where you need something to happen, but the user doesn’t need to wait to see if it succeeded before being able to proceed. For example:
- Resizing images and uploading them to S3
- Sending an Email
- Updating Solr
- Posting something to twitter
Each of these tasks can take a significant amount of time due to network timeouts or just the sheer amount of work involved in the task. Also some tasks like S3 uploads or sending an email may fail due to some transient-glitch, and rather than show your user an error page you probably just want to retry a few times. Generally my advice is to look through all your after_save, after_create and after_destroy methods, and evaluate which of them could be off-loaded.
Because the workers will be operating in a separate process inside a new database transaction you won’t be able to off-load any callbacks which rely on the transient state of an instance. Things like tracking attribute changes or relying on instance variables won’t work, but anything which just relies on the state in the database should be fine.
The killer feature that delayed_job has is , this lets you transparently turn a method call on a class or object into a delayed_job. For example:
@photo.calculate_thumbnails # runs during your request @photo.send_later(:calculate_thumbnails) # runs in a worker at some later stage
It also supports declaring certain methods to be handled asynchronously in an environment file:
# production.rb config.after_initialize do Photo.handle_asynchronously :calculate_thumbnails end
By making use of handle_asynchronously you can mark all the suitable callbacks to fire asynchronously without having to change any of your controllers. Just add an block to your production environment and mark them for async handling.
Things to Watch
Delayed job will automatically retry jobs on failure. This gives you free retries for transient errors like the ever-present Internal Server Errors from S3, but you should still check your jobs table to see if any of the errors you’re receiving aren’t transient. Also in its default configuration delayed_job will delete jobs which fail more than 25 times, while it will take a long time to accrue this many failures you could still have jobs silently disappearing on you.
Delayed job is a fantastic, flexible and simple solution to async-processing in a rails application, while it may not be suitable for extremely high workloads, it will serve you well when you’re getting started.