Dev Post: Schedule Recurring Rails Tasks to Run at Any Interval with Simple Scheduler
January 30, 2017
Simple In/Out is a Rails app running on Heroku with several background jobs that need to be scheduled to run either hourly, daily, or weekly. Being on Heroku means we can easily use Heroku Scheduler to run a job every hour or every day. We could even use a daily job and check for the day of the week to accomplish weekly jobs.
We recently needed to schedule jobs to run more often than every hour, and Heroku Scheduler can no longer do everything we need it to do.
Introducing Simple Scheduler
Simple Scheduler is a scheduling add-on that is designed to be used with Sidekiq and Heroku Scheduler. It gives you the ability to schedule tasks at any interval without adding a clock process. The goal of Simple Scheduler is for it to be delightful to use and for it to ensure your scheduled jobs always run, even if there is server downtime.
Why We Created Simple Scheduler
We evaluated every possible option we could find, and every solution seemed to have the same flaw: What happens if your server is down when your job is supposed to run? TL;DR: Your job won’t run.
The existing solutions out there don’t keep track of whether a scheduled job was actually run. This means there is no way for your app to know that a job that was scheduled to run at 1:00 AM didn’t run because the server wasn’t running at 1:00 AM. What if it was a critical job that needed to run??
Another problem with certain solutions is that the job scheduling happens in the web process. This can be a huge problem if you’re on Heroku with multiple web dynos. Each web dyno would have it’s own job queue and every job would be duplicated.
One more thing we didn’t like about most other solutions is the cron format. We’re Rubyist, so we’re used to nice things. Why use such a hard to read format for scheduling your recurring jobs? Would you rather read
cron: "0 * * * *" or
every: "1.hour"? I’m nitpicking here, but I don’t want to remind myself how the cron format works every time I create or revisit a scheduled job.
How did we solve the problems with the existing solutions we evaluated?
Solution #1: Server Downtime
Simple Scheduler is set up to evaluate your configuration file every 10 minutes via Heorku Scheduler and queue jobs that need to run in the next 6 hours. A minimum of two jobs is always added to the
Sidekiq::ScheduledSet. This ensures there is always one job in the queue that can be used to determine the next run time, even if one of the two was executed during the 10 minute scheduler wait time.
If your server is down when a job was scheduled to run, it obviously won’t run at the scheduled time, but the job will still be in the Sidekiq scheduled job queue. When your server comes back online, the job will execute immediately. This brings up another problem: What if my server is down for an hour and a job is scheduled to run every 10 minutes?
By default, every job that was scheduled will be run. This means if your server is down for an hour, your every-ten-minutes job will immediately run six times when the server comes back online. This may or may not be desirable, and there are two solutions for handling how these jobs behave.
Expected Run Time vs Actual Run Time
Simple Scheduler passes the expected run time to your job’s
perform method if it accepts an argument. Let’s say you want to send a daily digest email to all users at a time they specify. Background jobs are never guaranteed to run at an exact time, so using
Time.now to find all users who want to receive the email at 8:00 AM may not work. The job may not run until 8:01 AM, or worse, it may not run until 9:00 AM. Evaluating the expected run time passed to your background job will ensure you don’t send the email to the 9:00 AM users twice and skip the 8:00 AM users.
If you have an hourly job that performs the same task every hour, you may not want it to run twice in a row if your server is down for over an hour. The Simple Scheduler config allows for an
expires_after option on a scheduled task. If you specify that a job
expires_after: "59.minutes", only the last job will run because earlier jobs will be considered expired and won’t be run.
Solution #2: Multiple Heroku Dynos
Simple Scheduler is a rake task that is run every 10 minutes via Heroku’s Scheduler, so we’re not doing any job queuing in the web process. You can have as many web dynos as you need, and your queued jobs won’t be duplicated. Once queued as a scheduled job in Sidekiq, the rest is handled by your Sidekiq worker dyno(s).
Solution #3: Pretty Configuration File
I remember having the idea of a perfect YAML configuration file in my head before creating Simple Scheduler. I told Brandon, “Let me try some README-first development to solve this.” You can see my initial README here. I presented the README and said, “I don’t know if I can pull it off, but it’s awesome, right?” It is awesome. The Simple Scheduler configuration file is so pretty and easy to understand:
# Global configuration options and their defaults. These can also be set on each task. queue_ahead: 360 # Number of minutes to queue jobs into the future tz: nil # The application time zone will be used by default # Runs once every 2 minutes simple_task: class: "SomeActiveJob" every: "2.minutes" # Runs once every day at 4:00 AM. The job will expire after 23 hours, which means the # job will not run if 23 hours passes (server downtime) before the job is actually run overnight_task: class: "SomeSidekiqWorker" every: "1.day" at: "4:00" expires_after: "23.hours" # Runs once every half hour, starting on the 30 min mark half_hour_task: class: "HalfHourTask" every: "30.minutes" at: "*:30" # Runs once every week on Saturdays at 12:00 AM weekly_task: class: "WeeklyJob" every: "1.week" at: "Sat 0:00" tz: "America/Chicago"
Start Using Simple Scheduler
I couldn’t be more proud of the work I put into making Simple Scheduler a pleasure to use, easy to understand, and I am so happy to be able to share it with the Rails community! This is some of the best code I’ve ever written, and it solves real problems.