Sidekiq is too fast

... and databases are too slow

Sidekiq is too fast

FYI: This was presented at "Tokyo Rubyist Meetup" mid 2023.

_________

On an application I'm working on we added a job after a record has been created.

A record is created in multiple places. We are using Ruby on Rails, an easy way to not miss a place is by using an after_create hook.

class User < ApplicationRecord
  after_create :schedule_job

  def schedule_job
    UserJob.perform_async(self.id)
  end
end

Write the Sidekiq job:

class UserJob
  include Sidekiq::Job

  def perform(user_id)
    user = User.find(user_id)
    call_clever_method(user)
  end
end

Ok, job done! Shipped and feature is over ... Yes but User.find(user_id) raise id=42 ActiveRecord::RecordNotFound sometime ... like 20~30% of the time.

There are multiple possibilities:

  1. the user_id is empty

  2. the record has already been deleted from the database

  3. the connection from the database is at fault

  4. something else, yet unknown

Obviously, it's the last option.

Let's understand why the others are not the problem: 1. the error is Couldn't find User with 'id'=42 (ActiveRecord::RecordNotFound) the id is in the message, 2. I can see the record in the database when investigating and 3. the exception ActiveRecord::RecordNotFound is only raised after the database responded.

What I thought was happening in a timeline:

The id is return from the database, then pushed to Redis for Sidekiq to process. It must exists, right!

Thanks to clever coworker that remind me about database transactions. A transaction in a database is like a branch in git, until the branch is branch to the main branch it like it does not exists.

Really timeline of the problem:

The code was creating a transaction (it was implicit, no .transaction to be found) and with the speed of Sidekiq that pickup the job even before the database to commit the transaction, the Sidekiq job was already querying the main "branch" of the database.

Thankfully, it was on a create and the ActiveRecord .find raise an exception. On an update, the record would had existed and the job would had got the old data. Depending on the situation, this can have serious implications.

The solution

We want to push to Sidekiq only when the transaction is committed. In Rails, it's trivial just add _commit to get after_create_commit . This tiny change ensure it runs after all the changes are available to all.

With that change, here's the new timeline:

The full code, not much change but crucial:

class User < ApplicationRecord
  after_create_commit :schedule_job # Change here

  def schedule_job
    UserJob.perform_async(self.id)
  end
end

A different solution

I did present this Sidekiq/transaction issue to the Tokyo Rubyist Meetup earlier this year (2023). By publicizing the presentation I got a feedback:

Mike Perham, the creator of Sidekiq, provided support on Ruby.social (a Mastodon instance for Rubyist).

Sidekiq 6.5 introduced the transactional_push which will automatically push the job to the queue after the database committed the transaction. No more need to remember to call after_create_commit instead of after_create, it will always work by default. Please note that it works if you use Sidekiq directly, the ActiveJob abstraction might interfere.

For that feature, just add this in the config:

# config/initializers/sidekiq.rb:

Sidekiq.transactional_push!

That's all for today. I welcome constructive critiques, comments and please share.