More atomic poller operations#843
Conversation
… something goes wrong during a large pop operation.
|
@bdurand Ugh, sorry to hear about that. What do you think about using small batch size of, say, 20 rather than one at a time? |
|
I couldn't figure out a way to do small batches since the zrembyscore method doesn't take a limit. I think popping them one at a time would do the most to reduce race conditions. The best way to solve the issue would be an atomic pop and push lua script but that would tie sidekiq to redis 2.6 (maybe a feature for 3.0). |
lib/sidekiq/scheduled.rb
Outdated
There was a problem hiding this comment.
I think the if is redundant, yes? The while should stop if message is falsy.
There was a problem hiding this comment.
Yes, the if predates the while. I'll clean it up.
|
Yeah, you're right about the Redis commands. Maybe it's by design but this loop feels dirty and I guess we can't make it any cleaner. Would you add a comment with your five item list so the scheduler loop is documented in the code? Tests look fine. |
|
Thank you for the hard work in finding and fixing this issue! |


I recently had an issue where my application lost a very large number of jobs. The problem was with future scheduled jobs being popped from the scheduled queue but never pushed onto one of the worker queues. Here are the steps that happened:
The end result was that most of the jobs were irretrievably lost. This had also happened two weeks earlier during some other redis maintenance.
This pull request changes the logic in Poller to pop messages one at a time from the retry and schedule queues and immediately push them to the appropriate worker queue. The new logic is: