Skip to content

Add Queue.workerLifetimeJitter to stagger worker shutdowns#476

Merged
dereuromark merged 3 commits into
masterfrom
feat-worker-lifetime-jitter
Apr 27, 2026
Merged

Add Queue.workerLifetimeJitter to stagger worker shutdowns#476
dereuromark merged 3 commits into
masterfrom
feat-worker-lifetime-jitter

Conversation

@dereuromark
Copy link
Copy Markdown
Owner

Summary

Adds an optional Queue.workerLifetimeJitter config (seconds). Each worker picks a random offset in [0, workerLifetimeJitter] at startup and adds it to its effective workerLifetime / --max-runtime. When a fleet of workers is spawned at the same instant (ECS tasks, Kubernetes deployments, systemd unit with many instances), this prevents every worker from terminating on the same tick and producing a thundering-herd of simultaneous restarts.

Defaults to 0, so behavior is unchanged unless the option is set.

// config/app.php
$config['Queue']['workerLifetime'] = 300;
$config['Queue']['workerLifetimeJitter'] = 30; // workers now exit between 300s and 330s

Credit

Idea and original implementation by Rommel Penaflor (@xrompdev) in #475, where it was proposed against a legacy CakePHP 2.x Symphosize fork and therefore could not be merged directly. This PR is a fresh port to the modern Queue\Queue\Processor, keeping the operational intent intact.

Implementation notes

  • Jitter is computed once per worker (right after $startTime = time() in Processor::run()), not re-rolled each loop iteration, so the exit time is stable per worker.
  • Extracted into Processor::computeLifetimeJitterOffset() so the bounds/default behavior is unit-testable without spinning up the full run loop.
  • Only applied when $maxRuntime > 0 — unlimited workers stay unlimited.
  • <= 0 jitter values are ignored (returns 0), so a misconfigured negative value is a no-op rather than an error.
  • If jitter was applied, the worker logs Applying worker lifetime jitter: +Ns seconds so operators can see the stagger in action.

Docs

Added a dedicated bullet in docs/sections/configuration.md directly after the workerLifetime section, explaining the ECS/K8s use case.

Each worker picks a random offset in [0, workerLifetimeJitter] at startup
and adds it to its effective lifetime. Prevents thundering-herd restarts
when a fleet of workers (e.g. ECS/Kubernetes tasks) is spawned at the
same instant and would otherwise all exit on the same tick.

Defaults to 0 (no jitter), preserving existing behavior.

Idea and original implementation (against a legacy CakePHP 2.x fork) by
Rommel Penaflor (@xrompdev) in #475; ported to the modern processor.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 22, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 94.11765% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 77.39%. Comparing base (5410512) to head (783614b).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
src/Queue/Processor.php 94.11% 1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #476      +/-   ##
============================================
+ Coverage     77.22%   77.39%   +0.17%     
- Complexity      949      966      +17     
============================================
  Files            45       45              
  Lines          3196     3247      +51     
============================================
+ Hits           2468     2513      +45     
- Misses          728      734       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dereuromark dereuromark marked this pull request as ready for review April 22, 2026 15:50
@dereuromark dereuromark requested a review from Copilot April 27, 2026 10:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional Queue.workerLifetimeJitter configuration to stagger worker shutdown times by adding a per-worker random offset to the effective worker lifetime / --max-runtime, reducing synchronized restarts in large fleets.

Changes:

  • Add per-worker lifetime jitter computation in Queue\Queue\Processor and apply it to the effective max runtime.
  • Add unit tests for jitter offset bounds/default behavior.
  • Document the new configuration option and add it to the example config.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/Queue/Processor.php Computes a per-worker jitter offset and adds it to the effective max runtime; logs when jitter is chosen.
tests/TestCase/Queue/ProcessorTest.php Adds unit tests for jitter offset default/bounds/negative handling.
docs/sections/configuration.md Documents workerLifetimeJitter usage and intent (staggered shutdowns).
config/app.example.php Adds workerLifetimeJitter to the example configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Queue/Processor.php Outdated
Comment thread src/Queue/Processor.php Outdated
Comment thread docs/sections/configuration.md Outdated
@dereuromark dereuromark merged commit 39cf2e7 into master Apr 27, 2026
16 checks passed
@dereuromark dereuromark deleted the feat-worker-lifetime-jitter branch April 27, 2026 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants