Skip to content

Make workerkey the canonical worker identity#494

Merged
dereuromark merged 2 commits into
masterfrom
refactor/workerkey-canonical-identity
May 8, 2026
Merged

Make workerkey the canonical worker identity#494
dereuromark merged 2 commits into
masterfrom
refactor/workerkey-canonical-identity

Conversation

@dereuromark
Copy link
Copy Markdown
Owner

Status: DRAFT — for discussion

Summary

  • Drops the unique index on (pid, server) (re-added as non-unique for query performance).
  • Switches update() / remove() to lookup by workerkey (already uniquely indexed).
  • Processor now stores its workerkey alongside pid and uses it for heartbeat and shutdown.
  • endProcess() picks the most recently heartbeated row when multiple share a PID, so operator commands target the live worker rather than a stale leftover.

Why

PID is not a stable worker identity. The OS recycles low PIDs across container restarts, and the unique (pid, server) index collides whenever a containerized worker is replaced before its row is cleaned. workerkey is already a UUID-style hash generated per process — the correct identity.

Relationship to other PRs

Open questions

  • Should worker end / worker kill accept --workerkey as a more precise alternative to PID?
  • Should update() / remove() keep PID-based overloads for BC (deprecation path) or be a hard cut?
  • Should we remove the eviction logic from PR Auto-evict stale queue_processes rows on worker startup #493 in a follow-up, or keep it for safety?

Notes

terminateProcess() still wipes by PID; that's correct because the OS only ever has one live process per PID at any instant, so all rows sharing that PID belong to the same OS process (some live, some stale leftovers — all OK to remove once the kill succeeds).

PID is not a stable worker identity. The OS recycles low PIDs across
container restarts, and the unique index on (pid, server) collides
whenever a containerized worker is replaced before its row is cleaned.
workerkey is already a UUID-style hash generated per process - the
correct identity, and already uniquely indexed on its own.

Schema:
- New migration drops the unique index on (pid, server) and re-adds
  it as a non-unique index for query performance.
- Test fixture mirrors the change.

Behavior:
- update() and remove() now look up by workerkey.
- Processor stores the workerkey alongside the pid and passes it on
  heartbeat / shutdown.
- endProcess() picks the most recently heartbeated row when multiple
  rows share a PID. terminateProcess() already wipes by pid which
  remains correct - the OS guarantees only one live process per pid
  at any moment.

Open questions for review:
- Should worker end / kill accept --workerkey as an alternative to
  PID for precise targeting?
- Can the eviction logic added in PR #493 be removed once this lands,
  or kept as belt-and-braces?
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 8, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 86.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.48%. Comparing base (1f88ba5) to head (c7896b4).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/Model/Table/QueueProcessesTable.php 88.88% 1 Missing ⚠️
src/Queue/Processor.php 83.33% 1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #494      +/-   ##
============================================
- Coverage     77.51%   77.48%   -0.03%     
  Complexity      971      971              
============================================
  Files            45       45              
  Lines          3255     3251       -4     
============================================
- Hits           2523     2519       -4     
  Misses          732      732              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dereuromark dereuromark marked this pull request as ready for review May 8, 2026 15:03
@dereuromark dereuromark merged commit 0e93a9b into master May 8, 2026
16 checks passed
@dereuromark dereuromark deleted the refactor/workerkey-canonical-identity branch May 8, 2026 15:04
dereuromark added a commit that referenced this pull request May 8, 2026
…ue index (#495)

testAddDoesNotEvictRecentRowOnSamePidServer was asserting QueryException,
which fired before #494 because the unique (pid, server) index rejected
the second insert. With the unique constraint dropped, the second insert
succeeds, so the test premise no longer holds and master CI failed.

Reframe the test to verify what it actually cares about: the eviction
guard (`modified <` threshold) inside add() must NOT remove a fresh
row. After the second add() on the same (pid, server), assert the
pre-existing fresh row still exists. Coverage of the threshold check
is preserved; the duplicate-allowed behavior is already covered by
testAddAllowsDuplicatePidServer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants