Handle missing live high-watermark route#325
Conversation
|
Caution Review failedPull request was closed or merged during review No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthrough
ChangesLive subscription high-watermark route-unavailable fallback
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install failed due to a network error. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements a fallback mechanism in the factory orchestrator to run a startup full pull when the high-watermark route is unavailable. It introduces a helper function isHighWatermarkRouteUnavailable to detect route-not-found errors and adds corresponding unit tests. Feedback suggests making the error detection helper more robust by checking nested response data and safely inspecting error messages.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| const isHighWatermarkRouteUnavailable = (error: unknown): boolean => { | ||
| const details = asRecord(error) ?? {} | ||
| const response = asRecord(details.response) ?? {} | ||
| const status = details.status ?? details.statusCode ?? response.status ?? response.statusCode | ||
| if (status === 404 || status === '404') { | ||
| return true | ||
| } | ||
|
|
||
| const code = stringValue(details.code)?.toLowerCase() | ||
| if (code === 'route_not_found') { | ||
| return true | ||
| } | ||
|
|
||
| return error instanceof Error && /route not found/i.test(error.message) | ||
| } |
There was a problem hiding this comment.
The isHighWatermarkRouteUnavailable helper checks for details.code and error.message to identify route-not-found errors. However, in many HTTP clients (like Axios), server-returned error details (such as custom error codes or messages) are nested inside the response body (i.e., response.data). Additionally, if an error is serialized/deserialized across process boundaries, it may lose its Error prototype, causing error instanceof Error to evaluate to false.
To make this check more robust, we should also inspect response.data for code and message, and safely check for message on plain objects.
const isHighWatermarkRouteUnavailable = (error: unknown): boolean => {
const details = asRecord(error) ?? {}
const response = asRecord(details.response) ?? {}
const data = asRecord(response.data) ?? {}
const status = details.status ?? details.statusCode ?? response.status ?? response.statusCode
if (status === 404 || status === '404') {
return true
}
const code = (stringValue(details.code) ?? stringValue(data.code))?.toLowerCase()
if (code === 'route_not_found') {
return true
}
const message = stringValue(details.message) ?? stringValue(data.message) ?? (error instanceof Error ? error.message : '')
return /route not found/i.test(message)
}|
pr-reviewer could not complete review for #325 in AgentWorkforce/pear. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 33f1820d4b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (highWatermark.routeUnavailable) { | ||
| this.#increment('liveHighWatermarkFullPullFallbacks') | ||
| this.#logger.info?.('[factory] live subscription high-watermark route unavailable; running startup full pull before event stream') | ||
| await this.runOnce() |
There was a problem hiding this comment.
Close the startup blind spot before the fallback pull
When the high-watermark route is unavailable, this runs the full pull before any live subscription is registered. If an issue becomes Ready during runOnce() after its paths have already been listed, its change event is emitted while no listener exists, and the later { from: 'now' } subscription starts after that event, so neither the pull nor the stream dispatches it. Register the stream or otherwise capture a cursor before doing the fallback pull so live startup has no event-loss window.
Useful? React with 👍 / 👎.
|
pr-reviewer could not complete review for #325 in AgentWorkforce/pear. |
| if (highWatermark.routeUnavailable) { | ||
| this.#increment('liveHighWatermarkFullPullFallbacks') | ||
| this.#logger.info?.('[factory] live subscription high-watermark route unavailable; running startup full pull before event stream') | ||
| await this.runOnce() | ||
| await this.#refreshLiveHeartbeatIfDue() | ||
| } |
There was a problem hiding this comment.
🚩 runOnce() exceptions during startup full pull propagate and abort factory start
When the high-watermark route is unavailable (404), runOnce() is called at factory.ts:336 without a try-catch. Unlike #handleChange() (which wraps dispatch errors per-issue at factory.ts:1062), runOnce() lets dispatch() exceptions propagate at factory.ts:801. If any single issue fails to dispatch (e.g., spawn failure), the entire runOnce() throws, which propagates through #startLiveSubscription() up to #start() at factory.ts:243, setting this.#started = false and aborting the factory start. In the runLoop context, this is handled by retry logic (factory.ts:829-844), but in the live startup context there is no retry — the caller must re-invoke start(). This is consistent with how #start() handles other failures (e.g. ensureSubRoot), but it does mean a single transient dispatch failure during the 404-fallback full pull prevents the factory from starting, whereas the same failure in normal live mode (without the 404 fallback) would never occur because runOnce is never called.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
pr-reviewer could not complete review for #325 in AgentWorkforce/pear. |
1 similar comment
|
pr-reviewer could not complete review for #325 in AgentWorkforce/pear. |
|
pr-reviewer could not complete review for #325 in AgentWorkforce/pear. |
|
pr-reviewer could not complete review for #325 in AgentWorkforce/pear. |
Two P1s flagged on #325's startup full-pull fallback (now in main): 1. Unguarded startup runOnce() aborted daemon start. runOnce() calls #readyIssuePaths() unguarded, so a transient pull/listTree failure propagated out of #startLiveSubscription and killed `factory start` — a startup-resilience regression vs. the prior "watermark undefined -> continue". Now wrapped: on failure, increment liveHighWatermarkFullPullErrors, log via #error, and degrade to the live stream instead of going down. 2. Pull-before-subscribe blind spot. The full pull ran before the subscription registered, so an issue going Ready *during* the pull emitted an event with no listener and was lost. Now the subscription registers BEFORE the pull; events buffer via a new #deferLiveEventDrain gate and drain once the pull completes. Batch dedupe suppresses overlap with issues the pull dispatched. Tests: startup-pull-throws keeps the daemon up (start resolves + counter); an issue arriving via a live event mid-pull is captured and dispatched. Full factory-sdk suite green (339), typecheck clean. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Tests