Skip to content

Harden live-start full-pull fallback (fix-forward of #325 P1s)#326

Merged
khaliqgant merged 1 commit into
mainfrom
fix/live-startup-pull-resilience
Jun 14, 2026
Merged

Harden live-start full-pull fallback (fix-forward of #325 P1s)#326
khaliqgant merged 1 commit into
mainfrom
fix/live-startup-pull-resilience

Conversation

@khaliqgant

@khaliqgant khaliqgant commented Jun 14, 2026

Copy link
Copy Markdown
Member

Fix-forward of two P1s flagged on #325 (now merged to main).

1. Unguarded startup runOnce() aborted daemon start

runOnce() calls #readyIssuePaths() unguarded, so a transient pull/listTree failure propagated out of #startLiveSubscription and killed factory start — a startup-resilience regression vs. the prior "watermark undefined → continue" behavior. Now wrapped in try/catch: on failure it increments liveHighWatermarkFullPullErrors, logs via #error, and degrades to the live event stream instead of taking the daemon down.

2. Pull-before-subscribe blind spot

The full pull ran before the subscription was registered, so an issue going Ready during the pull emitted an event with no listener and was lost — ironic, since avoiding missed events is the point. Now the subscription registers before the pull; events buffer behind a new #deferLiveEventDrain gate and drain once the pull completes. The existing batch dedupe suppresses any overlap with issues the pull already dispatched (so no double-dispatch).

Tests

  • Startup pull throws → factory.start resolves (daemon stays up) + liveHighWatermarkFullPullErrors counter.
  • An issue that arrives via a live event during the startup pull is captured and dispatched (buffer-then-drain).
  • Full factory-sdk suite green (339), typecheck clean.

Refs #325.

🤖 Generated with Claude Code

Review in cubic

Two P1s flagged on #325's startup full-pull fallback (now in main):

1. Unguarded startup runOnce() aborted daemon start. runOnce() calls
   #readyIssuePaths() unguarded, so a transient pull/listTree failure
   propagated out of #startLiveSubscription and killed `factory start` — a
   startup-resilience regression vs. the prior "watermark undefined -> continue".
   Now wrapped: on failure, increment liveHighWatermarkFullPullErrors, log via
   #error, and degrade to the live stream instead of going down.

2. Pull-before-subscribe blind spot. The full pull ran before the subscription
   registered, so an issue going Ready *during* the pull emitted an event with
   no listener and was lost. Now the subscription registers BEFORE the pull;
   events buffer via a new #deferLiveEventDrain gate and drain once the pull
   completes. Batch dedupe suppresses overlap with issues the pull dispatched.

Tests: startup-pull-throws keeps the daemon up (start resolves + counter); an
issue arriving via a live event mid-pull is captured and dispatched. Full
factory-sdk suite green (339), typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@khaliqgant khaliqgant added the no-agent-relay-review Disable agent-relay automated PR review/fixes label Jun 14, 2026
@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@khaliqgant, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 34 minutes and 6 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4b19910a-c204-4af2-9cfa-3b9086f1816b

📥 Commits

Reviewing files that changed from the base of the PR and between b8802be and 04d1a85.

📒 Files selected for processing (2)
  • packages/factory-sdk/src/orchestrator/factory.test.ts
  • packages/factory-sdk/src/orchestrator/factory.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/live-startup-pull-resilience

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@agent-relay-code

Copy link
Copy Markdown
Contributor

pr-reviewer could not complete review for #326 in AgentWorkforce/pear.
The review harness exited with code 1.
No review was posted; this needs operator attention.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the FactoryLoop orchestrator to register the live subscription before running the startup full pull, ensuring that events arriving during the pull are buffered and not lost. It also wraps the startup full pull in a try-catch block so that a startup pull failure does not abort the daemon, allowing it to fall back to the live event stream. Corresponding unit tests have been added to verify these behaviors. There are no review comments to address.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@khaliqgant khaliqgant merged commit fb23fbf into main Jun 14, 2026
5 of 6 checks passed
@khaliqgant khaliqgant deleted the fix/live-startup-pull-resilience branch June 14, 2026 09:58
@agent-relay-code

Copy link
Copy Markdown
Contributor

pr-reviewer could not complete review for #326 in AgentWorkforce/pear.
The review harness exited with code 1.
No review was posted; this needs operator attention.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 04d1a85dee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 343 to 345
this.#subscription = this.#mount.subscribe([LIVE_ISSUE_GLOB], (event) => {
this.#enqueueLiveEvent(event)
}, { from: 'now', coalesce: 'none' })

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wait for the stream to open before pulling

In the cloud mount, subscribe() returns before the stream is actually registered: createWorkspaceScopedEventClient.subscribe starts tokenProvider() and only calls sync.start() later (packages/factory-sdk/src/subscriptions/event-client.ts around lines 350-417). With the high-watermark route unavailable, this still leaves a blind spot: if an issue becomes Ready after runOnce() has listed ready paths but before that async stream startup reaches sync.start(), the pull will not see it and the { from: 'now' } stream will not deliver it either. Please gate the startup full pull on the subscription being truly open, or use a cursor/backfill mechanism that covers the async subscription startup window.

Useful? React with 👍 / 👎.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

@agent-relay-code

Copy link
Copy Markdown
Contributor

pr-reviewer could not complete review for #326 in AgentWorkforce/pear.
The review harness exited with code 1.
No review was posted; this needs operator attention.

3 similar comments
@agent-relay-code

Copy link
Copy Markdown
Contributor

pr-reviewer could not complete review for #326 in AgentWorkforce/pear.
The review harness exited with code 1.
No review was posted; this needs operator attention.

@agent-relay-code

Copy link
Copy Markdown
Contributor

pr-reviewer could not complete review for #326 in AgentWorkforce/pear.
The review harness exited with code 1.
No review was posted; this needs operator attention.

@agent-relay-code

Copy link
Copy Markdown
Contributor

pr-reviewer could not complete review for #326 in AgentWorkforce/pear.
The review harness exited with code 1.
No review was posted; this needs operator attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-agent-relay-review Disable agent-relay automated PR review/fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant