Skip to content

Force factory daemon exit after graceful stop#273

Merged
kjgbot merged 1 commit into
mainfrom
factory-sdk-daemon-force-exit-sb-impl3
Jun 13, 2026
Merged

Force factory daemon exit after graceful stop#273
kjgbot merged 1 commit into
mainfrom
factory-sdk-daemon-force-exit-sb-impl3

Conversation

@kjgbot

@kjgbot kjgbot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes the remaining fleet factory start --mode live SIGTERM latency by forcing daemon process exit after graceful factory shutdown has fully completed.

Details

  • Keeps the force-exit scoped to the factory start daemon signal path only.
  • Preserves shutdown order: factory.stop() resolves first, then daemon output is flushed, then the daemon exit hook calls process.exit(code).
  • Leaves run-once, loop, and reap-orphans on their existing natural return/drain path.

Proof

  • Focused CLI regression: npx vitest run packages/factory-sdk/src/cli/fleet.test.ts
    • daemon SIGTERM path calls stop -> flush -> exit
    • one-shot run-once and reap-orphans do not invoke daemon flush/exit hooks
  • Authoritative typecheck: npm run typecheck:node
  • Full factory-sdk suite: npx vitest run packages/factory-sdk

Live cert handoff

fv2 live cert target: idle daemon SIGTERM should return fast rc0, and the in-flight daemon SIGTERM variant should show pair-tree empty plus fast rc0.

@kjgbot kjgbot added the no-agent-relay-review Disable agent-relay automated PR review/fixes label Jun 13, 2026
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d30d0ff1-031b-4165-94ba-a0e5235b4e03

📥 Commits

Reviewing files that changed from the base of the PR and between 44a982c and ba1673e.

📒 Files selected for processing (2)
  • packages/factory-sdk/src/cli/fleet.test.ts
  • packages/factory-sdk/src/cli/fleet.ts

📝 Walkthrough

Walkthrough

The pull request adds optional daemonExit and flushDaemonOutput lifecycle hooks to FleetCliDeps. When the factory start command receives SIGTERM, it now flushes stdout/stderr before calling the exit handler, ensuring buffered output is delivered. Tests validate the call sequence and confirm one-shot commands do not invoke these hooks.

Changes

Daemon output flushing on SIGTERM shutdown

Layer / File(s) Summary
Daemon exit and flush hooks contract
packages/factory-sdk/src/cli/fleet.ts
FleetCliDeps interface gains optional daemonExit(code) and flushDaemonOutput() hooks for customizable daemon lifecycle behavior during shutdown.
Output flushing implementation
packages/factory-sdk/src/cli/fleet.ts
flushProcessOutput() and flushWritable(stream) utilities are added to flush stdout/stderr by writing an empty string and awaiting completion when streams are writable.
Signal handler integration with flush and exit
packages/factory-sdk/src/cli/fleet.ts
runFactoryCommand for factory start integrates a new flushAndExit() helper in the signal/exit handler that flushes daemon output via the injected or default flusher before invoking the daemon exit handler and resolving the stop-signal waiter.
Test validation of SIGTERM handling and one-shot commands
packages/factory-sdk/src/cli/fleet.test.ts
SIGTERM test scenario now captures and asserts the shutdown call sequence (stopflushexit), daemon exit code, and signal listener unregistration. New test verifies that one-shot commands (run-once and reap-orphans) do not invoke daemonExit or flushDaemonOutput callbacks.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • AgentWorkforce/pear#248: Both PRs update SIGTERM/SIGINT stop handling in fleet.ts/fleet.test.ts; #248 introduces installFactoryStopSignalHandlers with listener unregistration, while the main PR refines the stop-signal exit path to flush daemon output and asserts the call sequence.
  • AgentWorkforce/pear#253: The main PR's test validation for the factory reap-orphans one-shot path directly connects to #253, which adds the reap-orphans command implementation.
  • AgentWorkforce/pear#267: Both the main PR and #267 update fleet.ts/fleet.test.ts around one-shot command execution (run-once and reap-orphans), with the main PR asserting these commands do not invoke forced exit hooks.

Poem

🐰 A daemon tidies up with grace,
Flushing streams from place to place,
When SIGTERM rings the final bell,
Output flows—then exits well! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Force factory daemon exit after graceful stop' directly and clearly summarizes the main change: adding a forced exit mechanism for the factory daemon after graceful shutdown completes.
Description check ✅ Passed The description is well-related to the changeset, providing context about the SIGTERM latency fix, implementation details, shutdown order preservation, scope limitations, and verification steps.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch factory-sdk-daemon-force-exit-sb-impl3

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install failed due to a network error.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for flushing output streams and gracefully exiting the daemon process when stopping the factory command, along with corresponding unit tests. The review feedback highlights two key issues: first, the stream flushing mechanism could hang indefinitely if the output streams are blocked or piped to a terminated process, which can be resolved by introducing a timeout; second, a failure during the flush operation could trigger an unhandled promise rejection, which should be handled with a catch block.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +481 to +496
async function flushProcessOutput(): Promise<void> {
await Promise.all([
flushWritable(process.stdout),
flushWritable(process.stderr),
])
}

function flushWritable(stream: NodeJS.WriteStream): Promise<void> {
if (stream.destroyed || stream.writableEnded || stream.writable === false) {
return Promise.resolve()
}

return new Promise((resolve) => {
stream.write('', () => resolve())
})
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If the output streams (process.stdout or process.stderr) are piped to another process that has terminated or stopped reading, writing to them can block indefinitely or the callback might never be invoked. This would cause the daemon to hang indefinitely during shutdown, defeating the purpose of a fast and graceful exit.

To prevent this, we should:

  1. Introduce a timeout (e.g., 200ms) to the flush operation so that we always proceed to exit even if the streams are clogged.
  2. Safely catch any synchronous errors during stream.write to avoid unhandled promise rejections.
async function flushProcessOutput(): Promise<void> {
  let timeoutId: NodeJS.Timeout | undefined
  const flushPromise = Promise.all([
    flushWritable(process.stdout),
    flushWritable(process.stderr),
  ])
  const timeoutPromise = new Promise<void>((resolve) => {
    timeoutId = setTimeout(resolve, 200)
  })
  await Promise.race([flushPromise, timeoutPromise])
  if (timeoutId) {
    clearTimeout(timeoutId)
  }
}

function flushWritable(stream: NodeJS.WriteStream): Promise<void> {
  if (stream.destroyed || stream.writableEnded || stream.writable === false) {
    return Promise.resolve()
  }

  return new Promise<void>((resolve) => {
    try {
      stream.write('', () => resolve())
    } catch {
      resolve()
    }
  })
}

Comment on lines +220 to +228
const flushAndExit = async (code: number): Promise<void> => {
try {
await (deps.flushDaemonOutput ?? flushProcessOutput)()
} finally {
const daemonExit = deps.daemonExit ?? ((exitCode: number) => process.exit(exitCode))
daemonExit(code)
waiter.resolve(code)
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If deps.flushDaemonOutput or flushProcessOutput rejects, the error will propagate out of flushAndExit because there is no catch block. Since flushAndExit is called as a floating promise (void flushAndExit(code)), this will trigger an unhandled promise rejection. This is especially problematic in tests where daemonExit does not terminate the process.

Adding a catch block ensures that any flush errors are safely ignored and do not cause unhandled rejections.

      const flushAndExit = async (code: number): Promise<void> => {
        try {
          await (deps.flushDaemonOutput ?? flushProcessOutput)()
        } catch {
          // Ignore flush errors to ensure we still exit cleanly
        } finally {
          const daemonExit = deps.daemonExit ?? ((exitCode: number) => process.exit(exitCode))
          daemonExit(code)
          waiter.resolve(code)
        }
      }

@kjgbot kjgbot merged commit f69d22a into main Jun 13, 2026
5 checks passed
@kjgbot kjgbot deleted the factory-sdk-daemon-force-exit-sb-impl3 branch June 13, 2026 10:34
@agent-relay-code

Copy link
Copy Markdown
Contributor

Implemented the scoped fixes in packages/factory-sdk/src/cli/fleet.ts and added regression coverage in packages/factory-sdk/src/cli/fleet.test.ts.

Addressed comments

  • CodeRabbit: processing/status comment only, no actionable finding; no code change needed.
  • gemini-code-assist[bot]: output flush could hang indefinitely or throw synchronously; fixed with a 200ms timeout and safe stream.write handling in packages/factory-sdk/src/cli/fleet.ts:485.
  • gemini-code-assist[bot]: rejected flush could escape the floating shutdown promise; fixed by catching flush errors before daemon exit in packages/factory-sdk/src/cli/fleet.ts:222, with regression coverage in packages/factory-sdk/src/cli/fleet.test.ts:511.
  • gemini-code-assist[bot] review summary: covered by the two inline fixes above.

Advisory Notes

None.

Local validation

  • npm ci
  • npm exec vitest run packages/factory-sdk/src/cli/fleet.test.ts
  • npm run verify:mcp-resources-drift
  • npm run lint passed with existing warnings only
  • npm run typecheck:web
  • npm run typecheck:node
  • npm test
  • npx vitest run
  • npm run build
  • npm run build:web
  • npx playwright test --config playwright.fidelity.config.ts
  • npx playwright test --config playwright.redraw.config.ts

I did not run the macOS-only dist:mac packaged smoke on this Linux runner, and I cannot verify the post-harness pushed commit’s remote CI/mergeability from here, so I’m not marking this READY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-agent-relay-review Disable agent-relay automated PR review/fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant