injection retries#349
Conversation
| const delivered = await this.performTaskInjection(task, from); | ||
| if (!delivered) { | ||
| this.logError(` Task delivery failed on attempt ${attempt + 1}`); | ||
| continue; // Retry | ||
| } | ||
|
|
||
| // Verify CLI shows activity (task received, thinking, or tool execution) | ||
| const activityVerified = await this.verifyActivityAfterInjection(outputBefore); | ||
| if (activityVerified) { | ||
| this.log(` Task delivered and activity verified successfully`); | ||
| return true; | ||
| } | ||
|
|
||
| // Delivery succeeded but no activity - CLI might not have been ready | ||
| this.logError(` Task delivered but no activity detected (attempt ${attempt + 1})`); |
There was a problem hiding this comment.
🔴 Task may be injected multiple times when CLI is slow to show activity
In injectTask, if performTaskInjection succeeds (socket write confirmed by Rust) but verifyActivityAfterInjection returns false (no activity patterns detected within 5 seconds), the code retries and sends the task again. This can cause duplicate task delivery when the CLI is slow to respond.
Click to expand
Mechanism
The retry loop at lines 2741-2766 works as follows:
performTaskInjectionreturnstrue- the task was successfully written to the PTYverifyActivityAfterInjectionwaits up to 5 seconds for activity patterns- If no activity is detected (CLI is slow), the loop continues and calls
performTaskInjectionagain - The same task is now sent twice to the CLI
// Line 2751: Task successfully delivered
const delivered = await this.performTaskInjection(task, from);
if (!delivered) {
continue; // This is fine - delivery failed
}
// Line 2758: But if verification fails, we retry delivery!
const activityVerified = await this.verifyActivityAfterInjection(outputBefore);
if (activityVerified) {
return true;
}
// Loop continues - task will be sent AGAINImpact
With MAX_RETRIES = 3 (4 total attempts) in the orchestrator, and the spawner having its own 3-retry loop (packages/bridge/src/spawner.ts:1373-1416), this could result in up to 12 duplicate task injections in the worst case. The agent would receive and potentially execute the same task multiple times.
Expected Behavior
Once performTaskInjection returns true, the task has been delivered. Activity verification failures should NOT trigger re-delivery of the same task - they should either fail immediately or use a different recovery strategy (like checking if the CLI is stuck).
Was this helpful? React with 👍 or 👎 to provide feedback.
Addresses Devin's review comment about inconsistent retry behavior between OpenCodeWrapper and RelayPtyOrchestrator. OpenCodeWrapper now retries failed task injections up to 3 times with 500ms delays, matching RelayPtyOrchestrator's behavior for consistent reliability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| }); | ||
| log.error(`CRITICAL: ${tracedError.logMessage}`); | ||
| await openCodeWrapper.stop(); | ||
| this.activeWorkers.delete(name); |
There was a problem hiding this comment.
🟡 Unnecessary activeWorkers.delete() called before worker is added to map
When task injection fails for OpenCodeWrapper, the code calls this.activeWorkers.delete(name) at line 1221, but the worker is only added to activeWorkers later at line 1247 (after successful task injection).
Click to expand
Code Flow
In the OpenCodeWrapper path in spawner.ts:
- OpenCodeWrapper is created and started (lines 1144-1170)
- Task injection is attempted (lines 1196-1208)
- If injection fails, cleanup happens at lines 1219-1230 including
this.activeWorkers.delete(name)at line 1221 - But the worker is only added to
activeWorkers.set(name, workerInfo)at line 1247 (after successful injection)
Actual vs Expected
- Actual:
activeWorkers.delete(name)is called on a key that was never added - Expected: This call should be removed since
Map.delete()on a non-existent key is a no-op
Impact
This is harmless dead code since Map.delete() returns false for non-existent keys without throwing. The same issue exists in the RelayPtyOrchestrator path at line 1429.
Recommendation: Remove the this.activeWorkers.delete(name) call since the worker was never added to the map at this point in the code flow.
Was this helpful? React with 👍 or 👎 to provide feedback.
The delete call at line 1221 was unreachable because the worker is only added to activeWorkers at line 1247 (success path). Removing this harmless but confusing dead code. Addresses Devin review comment #4. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Noticed that sometimes when spawning with a task the agent wasn't ready so the task injection fails. This logic tries to detect that and attempts retries to inject the message again