feat(provider): interleaved reasoning replay by xiami762 · Pull Request #294 · AgentFlocks/flocks

xiami762 · 2026-05-20T07:20:31Z

Summary

This PR improves multi-turn reliability for reasoning / interleaved-thinking models and hardens the session runner against runaway tool loops.

Provider: interleaved reasoning replay

Adds reasoning_replay.py and interleaved.py to normalize how assistant reasoning is echoed back to providers on follow-up turns.
Extends Anthropic and OpenAI SDK paths to preserve reasoning_content / reasoning_details across tool-call turns, with provider-specific echo policies (when_present, tool_calls, all_assistant) and cross-provider promotion/placeholder handling.
Updates catalog.json with interleaved-capability metadata for known reasoning model families.
Integrates replay preparation into the session runner so multi-step tool flows keep valid reasoning context.

Session: tool-loop guards and step budget

Introduces cross-step loop detection in SessionRunner:
- Halt after 3 consecutive tool-only turns with the same tool + identical arguments
- Halt after 8 consecutive tool-only turns using the same tool (even with different args)
- Warn one step before each threshold
Sets DEFAULT_MAX_TOOL_STEPS = 1000 when an agent does not declare an explicit steps limit (replaces unbounded inf).
Message replay now resolves the model from the session’s current pin/default/agent override, not the historical model stored on the original user message.

Agent: safer plugin loading

Hardens agent.yaml loading: validates mapping shape and skips malformed entries without breaking agent discovery.

Support provider-specific reasoning field replay (interleaved/thinking) in Anthropic and OpenAI SDK paths, with catalog metadata and runner integration for multi-turn reasoning preservation. Co-authored-by: Cursor <cursoragent@cursor.com>

Validate YAML mapping shape and wrap AgentInfo construction in try/except so malformed model configs are skipped without breaking agent scans. Co-authored-by: Cursor <cursoragent@cursor.com>

Add runner-level guards for repeated exact tool calls and long same-tool streaks, apply a default max tool step limit when agents omit steps, and resolve message replay against the session's current model pin. Co-authored-by: Cursor <cursoragent@cursor.com>

Agents without an explicit steps limit now get a 1000-step budget instead of 100 for longer coding and research tasks. Co-authored-by: Cursor <cursoragent@cursor.com>

xiami762 and others added 4 commits May 20, 2026 12:53

fix(agent): harden agent.yaml loading against invalid configs

d91f92b

Validate YAML mapping shape and wrap AgentInfo construction in try/except so malformed model configs are skipped without breaking agent scans. Co-authored-by: Cursor <cursoragent@cursor.com>

chore(session): raise default max tool steps to 1000

1e0cab7

Agents without an explicit steps limit now get a 1000-step budget instead of 100 for longer coding and research tasks. Co-authored-by: Cursor <cursoragent@cursor.com>

xiami762 requested a review from duguwanglong May 20, 2026 07:20

duguwanglong approved these changes May 20, 2026

View reviewed changes

duguwanglong merged commit ea74628 into dev May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(provider): interleaved reasoning replay#294

feat(provider): interleaved reasoning replay#294
duguwanglong merged 4 commits into
devfrom
feat/provider-reasoning-interleaved-replay

xiami762 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xiami762 commented May 20, 2026

Summary

Provider: interleaved reasoning replay

Session: tool-loop guards and step budget

Agent: safer plugin loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants