Skip to content
This repository was archived by the owner on Apr 26, 2026. It is now read-only.

Upgrade to AI SDK v3 spec, fix hanging thinking indicator, add --effort support#12

Open
Aptul9 wants to merge 3 commits into
unixfox:masterfrom
Aptul9:feat/sdk-v3-upgrade
Open

Upgrade to AI SDK v3 spec, fix hanging thinking indicator, add --effort support#12
Aptul9 wants to merge 3 commits into
unixfox:masterfrom
Aptul9:feat/sdk-v3-upgrade

Conversation

@Aptul9

@Aptul9 Aptul9 commented Apr 17, 2026

Copy link
Copy Markdown

Summary

Core fixes required to make the plugin work correctly under AI SDK v6 / OpenCode v1.4.3, plus --effort flag passthrough. This PR is the V3 migration split out from the previous larger PR per maintainer request.

AI SDK v3 spec migration

  • Upgrade specificationVersion from v2 to v3, eliminating the V2-to-V3 compatibility layer in AI SDK v6 that caused the thinking indicator to hang indefinitely in OpenCode
  • Return usage in V3 format (nested objects with total/cacheRead/cacheWrite/reasoning) — fixes the usage2.inputTokens.total crash under AI SDK v6 / OpenCode v1.4.3
  • Filter empty text blocks in message builder to prevent cache_control cannot be set for empty text blocks API errors
  • Return empty stream when no user content is available (follow-up calls after tool execution), preventing API errors on requests with empty messages
  • Always use finishReason: "stop" since the CLI executes all tools internally and returns complete results — prevents the AI SDK agentic loop from making unnecessary follow-up calls
  • Upgrade @ai-sdk/provider to 3.0.8 and @ai-sdk/provider-utils to 4.0.23

Effort flag passthrough

  • Add --effort flag passthrough: reads effort level from OpenCode thinking variant selection and passes it to the Claude CLI via --effort <level>

Stream event fixes

  • Per-block text emission: emit text-start/text-delta/text-end for each CLI text block instead of a single pair for the entire turn. Each text block is saved immediately by OpenCode as a separate part, preventing loss of the final response text when the stream is aborted before the result event arrives from the CLI.
  • providerExecuted on tool-input-start: add the providerExecuted flag to tool-input-start events so OpenCode marks CLI-executed tools correctly and does not re-loop looking for unresolved tool calls.
  • Last-iteration token counts: use usage.iterations[-1] from CLI result instead of cumulative totals across all internal tool-use iterations — prevents inflated context counters (e.g. 780k instead of 80k) and premature compaction.
  • Negative input token fix: when counting non-cached input tokens, compute total = input + cacheRead + cacheWrite so the OpenCode UI does not display negative values.
  • Result fallback timer (5s): if the CLI emits a final text block but never sends a result event, the stream closes gracefully after 5 seconds with accumulated content.
  • Abort grace period: on abort signal, start a 5s grace period instead of closing immediately, allowing in-flight CLI messages to be processed.
  • Copilot review fixes: re-enable DTS generation, remove unused LanguageModelV3StreamPart import, remove ai from runtime dependencies.

OpenCode config example with thinking variants

{
  "provider": {
    "claude-code": {
      "npm": "opencode-claude-code-plugin",
      "models": {
        "sonnet": {
          "name": "Claude Code Sonnet",
          "limit": { "context": 1000000, "output": 64000 },
          "capabilities": { "reasoning": true, "toolcall": true },
          "variants": {
            "low": { "thinking": { "type": "adaptive" }, "effort": "low" },
            "medium": { "thinking": { "type": "adaptive" }, "effort": "medium" },
            "high": { "thinking": { "type": "adaptive" }, "effort": "high" },
            "max": { "thinking": { "type": "adaptive" }, "effort": "max" }
          }
        }
      }
    }
  }
}

Test plan

  • Verified stream events with AI SDK v6 streamText + fullStream (correct event ordering)
  • Verified stream completes with maxSteps: 5 (no unnecessary follow-up steps)
  • E2E test with 28 tool calls: final text (1128 chars) arrives correctly
  • E2E test with exact user prompt (21 tools): final text (2556 chars) arrives correctly
  • Tested in OpenCode v1.4.3: no API errors, no hanging thinking indicator, no truncated responses
  • Tested effort level selection in OpenCode UI
  • Token counts no longer inflated (last iteration only)

Follow-ups

Two independent follow-up PRs split from the previous combined submission:

  • Image/vision support in user messages
  • Helper script to import existing Claude Code .jsonl sessions into the OpenCode DB

Aptul9 added 3 commits April 17, 2026 19:46
- Upgrade specificationVersion from v2 to v3, eliminating the
  compatibility layer that caused the thinking indicator to hang
  in OpenCode.
- Return usage in V3 format (nested objects with total/cacheRead/etc)
  fixing the 'usage2.inputTokens.total' crash in AI SDK v6.
- Filter empty text blocks in message builder to prevent
  'cache_control cannot be set for empty text blocks' API errors.
- Return empty stream when no user content is available, preventing
  API errors on follow-up calls with empty messages.
- Always use finishReason 'stop' since the CLI executes all tools
  internally and returns complete results.
- Add --effort flag passthrough: reads effort level from OpenCode
  variant selection and passes it to the Claude CLI.
- Upgrade @ai-sdk/provider to 3.0.8 and @ai-sdk/provider-utils
  to 4.0.23 for V3 type support.
CLI returns input_tokens as non-cached count only, while cacheRead and
cacheWrite are separate. OpenCode expects inputTokens.total to include
all input tokens (cached + non-cached), then subtracts cache internally.

Pass total = input + cacheRead + cacheWrite so OpenCode's subtraction
produces the correct non-cached count instead of going negative.
Also populate noCache field with the raw non-cached input count.
- Emit text-start/delta/end per CLI text block instead of one pair for
  the entire turn. Each text block is saved immediately by OpenCode as a
  separate part, preventing loss when the stream is aborted before the
  result event arrives.
- Add providerExecuted flag to tool-input-start events so OpenCode does
  not re-loop looking for unresolved tool calls.
- Use last iteration token counts from CLI usage instead of cumulative
  totals across all internal tool-use iterations — prevents inflated
  context counters and premature compaction.
- Add 5s result fallback timer: if the CLI emits a final text block but
  never sends a result event, the stream closes gracefully after 5
  seconds with accumulated content.
- On abort signal, start grace period instead of closing immediately,
  allowing in-flight CLI messages to be processed.
- Fix Copilot review comments: re-enable DTS generation, remove unused
  LanguageModelV3StreamPart import, remove ai from runtime dependencies.
- Remove file-based trace logging.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant