fix(provider): restore native OpenRouter endpoint when switching back from a direct profile by prateekjain-afk · Pull Request #280 · 1jehuang/jcode

prateekjain-afk · 2026-05-29T06:46:45Z

Problem

Switching models OpenRouter → NVIDIA NIM → back to a native OpenRouter model (e.g. openrouter/owl-alpha) fails with a 404. The request is sent to the wrong endpoint:

endpoint: https://integrate.api.nvidia.com/v1/chat/completions
model:    openrouter/owl-alpha
auth:     NVIDIA_API_KEY
→ status: 404 Not Found  (404 page not found)

Root cause

Selecting a built-in OpenAI-compatible profile (e.g. NVIDIA NIM via nvidia-nim:...) calls force_apply_openai_compatible_profile_env(Some(profile)), which stamps that profile's endpoint + API key into the process-global JCODE_OPENROUTER_* env vars. The ActiveProvider::OpenRouter arm of set_model never cleared those overrides when switching back, so a native OpenRouter model was POSTed to the stale profile endpoint with the wrong key.

Because the leak lives in process-global env, even brand-new sessions kept failing until the server was fully restarted. Other providers (Claude/OpenAI/Gemini/Copilot) are unaffected because they use self-contained providers and never touch this shared env.

Fix

In the OpenRouter set_model arm: when the previous selection was a built-in direct profile (profile_id.is_some()) and the target is a native openrouter.ai catalog model (id starts with openrouter/), reset the profile env to None and rebuild the provider so it talks to the native endpoint again.

Deliberately left untouched:

raw/custom endpoints configured directly via JCODE_OPENROUTER_API_BASE (profile_id == None)
@provider-pinned or opaque model ids on forced-OpenRouter providers
locked named profiles (JCODE_PROVIDER_PROFILE_ACTIVE)

Tests

Adds test_switch_back_to_native_openrouter_restores_endpoint_after_nvidia, which reproduces the OpenRouter → NVIDIA → OpenRouter switch-back and asserts the endpoint override is cleared. It fails without the fix with the exact integrate.api.nvidia.com URL, and passes with it. Verified no regressions in the provider test suite (the only remaining failures are pre-existing parallel-env-contamination flakes present on master that pass in isolation).

… from a direct profile Selecting a built-in OpenAI-compatible profile (e.g. NVIDIA NIM via "nvidia-nim:...") calls force_apply_openai_compatible_profile_env(Some(profile)), which stamps that profile's endpoint and API key into the global JCODE_OPENROUTER_* env. Switching back to a native OpenRouter catalog model ("openrouter/owl-alpha") never cleared those overrides, so the native model was POSTed to the stale profile endpoint (https://integrate.api.nvidia.com/v1) with the wrong key and returned 404. Because the leak lives in process-global env, even brand-new sessions kept failing until the server was restarted. Fix: in the OpenRouter set_model arm, when the previous selection was a built-in direct profile (profile_id is Some) and the target is a native openrouter.ai catalog model, reset the profile env to None and rebuild the provider so it talks to the native endpoint again. Raw/custom JCODE_OPENROUTER_API_BASE endpoints (profile_id == None), @-pinned ids, and locked named profiles are deliberately left untouched. Adds a regression test that reproduces the OpenRouter -> NVIDIA -> OpenRouter switch-back and asserts the endpoint override is cleared (fails without the fix with the exact integrate.api.nvidia.com URL).

Spawned swarm agents got stuck forever at 'startup queued' because the default spawn mode was Visible: the server forks a terminal launcher (e.g. 'open -a Terminal'), the fork succeeds, but on a server/headless host (jcode serve shared server, no GUI) no interactive client ever attaches to drive the agent loop. The member sits 'running / startup queued', DMs land in an unread mailbox, and wake/resume fail because no task ever ran. Fixes: - Auto mode now verifies a visible launch actually produced a live client attachment (SwarmMember.event_txs becomes non-empty) within a short timeout; if not, it tears down the orphaned visible session and falls back to the in-process headless runner, which always executes. - register_visible_spawned_member no longer clobbers a member that a real client already attached to (avoids a race when a client connects during the Auto attach-wait window). - Default swarm_spawn_mode changed Visible -> Auto so swarm works out of the box on both desktop and headless hosts. Adds unit tests for attach detection, timeout fallback, and the non-clobber guard.

…ng/master Provides a one-command, safe way to pull upstream (origin/master) updates timely while keeping local fix commits on top. Tags a backup before rewriting history and aborts cleanly on conflict.

…ing "action missing" label Two swarm UX bugs surfaced when running a research swarm on a headless \`jcode serve\` shared server: 1. Auto spawn opened a useless bare-jcode Terminal window per child and then waited out the 8s attach timeout before falling back to headless. Now Auto checks up-front whether the requesting coordinator itself has a live interactive client (event_txs). If not (headless server), it skips the visible attempt entirely and spawns the child headless immediately, eliminating the orphan window and the per-spawn delay. The post-launch wait_for_live_attachment safety net is retained for the case where an attached coordinator opens a child window that fails to attach. 2. The TUI rendered swarm/memory/initiative/side_panel tool calls as "action missing" (with a warning logged) whenever the streamed tool input had not yet populated its arguments (empty object). This flashed "swarm action missing" for every spawned agent. Added tool_input_is_unpopulated + resolve_tool_action_for_display so an unpopulated/streaming call shows a neutral "…" without logging, while a genuinely malformed (populated-but-action-less) call still surfaces the diagnostic. Adds 6 unit tests (3 in jcode-app-core, 3 in jcode-tui).

…ed serve server The previous Auto gate keyed off the coordinator session having a live event channel, but an interactive coordinator attached to a *detached* \`jcode serve\` shared server still reports as attached, so visible spawns were still attempted (orphan window + 8s attach-timeout per child). Switch the signal to whether THIS process has a controlling TTY: a detached \`jcode serve\` server has none (ps TTY \`??\`), while an interactive jcode/desktop run by a user does. When detached, Auto spawns children headless directly. Add JCODE_SWARM_FORCE_VISIBLE=1 escape hatch. session_has_live_attachment is retained under #[cfg(test)]. Adds running_as_detached_server_respects_force_visible_override test.

prateek310524 added 2 commits May 29, 2026 19:27

prateekjain-afk force-pushed the fix/openrouter-nvidia-endpoint-leak branch from 00f546a to 8e35c8d Compare May 29, 2026 14:02

prateek310524 added 3 commits May 29, 2026 19:32

chore(scripts): add sync_upstream.sh to rebase fork fixes onto 1jehua…

1a9bb87

…ng/master Provides a one-command, safe way to pull upstream (origin/master) updates timely while keeping local fix commits on top. Tags a backup before rewriting history and aborts cleanly on conflict.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(provider): restore native OpenRouter endpoint when switching back from a direct profile#280

fix(provider): restore native OpenRouter endpoint when switching back from a direct profile#280
prateekjain-afk wants to merge 5 commits into
1jehuang:masterfrom
prateekjain-afk:fix/openrouter-nvidia-endpoint-leak

prateekjain-afk commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

prateekjain-afk commented May 29, 2026

Problem

Root cause

Fix

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants