Skip to content

Desktop: parallel WSL + Windows backends with mode picker#2751

Open
Jgratton24 wants to merge 50 commits into
pingdotgg:mainfrom
Jgratton24:josh/desktop-wsl-parallel-backends
Open

Desktop: parallel WSL + Windows backends with mode picker#2751
Jgratton24 wants to merge 50 commits into
pingdotgg:mainfrom
Jgratton24:josh/desktop-wsl-parallel-backends

Conversation

@Jgratton24
Copy link
Copy Markdown

@Jgratton24 Jgratton24 commented May 18, 2026

Stacked on top of #2353. That PR let users pick one backend at a time (Windows or WSL, swap to switch). This PR makes them run side by side so projects on both sides are always reachable.

What Changed

  • Run the Windows backend and a WSL backend in parallel. Each project is routed to the backend it lives on; opening a Windows project doesn't tear down the WSL one, and vice versa.
  • Added a Run WSL only mode for users who develop entirely inside WSL and don't want a second backend process. T3 Code restarts when the mode changes so the new primary takes effect cleanly.
  • New mode-choice modal when enabling WSL: pick Run both backends (the new default) or Use only WSL upfront.
  • Consolidated the previous separate "WSL backend" switch and "WSL distro" picker into a single dropdown. "Off" stops the backend, picking a distro starts it on that distro.
  • Confirmation dialogs before destructive transitions (disable, distro switch, mode change) when there's saved-env state on the host that could be affected.
  • Sidebar indicator while the secondary WSL backend is cold-booting.

Why

#2353 treated the backends as mutually exclusive: switching from Windows to WSL stopped one and started the other. That works if you only work on one side, but in a mixed workflow you constantly have projects on both. Swapping interrupted whatever was running on the other backend and forced you to wait for the new one to come up before you could open anything.

Running them in parallel removes the swap entirely. The Windows backend stays primary for Windows projects, a WSL backend runs alongside it for projects that live on the Linux side, and the renderer routes per-project. The "Run WSL only" mode is the escape hatch for users who don't want two processes at all.

UI Changes

Enable-mode picker (shown when the user picks a distro from the Off state):

electron_xZfqTanAmi

Connections settings panel with the consolidated WSL backend picker and "Run WSL only" row:

electron_6fhHUp0m5h

Sidebar indicator while the WSL backend is cold-booting:

electron_dGOvMHhVog

Checklist

  • This PR is small and focused
  • I explained what changed and why
  • I included before/after screenshots for any UI changes
  • I included a video for animation/interaction changes (n/a)

Note

High Risk
High risk because it refactors core backend process management/bootstrapping and adds new WSL process orchestration (env forwarding, port scanning, bootstrap delivery), which can affect startup, shutdown, and local connectivity on Windows hosts.

Overview
Desktop backend lifecycle is refactored from a singleton to a multi-instance pool. DesktopBackendManager becomes a per-instance factory with new config fields (args, extendEnv, bootstrapDelivery, preflightFailure) and DesktopBackendPool is introduced to register/list/stop instances; app shutdown and update install now stop all instances concurrently.

Adds WSL backend support alongside the primary backend. DesktopBackendConfiguration splits into resolvePrimary/resolveWsl (shared bootstrap token), supports WSL preflight + distro IP handling, forwards secrets via WSLENV, and switches bootstrap delivery to stdin for WSL. A new DesktopWslBackend orchestrator reconciles a WSL instance based on persisted settings, scanning loopback ports and registering/unregistering instances in the pool.

IPC/settings/observability are updated for multiple local environments. The preload + IPC API changes from getLocalEnvironmentBootstrap to getLocalEnvironmentBootstraps, adds WSL IPC (getWslState, setWslBackendEnabled, setWslDistro, setWslOnly), routes pickFolder by optional target environment id (including WSL defaults), adds per-instance backend child logs via DesktopBackendOutputLogFactory, and extends DesktopAppSettings with wslBackendEnabled/wslDistro/wslOnly plus migration from legacy wslMode.

Reviewed by Cursor Bugbot for commit b62348b. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Add parallel WSL + Windows backend support with a mode picker in Connection Settings

  • Introduces a DesktopBackendPool that manages multiple named backend instances (primary Windows + optional WSL), replacing the single DesktopBackendManager singleton
  • Adds a new DesktopWslEnvironment service for WSL interrogation (distro listing, path translation, node-pty verification) and a DesktopWslBackend service that reconciles a running WSL backend instance against persisted settings
  • Extends DesktopBackendConfiguration with resolvePrimary (fd3 bootstrap, extends env) and resolveWsl (spawns via wsl.exe, delivers bootstrap via stdin, forwards API keys through WSLENV)
  • Adds WSL settings (wslBackendEnabled, wslDistro, wslOnly) to DesktopAppSettings with migration from the legacy wslMode field; wslOnly mode triggers an app relaunch
  • Exposes WSL IPC methods (getWslState, setWslBackendEnabled, setWslDistro, setWslOnly) and replaces the single getLocalEnvironmentBootstrap bridge method with getLocalEnvironmentBootstraps returning an array
  • Adds a WSL backend picker with confirmation dialogs to the Connections Settings panel, with suppressed reconnect toasts during intentional backend swaps
  • Desktop-local (WSL) saved environment records are excluded from persistence and shown with a container icon and 'Local sandbox' label in the sidebar
  • Desktop bootstrap credentials are now valid for 24 hours with unlimited reuse, replacing the previous 5-minute single-use token
  • Risk: wslOnly mode reroutes the Electron window to a WSL distro IP; if WSL is misconfigured the window will fail to load

Macroscope summarized b62348b.

Jgratton24 added 30 commits May 16, 2026 11:38
Stdin pipes inherited across the wsl.exe boundary fail to re-open
via /proc/self/fd/0, so add EACCES to the codes that drop back to
reading the fd directly. Without this the WSL desktop backend
fails to load its bootstrap envelope with "Failed to duplicate
bootstrap fd" and ends up in a scheduled-restart loop.
Lets the desktop app launch the local backend inside a WSL distro
instead of natively on Windows. Adds:

- Backend plumbing (apps/desktop/src/wsl): pure path parsing
  utilities, a DesktopWslEnvironment Effect service wrapping
  wsl.exe operations (listDistros, preWarm, windowsToWslPath,
  ensureNodePty, isAvailable), and an explicit preflight that
  checks for missing node / build tools before spawning so the
  failure message names the actual problem.
- Spawn path: DesktopBackendConfiguration branches on the new
  wslMode setting and assembles "wsl.exe -d <distro> -- node
  <linux-entry> --bootstrap-fd 0" with the bootstrap envelope on
  stdin (wsl.exe drops additional file descriptors). Sensitive env
  vars forward via WSLENV; --dev-url is passed as a CLI flag so the
  WSL dev backend lands in dev/ instead of userdata/ deterministi-
  cally. The Windows-side T3CODE_HOME is scrubbed and extendEnv is
  disabled for WSL so the WSL backend cannot accidentally share a
  baseDir with the local backend via /mnt/c/...
- Settings: wslMode + wslDistro on DesktopAppSettings, with
  validation that drops distro names containing control or shell
  meta characters. Contracts get DesktopWslMode / DesktopWslDistro
  / DesktopWslState schemas.
- IPC: getWslState and setWslBackend on the desktop bridge. The
  setter pre-warms the WSL VM, persists settings, then drives an
  in-process backend stop + start with a 2-minute readiness wait
  and a rollback path that reverts to the previous mode if the new
  backend never reports ready. pickFolder defaults to the WSL home
  UNC path when wslMode is "wsl".
- Web UI: backend-runtime selector in Connection Settings with a
  three-stage swap modal (restarting / re-establishing session /
  syncing) that suppresses the WS reconnect toast for the duration
  of the swap, waits for the new backend's welcome event before
  closing, and clears the previous env's store state so the side-
  bar does not render stale threads. New suppressReconnect helper
  on the connection-status atom plus exports for the descriptor
  refresh and reauth used by the swap flow.
- drain stdout/stderr concurrently in runWslShell so node-gyp output
  on both pipes can't deadlock the child
- short-circuit waitForReady when desiredRunning flips off, so an
  external stop() during swap doesn't waste the full timeout
- guard runSwap continuations after the 180s flow timeout so an
  orphaned IPC resolution can't overwrite rolled-back UI state
- drop unused refreshPrimaryEnvironmentDescriptor — descriptor URL is
  stable across the swap and consumers re-fetch lazily
…tate removal

- move clearTimeout(flowTimeoutHandle) into the finally block so it fires
  on success, error, and timeout — previously the error path left a live
  180s timer that would reject an unreferenced promise (unhandled rejection)
- remove the second removeEnvironmentState call after welcome; the first
  call right after the IPC swap already wipes the old environment state
  and nothing recreates state under the old env id during reauth/welcome
… mapping, surface failed rollback

- map null distro to the actual default distro name in the backend select
  so the dropdown highlights a real option instead of an orphan "__default__"
  with no matching item when distros are listed
- add getUserHome to DesktopWslEnvironment (cached per distro) and pass the
  resolved /home/<user> into the picker helper so ~/path expands correctly
  instead of producing /home/<rest>
- surface a clearer error when the rollback backend also fails to start, so
  the user knows the app is degraded rather than seeing the misleading
  "Rolled back to the previous mode" message
…undant WSLENV entry

- swap "__local__" / "__default__" select values for "backend:local" /
  "backend:default-wsl" — the colon is rejected by DISTRO_NAME_PATTERN so
  the sentinels can never collide with an actual WSL distro name
- remove VITE_DEV_SERVER_URL from WSL_FORWARDED_ENV_NAMES; the value is
  delivered exclusively via the --dev-url CLI flag because WSLENV translation
  of URL-shaped values is unreliable, and keeping it in both paths
  contradicted the comment at the CLI-flag site
The dropdown maps state.distro: null to the actual default distro's name
so the Select highlights a real option, but the no-op check still compared
target.distro (e.g. "Ubuntu") against state.distro (null). Re-picking the
visually-active row opened the confirmation dialog and triggered a full
backend restart for what was clearly a no-op. Resolve both sides through
the same null->default mapping before comparing.
The renderer's 180s ceiling was shorter than the IPC's worst-case duration:
setWslBackend can take up to ~2min for the initial readiness wait plus
another ~2min for the rollback readiness wait before throwing
WslBackendSwapError, so the client was firing "Backend swap took too long"
while the main process was still actively rolling back. Bump the ceiling
to 6 minutes (4min IPC worst case + ~60s reauth retry budget + 45s welcome
race) so a real hang still surfaces but a legitimate rollback completes.
…n through error recovery

- remove the unused `enabled` field from WslConfig and the unreferenced
  DEFAULT_WSL_CONFIG export; the toggle moved to DesktopAppSettings.wslMode
  during the migration and the field was carried along by every caller as
  noise that didn't influence behavior
- wrap the entire backend-swap flow (success + catch) in suppressReconnect
  so the catch-block reauth doesn't fire reconnect/offline toasts on top of
  the error toast the user is reading. The previous structure only
  suppressed during the happy path; recovery work landed outside the window
…e-fire false resolve

onWelcome subscribes with `immediate: true`, so the listener fires
synchronously with whatever welcome payload is already in the atom. The
previous code compared against `previousPrimaryEnvId` (descriptor-derived);
if the descriptor hadn't loaded yet, that was null and any non-null current
welcome would resolve the promise instantly, completing the "syncing" stage
before the new backend's welcome actually arrived. Capture the current
welcome's env-id from the atom as the baseline instead so the immediate
fire never matches the "new welcome arrived" predicate.
Phase-1 foundation for running Windows and WSL backends side by side.
Introduces:

* BackendInstanceId brand + PRIMARY_INSTANCE_ID constant
* DesktopBackendInstance interface mirroring the legacy backend manager
  surface so consumers can migrate one call site at a time
* DesktopBackendPool service with get/list/primary operations
* Phase-1 layer that wraps the existing singleton DesktopBackendManager
  and exposes it under PRIMARY_INSTANCE_ID, no behavior change
* layerTest helper for unit tests

The pool is wired into the desktop application layer alongside the
existing manager; current consumers (window/wsl IPC, lifecycle hooks,
telemetry) still depend on DesktopBackendManager directly. The header
docblock on DesktopBackendPool.ts captures the full migration sequence
for follow-up commits: reshape the manager into an instance factory,
move per-instance state off DesktopState/DesktopBackendOutputLog, wire
WSL as a second pooled instance, widen the bootstrap IPC, and retire
the swap-mode dialog.
Replace the singleton DesktopBackendManager Context.Service with a
factory function makeBackendInstance(spec) that returns one
DesktopBackendInstance per call. Each instance owns its own state Ref,
mutex, restart fiber, and active child process so no state is shared
across pool members.

The pool layer calls the factory once for the Windows primary at
startup, wiring the spec's configResolve to DesktopBackendConfiguration
and the onReady/onShutdown callbacks to the legacy global side effects
(DesktopState.backendReady, DesktopWindow.handleBackendReady). Those
last couplings move per-instance in steps 2 and 3.

All five consumers (DesktopApp bootstrap + shutdown, wsl.ts swap IPC,
window.ts bootstrap IPC, DesktopUpdates installer) now read the primary
instance via pool.primary instead of the deleted manager service. Log
session boundaries are prefixed with the instance id so per-backend
output stays distinguishable until step 3 splits the output log.

DesktopBackendManager.test.ts rewritten to use the factory directly
under Effect.scoped. DesktopUpdates.test.ts swaps its backend stub from
a Layer.succeed(DesktopBackendManager, ...) to
DesktopBackendPool.layerTest([stub]).
…ndow

DesktopState.backendReady was the last global coupling tying backend
lifecycle to app-wide state. With the pool owning per-instance
readiness (instance.snapshot.ready), the only remaining consumer of the
global latch is the window's auto-create-on-ready path. Move ownership
of the latch into DesktopWindow's own internals so DesktopState only
carries truly app-wide state (the quitting flag).

DesktopWindow gains handleBackendNotReady, called by the primary
instance's onShutdown callback so the latch clears on clean stop,
restart, or crash. Without it the macOS dock-click activation path could
produce a window pointing at a backend that is no longer up. The pool
spec wires both callbacks against the window service instead of the
state Ref.

Test stubs for DesktopWindowShape pick up the new handleBackendNotReady
field; DesktopWindow.test.ts drops its DesktopState dependency.
…factory

Backend child output and session boundaries used to land in one shared
server-child.log via the DesktopBackendOutputLog singleton. With a
second backend instance on the way, intermixing two processes' stdout
streams into one file makes triage harder than it needs to be.

Replace the singleton with DesktopBackendOutputLogFactory.forInstance(id)
that vends a rotating writer per backend id. The primary keeps the
historical server-child.log path so existing ops tooling, packaged-build
log inspection, and habit don't break; non-primary instances land in
server-child-<sanitized-id>.log. A SynchronizedRef-backed cache keyed by
id ensures repeated forInstance lookups on the same id reuse the writer
(important under restart loops that re-resolve the factory).

Each emitted record now carries an instanceId annotation so cross-file
greps can still associate records belonging to the same backend even if
log paths drift later. The redundant "instance=<id>" prefix on session
boundary details is dropped — that info now lives in the structured
annotation and in the file path.

DesktopBackendManager pulls its writer from the factory at instance
construction time so each instance carries a fixed writer for its
lifetime. DesktopBackendManager.test.ts stubs the factory directly;
DesktopObservability.test.ts asserts the new instanceId annotation.
Lays the groundwork for the WSL second-instance orchestrator. Pool now
exposes:
- register(spec): builds a DesktopBackendInstance via the factory under
  a fresh child Scope owned by the pool, adds it to the registry, and
  returns the instance unstarted so the caller decides when to start.
- unregister(id): atomically removes the entry from the registry and
  closes the child scope, which runs the instance's auto-stop finalizer.

Each registered instance lives under its own Scope so a single unregister
stops just that instance without disturbing the rest of the pool. The
primary instance keeps its place in the pool's own layer scope and is
guarded by a DesktopBackendPoolCannotUnregisterPrimaryError; that case
is treated as a wiring bug rather than something callers handle.

The instances Ref upgrades to a SynchronizedRef so register and
unregister can run as serialized modify-effects without racing each
other on the underlying Map. Duplicate registration on the same id
fails with a typed DesktopBackendPoolInstanceAlreadyRegisteredError.

No caller registers a second instance yet — that arrives in the next
commit when the WSL orchestrator goes in.
The "local" vs "wsl" swap mode is going away. Windows and WSL backends
will run in parallel as two pool instances, so the setting that drives
WSL only needs to answer "should there be a WSL backend at all". Rename
the persisted field to wslBackendEnabled and replace setWslMode with
two narrower setters (setWslBackendEnabled, setWslDistro) so the
upcoming orchestrator IPC can toggle each independently.

Existing on-disk settings that still carry the legacy wslMode key get
migrated on load: wslMode=="wsl" becomes wslBackendEnabled=true. The
schema still accepts wslMode for one release so users coming off the
swap-mode build keep their selection. The new wslBackendEnabled wins
when both keys are present, and the next persist drops wslMode.

Consumers that read settings.wslMode get pointed at wslBackendEnabled:
DesktopBackendConfiguration (the resolver still produces a single WSL
config in this commit; the split lands next), the pickFolder IPC, and
the wsl.ts IPC's readWslState/setWslBackend handlers. The wire shape
for the renderer stays the same in this commit so the web app keeps
compiling; the renderer-facing IPC gets reworked alongside the
orchestrator.
…esolvers

DesktopBackendConfiguration.resolve was a single effect that picked
between local and WSL config based on the persisted wslMode. Now that
the two backends run in parallel, each pool instance needs its own
resolver. Split into:

  - resolvePrimary: Effect<DesktopBackendStartConfig>
      Always Windows-native. Reads port/host/exposure from
      DesktopServerExposure.backendConfig like before.

  - resolveWsl({ port, distro }): Effect<DesktopBackendStartConfig>
      Builds a WSL-via-wsl.exe config for the given distro on the
      given port. Doesn't touch DesktopServerExposure since the WSL
      backend is loopback-only by design; the primary owns LAN
      exposure when the user enables network-accessible mode.

Shared bits (bootstrap token via tokenRef, persisted observability
endpoints, env patching, mergeWslEnv) stay private to this module.
Both resolvers reuse the same bootstrapToken so the renderer can
authenticate against either backend with one token.

The WSL config now hardcodes 127.0.0.1 + tailscaleServeEnabled=false
in the bootstrap envelope. The old code copied the primary's host
(could be 0.0.0.0) and tailscale flags into the WSL bootstrap, which
made sense when WSL was a replacement but is wrong when both run
side by side: a tailscale-serve forwarder bound on Windows can't also
bind from inside WSL on the same port. Loopback-only WSL plus the
primary handling LAN exposure is the cleaner v1 contract.

DesktopBackendPool's primary spec now wires configuration.resolvePrimary;
the WSL spec call site lands in the orchestrator commit. Tests updated
to drive the two resolvers explicitly and to assert the shared-token
guarantee.
This is the cut-over to parallel backends. The old "swap the primary
into WSL and bounce it" flow goes away; the WSL backend is now a
second instance registered with the pool, running alongside the
Windows primary. Toggling the WSL backend on/off doesn't touch the
primary at all.

New service DesktopWslBackend (apps/desktop/src/wsl/DesktopWslBackend.ts)
owns the orchestration. Its one entry point, reconcile, reads the
persisted wslBackendEnabled + wslDistro settings, looks at what's
currently registered with the pool, and brings the two in line:

  - If WSL should be running and isn't, allocate a loopback port
    starting one above the primary's, register a fresh instance via
    pool.register({ id, label, configResolve: resolveWsl(...) }),
    and kick off instance.start.
  - If WSL is running with a stale distro selection, unregister the
    old instance (which closes its scope and stops the child process)
    before registering the new one.
  - If WSL should not be running, unregister whatever wsl: instance
    is registered.

reconcile never fails. Port-allocation failures, "WSL not available",
and pool-already-registered errors are logged and the call returns
having left the pool in a consistent state. The primary backend is
never affected.

Instance ids encode the user's distro selection: wsl:default when
wslDistro is null (track the WSL default) and wsl:<distro> otherwise.
These ids are what the env-id work in step 6/7 will key off, so they
stay stable across underlying-default-distro changes — picking
"track default" doesn't reshuffle env ids if the user later sets a
different WSL distro as the default.

Bootstrap call site (DesktopApp.ts) forks reconcile after the primary
start request. The WSL backend can take a moment to come up first
time (wsl.exe cold spawn, node-pty build); the fork keeps that off
the primary's critical path.

IPC surface change:

  - Drop setWslBackend({mode, distro}) — the swap call with rollback
    semantics — and the SWAP_READINESS_TIMEOUT / waitForReady /
    in-process primary stop+start dance in apps/desktop/src/ipc/methods/wsl.ts.
  - Add setWslBackendEnabled(boolean) + setWslDistro(string | null).
    Each persists the setting via DesktopAppSettings and then calls
    wslBackend.reconcile to bring the pool in line. No rollback path:
    with both backends running, "WSL didn't come up" is transient
    state on one instance, not a degraded app.
  - Drop DesktopWslMode / DesktopWslModeSchema from contracts. The
    DesktopWslState wire shape changes mode: "local" | "wsl" to a
    plain enabled: boolean.
  - New IPC channels SET_WSL_BACKEND_ENABLED_CHANNEL +
    SET_WSL_DISTRO_CHANNEL.

DesktopBackendConfiguration's resolveWsl bootstrap now hardcodes
tailscaleServePort: 443 when tailscaleServeEnabled is false, because
PortSchema rejects 0. The backend only reads the port when serve is
on, so the value is inert.

Web UI (ConnectionsSettings.tsx) is wired against the new IPC. The
swap ceremony (reauth, welcome-race, suppressReconnect, 6-minute
flow timeout) goes away — toggling is fast and non-destructive now.
The dialog is still the same select-with-confirm shape; step 8 will
rework it into a proper "WSL backend: enabled + distro picker"
control. SettingsPanels.browser.tsx and localApi.test.ts pick up
the new mock shape.
The pool's design-notes block was still describing the step-4 cut-off
("WSL instance not yet registered, IPC still uses swap mode"). Step 5
shipped, so update the "current state" section and convert the
forward-looking migration list into a history block + a short "what's
left" callout for steps 6+. No code changes here, just the header
docblock.
…stances

getLocalEnvironmentBootstrap used to hand back a single bootstrap for
the primary backend. With the WSL backend running as a second pool
instance, the renderer needs to learn about both so step 7 can
register them as separate local environments.

Rename the IPC to getLocalEnvironmentBootstraps (plural) and walk
pool.list, emitting one entry per instance that already has a config.
Instances that are registered but haven't produced a config yet (WSL
backend mid-registration before its first start cycle) are skipped
and will appear on the next call.

The bootstrap entry gains an id field that mirrors the backend
instance id (e.g. "primary" or "wsl:ubuntu"). The renderer uses that
to find the primary entry today (auth.ts, target.ts); step 7 keys
local environments off the same id. PRIMARY_LOCAL_ENVIRONMENT_ID is
exported from contracts so web code can reference the primary by name
without importing brand machinery from the desktop package. The
desktop side wraps the same constant in BackendInstanceId so the two
stay locked together.

Test bridge mocks updated. The DesktopBridge casts in
authBootstrap.test.ts went through `as DesktopBridge` previously; the
array-returning plural is structurally different enough that TS
flagged it, so they go through `as unknown as DesktopBridge` now.
PickFolderOptions gains an optional targetEnvironmentId so callers
that know which local backend they're targeting (a project opened in
WSL, for example) can ask for that backend's filesystem picker.

The default behavior is unchanged: when targetEnvironmentId is
undefined, the dialog opens against the Windows-native primary, which
is what every existing caller gets. This is deliberate — most users
never enable the WSL backend and shouldn't see a different picker
showing up. Only callers that explicitly opt in route to WSL.

When targetEnvironmentId starts with "wsl:", the handler uses the
WSL helpers in wslPathParsing.ts. The id encodes the distro
selection (e.g. "wsl:ubuntu") and falls back to the persisted
wslDistro setting when the id is the "wsl:default" sentinel, matching
how DesktopWslBackend.reconcile resolves the same input. The legacy
"if wslBackendEnabled then always use WSL picker" branch is gone —
that was the swap-mode mental model.
The WSL section in ConnectionsSettings was still shaped as a "switch
backend" decision: pick local-or-wsl from one dropdown, confirm in a
modal, watch a "restarting backend" spinner. That mental model is wrong
for parallel backends. Toggling the WSL backend on/off doesn't bounce
the Windows one, and switching distros only restarts the WSL instance.

Replace with two plain rows:

  - "WSL backend" — a switch that enables/disables the second
    backend. Off by default for users who never opted in, so the
    normal flow looks the same as before.
  - "WSL distro" — a select that lists the installed distros, shown
    only when the toggle is on. Changing the selection writes the
    new wslDistro setting and lets the orchestrator restart just the
    WSL instance.

Both controls fire the relevant new IPCs (setWslBackendEnabled,
setWslDistro) without an AlertDialog confirmation. The reconcile is
non-destructive: the orchestrator unregisters and re-registers the
WSL pool instance, the primary stays up.

Drop the confirm-then-apply state machine
(pendingDesktopWslSelection, the dialog markup, the per-stage spinner
copy). The error toast and disabled-while-updating spinner stay so
the user gets feedback if the orchestrator's reconcile fails.

BACKEND_VALUE_LOCAL is gone — there's no longer a "switch to local"
option to express. BACKEND_VALUE_DEFAULT_WSL stays as the sentinel
for the "track the WSL default" choice in the distro picker.
Rewrite the "current state" section so it matches what's actually in
the tree (plural bootstraps IPC, pickFolder routing, toggle-style
settings UX). Note the renderer-side gap: the web env runtime still
treats the primary as the only local environment, and lifting that
requires a per-environment auth bootstrap pass that we deliberately
left for a follow-up. The desktop side is ready when the renderer
takes it up.
Bring up the second local backend in the renderer so its env id
appears in the saved-environment registry alongside any remote saved
envs the user paired. The sidebar, env switcher, CommandPalette, and
project-routing UI all consume that registry, so they pick up the
WSL backend without per-surface plumbing.

How the data flows:
  - On boot, runtime/service.ts calls
    reconcileLocalSecondaryEnvironments() (and again after a 5s delay
    to catch a slow WSL cold boot).
  - The reconciler reads getLocalEnvironmentBootstraps() from the
    desktop bridge. Primary stays owned by the primary/ runtime;
    everything else with a desktopLocal instance id is routed here.
  - For each new instance, the reconciler POSTs the shared bootstrap
    token to /api/auth/bootstrap/bearer on the WSL backend's URL,
    fetches the descriptor, builds a SavedEnvironmentRecord carrying
    a desktopLocal marker, upserts it into the registry, writes the
    bearer to the secret store, and triggers
    ensureSavedEnvironmentConnection.
  - Records carrying desktopLocal are filtered out of the saved-env
    persistence path, so toggling WSL off or switching distros
    doesn't leave stale entries on disk.

After-toggle wiring: ConnectionsSettings.applyWslSettingChange fires
reconcile after each setWslBackendEnabled/setWslDistro call, then
again after 1.5s for the same slow-boot reason.

Why bearer-token auth instead of cookies:
  - The WSL backend runs on its own loopback port. Cookies are
    per-origin, and the renderer's origin (the primary's URL in
    packaged builds, the vite dev server URL in dev) doesn't match
    that port, so a cookie set on the WSL origin wouldn't ride along.
  - Bearer auth uses the Authorization header which the backend's
    CORS layer already permits, and WS connections use the
    ?wsToken=... pattern that saved environments rely on. No CORS
    surgery on the backend side.
  - Auth state is per-env (a separate bearer per backend); the global
    primary auth gate in primary/auth.ts stays untouched, so the
    normal single-backend flow for non-WSL users is unaffected.

The reconciler is idempotent, dedupes concurrent calls per instance
id, and never throws — errors get logged and the caller can retry by
calling again. If the WSL backend restarts and its old bearer goes
stale, the user toggling settings re-bootstraps a fresh one.

Plumbing changes:
  - ensureSavedEnvironmentConnection exported from runtime/service
    so the reconciler can reuse the saved-env connection lifecycle
    without duplicating it.
  - New removeSavedEnvironmentByInstance variant: same teardown as
    removeSavedEnvironment but skips the secret-store delete, since
    desktopLocal entries may not own a persisted secret to remove.
The command-palette's add-project flow gated the "Open project from
File Manager" affordance on browseEnvironmentId === primary. That
held for the swap-mode world where the desktop only managed one
local backend. With the WSL backend now registered as a saved-env
with a desktopLocal marker, the file-manager picker should be
available there too — and the desktop side already knows how to
route a pickFolder call into the right WSL distro's filesystem when
the renderer passes targetEnvironmentId.

Open the gate to "primary OR desktopLocal" and forward
browseEnvironmentId as targetEnvironmentId on the pickFolder
call. Remote saved environments stay browse-only because the desktop
side has no way to spawn an OS file dialog over there.
The saved-env registry subscriber kicks off a sync as soon as upsert
lands, and that path reads the bearer back via readSavedEnvironmentBearerToken.
Writing the bearer first means whichever path connects first finds
the credential.
After step 7a landed, the renderer registers each non-primary
bootstrap as a desktop-local SavedEnvironmentRecord via the
reconcileLocalSecondaryEnvironments path. The desktopLocal marker
keeps these entries out of saved-env persistence so they don't end
up in the user's settings file, and the saved-env runtime takes care
of the connection lifecycle, sidebar listing, env-switcher, and
project-id routing for free.

Browser validation done with a real dev:desktop run with
wslBackendEnabled=true and wslDistro="Ubuntu":
  - Distinct ports (13773 primary, 13774 wsl) listening side by
    side, both serving distinct env descriptors (windows vs linux
    platform).
  - Per-instance log files in dev/logs/ (server-child.log +
    server-child-wsl_Ubuntu.log).
  - Renderer completes the bearer-token bootstrap against the WSL
    backend (200), obtains a ws-token (200), holds an ESTABLISHED
    WebSocket connection to each port (netstat).

Header docblock now lists this state explicitly + the per-commit
migration history.
The WSL backend was bound to 127.0.0.1 inside the distro and the
renderer reached it through wslhost (Windows' built-in localhost
forwarder). That forwarding is flaky on at least my Win11 install:
the readiness probe and saved-env descriptor fetch both saw "Failed
to fetch" against a backend that was otherwise healthy.

Bind to 0.0.0.0 inside WSL and advertise the distro's eth0 IP as the
renderer-visible httpBaseUrl. DesktopWslEnvironment.getDistroIp uses
`hostname -I` inside the distro (cached per distro) and falls back
to 127.0.0.1 + wslhost when the probe fails, so a busted setup
degrades to the prior behavior instead of regressing.

The network this exposes on is the WSL-vEthernet network, not the
LAN; primary owns LAN exposure when the user opts in, so this
doesn't widen the attack surface for non-WSL users.
The desktop-bootstrap credential used the same single-use + 5-minute
TTL semantics as a user-facing pairing link. That fit the original
mental model (one renderer, one bootstrap exchange) but breaks
parallel backends where a slow WSL cold boot lands outside the
renderer's first reconcile pass, and breaks page reloads where the
renderer no longer has the bearer it traded the bootstrap for and
needs to re-exchange.

Switch the seed for `desktopBootstrapToken` to `remainingUses:
"unbounded"` with a 24h TTL. The seed is delivered over trusted IPC
(fd3 / stdin) at backend launch and lives in the renderer/desktop
processes the user already trusts, so single-use buys us nothing
operationally and costs us recoverable error paths.

Tests:
- BootstrapCredentialService.test.ts now asserts repeat consumption
  succeeds and that expiry kicks in past 24h, not 5 minutes.
- server.test.ts: the "rejects reusing" test is rewritten as
  "allows reusing" against the same credential.
Jgratton24 added 12 commits May 18, 2026 09:40
The thread-row icon was fixed in ad7e3d7, but the project-header
indicator still rendered a cloud for any group whose members were
all non-primary. That was visually wrong for WSL backends: the
project lives on the user's machine, just in a sandbox.

Pull the desktopLocal marker through into the sidebar's grouping
output as allRemoteMembersAreDesktopLocal, and let the project
header render ContainerIcon (plus a "Local sandbox" tooltip) when
every non-primary member is desktopLocal. Mixed groups and true
remote-only groups still get the cloud icon.

buildSidebarProjectSnapshots gains an optional
isDesktopLocalEnvironment resolver so callers without saved-env
context still get the legacy behavior.
Two issues from a fresh cold-start launch:

1. The first register attempt hit ERR_CONNECTION_REFUSED (WSL not
   listening yet), surfaced as Failed to fetch, and then no retries
   fired. Most likely a registration step hung past the descriptor
   fetch (IPC bearer read, WS open, etc) so the per-instance pending
   promise never resolved, which kept reconcileOnce awaiting forever
   and left pendingReconcileRun wedged for every subsequent retry
   tick. Wrap each attempt in a 25s hard ceiling via Promise.race so
   a stuck step is force-rejected, the catch wrapper logs it, and the
   scheduleAutoRetry chain keeps progressing.

2. Production renderers don't have CDP, so when the silent failure
   above happened there was no way to inspect what state the reconciler
   was actually in. Expose window.__t3LocalSecondaryDebug with two
   methods:
     - getState() returns the most recent bootstraps list, per-instance
       registration errors, last reconcile timestamp, and attempt count.
     - retryNow() re-fires the reconcile loop on demand.
   Wrapped behind exposeDebugGlobal() so non-browser environments
   (vitest under node) don't poison globalThis.

Also wrapped the IPC call to getLocalEnvironmentBootstraps in a
try/catch with logging, in case the bridge throws during a startup
race.
A cold-start launch where the WSL backend is still booting (or where
the renderer's first registration attempt fails before retries cover
the gap) used to leave the sidebar silent: no projects, no
indication anything was happening. The user has no way to tell
whether the backend is starting up, broken, or simply not
configured.

Promote the reconciler's in-memory trace to a real zustand store
(useLocalSecondaryReconcileStore) tracking pendingInstanceIds,
per-instance registrationErrors, and budgetExhausted. The reconciler
writes:
  - addPending/removePending around each register attempt
  - setRegistrationError on Promise.race failure or timeout
  - clearRegistrationError when an attempt actually returns a record
  - budgetExhausted = true once auto-retry runs out of attempts
  - budgetExhausted = false on user-driven reconcile (so Retry resets it)

The window.__t3LocalSecondaryDebug global keeps working - it now
just reads from the store.

Add a LocalSecondaryStatus component to the sidebar. It renders:
  - A "Connecting <label>" alert with a spinner while pendingInstanceIds
    is non-empty.
  - A "Couldn't connect <label>" warning alert with the error message
    and a Retry button when the retry budget runs out with errors
    still recorded.

Retry calls reconcileLocalSecondaryEnvironments() (resetBudget: true)
so the loop gets a fresh shot.
pendingInstanceIds tracks "an attempt is in flight RIGHT NOW", which
empties out during the backoff delay between retries. That made the
sidebar status alert flash on and off once per setTimeout fire while
the WSL backend was cold-booting (the user reported "rendered in and
out a few times before the threads loaded").

Derive the connecting set from "secondary in bootstraps + not yet in
the saved-env registry + auto-retry budget still active" instead.
That stays true through all 8 attempts of the backoff schedule, so
the user sees one steady "Connecting WSL (Ubuntu)" line until the
backend lands.

The Retry button still works because the public reconcile entry
resets budgetExhausted, which makes the connecting predicate true
again until either the new attempt succeeds or the budget runs out
a second time.
The project header now carries the container icon for WSL-only
project groups (943e1ed), so repeating the icon on every thread
inside the group is just noise: a WSL project's threads are
implicitly all WSL.

Skip the per-thread indicator when the thread's env is desktopLocal.
The CloudIcon path for true remote envs stays in place because the
project header doesn't differentiate "remote project A" from "remote
project B" - the cloud icon on the thread row carries the env label
tooltip for those.
When the user toggles WSL off or switches to a different distro, we
now stop and ask for confirmation if the WSL backend has ever
landed in the saved-environment registry on this machine. Skipping
the dialog when there's no registration to lose keeps the common
case (first-time enable, immediate undo, fresh install) frictionless;
prompting when WSL has actual state protects users from accidentally
disconnecting threads/projects they care about.

Enabling WSL never prompts: it's non-destructive (just starts a
second backend) and getting a confirmation dialog every time would
be noise.

Cancelling the dialog drops pendingWslChange without touching the
underlying setting, so the Switch / Select stay on their current
value.
Three things in one branch:

* fix(web): drop the per-thread container icon when every thread in
  the project is already on WSL (a7a2e1b, already pushed).

* feat(web): confirm before disable / distro-switch when WSL has
  state (449e18f, already pushed).

* This commit: add a third backend mode where the WSL backend runs
  as the primary and the Windows backend isn't started.

The mode lands as a new `wslOnly` setting on DesktopAppSettings,
gated behind wslBackendEnabled. When both are true:

  - DesktopBackendConfiguration.resolvePrimary now dispatches at
    resolve time. The Windows path builds the same start config as
    before; the WSL path reuses resolveWslStartConfig with the
    primary port and the user's chosen distro, so the WSL backend
    binds where the Windows backend would have, the renderer loads
    from the same exposure endpoint, and cookie-auth keeps working.

  - DesktopBackendPool reads settings once at layer init to pick the
    primary's label ("Windows" vs "WSL (Ubuntu)"). The label is
    captured per process; flipping the mode requires a restart.

  - DesktopWslBackend.reconcile skips the parallel "wsl:<distro>"
    registration when wsl-only is on, otherwise we'd run two WSL
    processes on the same distro.

  - New IPC channel SET_WSL_ONLY_CHANNEL plus a setWslOnly handler.
    getWslState now includes the wslOnly flag, and the DesktopBridge
    contract gains setWslOnly(enabled).

  - Settings UI gets a third row ("Run WSL only") under the distro
    picker, visible when WSL is enabled. The confirmation dialog
    grows a third branch with restart-aware copy ("Save and restart
    later") since the primary spec is captured at layer init.

Migration / defaults: wslOnly defaults to false. Existing setups
stay on parallel mode unless the user explicitly turns this on.

Tests: test bridges, settings mock, and IPC layerTest extended for
the new field. Existing settings + backend + WSL test suites pass.
Persisting the wsl-only setting and waiting for the next launch was
clunky: the user had to confirm a dialog and then close + reopen the
app for the toggle to take effect. The pool captures the primary
choice once at layer init, so the relaunch is unavoidable, but we can
do it for them. Same pattern the network-access toggle already uses.
…Select

Two separate controls for "is the WSL backend on?" and "which distro
does it run on?" was confusing. They're the same decision. Merge them
into a single dropdown: "Off" stops the backend, any distro entry
starts it on that distro. The confirmation paths (disable, switch
distro) keep their existing copy and gating against
hasWslRegistrationToLose.
In wsl-only mode, serverExposure.backendConfig.httpBaseUrl
(127.0.0.1:3773) doesn't match the URL the WSL backend actually
listens on (the distro IP, e.g. 172.27.152.141:3773). wslhost
localhost forwarding is flaky on some Windows hosts, so loadURL at
127.0.0.1 leaves the renderer black-screened with did-fail-load.

Plumb the primary instance's resolved httpBaseUrl through onReady to
the window service, store it, and let createMain read it. Falls back
to serverExposure when nothing's been reported yet so existing
behavior (dev mode, activate path before ready) is unchanged.
Two related state bugs in the WSL picker:

1. Turning the WSL backend off while wsl-only was on persisted
   wslBackendEnabled=false but left wslOnly=true. The "Run WSL only"
   row vanished, but the running app stayed on the WSL primary
   (wsl-only is only honoured when WSL is enabled, so on next launch
   the app would silently fall back to Windows). Off in wsl-only now
   confirms and clears both flags, relaunching onto Windows.

2. Enabling the WSL backend dropped the user into "both backends" and
   left them to discover the wsl-only switch separately. Enabling now
   prompts with a modal that offers "Run both backends" or "Use only
   WSL" so the mode is chosen upfront.

Both flows share the existing dialog. The enable kind uses two action
buttons instead of the single Confirm path.
The renderer's local-secondary reconciler was polling
getLocalEnvironmentBootstraps on a fixed backoff (2s, 4s, 8s, 16s,
30s, 45s, 60s, 60s) regardless of whether the user had any WSL
backend configured. On Windows hosts without WSL enabled that's eight
IPC round-trips plus a setTimeout chain doing zero useful work.

Probe desktopBridge.getWslState() once at module init. When the user
hasn't enabled the WSL backend, latch the auto-retry chain so it
parks. The settings page calls markSecondariesConfigured(state.enabled)
after every WSL toggle so flipping WSL on later wakes the loop back
up.

For users who *do* have WSL enabled, behavior is unchanged: the loop
still rides out the full 4-minute desktop-bootstrap TTL window
covering cold boot.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5073d357-b7a4-43a2-a1e0-51f1e4798c7c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added vouch:unvouched PR author is not yet trusted in the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels May 18, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aebb4fa3e5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/web/src/components/settings/ConnectionsSettings.tsx Outdated
Comment thread apps/desktop/src/app/DesktopObservability.ts
Comment thread apps/web/src/components/Sidebar.tsx
Comment thread apps/desktop/src/backend/DesktopBackendPool.ts
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp Bot commented May 18, 2026

Approvability

Verdict: Needs human review

Diff is too large for automated approval analysis. A human reviewer should evaluate this PR.

You can customize Macroscope's approvability policy. Learn more.

@Jgratton24 Jgratton24 mentioned this pull request May 18, 2026
4 tasks
Four review comments addressed in one pass:

- Park the local-secondary auto-retry loop in wsl-only mode too. WSL
  is the primary there and the desktop never registers a wsl:<distro>
  secondary, so the loop would otherwise poll forever for something
  that can't appear. Initial probe and the post-toggle latch both
  treat enabled+wslOnly as "no secondaries".

- Key the backend output log cache by the resolved file path, not the
  raw instance id. Two ids that sanitize to the same filename (e.g.
  `wsl:default` and `wsl_default`) would otherwise produce two
  RotatingLogFileWriters racing on the same file.

- Fork registered instance scopes off the pool's layer scope instead
  of leaving them as orphan `Scope.make` handles. On app shutdown the
  pool's scope closes, which now also closes the WSL instance scope
  and runs the BackendInstance stop finalizer (graceful SIGTERM +
  grace period) rather than letting the OS hard-kill the child.

- Refresh the stale Sidebar comment that claimed we render a
  container icon on desktop-local threads. The project-level header
  already carries that icon (sidebarProjectGrouping); the thread row
  only suppresses the cloud icon now.
Comment thread apps/desktop/src/updates/DesktopUpdates.ts Outdated
…t primary

quitAndInstall fires app.quit() + a hard relaunch, which may not wait
for Effect's scope finalizer cascade to drain. With parallel backends
that meant the WSL instance got hard-killed by the OS instead of
receiving the SIGTERM + grace period the BackendInstance stop
finalizer provides. Iterate pool.list and stop every instance
concurrently with the same 5s budget the primary had on its own.
Comment thread apps/desktop/src/app/DesktopApp.ts Outdated
…rimary

Same pattern as the update-install path (ba31706) but on the normal
quit path. The scoped program finalizer was only stopping the primary
backend before marking shutdown complete, so any registered WSL
instance was left for the layer-scope cascade to clean up. The
electronApp.quit() in listenForQuit can race ahead of that cascade,
hard-killing the WSL child instead of letting it receive SIGTERM +
grace. Iterate pool.list and stop every instance concurrently.
WSL2's `networkingMode=mirrored` makes the distro share the Windows
network stack, so `hostname -I` returns the host's own IP (e.g.
192.168.0.64). Our renderer URL resolution was passing that IP through
verbatim, and Windows can't route a packet to its own NIC address and
have it loop back to a WSL listener — the request just times out.
Loopback DOES forward correctly in mirrored mode, so detect the
collision (distro IP matches one of our own interfaces) and fall back
to 127.0.0.1.

NAT mode is unchanged: the distro IP there is a private vEthernet
address that won't match any Windows interface, so we keep advertising
it as before (which is the path that avoids the flaky wslhost proxy).
Comment thread apps/desktop/src/ipc/methods/window.ts
Comment thread apps/desktop/src/wsl/DesktopWslBackend.ts
Two review-feedback fixes:

- getLocalEnvironmentBootstraps was exposing the bootstrap info (URL,
  token) for backends whose configResolve produced a preflightFailure.
  Those backends never actually listen — the manager calls
  scheduleRestart instead of spawning — so the renderer would pick up
  a phantom URL, POST to /api/auth/bootstrap/bearer, fail, and register
  a broken saved-env. Skip them in the IPC handler.

- wslBackend.reconcile read pool state and settings non-atomically.
  The bootstrap fork + an in-flight setWslDistro IPC could both
  observe "no WSL instance registered", both proceed to startNew with
  different distros, and leave a stranded instance behind. Wrap the
  reconcile body in a single-permit semaphore so concurrent callers
  queue.
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b62348b. Configure here.

Comment thread apps/desktop/src/backend/DesktopBackendManager.ts
@UtkarshUsername
Copy link
Copy Markdown
Contributor

UtkarshUsername commented May 19, 2026

Hey @Jgratton24, I was working #2402. Coincidentally, we both started working on WSL support at the same time.

I saw your PR later, but I was still going to continue mine because it supported parallel WSL and Windows work.
I was going to resolve its conflicts today, and continue it. But you have solved that point in this PR.

So, now only one reason remains for me to prefer mine. It's that, it supports WSL projects in the server mode too, and not only in the desktop app. Do you think that is something which you can make possible in yours? If not, I can continue mine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants