Rewrite terminal session lifecycle, fix PTY fd handling, add startup config#21
Conversation
a15406a to
b39e345
Compare
There was a problem hiding this comment.
Pull request overview
This PR reworks the terminal session lifecycle to improve behavior under concurrent terminal startups, while also tightening PTY fd handling and adding a parent-process readiness signal.
Changes:
- Introduces atomic “scrollback replay + enable live forwarding” to prevent duplicated early output during WebSocket handshake.
- Switches PTY fallback pre-exec fd leak prevention from force-closing fds to
close_range(..., CLOSE_RANGE_CLOEXEC). - Adds server readiness signaling via an
AXS_READY_PIPEFIFO.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/terminal/handlers.rs |
Updates terminal session/WebSocket lifecycle and scrollback replay ordering. |
src/terminal/scrollback.rs |
Adds read_tail_and_then to keep replay + activation under the same scrollback lock. |
src/terminal/pty_fallback.rs |
Replaces close-loop fd handling with close_range CLOEXEC approach in pre_exec. |
src/terminal/mod.rs |
Writes "READY\n" to a FIFO env-configured pipe once the listener is bound. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
9b3c488 to
dea7378
Compare
…config handlers.rs: - Rewrite terminal session model: persistent TerminalSession struct with background PTY reader thread (always running for session lifetime) and background child waiter thread with exit_notify. - WebSocket output coalescing: batch PTY output at 8ms intervals to reduce WS frame overhead. - Process exit detection: send JSON exit message to connected WebSocket clients when child process terminates. - Separate openpty failure (fallback to TIOCGPTPEER) from spawn failure (report immediately, no fallback). scrollback.rs: - Replace read_tail() with read_tail_and_then(): execute a callback under the scrollback file lock after reading, so WebSocket replay can atomically activate live forwarding before the PTY reader appends more bytes — prevents duplicate output during WS handshake. pty_fallback.rs: - Replace close_fds_above_stderr() with cloexec_fds_above_stderr(): use close_range(3, UINT_MAX, CLOSE_RANGE_CLOEXEC) instead of force-closing fds. This preserves Rust's internal exec-error reporting pipe until execve, avoiding a stdlib abort when the pipe is closed too early. - Remove fallback close loop (3..RLIMIT_NOFILE) since CLOEXEC makes it unnecessary. mod.rs: - Server readiness signaling: write "READY\n" to AXS_READY_PIPE FIFO when the server is bound and listening, so the parent shell can synchronize without HTTP polling.
dea7378 to
f977e1e
Compare
Cargo.lock records the version bump of the root crate axs from 0.2.13 to 0.2.14, auto-updated when running cargo build after pulling the latest changes on the session-lifecycle-pty-cloexec-startup-config branch.
| let output_tx_for_replay = output_tx_arc.clone(); | ||
| let ws_output_tx_for_replay = ws_output_tx.clone(); | ||
| match spawn_blocking(move || { | ||
| scrollback_for_replay.read_tail_and_then(MAX_SCROLLBACK_BYTES, || { |
There was a problem hiding this comment.
Live output ownership now races between concurrent websocket connects
Moving output_tx assignment into read_tail_and_then makes the winning live-forwarding socket depend on which replay finishes last rather than which connection is newest. If a reconnect happens while an older socket is still replaying scrollback, the older socket can overwrite output_tx after the newer one has already taken over, so the new client stops receiving live PTY output. When that older socket later disconnects it will also clear output_tx, cutting live output off entirely. The previous code installed the sender before replay, so the latest connection deterministically won.
| /// avoiding the stdlib abort seen when the pipe is closed too early, while | ||
| /// still preventing descriptor leaks into the spawned program. | ||
| /// | ||
| /// No fallback when `close_range` fails: `CLOSE_RANGE_CLOEXEC` requires |
There was a problem hiding this comment.
Removing the fd-closing fallback silently leaks descriptors when close_range CLOEXEC is unavailable
The new startup protocol assumes the parent is blocked on read < pipe, but this code ignores both open() and write_all() failures. If AXS_READY_PIPE is set and the notification cannot be delivered, the server keeps running with no log while the parent waits forever for a readiness signal that will never arrive. Since the env var is an explicit opt-in to this protocol, failure should at least be logged and probably fail fast.
| // - A caller that sets the variable yet deliberately violates the | ||
| // protocol deserves no fallback — blocking is a visible symptom | ||
| // that exposes the misconfiguration rather than hiding it. | ||
| if let Ok(pipe_path) = env::var("AXS_READY_PIPE") { |
There was a problem hiding this comment.
Ready-pipe notification failures are swallowed, which can hang callers indefinitely
This PR optimizes the high-concurrency scenario model and solves some occasional race conditions when multiple terminals start in parallel.
handlers.rs:
scrollback.rs:
pty_fallback.rs:
mod.rs: