Daemon + thin client architecture#22
Merged
Merged
Conversation
Update PRDs (001, 004, 009), SPEC-011, and IMPLEMENTATION_PLAN to reflect the OpenClaw-pattern architecture: persistent daemon (Netclaw.Daemon) with thin CLI/TUI clients (Netclaw.Cli) connecting via SignalR. Key changes: - Two binaries: daemon owns Akka/persistence/tools, CLI is lightweight - TUI chat becomes pure thin client rendering SessionOutput over SignalR - CLI commands categorized as offline vs daemon-required - Daemon management commands: start/stop/status/install/uninstall - systemd user service registration (no sudo) - New implementation tasks 1.26-1.31 for the split - Tasks 1.10-1.12 marked as done (PRs #14, #16, #17)
Structural split following the daemon + thin client pattern: - Netclaw.Daemon (netclawd): Web SDK, actor system, SignalR hub, config watcher, health endpoint - Netclaw.Cli (netclaw): CLI routing (chat, -p, init, doctor, daemon stub, config stub), TUI, headless channel Shared types moved to library projects: - Config POCOs (ProviderEntry, ModelSelection, ModelReference) → Netclaw.Configuration - SessionOutputDto → Netclaw.Actors.Protocol CLI temporarily keeps full Akka stack in-process with transitional copies of ChatClientFactory/NetclawChatClientProvider. Task 1.28 refactors to SignalR client and removes Akka dependencies.
Replace the stub SessionHub with a functional implementation that creates sessions, accepts messages, and streams output back to SignalR callers. SessionRegistry (singleton) owns session state and uses IHubContext to push output from Akka.Streams callbacks without requiring a hub instance.
DaemonManager handles the full lifecycle: binary discovery, detached process spawning with PID file tracking, graceful SIGTERM shutdown on Linux/macOS, and systemd user service registration. Windows service support tracked in #21.
Review fixes: - CLI port conflict: use port 0 (random) for transitional in-process mode - SessionHub: call base.OnDisconnectedAsync for framework cleanup - SessionRegistry: log output delivery errors instead of swallowing, guard against duplicate CreateSession per connection, check OfferAsync result for queue failures - DaemonManager: don't redirect stdio (prevents SIGPIPE / pipe deadlock), validate process name before SIGTERM to prevent PID recycling kills, check kill() return value, proper DateTimeOffset for uptime calc - Eliminate duplicate NetclawPaths instances in both Program.cs files README updated to reflect daemon + thin client architecture with CLI reference documentation.
Aaronontheweb
commented
Feb 24, 2026
Aaronontheweb
commented
Feb 24, 2026
Collaborator
Author
Aaronontheweb
left a comment
There was a problem hiding this comment.
Corrected review notes (prior comment had shell-escaping artifacts):
netclaw daemon stopcan target unrelateddotnetprocesses because PID validation currently accepts any process nameddotnetwithout verifying command-line identity.SessionHub.SendMessagedoes not enforce connection ownership ofsessionId, so a client that learns another session ID can inject messages.- Session disconnect currently disposes sessions immediately, which conflicts with the reconnect model in SPEC-011 (
daemon keeps session alive for reconnection). - PID ownership is launcher-side only; with systemd
Restart=always, PID can drift after daemon restarts andstatus/stopbecome unreliable. - Daemon subcommands return error text but still exit code 0, which makes automation/scripts unreliable.
Proposed fixes:
- Harden daemon PID validation (including command-line identity checks for dotnet-hosted daemon) and validate process identity in
status. - Enforce per-connection session ownership in hub/registry and add explicit session re-attach flow for reconnects.
- Move PID file authority into the daemon runtime via hosted service so restarts keep PID accurate.
- Return non-zero exit codes for daemon command failures.
Enforce connection-scoped SignalR sessions with typed IDs, preserve sessions for reconnect, and tighten daemon PID/process validation for safer lifecycle commands. Add targeted tests and daemon PID file ownership to keep status/stop behavior reliable.
8 tasks
Aaronontheweb
added a commit
that referenced
this pull request
May 22, 2026
* fix(security): address open CodeQL alerts - workflows: add `permissions: contents: read` to website-rebuild.yml so the rebuild job no longer inherits the repo's default GITHUB_TOKEN scope (cs-actions/missing-workflow-permissions, alert #21). - webhooks: sanitize the raw route value via SanitizeWebhookId before logging the route_not_found case — ASP.NET URL-decodes path segments, so a caller could otherwise inject CR/LF into log lines (cs/log-forging, alert #22). - lifecycle: add a SanitizeReason helper to DaemonLifecycleNotifier that strips control chars and caps length at 200, and apply it in NotifyShutdown (HTTP-tainted via ShutdownDaemonRequest.Reason) and NotifyCrashing (cs/log-forging, alert #18). - reminders: harden ReminderDefinitionStore.GetPath with an explicit base-directory containment check on top of the existing Uri.EscapeDataString encoding, and add a regression test covering traversal, backslash, and absolute-path ids (cs/path-injection, alert #17). * fix(security): tighten log/telemetry sanitization from review Follow-ups from the post-CodeQL code review: - webhooks: sanitize `route` once and pass the safe value to both WebhookTelemetry.RecordRouteNotFound and the log line, not just the log. The metric tag was the same log-forging surface and additionally an unbounded high-cardinality vector against the `route` dimension. - lifecycle: widen the SanitizeReason predicate from `char.IsControl` to also strip U+2028 (LINE SEPARATOR) and U+2029 (PARAGRAPH SEPARATOR); these are categories Zl/Zp (not Cc) so IsControl misses them, but JSON- line readers and many log shippers still split on them. - lifecycle: strip sanitized chars rather than replacing them with space, so an attacker-controlled `reason=ok\nlevel=critical` collapses to `reason=oklevel=critical` instead of `reason=ok level=critical` — a space would have been a plausible field separator to key=value log parsers. - lifecycle: when truncating reason at 200 chars, back off one position if the cut would orphan a high surrogate, so downstream UTF-8 encoders don't emit U+FFFD or throw. - tests: cover the four behaviors above with a Theory across CR/LF, NUL, U+2028, U+2029, and a surrogate-pair-boundary truncation case.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Splits
Netclaw.Appinto two binaries and wires them together:netclawd(Netclaw.Daemon) — always-on daemon hosting the Akka actor system, LLM sessions, tool execution, and persistence. Exposes a functional SignalR hub at/hub/sessionfor remote session access.netclaw(Netclaw.Cli) — thin CLI client with daemon management (start/stop/status/install/uninstall) and transitional in-process chat mode.Key changes
Netclaw.App→Netclaw.Daemon+Netclaw.Cliwith sharedNetclaw.ConfigurationSessionRegistry(singleton) bridges SignalR hub (transient) to Akka.Streams viaIHubContext<SessionHub, ISessionHubClient>— creates sessions, accepts messages, streams outputDaemonManagerhandles process lifecycle — binary discovery, PID file tracking, graceful SIGTERM shutdown, systemd user service registrationkill()return checking, output sink error logging,OfferAsyncresult checking, duplicate session guardIntegration tested
Test plan
dotnet build Netclaw.slnx— 0 errors, 0 warningsdotnet test Netclaw.slnx— 110 passeddotnet slopwatch analyze— 0 issues