Skip to content

feat: add ChildHandle and ActorId for type-erased actor management#164

Open
ElFantasma wants to merge 4 commits intomainfrom
feat/links
Open

feat: add ChildHandle and ActorId for type-erased actor management#164
ElFantasma wants to merge 4 commits intomainfrom
feat/links

Conversation

@ElFantasma
Copy link
Copy Markdown
Collaborator

@ElFantasma ElFantasma commented Apr 7, 2026

Summary

  • Adds ActorId — unique identity for actor instances, used by monitors and links to identify actors without knowing the concrete type
  • Adds ChildHandle — type-erased handle to a running actor providing supervision operations (stop(), is_alive(), exit_reason(), wait_exit_blocking(), wait_exit_async()) without needing ActorRef<A>
  • From<ActorRef<A>> for ChildHandle works in both tasks and threads modes
  • ActorRef<A> and Context<A> now carry an ActorId assigned at spawn time, with id() accessors on both
  • Extends exit_reason example with ChildHandle scenarios (type-erased stop, panic observation, heterogeneous Vec<ChildHandle>)

Building block for monitors (#131), links, and supervisor (#133). A supervisor manages children of different actor types — ChildHandle is how it holds and operates on them uniformly.

Key design decisions:

  • ActorId fills the role of Erlang's Pid as a lightweight identity key, but is not the primary user-facing API (users interact through ActorRef<A> and ChildHandle)
  • CancelFn = Arc<dyn Fn() + Send + Sync> wraps both tokio and threads-mode cancel tokens (they're different types)
  • Completion enum with Watch/Condvar variants handles both execution modes
  • ChildHandle equality is by ActorId, not pointer identity
  • wait_exit_blocking uses block_in_place inside a tokio runtime to avoid panics

Test plan

  • 9 new tests: ActorId uniqueness/display, ChildHandle basics/stop/equality, From for both modes, wait_exit_blocking inside runtime
  • All 79 existing tests pass (no regressions)
  • Runnable example: cargo run -p exit_reason (scenarios 5-7 demonstrate ChildHandle)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 7, 2026

🤖 Kimi Code Review

Overall Assessment: Good abstraction that cleanly unifies tasks and threads modes. Code is mostly correct with clean separation of concerns.

Issues Found

1. Double clone inefficiency in child_handle() (tasks/actor.rs:403-405, threads/actor.rs:386-388)

The child_handle() method clones the entire ActorRef (including sender) only to clone the specific fields again in From::from.

Fix: Implement From<&ActorRef<A>> or call constructor directly:

pub fn child_handle(&self) -> ChildHandle {
    ChildHandle::from_tasks(
        self.id,
        self.cancellation_token.clone(),
        self.completion_rx.clone(),
    )
}

Same fix needed for threads mode.

2. Documentation mismatch (child_handle.rs:70-72)

Docstring mentions "monitor" and "link" operations that don't exist yet:

/// Unlike `ActorRef<A>`, a `ChildHandle` cannot send messages — it only
/// provides supervision-related operations (stop, wait, monitor, link).

Fix: Remove "monitor, link" or add TODO comment indicating future implementation.

3. Error handling hides panics (child_handle.rs:191)

In wait_exit_async() for threads mode, spawn_blocking errors (including panics in the waiting thread) are silently converted to ExitReason::Kill:

.unwrap_or(ExitReason::Kill)

Suggestion: Log the error or consider propagating panic information. At minimum, document this behavior.

4. Resource exhaustion risk (child_handle.rs:188-191)

When calling wait_exit_async() on a threads-mode handle, it spawns a blocking task that occupies a thread pool thread until the actor exits. If many async tasks wait on thread actors simultaneously, this could exhaust the blocking pool.

Suggestion: Document this limitation in the method docs:

/// Note: When used with threads-mode actors, this consumes a thread from the 
/// blocking pool for the duration of the wait.

5. Test coverage gap (child_handle.rs:198-256)

Tests only cover threads mode (from_threads). Add tests for tasks mode to ensure Completion::Watch logic is correct.

Minor Observations

  • child_handle.rs:31: Ordering::Relaxed is correct for ID generation (no ordering constraints needed).
  • child_handle.rs:123-127: Poisoned mutex handling with into_inner() is pragmatic for completion signaling, though consider if poison should be treated as ExitReason::Panic.
  • child_handle.rs:145: block_on usage in tasks mode is correctly documented as blocking the thread.

Conclusion

Address the double-clone issue (item 1) before merging. The rest are minor polish items or documentation improvements. The concurrency logic is sound.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 7, 2026

🤖 Codex Code Review

  1. ChildHandle::wait_exit_blocking() can panic if it is called from code already running on Tokio, even though the docs only warn that it “blocks the calling thread.” The implementation creates a fresh runtime through spawned_rt::threads::block_on, and nested Runtime::block_on on a Tokio worker thread is a panic path. That makes this a runtime footgun for supervision code that might call the blocking API from async context. Consider guarding against an active Tokio runtime and failing explicitly, or keeping the blocking wait API backend-specific instead of exposing it uniformly. concurrency/src/child_handle.rs:130 rt/src/threads/mod.rs:27

I did not see other obvious correctness, race, or lifecycle regressions in the ActorId propagation or completion signaling changes. Residual risk: the new ChildHandle tests only exercise the thread/Condvar path, not the task/watch path, which is the branch with the sharper runtime behavior. concurrency/src/child_handle.rs:198

I could not run cargo test in this sandbox because Cargo attempted to reach crates.io and network access is disabled here.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 7, 2026

🤖 Claude Code Review

PR #164 Review: ChildHandle and ActorId

The design is sound and well-structured. The Completion enum cleanly abstracts tasks/threads differences, and Ordering::Relaxed for NEXT_ACTOR_ID is correct (uniqueness is the only requirement). The watch-channel loop pattern in wait_exit_* is idiomatic tokio.


Issues

1. Unnecessary clones in From implementations (tasks/actor.rs, threads/actor.rs)

Both From<ActorRef<A>> for ChildHandle impls take ownership of actor_ref but then .clone() fields they already own:

// tasks/actor.rs ~L415
impl<A: Actor> From<ActorRef<A>> for ChildHandle {
    fn from(actor_ref: ActorRef<A>) -> Self {
        ChildHandle::from_tasks(
            actor_ref.id,
            actor_ref.cancellation_token.clone(),  // owns it — clone is redundant
            actor_ref.completion_rx.clone(),        // owns it — clone is redundant
        )
    }
}

Since actor_ref is consumed, use the fields directly. Same issue in threads/actor.rs. Fix: drop the .clone() calls and move the fields.


2. wait_exit_blocking will panic/deadlock when called from an async context

child_handle.rs ~L133: spawned_rt::threads::block_on(...) is called for the Watch variant. If this wraps tokio::runtime::Handle::block_on or creates a nested runtime inside an existing tokio executor, it will panic at runtime. The doc warning is necessary but insufficient — actors in tasks mode can receive a ChildHandle and accidentally call this.

Consider adding a debug_assert:

// At the top of the Watch branch in wait_exit_blocking
debug_assert!(
    tokio::runtime::Handle::try_current().is_err(),
    "wait_exit_blocking called from async context; use wait_exit_async instead"
);

3. ExitReason::Kill returned on watch channel disconnect (child_handle.rs ~L143, ~L168)

if rx.changed().await.is_err() {
    return ExitReason::Kill;
}

If the actor task is dropped without writing a final exit reason (e.g., due to a panic in the runtime teardown path), this returns Kill, which is misleading — the actor wasn't killed. A comment explaining why Kill is used as the sentinel here would help readers. Alternatively, ExitReason could gain an Unknown / Dropped variant in a follow-up.


4. Context does not expose id()

ActorRef has pub fn id() -> ActorId but Context does not. Actors in their own handler (fn handle(&mut self, msg, ctx: &Context<Self>)) cannot query their own ID without calling ctx.actor_ref().id(), which needlessly allocates a sender clone. Since Context already stores the id field, adding the method is trivial:

// tasks/actor.rs and threads/actor.rs
pub fn id(&self) -> ActorId {
    self.id
}

This will be needed when implementing monitors/links (#131) where an actor needs to report its own identity.


5. Test coverage gap for wait_exit_blocking / wait_exit_async

The 5 new tests cover ActorId, basics, stop(), and equality — but none exercise the blocking/async wait paths. At minimum, the threads-mode wait_exit_blocking should be covered given it's the most complex path (the Condvar loop). A test similar to child_handle_from_threads_basics that calls wait_exit_blocking() in a background thread would catch regressions.


Minor Nits

  • child_handle.rs is pub mod from lib.rs, exposing the Completion and internal constructors to downstream crates that use spawned_concurrency as a dep (even though they're pub(crate), the module boundary is visible). This is fine but worth being aware of.
  • The ActorId counter starting at 1 implicitly reserves 0 as a sentinel — worth a comment if that's intentional.

Summary

Solid foundational work. Issues #1 and #4 are straightforward fixes; #2 and #5 are higher priority before this becomes the basis for monitors/supervisors. #3 is cosmetic.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 7, 2026

Greptile Summary

This PR introduces ActorId (a global atomic u64 identity assigned at spawn time) and ChildHandle (a type-erased supervision handle providing stop, is_alive, exit_reason, and wait_exit_* operations) as foundational building blocks for the upcoming monitor and supervisor features. Both tasks and threads modes are updated symmetrically, with ActorRef<A> and Context<A> now carrying an ActorId.

Confidence Score: 5/5

Safe to merge; the only finding is a minor API inconsistency that does not affect correctness.

No P0 or P1 issues found. The single P2 finding — a missing convenience method on Context — does not break any existing behavior or contract. The core design (ActorId uniqueness via Relaxed atomic fetch_add, ChildHandle equality by ActorId consistent with Hash, Completion enum unifying both backends, documented warning on wait_exit_blocking) is all correct.

No files require special attention; the P2 gap in tasks/actor.rs and threads/actor.rs is a minor ergonomics improvement for future callers.

Important Files Changed

Filename Overview
concurrency/src/child_handle.rs New file: defines ActorId (global atomic u64 starting at 1), Completion enum (Watch/Condvar variants), and ChildHandle type-erased supervision handle with stop/is_alive/exit_reason/wait_exit_blocking/wait_exit_async; includes 5 unit tests
concurrency/src/lib.rs Adds public child_handle module and re-exports ActorId and ChildHandle at the crate root
concurrency/src/tasks/actor.rs Adds ActorId field to ActorRef and Context, exposes id() on ActorRef, adds child_handle() convenience method and From<ActorRef> for ChildHandle; Context is missing a public id() method (P2)
concurrency/src/threads/actor.rs Same additions as tasks/actor.rs for threads mode: ActorId on ActorRef/Context, id() on ActorRef, child_handle() and From impl; Context also missing public id() method (P2)

Sequence Diagram

sequenceDiagram
    participant Supervisor
    participant ActorRef
    participant ChildHandle
    participant Actor

    Supervisor->>ActorRef: spawn(actor) → ActorRef<A> (ActorId assigned)
    Supervisor->>ActorRef: .child_handle()
    ActorRef-->>Supervisor: ChildHandle (type-erased)
    note over Supervisor,ChildHandle: Supervisor stores Vec<ChildHandle>
    Supervisor->>ChildHandle: .id() → ActorId
    Supervisor->>ChildHandle: .is_alive() → bool
    Supervisor->>ChildHandle: .stop()
    ChildHandle->>Actor: cancellation_token.cancel()
    Actor-->>ChildHandle: sets completion (ExitReason)
    Supervisor->>ChildHandle: .wait_exit_async().await
    ChildHandle-->>Supervisor: ExitReason
Loading

Comments Outside Diff (1)

  1. concurrency/src/tasks/actor.rs, line 109-114 (link)

    P2 Context<A> missing id() accessor

    Context now carries id: ActorId but doesn't expose a public id() method, unlike ActorRef<A> (which gains id() at line 403 in this PR). Actors receive ctx: &Context<Self> in started(), stopped(), and all Handler::handle() calls. For the monitor/supervisor work this PR targets, actors need to know their own ID to register themselves — currently they must call ctx.actor_ref().id(), which clones the full ref. The same gap exists in concurrency/src/threads/actor.rs.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: concurrency/src/tasks/actor.rs
    Line: 109-114
    
    Comment:
    **`Context<A>` missing `id()` accessor**
    
    `Context` now carries `id: ActorId` but doesn't expose a public `id()` method, unlike `ActorRef<A>` (which gains `id()` at line 403 in this PR). Actors receive `ctx: &Context<Self>` in `started()`, `stopped()`, and all `Handler::handle()` calls. For the monitor/supervisor work this PR targets, actors need to know their own ID to register themselves — currently they must call `ctx.actor_ref().id()`, which clones the full ref. The same gap exists in `concurrency/src/threads/actor.rs`.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: concurrency/src/tasks/actor.rs
Line: 109-114

Comment:
**`Context<A>` missing `id()` accessor**

`Context` now carries `id: ActorId` but doesn't expose a public `id()` method, unlike `ActorRef<A>` (which gains `id()` at line 403 in this PR). Actors receive `ctx: &Context<Self>` in `started()`, `stopped()`, and all `Handler::handle()` calls. For the monitor/supervisor work this PR targets, actors need to know their own ID to register themselves — currently they must call `ctx.actor_ref().id()`, which clones the full ref. The same gap exists in `concurrency/src/threads/actor.rs`.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat: add ChildHandle and ActorId for ty..." | Re-trigger Greptile

@ElFantasma
Copy link
Copy Markdown
Collaborator Author

Bot Review Responses

Findings addressed in f9f5279 and 5cf0218.


Kimi

#1: Double clone in child_handle() / From impls

Fixed in 5cf0218. From<ActorRef<A>> now moves fields instead of cloning. child_handle() still clones via self.clone() because it borrows &self — but From no longer double-clones.

#2: Doc mentions "monitor, link" that don't exist yet

Fixed in 5cf0218. Changed to "stop, wait, check liveness."

#3: spawn_blocking errors silently converted to Kill

Acknowledged — this is a defensive fallback for an edge case (the blocking thread itself panics). Logging would add a dependency on tracing in a low-level path. The behavior is documented.

#4: spawn_blocking pool exhaustion risk

Fixed in 5cf0218. Added doc note: "When used with threads-mode handles, this consumes a thread from tokio's blocking pool for the duration of the wait."

#5: Test coverage gap for tasks mode

Already fixed in f9f5279. Added 4 integration tests: child_handle_from_tasks_actor_ref, child_handle_stop_from_tasks_actor, child_handle_wait_blocking_inside_runtime, plus child_handle_from_threads_actor_ref.


Codex

wait_exit_blocking can panic inside tokio runtime

Already fixed in f9f5279. Uses Handle::try_current() + block_in_place when inside a runtime, falls back to a temporary runtime otherwise. Test child_handle_wait_blocking_inside_runtime verifies this.

Tests only exercise Condvar path

Already fixed in f9f5279. See test list above.


Claude

#1: Unnecessary clones in From impls

Fixed in 5cf0218. Fields are moved, not cloned.

#2: wait_exit_blocking panics in async context

Already fixed in f9f5279. See Codex response above.

#3: Kill returned on watch channel disconnect

Fixed in 5cf0218. Added doc comment explaining the rationale: the watch sender was dropped without setting a reason, meaning the actor task was aborted externally (e.g., runtime shutdown).

#4: Context missing id() method

Fixed in 5cf0218. Added pub fn id(&self) -> ActorId to Context<A> in both modes.

#5: No tests for wait paths

Already fixed in f9f5279. child_handle_wait_blocking_inside_runtime and child_handle_from_tasks_actor_ref (which calls wait_exit_async) cover both paths.


Greptile

Context missing id() accessor

Fixed in 5cf0218. Same as Claude #4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant