Skip to content

Add IORuntime#liveFiberSnapshot#4444

Merged
djspiewak merged 3 commits intotypelevel:series/3.xfrom
iRevive:feature/live-fiber-snapshot
Jul 23, 2025
Merged

Add IORuntime#liveFiberSnapshot#4444
djspiewak merged 3 commits intotypelevel:series/3.xfrom
iRevive:feature/live-fiber-snapshot

Conversation

@iRevive
Copy link
Copy Markdown
Contributor

@iRevive iRevive commented Jul 12, 2025

Fixes #2580, #3025. The implementation is based on #3038.

@iRevive iRevive requested review from armanbilge and djspiewak July 12, 2025 08:17
@iRevive iRevive force-pushed the feature/live-fiber-snapshot branch from 5ff02a2 to 9b4c259 Compare July 12, 2025 08:29
/**
* Mapping of worker threads to their currently active fibers.
*/
def workers: Map[WorkerInfo, List[FiberInfo]]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workers will always be empty in Scala.js. Should we add FiberSnapshotPlatform and provide a platform-specific API?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah feels like a platform specific API would be better here unless it creates extensive contortions

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/**
* The list of all global (non-worker-local) fibers.
*/
def global: List[FiberInfo]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global name feels odd. Should we use external or foreign instead?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

External feels best since we use that term a lot for such things

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@iRevive iRevive force-pushed the feature/live-fiber-snapshot branch from 9b4c259 to 757a237 Compare July 12, 2025 08:49
Copy link
Copy Markdown
Member

@djspiewak djspiewak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very good. A couple high level comments

/**
* The underlying fiber.
*/
def fiber: Fiber[IO, ?, ?]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, this opens up some really weird possible patterns, like taking a live snapshot and then canceling random fibers. Technically that's valid but it might break some assumptions (e.g. in combinators which use local fibers as an implementation detail). Do we need to expose the actual Fiber instance in this way?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need fiber in exactly one place:

def pretty: String = {
val id = System.identityHashCode(fiber).toHexString
val prefixedTrace = if (trace.toList.isEmpty) "" else "\n" + Tracing.prettyPrint(trace)
s"cats.effect.IOFiber@$id $state$prefixedTrace"
}

It might be helpful to include an identifier to indicate which specific fiber this FiberInfo refers to.
Perhaps we can replace fiber with a fiberId: String instead? That should be sufficient.

/**
* The list of all global (non-worker-local) fibers.
*/
def global: List[FiberInfo]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

External feels best since we use that term a lot for such things

/**
* Mapping of worker threads to their currently active fibers.
*/
def workers: Map[WorkerInfo, List[FiberInfo]]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah feels like a platform specific API would be better here unless it creates extensive contortions

/**
* The underlying thread representing the worker.
*/
def thread: Thread
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this API (which is very fair) makes a decent symmetrical case for producing the Fiber in FiberInfo (as you've done). People are just going to have to be aware that this whole thing is a pretty serious footgun.

Copy link
Copy Markdown
Contributor Author

@iRevive iRevive Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same story here.

It is used in exactly one place:

val workersStatuses = snapshot.workers.map {
case (worker, fibers) =>
val yielding = fibers.collect {
case fiber: FiberInfo if fiber.state == FiberInfo.State.Yielding => fiber
}.size
printFibers(fibers)(print)
s"${worker.thread} (#${worker.index}): $yielding enqueued"
}

Maybe we should use threadInfo: String instead, which can use Thread#toString under the hood? Sufficient enough for debugging and not enough to break something.

* } yield ()
* }}}
*/
def liveFiberSnapshot(): FiberSnapshot =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to add some documentation to justify why we aren't wrapping this in IO. In particular, the fact that we expect it to be needed from outside an IO context (for debugging and monitoring) makes it necessary to keep it unwrapped. We could hypothetically do SyncIO here instead if we want to be purists, but given that this is in the unsafe package I don't think it's strictly needed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few more bits:

   * This method is intended primarily for external tools, such as debuggers, monitoring agents,
   * or test frameworks. As such, it is '''not''' wrapped in [[cats.effect.IO]] or
   * [[cats.effect.SyncIO]] since this API is expected to be used outside the effect context.
   * Requiring an `IO` here would make integration with external tools more cumbersome or even
   * impossible.
   *
   * ....
   *
   * @note
   *   this method is part of the `unsafe` package and should be used with care. Its primary
   *   purpose is for runtime introspection, not for application logic.

@djspiewak
Copy link
Copy Markdown
Member

Thought about it more, and I kinda like having the threads and fibers. It's analogous to the APIs the JVM exposes on Thread itself.

@djspiewak djspiewak merged commit d2e396a into typelevel:series/3.x Jul 23, 2025
36 of 38 checks passed
@iRevive iRevive deleted the feature/live-fiber-snapshot branch July 23, 2025 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide within-process "public" API for fiber dumps

2 participants