Skip to content

feat: OnLLMUsage event, ctx.SetState/GetState, and enriched AgentEndEvent #53

Description

@ezynda3

Feature Description

Three additive extensions to the ext.API / ext.Context surface that would significantly reduce boilerplate for long-lived, stateful, budget-aware extensions:

  1. OnLLMUsage event — per-LLM-call usage callback with token and cost deltas
  2. ctx.SetState / ctx.GetState — last-write-wins, session-scoped key-value state that lives outside the conversation tree
  3. Enriched AgentEndEvent — include per-turn aggregates (tool counts, token deltas, duration) so observers don't have to maintain parallel bookkeeping

These are independent but related, share an implementation theme ("give extensions accurate per-turn signals and a lightweight place to put state"), and would each unlock a class of extension rather than just one use case.


1. OnLLMUsage event

Today, MessageEndEvent carries only {Content}. To observe token usage per LLM call, extensions must snapshot ctx.GetSessionUsage() at turn boundaries and subtract. This is coarse (a single user turn can include many LLM calls inside a tool loop) and lossy (no way to attribute usage to specific calls or models).

api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
    // e.InputTokens       int
    // e.OutputTokens      int
    // e.CacheReadTokens   int
    // e.CacheWriteTokens  int
    // e.Cost              float64
    // e.Model             string
    // e.Provider          string
    // e.RequestID         string  // correlation id for the underlying provider call
})

2. ctx.SetState / ctx.GetState

AppendEntry is the only persistence primitive today. It's append-only, fork-aware, and lives in the conversation tree. That's the right tool for audit logs, but most stateful extensions want a key-value snapshot: "what's the current value of X?" — last write wins.

Using AppendEntry for snapshots has costs:

  • O(branch_length) reads (must walk the branch and take the last matching entry)
  • Every write fsyncs into the conversation JSONL and advances leafID
  • Snapshots accumulate forever; forks duplicate every prior entry
  • Slows future branch reads in proportion to write frequency
ctx.SetState("key", jsonBlob)        // last-write-wins, session-scoped
val, ok := ctx.GetState("key")
ctx.DeleteState("key")
keys := ctx.ListState()

Implementation can be a sidecar file (e.g. <session-dir>/ext-state.json) or an in-memory map flushed on shutdown. Either way: not in the conversation tree, not in LLM context, not walked on every branch read.

3. Enriched AgentEndEvent

Today:

type AgentEndEvent struct {
    Response   string
    StopReason string
}

Almost every observer-style extension reconstructs the same data by maintaining counters in OnToolResult and snapshotting GetSessionUsage at OnAgentStart / OnAgentEnd. Proposed:

type AgentEndEvent struct {
    Response          string
    StopReason        string
    ToolCallCount     int       // total tool invocations in this turn
    ToolNames         []string  // ordered list for inspection
    LLMCallCount      int       // tool-loop iterations / LLM round trips
    InputTokensDelta  int       // sum across this turn
    OutputTokensDelta int
    CostDelta         float64
    DurationMs        int64
}

All fields are computed by Kit's existing runtime; exposing them in the event removes the ~50 lines of duplicate bookkeeping that lives in nearly every continuation-loop, telemetry, or summarization extension.

Motivation / Use Case

These three changes converge on a category of extension Kit doesn't currently support ergonomically: long-running, stateful, observation- or budget-driven extensions. Examples that all need at least two of the three primitives:

  • Cost dashboards / budget caps — need per-call usage (gap 1) and durable state (gap 2)
  • Token-budget enforcement ("warn at $5, cut off at $10") — needs accurate per-LLM-call usage (gap 1) to act between calls instead of after a whole turn
  • Auto-commit-on-idle extensions — need to detect "real work happened this turn" via ToolCallCount (gap 3) without maintaining parallel counters
  • Summarize-on-idle — same as above
  • Continuation / Ralph-loop patterns — need turn-level signals (gap 3) and durable state (gap 2) to decide whether to dispatch
  • Cross-session preferences — currently every extension reinvents file path resolution under ~/.config/kit/...; SetState removes the boilerplate (gap 2)

Today each of these extensions reimplements the same bookkeeping or pays the wrong storage cost. The boilerplate is small in isolation, but it's identical across extensions, error-prone (TOCTOU races, missed cache-token fields, forgotten reset-on-fork semantics), and silently expensive when extensions default to AppendEntry for snapshot state.

Each gap is additive — no breaking changes — and each can ship independently:

  • Gap 1 is the highest-impact. Without it there is no accurate per-call usage signal available to extensions at all; all current solutions are snapshot-and-subtract approximations.
  • Gap 2 is the broadest. Every stateful extension benefits, and it removes the recurring "should I use AppendEntry for this?" decision that today usually gets the wrong answer.
  • Gap 3 is pure ergonomics, but it's the most-duplicated boilerplate in the example extensions directory.

Proposed Implementation

1. OnLLMUsage

Hook into the existing usage accounting that already feeds GetSessionUsage. The runtime already knows per-call usage (it has to, to aggregate). Expose it via a new event type registered the same way as other lifecycle events:

// internal/extensions/api.go
type LLMUsageEvent struct {
    InputTokens      int
    OutputTokens     int
    CacheReadTokens  int
    CacheWriteTokens int
    Cost             float64
    Model            string
    Provider         string
    RequestID        string
}

func (e LLMUsageEvent) Type() EventType { return LLMUsage }

func (a *API) OnLLMUsage(handler func(LLMUsageEvent, Context)) {
    a.onLLMUsage(handler)
}

Fire after each provider call, before the corresponding ToolResult / MessageEnd dispatch.

2. SetState / GetState

Add four function fields to ext.Context:

type Context struct {
    // ...
    SetState    func(key, value string)
    GetState    func(key string) (string, bool)
    DeleteState func(key string)
    ListState   func() []string
}

Backing store options:

  • Sidecar JSON file under the session directory (<session-dir>/ext-state.json), flushed on shutdown and on each SetState with a debounce
  • In-memory map, persisted via a single tree-level entry on shutdown and reloaded on session start

Either works; the sidecar file keeps it out of the conversation tree entirely, which I'd prefer.

Namespacing convention should match AppendEntry (extname:key).

3. Enriched AgentEndEvent

The runtime already tracks all of these. Add fields to the event struct and populate them at the dispatch site:

type AgentEndEvent struct {
    Response          string
    StopReason        string
    ToolCallCount     int
    ToolNames         []string
    LLMCallCount      int
    InputTokensDelta  int
    OutputTokensDelta int
    CostDelta         float64
    DurationMs        int64
}

Existing handlers that read Response / StopReason are unaffected.

Component

extensions (internal/extensions/api.go, internal/extensions/runner.go, plus wiring in cmd/extension_context.go and the pkg/kit/extension_api.go adapter)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions