Feature Description
Three additive extensions to the ext.API / ext.Context surface that would significantly reduce boilerplate for long-lived, stateful, budget-aware extensions:
OnLLMUsage event — per-LLM-call usage callback with token and cost deltas
ctx.SetState / ctx.GetState — last-write-wins, session-scoped key-value state that lives outside the conversation tree
- Enriched
AgentEndEvent — include per-turn aggregates (tool counts, token deltas, duration) so observers don't have to maintain parallel bookkeeping
These are independent but related, share an implementation theme ("give extensions accurate per-turn signals and a lightweight place to put state"), and would each unlock a class of extension rather than just one use case.
1. OnLLMUsage event
Today, MessageEndEvent carries only {Content}. To observe token usage per LLM call, extensions must snapshot ctx.GetSessionUsage() at turn boundaries and subtract. This is coarse (a single user turn can include many LLM calls inside a tool loop) and lossy (no way to attribute usage to specific calls or models).
api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
// e.InputTokens int
// e.OutputTokens int
// e.CacheReadTokens int
// e.CacheWriteTokens int
// e.Cost float64
// e.Model string
// e.Provider string
// e.RequestID string // correlation id for the underlying provider call
})
2. ctx.SetState / ctx.GetState
AppendEntry is the only persistence primitive today. It's append-only, fork-aware, and lives in the conversation tree. That's the right tool for audit logs, but most stateful extensions want a key-value snapshot: "what's the current value of X?" — last write wins.
Using AppendEntry for snapshots has costs:
- O(branch_length) reads (must walk the branch and take the last matching entry)
- Every write fsyncs into the conversation JSONL and advances
leafID
- Snapshots accumulate forever; forks duplicate every prior entry
- Slows future branch reads in proportion to write frequency
ctx.SetState("key", jsonBlob) // last-write-wins, session-scoped
val, ok := ctx.GetState("key")
ctx.DeleteState("key")
keys := ctx.ListState()
Implementation can be a sidecar file (e.g. <session-dir>/ext-state.json) or an in-memory map flushed on shutdown. Either way: not in the conversation tree, not in LLM context, not walked on every branch read.
3. Enriched AgentEndEvent
Today:
type AgentEndEvent struct {
Response string
StopReason string
}
Almost every observer-style extension reconstructs the same data by maintaining counters in OnToolResult and snapshotting GetSessionUsage at OnAgentStart / OnAgentEnd. Proposed:
type AgentEndEvent struct {
Response string
StopReason string
ToolCallCount int // total tool invocations in this turn
ToolNames []string // ordered list for inspection
LLMCallCount int // tool-loop iterations / LLM round trips
InputTokensDelta int // sum across this turn
OutputTokensDelta int
CostDelta float64
DurationMs int64
}
All fields are computed by Kit's existing runtime; exposing them in the event removes the ~50 lines of duplicate bookkeeping that lives in nearly every continuation-loop, telemetry, or summarization extension.
Motivation / Use Case
These three changes converge on a category of extension Kit doesn't currently support ergonomically: long-running, stateful, observation- or budget-driven extensions. Examples that all need at least two of the three primitives:
- Cost dashboards / budget caps — need per-call usage (gap 1) and durable state (gap 2)
- Token-budget enforcement ("warn at $5, cut off at $10") — needs accurate per-LLM-call usage (gap 1) to act between calls instead of after a whole turn
- Auto-commit-on-idle extensions — need to detect "real work happened this turn" via
ToolCallCount (gap 3) without maintaining parallel counters
- Summarize-on-idle — same as above
- Continuation / Ralph-loop patterns — need turn-level signals (gap 3) and durable state (gap 2) to decide whether to dispatch
- Cross-session preferences — currently every extension reinvents file path resolution under
~/.config/kit/...; SetState removes the boilerplate (gap 2)
Today each of these extensions reimplements the same bookkeeping or pays the wrong storage cost. The boilerplate is small in isolation, but it's identical across extensions, error-prone (TOCTOU races, missed cache-token fields, forgotten reset-on-fork semantics), and silently expensive when extensions default to AppendEntry for snapshot state.
Each gap is additive — no breaking changes — and each can ship independently:
- Gap 1 is the highest-impact. Without it there is no accurate per-call usage signal available to extensions at all; all current solutions are snapshot-and-subtract approximations.
- Gap 2 is the broadest. Every stateful extension benefits, and it removes the recurring "should I use
AppendEntry for this?" decision that today usually gets the wrong answer.
- Gap 3 is pure ergonomics, but it's the most-duplicated boilerplate in the example extensions directory.
Proposed Implementation
1. OnLLMUsage
Hook into the existing usage accounting that already feeds GetSessionUsage. The runtime already knows per-call usage (it has to, to aggregate). Expose it via a new event type registered the same way as other lifecycle events:
// internal/extensions/api.go
type LLMUsageEvent struct {
InputTokens int
OutputTokens int
CacheReadTokens int
CacheWriteTokens int
Cost float64
Model string
Provider string
RequestID string
}
func (e LLMUsageEvent) Type() EventType { return LLMUsage }
func (a *API) OnLLMUsage(handler func(LLMUsageEvent, Context)) {
a.onLLMUsage(handler)
}
Fire after each provider call, before the corresponding ToolResult / MessageEnd dispatch.
2. SetState / GetState
Add four function fields to ext.Context:
type Context struct {
// ...
SetState func(key, value string)
GetState func(key string) (string, bool)
DeleteState func(key string)
ListState func() []string
}
Backing store options:
- Sidecar JSON file under the session directory (
<session-dir>/ext-state.json), flushed on shutdown and on each SetState with a debounce
- In-memory map, persisted via a single tree-level entry on shutdown and reloaded on session start
Either works; the sidecar file keeps it out of the conversation tree entirely, which I'd prefer.
Namespacing convention should match AppendEntry (extname:key).
3. Enriched AgentEndEvent
The runtime already tracks all of these. Add fields to the event struct and populate them at the dispatch site:
type AgentEndEvent struct {
Response string
StopReason string
ToolCallCount int
ToolNames []string
LLMCallCount int
InputTokensDelta int
OutputTokensDelta int
CostDelta float64
DurationMs int64
}
Existing handlers that read Response / StopReason are unaffected.
Component
extensions (internal/extensions/api.go, internal/extensions/runner.go, plus wiring in cmd/extension_context.go and the pkg/kit/extension_api.go adapter)
Feature Description
Three additive extensions to the
ext.API/ext.Contextsurface that would significantly reduce boilerplate for long-lived, stateful, budget-aware extensions:OnLLMUsageevent — per-LLM-call usage callback with token and cost deltasctx.SetState/ctx.GetState— last-write-wins, session-scoped key-value state that lives outside the conversation treeAgentEndEvent— include per-turn aggregates (tool counts, token deltas, duration) so observers don't have to maintain parallel bookkeepingThese are independent but related, share an implementation theme ("give extensions accurate per-turn signals and a lightweight place to put state"), and would each unlock a class of extension rather than just one use case.
1.
OnLLMUsageeventToday,
MessageEndEventcarries only{Content}. To observe token usage per LLM call, extensions must snapshotctx.GetSessionUsage()at turn boundaries and subtract. This is coarse (a single user turn can include many LLM calls inside a tool loop) and lossy (no way to attribute usage to specific calls or models).2.
ctx.SetState/ctx.GetStateAppendEntryis the only persistence primitive today. It's append-only, fork-aware, and lives in the conversation tree. That's the right tool for audit logs, but most stateful extensions want a key-value snapshot: "what's the current value of X?" — last write wins.Using
AppendEntryfor snapshots has costs:leafIDImplementation can be a sidecar file (e.g.
<session-dir>/ext-state.json) or an in-memory map flushed on shutdown. Either way: not in the conversation tree, not in LLM context, not walked on every branch read.3. Enriched
AgentEndEventToday:
Almost every observer-style extension reconstructs the same data by maintaining counters in
OnToolResultand snapshottingGetSessionUsageatOnAgentStart/OnAgentEnd. Proposed:All fields are computed by Kit's existing runtime; exposing them in the event removes the ~50 lines of duplicate bookkeeping that lives in nearly every continuation-loop, telemetry, or summarization extension.
Motivation / Use Case
These three changes converge on a category of extension Kit doesn't currently support ergonomically: long-running, stateful, observation- or budget-driven extensions. Examples that all need at least two of the three primitives:
ToolCallCount(gap 3) without maintaining parallel counters~/.config/kit/...;SetStateremoves the boilerplate (gap 2)Today each of these extensions reimplements the same bookkeeping or pays the wrong storage cost. The boilerplate is small in isolation, but it's identical across extensions, error-prone (TOCTOU races, missed cache-token fields, forgotten reset-on-fork semantics), and silently expensive when extensions default to
AppendEntryfor snapshot state.Each gap is additive — no breaking changes — and each can ship independently:
AppendEntryfor this?" decision that today usually gets the wrong answer.Proposed Implementation
1.
OnLLMUsageHook into the existing usage accounting that already feeds
GetSessionUsage. The runtime already knows per-call usage (it has to, to aggregate). Expose it via a new event type registered the same way as other lifecycle events:Fire after each provider call, before the corresponding
ToolResult/MessageEnddispatch.2.
SetState/GetStateAdd four function fields to
ext.Context:Backing store options:
<session-dir>/ext-state.json), flushed on shutdown and on eachSetStatewith a debounceEither works; the sidecar file keeps it out of the conversation tree entirely, which I'd prefer.
Namespacing convention should match
AppendEntry(extname:key).3. Enriched
AgentEndEventThe runtime already tracks all of these. Add fields to the event struct and populate them at the dispatch site:
Existing handlers that read
Response/StopReasonare unaffected.Component
extensions (
internal/extensions/api.go,internal/extensions/runner.go, plus wiring incmd/extension_context.goand thepkg/kit/extension_api.goadapter)