Skip to content

feat(tui): add tokens per second to response footer#12721

Open
JohnC0de wants to merge 3 commits into
anomalyco:devfrom
JohnC0de:feat/tokens-per-second-display
Open

feat(tui): add tokens per second to response footer#12721
JohnC0de wants to merge 3 commits into
anomalyco:devfrom
JohnC0de:feat/tokens-per-second-display

Conversation

@JohnC0de
Copy link
Copy Markdown

@JohnC0de JohnC0de commented Feb 8, 2026

Fixes #5374
Closes #6096

Adds a tok/s (TPS) counter to assistant message footers. Shows up right after duration, like: 18.3s · 131 tok/s

Why

I've been switching between providers a lot lately and wanted a quick way to see which models are actually fast vs which just feel fast. Kimi K2.5 clocks ~130 tok/s. Having the number right there makes the difference obvious without needing external tooling.

Screenshot

TPS in action with Kimi K2.5

Kimi K2.5 Free hitting 198 tok/s on a real response

Prior art

#5497 by @edlsh tackled this back in December. It's been sitting for 2+ months now with merge conflicts and CI failures, and a few people in the comments are asking for it to land. Rather than try to rebase that PR, I reimplemented it cleanly on current dev with a different structure: TPS logic lives in core/tokens/ instead of tui/util/ so the SDK and other consumers can use it later without pulling in TUI code.

How it works

processor.ts records a firstToken timestamp when the first output-delta arrives during streaming. TPS is then calculated as generatedTokens / ((completed - firstToken) / 1000), where generatedTokens includes both output and reasoning tokens. Responses shorter than 250ms, tool calls, and errored responses are filtered out.

What I left out

Average/aggregate TPS across a session. Both issues mention it but it felt like scope creep for a first pass. The per-message timestamps are all persisted, so adding a session-level summary later is straightforward.

Testing

34 unit tests cover calculation, edge cases, and filtering. All CI checks pass: typecheck, unit, e2e (linux), pr-standards.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 8, 2026

The following comment was made by an LLM, it may be inaccurate:

Potential Duplicate Found:

PR #5497 - "feat: display tokens per second for assistant messages"
#5497

Why it's related: This PR appears to be addressing the exact same feature - displaying tokens per second for assistant messages. It likely covers similar functionality for tracking and displaying TPS metrics in the UI.

@JohnC0de JohnC0de force-pushed the feat/tokens-per-second-display branch from 787aee0 to c54f23a Compare February 8, 2026 17:23
Adds TPS calculation and display to message footers. Tracks firstToken
timestamp during streaming and calculates throughput for completed text
responses. Filters out tool calls and fast responses to avoid noise.

Key features:
- Shows TPS next to duration: "3.4s · 45 tok/s"
- Includes both output and reasoning tokens
- 250ms minimum threshold to filter noise
- Comprehensive test coverage (34 tests)

Tested with Kimi K2.5 showing ~131 tok/s.

Fixes anomalyco#5374, Closes anomalyco#6096
@JohnC0de JohnC0de force-pushed the feat/tokens-per-second-display branch from c54f23a to 571c49b Compare February 8, 2026 17:30
@JohnC0de JohnC0de changed the title feat(tui): display tokens/second metric for assistant responses feat: show tokens per second Feb 8, 2026
@JohnC0de JohnC0de changed the title feat: show tokens per second feat(tui): add tokens per second to response footer Feb 8, 2026
@JohnC0de
Copy link
Copy Markdown
Author

JohnC0de commented Feb 8, 2026

@adamdotdevin @rekram1-node — the bot flagged this as a duplicate of #5497, so wanted to give some context.

I reviewed #5497 before starting. It has merge conflicts against dev and failing CI, and @edlsh hasn't been active on it since December. A few people in the comments there are asking for it to land. Rather than try to rebase someone else's branch, I reimplemented it cleanly on current dev with a different structure: TPS calculation lives in core/tokens/ instead of tui/util/ so the SDK and non-TUI consumers can reuse it.

Quick review guide if it helps:

  • tps.ts (83 lines) — the whole calculation. Pure functions, no side effects
  • processor.ts — only change is recording time.firstToken when the first output-delta arrives
  • session/index.tsx — swaps the old inline calculation for getMessageTPS() + formatTPS()
  • The rest (message-v2.ts, types.gen.ts, openapi.json) is schema + SDK regen for the new firstToken field

Happy to adjust anything.

@KohliNaman
Copy link
Copy Markdown

KohliNaman commented Feb 8, 2026

i was just looking for this, checked it out on macos, works flawlessly.... Thanks a ton!
image

@Daltonganger
Copy link
Copy Markdown

Any update on this?

@Daltonganger
Copy link
Copy Markdown

Daltonganger commented Feb 16, 2026

@rekram1-node I investigated the 3 failing checks on this PR.

Root cause:

  • e2e (windows) fails because Bun.which("rg") can resolve to an invalid POSIX-style path on Windows.
  • e2e (linux) fails because Bun.which("rg") can return a path that exists but is not spawnable (ENOENT at runtime).
  • test (linux) is a gate job and fails because upstream e2e jobs fail.

Proposed minimal fix (single-file change): packages/opencode/src/file/ripgrep.ts

  • Ignore POSIX rg paths on Windows.
  • Probe rg --version before trusting a resolved binary path; fallback to bundled/downloaded rg when unusable.

I can paste the exact patch here if useful.

@Daltonganger
Copy link
Copy Markdown

@rekram1-node I opened a follow-up PR that includes all changes from this PR plus a minimal ripgrep path fix for the failing checks:
#13892

Cross-reference:

@JosXa
Copy link
Copy Markdown
Contributor

JosXa commented Feb 20, 2026

Ship it! 🚀

jwiegley added a commit to jwiegley/nix-config that referenced this pull request Feb 27, 2026
Patch opencode with tokens-per-second display from
anomalyco/opencode#12721. Shows streaming throughput metrics
(e.g., "18.3s · 131 tok/s") in assistant message footers.

Built from JohnC0de/opencode feat/tokens-per-second-display branch
(commit 4687e48e9) as a standalone Bun binary, fetched at build time
via fixed-output derivation to work with pure flake evaluation.

Remove this overlay once the feature lands in an upstream release.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FurryWolfX
Copy link
Copy Markdown

I need it

@andrewdunndev
Copy link
Copy Markdown

Nice implementation. The tps.ts utility with the sliding window approach is solid, and the screenshot shows exactly the UX people are asking for.

One thing I noticed while building something similar: the TPS calculation can produce jittery values during the first few tokens of a response (small denominator, large variance). A minimum window before displaying (e.g., wait until at least 10 output tokens before showing tok/s) smooths this out without adding latency to the display.

Also worth noting that @thdxr has #14493 open for the same feature. Might be worth coordinating to avoid duplicate effort.

@Daltonganger
Copy link
Copy Markdown

When is this imported? @rekram1-node

@com30n
Copy link
Copy Markdown

com30n commented Apr 8, 2026

any updates on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Adding Experimental Calculation and Display of Tokens per second [FEATURE]: show tokens / second

7 participants