feat: Token velocity alert — detect runaway agent loops (closes #313) by vivekchand · Pull Request #358 · vivekchand/clawmetry

vivekchand · 2026-03-26T01:33:45Z

Summary

Real-time runaway-loop detection via three independent sliding-window signals.

Detection Signals

Signal	Default	Config key
Token velocity (2-min window)	10,000 tokens	`token_velocity_threshold`
Consecutive tool-call chain	20 calls	`tool_chain_threshold`
Cost velocity	$0.10/min	`cost_velocity_threshold`

Each signal is independently togglable and threshold-configurable.

New Endpoints

GET  /api/alerts/velocity              # real-time metrics + breach flags
GET  /api/alerts/velocity/config       # read config
POST /api/alerts/velocity/config       # update thresholds

Sample response:

{
  "tokens_in_window": 3200,
  "cost_in_window": 0.000096,
  "cost_per_min": 0.000048,
  "max_consecutive_tool_calls": 4,
  "active_sessions": [...],
  "alert_active": false,
  "breaches": {"token_velocity": false, "tool_chain": false, "cost_velocity": false},
  "thresholds": {"token_velocity": 10000, "tool_chain": 20, "cost_velocity": 0.1},
  "window_sec": 120
}

Implementation

_compute_token_velocity() — scans JSONL session files for recent events; skips files not modified in window+60s (fast); counts tokens, cost, and consecutive tool calls per session
_check_velocity_alerts() — checks each signal against config thresholds, calls _fire_alert() (banner + telegram) and _dispatch_configured_webhooks() for token/cost signals
Wired into _budget_monitor_loop() which runs every 60s

Tests

7 tests in TestTokenVelocity — all passing in ~1s (no gateway dependency)

Sliding-window token velocity monitoring with three detection signals: 1. **Token velocity** (sliding 2-min window, default 10,000 tokens): - Scans JSONL session files for events in the last N seconds - Sums token counts across all active sessions - Default window: 120s, configurable 30-600s via API 2. **Consecutive tool-call chain** (default: 20): - Tracks unbroken sequences of tool_use/tool_result events - Resets count when a human-role message (non-tool) appears - Flags the specific session causing the chain 3. **Cost velocity** (default: $0.10/min): - Derived from token cost / window_sec × 60 - Fires independently of token count New endpoints: GET /api/alerts/velocity — real-time velocity metrics + breach flags GET /api/alerts/velocity/config — read velocity config POST /api/alerts/velocity/config — update thresholds Alert routing: - All breaches call _fire_alert() → banner + telegram channels - Token and cost velocity also dispatch to configured webhooks - 30-min cooldown per rule_id via existing _budget_alert_cooldowns Wired into _budget_monitor_loop() (runs every 60s) — lightweight because it skips JSONL files not modified in the last window + 60s. Tests: 7 tests in TestTokenVelocity, all passing in ~1s

…naway loops (closes #313) Wire _compute_token_velocity + _check_velocity_alerts into the active _budget_monitor_loop (second app block) so velocity thresholds are evaluated every 60s alongside existing anomaly checks. Add a dedicated red velocity banner in the dashboard HTML with: - Live token/cost/tool-chain breach message updated every 30s - 'Kill Loop' button that pauses the gateway via /api/budget/pause - Dismiss with 5-minute snooze before re-checking The backend velocity computation (_compute_token_velocity) and config API (/api/alerts/velocity, /api/alerts/velocity/config) were already present in the codebase but were not integrated into the monitor loop or surfaced in the UI.

vivekchand added 2 commits March 26, 2026 02:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Token velocity alert — detect runaway agent loops (closes #313)#358

feat: Token velocity alert — detect runaway agent loops (closes #313)#358
vivekchand wants to merge 2 commits intomainfrom
feat/gh-313-token-velocity-alert

vivekchand commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vivekchand commented Mar 26, 2026

Summary

Detection Signals

New Endpoints

Implementation

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant