Skip to content

feat: Token velocity alert — detect runaway agent loops (closes #313)#358

Open
vivekchand wants to merge 2 commits intomainfrom
feat/gh-313-token-velocity-alert
Open

feat: Token velocity alert — detect runaway agent loops (closes #313)#358
vivekchand wants to merge 2 commits intomainfrom
feat/gh-313-token-velocity-alert

Conversation

@vivekchand
Copy link
Copy Markdown
Owner

Summary

Real-time runaway-loop detection via three independent sliding-window signals.

Detection Signals

Signal Default Config key
Token velocity (2-min window) 10,000 tokens token_velocity_threshold
Consecutive tool-call chain 20 calls tool_chain_threshold
Cost velocity $0.10/min cost_velocity_threshold

Each signal is independently togglable and threshold-configurable.

New Endpoints

GET  /api/alerts/velocity              # real-time metrics + breach flags
GET  /api/alerts/velocity/config       # read config
POST /api/alerts/velocity/config       # update thresholds

Sample response:

{
  "tokens_in_window": 3200,
  "cost_in_window": 0.000096,
  "cost_per_min": 0.000048,
  "max_consecutive_tool_calls": 4,
  "active_sessions": [...],
  "alert_active": false,
  "breaches": {"token_velocity": false, "tool_chain": false, "cost_velocity": false},
  "thresholds": {"token_velocity": 10000, "tool_chain": 20, "cost_velocity": 0.1},
  "window_sec": 120
}

Implementation

  • _compute_token_velocity() — scans JSONL session files for recent events; skips files not modified in window+60s (fast); counts tokens, cost, and consecutive tool calls per session
  • _check_velocity_alerts() — checks each signal against config thresholds, calls _fire_alert() (banner + telegram) and _dispatch_configured_webhooks() for token/cost signals
  • Wired into _budget_monitor_loop() which runs every 60s

Tests

7 tests in TestTokenVelocity — all passing in ~1s (no gateway dependency)

Sliding-window token velocity monitoring with three detection signals:

1. **Token velocity** (sliding 2-min window, default 10,000 tokens):
   - Scans JSONL session files for events in the last N seconds
   - Sums token counts across all active sessions
   - Default window: 120s, configurable 30-600s via API

2. **Consecutive tool-call chain** (default: 20):
   - Tracks unbroken sequences of tool_use/tool_result events
   - Resets count when a human-role message (non-tool) appears
   - Flags the specific session causing the chain

3. **Cost velocity** (default: $0.10/min):
   - Derived from token cost / window_sec × 60
   - Fires independently of token count

New endpoints:
  GET  /api/alerts/velocity          — real-time velocity metrics + breach flags
  GET  /api/alerts/velocity/config   — read velocity config
  POST /api/alerts/velocity/config   — update thresholds

Alert routing:
  - All breaches call _fire_alert() → banner + telegram channels
  - Token and cost velocity also dispatch to configured webhooks
  - 30-min cooldown per rule_id via existing _budget_alert_cooldowns

Wired into _budget_monitor_loop() (runs every 60s) — lightweight because
it skips JSONL files not modified in the last window + 60s.

Tests: 7 tests in TestTokenVelocity, all passing in ~1s
…naway loops (closes #313)

Wire _compute_token_velocity + _check_velocity_alerts into the active _budget_monitor_loop
(second app block) so velocity thresholds are evaluated every 60s alongside existing
anomaly checks.

Add a dedicated red velocity banner in the dashboard HTML with:
- Live token/cost/tool-chain breach message updated every 30s
- 'Kill Loop' button that pauses the gateway via /api/budget/pause
- Dismiss with 5-minute snooze before re-checking

The backend velocity computation (_compute_token_velocity) and config API
(/api/alerts/velocity, /api/alerts/velocity/config) were already present in the
codebase but were not integrated into the monitor loop or surfaced in the UI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant