feat: Token velocity alert — detect runaway agent loops (closes #313)#358
Open
vivekchand wants to merge 2 commits intomainfrom
Open
feat: Token velocity alert — detect runaway agent loops (closes #313)#358vivekchand wants to merge 2 commits intomainfrom
vivekchand wants to merge 2 commits intomainfrom
Conversation
Sliding-window token velocity monitoring with three detection signals: 1. **Token velocity** (sliding 2-min window, default 10,000 tokens): - Scans JSONL session files for events in the last N seconds - Sums token counts across all active sessions - Default window: 120s, configurable 30-600s via API 2. **Consecutive tool-call chain** (default: 20): - Tracks unbroken sequences of tool_use/tool_result events - Resets count when a human-role message (non-tool) appears - Flags the specific session causing the chain 3. **Cost velocity** (default: $0.10/min): - Derived from token cost / window_sec × 60 - Fires independently of token count New endpoints: GET /api/alerts/velocity — real-time velocity metrics + breach flags GET /api/alerts/velocity/config — read velocity config POST /api/alerts/velocity/config — update thresholds Alert routing: - All breaches call _fire_alert() → banner + telegram channels - Token and cost velocity also dispatch to configured webhooks - 30-min cooldown per rule_id via existing _budget_alert_cooldowns Wired into _budget_monitor_loop() (runs every 60s) — lightweight because it skips JSONL files not modified in the last window + 60s. Tests: 7 tests in TestTokenVelocity, all passing in ~1s
…naway loops (closes #313) Wire _compute_token_velocity + _check_velocity_alerts into the active _budget_monitor_loop (second app block) so velocity thresholds are evaluated every 60s alongside existing anomaly checks. Add a dedicated red velocity banner in the dashboard HTML with: - Live token/cost/tool-chain breach message updated every 30s - 'Kill Loop' button that pauses the gateway via /api/budget/pause - Dismiss with 5-minute snooze before re-checking The backend velocity computation (_compute_token_velocity) and config API (/api/alerts/velocity, /api/alerts/velocity/config) were already present in the codebase but were not integrated into the monitor loop or surfaced in the UI.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Real-time runaway-loop detection via three independent sliding-window signals.
Detection Signals
token_velocity_thresholdtool_chain_thresholdcost_velocity_thresholdEach signal is independently togglable and threshold-configurable.
New Endpoints
Sample response:
{ "tokens_in_window": 3200, "cost_in_window": 0.000096, "cost_per_min": 0.000048, "max_consecutive_tool_calls": 4, "active_sessions": [...], "alert_active": false, "breaches": {"token_velocity": false, "tool_chain": false, "cost_velocity": false}, "thresholds": {"token_velocity": 10000, "tool_chain": 20, "cost_velocity": 0.1}, "window_sec": 120 }Implementation
_compute_token_velocity()— scans JSONL session files for recent events; skips files not modified in window+60s (fast); counts tokens, cost, and consecutive tool calls per session_check_velocity_alerts()— checks each signal against config thresholds, calls_fire_alert()(banner + telegram) and_dispatch_configured_webhooks()for token/cost signals_budget_monitor_loop()which runs every 60sTests
7 tests in
TestTokenVelocity— all passing in ~1s (no gateway dependency)