Use lower thinking/effort for cron and hook sessions#53
Closed
constkolesnyak wants to merge 7 commits into
Closed
Conversation
Gmail preprocessing was stripping all standalone URLs indiscriminately. Now only removes boilerplate (unsubscribe, social media, tracking pixels) and keeps actionable links (booking, messaging, payment, reply threads). Condense prompts updated to explicitly preserve actionable URLs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Notifications sent via NotificationService._deliver_telegram() bypassed TelegramChannel.send() and went directly through bot.send_message(), so they were never stored in _message_cache. When a user reacted to a notification, the reaction handler couldn't find the original text and only showed "[Reaction: 👎]" without context. Now _deliver_telegram caches sent messages via channel._cache_message() so reactions on notifications include the message text. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
anyio 4.13.0's CancelScope._deliver_cancellation sets should_retry=True unconditionally for every task in self._tasks, then reschedules itself via call_soon(). When every task in the scope is the *current* task, nothing gets cancelled but the callback re-queues on every event-loop tick — pinning one CPU core at 100% with ~45k epoll_pwait syscalls/sec. Observed on April 22 and again on April 23 (24h+ of 97% CPU, no work done). The existing _safe_disconnect() workaround in agent.engine only clears the stuck scope during client.disconnect(), so spins triggered by telegram polling / cron / live SDK requests weren't covered. The patch sets should_retry=True only when we actually delivered a cancel or the task is still waiting pickup (_must_cancel). Semantics otherwise match upstream byte-for-byte. Applied via nerve/__init__.py so any import path picks it up. Includes a regression test that exercises the exact pathological shape (scope whose only task is the current task) and asserts the scope stops rescheduling itself.
The April 23 patch (ec0f8f7) fixed the 96%-CPU hot loop where should_retry was set unconditionally, but left a narrower version of the same bug: the `_must_cancel` branch still set should_retry=True. When the current task itself sits with _must_cancel=True while running the cancel callback (observed in production today, nerve/SDK path), this re-queues _deliver_cancellation on every event-loop tick: before fix: 20% CPU, ~61k epoll_pwait/sec after fix: should be idle (~5% CPU range) Changes: - Skip the current task entirely — it cannot cancel itself from inside the callback it is running. - Drop should_retry=True in the _must_cancel branch — asyncio's Task.__step raises CancelledError when the task resumes, no retry needed from us. - should_retry is now True only when we actually called task.cancel() in this pass. - Add regression test that poisons the _must_cancel branch with a fake task. The existing "current-task-only" test passed without reproducing this variant because it didn't set _must_cancel.
Every cron run was failing with `API Error: 400 level "max" not supported, valid levels: low, medium, high` because the global `agent.effort=max` and `agent.thinking=max` were applied to cron sessions too. Cron sessions run on `cron_model` (Sonnet by default), which under Claude OAuth (subscription) caps non-flagship models at `high` and rejects `max`. With a persistent cron session this also poisons every subsequent run on the same SDK session id. Add dedicated `agent.cron_thinking` / `agent.cron_effort` settings (default `high`) and a `_select_thinking_effort` helper that picks the right pair based on session source. `cron` and `hook` get the overrides; everything else keeps the main settings, so interactive sessions still get the full thinking budget on the flagship model. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
Author
|
Wrong target repo, recreating against constkolesnyak/nerve |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Every cron run was failing with:
Root cause: nerve passed the global
agent.effort=maxandagent.thinking=maxto every session, including cron. Cron sessions run oncron_model(Sonnet by default), and under Claude OAuth (subscription)cli-proxy-apirejectslevel=maxfor non-flagship models — onlylow/medium/highare accepted. With a persistent cron session this poisons the SDK session id, so every subsequent run on the same cron job fails identically until the session is rotated.This was the second time this hit (first incident: April 21, 2026, same error in
inbox-processor).Fix
agent.cron_thinkingandagent.cron_effortfields (defaulthigh) — separate knobs for cron/hook sessions._select_thinking_effort(agent_config, source)helper that returns(thinking, effort)based on session source —cronandhookget the cron overrides, everything else keeps the main settings._build_options(...)so interactive sessions still get the full thinking budget on the flagship model, while cron jobs stay within what their proxy/provider accepts.Test plan
tests/test_engine_options.py— 13 new tests covering source routing, default values, custom overrides, and back-compat for configs that don't define the new fields344 passed, 2 skipped