feat: add context auto-compaction flow#187
Conversation
|
Thanks for the pull request. A maintainer will review it when available. Please keep the PR focused, explain the why in the description, and make sure local checks pass before requesting review. Contribution guide: https://github.com/AI-Shell-Team/aish/blob/main/CONTRIBUTING.md |
|
This pull request description looks incomplete. Please update the missing sections below before review. Missing items:
|
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR implements an automated context window management system that compresses conversation history when token usage approaches configured thresholds. It adds token budgeting logic, configurable compaction strategies (microcompact for low-value shell output, full-compact using LLM summarization), budget-aware message preparation in the LLM session, and shell integration with pre-send compaction and UI feedback across terminal and SSH modes. ChangesContext Auto-Compaction Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/aish-shell/src/app.rs (1)
2280-2330:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftThread the resolved budget policy into SSH sub-sessions too.
These new SSH branches assume compaction events can fire, but the ad-hoc
LlmSessions used in the SSH callbacks still run withoutset_context_budget_policy(...). That leaves SSH turns on a different compaction policy than the main shell session, so the new UI can stay cold exactly where long remote transcripts are most likely.Also applies to: 2962-3011
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aish-shell/src/app.rs` around lines 2280 - 2330, The SSH callback branches that create ad-hoc LlmSession instances (the closures passed to session.set_event_callback and the SSH-related handlers that mirror this logic) do not apply the resolved context budget policy, so their compaction behavior can diverge; update each place that constructs or configures an LlmSession used for SSH/remote turns to call set_context_budget_policy(...) with the same budget policy object/value used by the main shell session (reuse the same policy variable or fetch it from the session config) so compaction events and UI behavior match; specifically modify the code paths that create those ad-hoc LlmSession(s) referenced alongside context_compaction_notice and the event callback so they invoke set_context_budget_policy before the session is used.
🧹 Nitpick comments (1)
crates/aish-shell/src/ai_handler.rs (1)
348-389: 💤 Low valueStep numbering drift after inserting the compaction step.
Inserting "Step 4: Compact persistent context" at line 348 left the surrounding comments inconsistent: lines 362 and 366 both say "Step 5", and Step 6 is skipped (jumps from Step 5 to Step 7 at line 378). Trivial to renumber while the change is fresh.
✏️ Suggested renumber
- // Step 5: Build context and system messages + // Step 5: Build context and system messages let context_messages = self.build_context_messages(); let system_message = self.system_message(); - // Step 5: Send to LLM + // Step 6: Send to LLM let process_result = self .llm_session .process_input( &question_processed, &context_messages, system_message.as_deref(), true, ) .await?; let response = process_result.text; - // Step 7: Store the exchange in context + // Step 7: Store the exchange in context🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aish-shell/src/ai_handler.rs` around lines 348 - 389, Comments around the new compaction step are misnumbered; update the inline step comments to be sequential (e.g., compact_context_before_send is Step 4, build_context_messages/system_message Step 5, llm_session.process_input Step 6, context_manager.add_message calls Step 7, auto_retain_memory Step 8, and persist_token_usage Step 9) so the comments match the actual flow; locate the block around compact_context_before_send, build_context_messages, llm_session.process_input, context_manager.add_message, auto_retain_memory, and persist_token_usage and rename the "Step X" comments accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/aish-context/src/manager.rs`:
- Around line 619-636: In compact_shell_content the check
trimmed.contains("path") is too broad and preserves unrelated lines; update the
retention logic in compact_shell_content to match the real metadata format
instead (e.g., look for path=" or the path attribute inside an <offload ...>
tag) rather than any occurrence of "path" — either change the predicate to
trimmed.contains("path=\"") or perform a simple parse for an <offload ...
path="..."> attribute so only true metadata lines are retained.
In `@crates/aish-llm/src/session.rs`:
- Around line 1479-1491: The function estimate_one_chat_message currently
ignores policy.enable_token_estimation because both branches compute the same
rough heuristic; update it to mirror ContextManager::estimate_tokens by using
tiktoken_rs when policy.enable_token_estimation is true and only falling back to
the (content_len + reasoning_len)/4 heuristic when false, or alternatively
remove the dead if and document it's a heuristic; specifically modify
estimate_one_chat_message to call the same tokenizer (e.g.,
tiktoken_rs::cl100k_base_singleton() usage as in
ContextManager::estimate_tokens) when enable_token_estimation is true and
otherwise return the existing heuristic, ensuring you reference
policy.enable_token_estimation, estimate_one_chat_message, and
ContextManager::estimate_tokens so the behavior is consistent.
- Around line 1183-1263: The end event can be emitted without a matching start;
update both prepare_messages_for_send and compact_context_before_send so that
whenever microcompact_chat_messages changes messages you call
emit_context_compaction_start(..., "microcompact") before running the
microcompact step (set emitted_compaction_start = true and compaction_mode =
"microcompact"), and keep the existing emit_context_compaction_end(...) path
unchanged so the start/end pair is always emitted; ensure you reference
emit_context_compaction_start, emit_context_compaction_end,
microcompact_chat_messages, emitted_compaction_start, compaction_mode,
prepare_messages_for_send and compact_context_before_send when applying the
change.
In `@crates/aish-shell/src/ai_handler.rs`:
- Around line 425-530: In compact_context_before_send: fix two issues — (1)
ensure lifecycle events are balanced by only emitting
emit_context_compaction_end when a matching emit_context_compaction_start was
emitted (use the emitted_compaction_start flag to gate the end-event) or emit a
symmetric start for microcompact runs (update the branch where
report.microcompact is set to call llm_session.emit_context_compaction_start
with "microcompact"); and (2) avoid incrementing the compact-failure fuse for
structural "no candidates" cases by changing the candidates.is_empty() branch in
the full-compact path to treat emptiness as a skip (do not call
context_manager.record_compact_failure(); optionally set
report.skipped_full_compact_due_to_fuse or another flag) so repeated
empty-candidate cycles do not trip compact_consecutive_failures(); update the
referenced symbols: compact_context_before_send, emitted_compaction_start,
emit_context_compaction_start, emit_context_compaction_end,
full_compact_candidate_messages, record_compact_failure, reset_compact_failures,
and skipped_full_compact_due_to_fuse.
In `@crates/aish-shell/src/app.rs`:
- Around line 292-293: The context budget policy computed by
context_budget_policy_from_config in AishShell::new is not re-applied after
config/model changes, so update flows (e.g., the /model handler,
run_setup_wizard, and any place calling self.ai_handler.update_model) must
recompute and reinstall the policy: call
context_budget_policy_from_config(&self.config) and then
llm_session.set_context_budget_policy(...) (or the equivalent on the
session/handler) immediately after update_model completes to replace stale
window/threshold values; ensure this same fix is applied at the other locations
referenced (around the existing lines where
llm_session.set_context_budget_policy is set and at the other noted spot).
- Around line 606-613: The compaction notice flag compaction_notice_shown_ref is
being set (swap(true,...)) before we verify and actually print the notice in the
LlmEventType::ContextCompactionEnd handler, which can suppress future valid
notices; change the flow so you only set compaction_notice_shown_ref to true
after you successfully obtain and print a message from
context_compaction_notice(&event). Concretely, keep
compaction_active_ref.store(false, Ordering::SeqCst) and animation_ref.stop()
as-is, then call context_compaction_notice(&event) and if Some(message) {
println!(...); compaction_notice_shown_ref.store(true, Ordering::SeqCst); }
(i.e., remove the early swap and only mark shown when printing). Apply the same
change to the other duplicate blocks referenced.
---
Outside diff comments:
In `@crates/aish-shell/src/app.rs`:
- Around line 2280-2330: The SSH callback branches that create ad-hoc LlmSession
instances (the closures passed to session.set_event_callback and the SSH-related
handlers that mirror this logic) do not apply the resolved context budget
policy, so their compaction behavior can diverge; update each place that
constructs or configures an LlmSession used for SSH/remote turns to call
set_context_budget_policy(...) with the same budget policy object/value used by
the main shell session (reuse the same policy variable or fetch it from the
session config) so compaction events and UI behavior match; specifically modify
the code paths that create those ad-hoc LlmSession(s) referenced alongside
context_compaction_notice and the event callback so they invoke
set_context_budget_policy before the session is used.
---
Nitpick comments:
In `@crates/aish-shell/src/ai_handler.rs`:
- Around line 348-389: Comments around the new compaction step are misnumbered;
update the inline step comments to be sequential (e.g.,
compact_context_before_send is Step 4, build_context_messages/system_message
Step 5, llm_session.process_input Step 6, context_manager.add_message calls Step
7, auto_retain_memory Step 8, and persist_token_usage Step 9) so the comments
match the actual flow; locate the block around compact_context_before_send,
build_context_messages, llm_session.process_input, context_manager.add_message,
auto_retain_memory, and persist_token_usage and rename the "Step X" comments
accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 549efe3b-a620-4bbd-b6db-8778a59843eb
📒 Files selected for processing (16)
CONFIGURATION.mdcrates/aish-config/src/model.rscrates/aish-context/src/budget.rscrates/aish-context/src/lib.rscrates/aish-context/src/manager.rscrates/aish-core/src/types.rscrates/aish-i18n/locales/de-DE.yamlcrates/aish-i18n/locales/en-US.yamlcrates/aish-i18n/locales/es-ES.yamlcrates/aish-i18n/locales/fr-FR.yamlcrates/aish-i18n/locales/ja-JP.yamlcrates/aish-i18n/locales/zh-CN.yamlcrates/aish-llm/src/session.rscrates/aish-llm/src/streaming.rscrates/aish-shell/src/ai_handler.rscrates/aish-shell/src/app.rs
| fn compact_shell_content(content: &str) -> Option<String> { | ||
| if content.len() <= 512 && !is_low_value_output(content) { | ||
| return None; | ||
| } | ||
| let mut retained = Vec::new(); | ||
| for line in content.lines() { | ||
| let trimmed = line.trim(); | ||
| if trimmed.starts_with("Command:") | ||
| || trimmed.starts_with("Exit code:") | ||
| || trimmed.contains("<return_code>") | ||
| || trimmed.contains("</return_code>") | ||
| || trimmed.contains("<offload>") | ||
| || trimmed.contains("</offload>") | ||
| || trimmed.contains("path") | ||
| { | ||
| retained.push(trimmed.to_string()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Over-broad path substring match retains spurious lines during shell compaction.
trimmed.contains("path") matches any line that happens to contain the substring path anywhere — e.g., "pathway", "I dispatch a job", "Use the path to enlightenment", or random user prose. Other retention conditions in this loop use structured markers (Command:, Exit code:, <return_code>, <offload>), but path is a bare lowercase word that will preserve many non-metadata lines, weakening the microcompact effectiveness on noisy shell output.
Consider anchoring this match to the actual metadata format you intend to retain (likely the path="..." attribute inside <offload …>), e.g. trimmed.contains("path=\"") or a stricter <offload …path="…"> parse.
♻️ Suggested tightening
- || trimmed.contains("<offload>")
- || trimmed.contains("</offload>")
- || trimmed.contains("path")
+ || trimmed.contains("<offload>")
+ || trimmed.contains("</offload>")
+ || trimmed.contains("path=\"")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn compact_shell_content(content: &str) -> Option<String> { | |
| if content.len() <= 512 && !is_low_value_output(content) { | |
| return None; | |
| } | |
| let mut retained = Vec::new(); | |
| for line in content.lines() { | |
| let trimmed = line.trim(); | |
| if trimmed.starts_with("Command:") | |
| || trimmed.starts_with("Exit code:") | |
| || trimmed.contains("<return_code>") | |
| || trimmed.contains("</return_code>") | |
| || trimmed.contains("<offload>") | |
| || trimmed.contains("</offload>") | |
| || trimmed.contains("path") | |
| { | |
| retained.push(trimmed.to_string()); | |
| } | |
| } | |
| fn compact_shell_content(content: &str) -> Option<String> { | |
| if content.len() <= 512 && !is_low_value_output(content) { | |
| return None; | |
| } | |
| let mut retained = Vec::new(); | |
| for line in content.lines() { | |
| let trimmed = line.trim(); | |
| if trimmed.starts_with("Command:") | |
| || trimmed.starts_with("Exit code:") | |
| || trimmed.contains("<return_code>") | |
| || trimmed.contains("</return_code>") | |
| || trimmed.contains("<offload>") | |
| || trimmed.contains("</offload>") | |
| || trimmed.contains("path=\"") | |
| { | |
| retained.push(trimmed.to_string()); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aish-context/src/manager.rs` around lines 619 - 636, In
compact_shell_content the check trimmed.contains("path") is too broad and
preserves unrelated lines; update the retention logic in compact_shell_content
to match the real metadata format instead (e.g., look for path=" or the path
attribute inside an <offload ...> tag) rather than any occurrence of "path" —
either change the predicate to trimmed.contains("path=\"") or perform a simple
parse for an <offload ... path="..."> attribute so only true metadata lines are
retained.
| async fn prepare_messages_for_send(&self, mut messages: Vec<ChatMessage>) -> Vec<ChatMessage> { | ||
| let policy = &self.context_budget_policy; | ||
| if !policy.enabled { | ||
| return trim_messages(messages, self.max_context_tokens, 5); | ||
| } | ||
|
|
||
| let before_tokens = estimate_chat_tokens(&messages, policy); | ||
| let before_state = policy.state_for_tokens(before_tokens); | ||
| let mut compaction_changed = false; | ||
| let mut emitted_compaction_start = false; | ||
| let mut compaction_mode = "microcompact"; | ||
|
|
||
| if matches!( | ||
| before_state.pressure, | ||
| ContextPressureLevel::Warning | ||
| | ContextPressureLevel::AutoCompact | ||
| | ContextPressureLevel::Blocking | ||
| ) { | ||
| let report = microcompact_chat_messages(&mut messages, policy); | ||
| if report.changed_messages > 0 { | ||
| compaction_changed = true; | ||
| tracing::info!( | ||
| changed_messages = report.changed_messages, | ||
| reclaimed_tokens = report.reclaimed_tokens, | ||
| "send-path context microcompact completed" | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| let after_micro_tokens = estimate_chat_tokens(&messages, policy); | ||
| let after_micro_state = policy.state_for_tokens(after_micro_tokens); | ||
| if after_micro_state.is_above_auto_compact_threshold && policy.full_compact_enabled { | ||
| self.emit_context_compaction_start("send_path", "full_compact"); | ||
| emitted_compaction_start = true; | ||
| compaction_mode = "full_compact"; | ||
| let failures = *self.compact_consecutive_failures.lock().unwrap(); | ||
| if failures >= policy.max_consecutive_failures { | ||
| tracing::warn!( | ||
| failures, | ||
| "send-path full compact skipped because failure fuse is open" | ||
| ); | ||
| } else { | ||
| match self | ||
| .full_compact_chat_messages_with_model(messages.clone(), policy) | ||
| .await | ||
| { | ||
| Ok(compacted) => { | ||
| *self.compact_consecutive_failures.lock().unwrap() = 0; | ||
| compaction_changed = true; | ||
| messages = compacted; | ||
| } | ||
| Err(err) => { | ||
| let mut guard = self.compact_consecutive_failures.lock().unwrap(); | ||
| *guard = (*guard).saturating_add(1); | ||
| tracing::warn!( | ||
| error = %err, | ||
| failures = *guard, | ||
| "send-path full compact failed; falling back to final trim" | ||
| ); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if emitted_compaction_start || compaction_changed { | ||
| let after_tokens = estimate_chat_tokens(&messages, policy); | ||
| self.emit_context_compaction_end( | ||
| "send_path", | ||
| compaction_mode, | ||
| before_tokens, | ||
| after_tokens, | ||
| compaction_changed, | ||
| ); | ||
| } | ||
|
|
||
| trim_messages( | ||
| messages, | ||
| self.max_context_tokens, | ||
| policy.micro_keep_recent_messages.max(5), | ||
| ) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Locate UI handlers for ContextCompactionStart/End to confirm whether
# they rely on strict start/end pairing.
rg -nP -C3 'ContextCompactionStart|ContextCompactionEnd'Repository: AI-Shell-Team/aish
Length of output: 5118
🏁 Script executed:
# Verify the code at the exact lines mentioned in session.rs
sed -n '1183,1263p' crates/aish-llm/src/session.rs | cat -nRepository: AI-Shell-Team/aish
Length of output: 3956
🏁 Script executed:
# Search for the parallel issue in ai_handler.rs
rg -n 'compact_context_before_send|ContextCompactionStart|ContextCompactionEnd' crates/aish-llm/src/ai_handler.rs -A5 -B5Repository: AI-Shell-Team/aish
Length of output: 136
🏁 Script executed:
# Check if ai_handler.rs exists and has the mentioned function
fd 'ai_handler' crates/aish-llm/src/ -type fRepository: AI-Shell-Team/aish
Length of output: 233
🏁 Script executed:
# Search for the function mentioned in the review
rg 'compact_context_before_send' crates/ --type rustRepository: AI-Shell-Team/aish
Length of output: 363
🏁 Script executed:
# List all files in aish-llm/src to see what files exist
ls -la crates/aish-llm/src/Repository: AI-Shell-Team/aish
Length of output: 1030
🏁 Script executed:
# Search for any other emit_context_compaction patterns
rg 'emit_context_compaction' crates/ --type rust -B5 -A10Repository: AI-Shell-Team/aish
Length of output: 6278
🏁 Script executed:
# Get the full compact_context_before_send function
rg -n 'async fn compact_context_before_send' crates/aish-shell/src/ai_handler.rs -A 100Repository: AI-Shell-Team/aish
Length of output: 5014
ContextCompactionEnd may be emitted without a matching Start in both prepare_messages_for_send and compact_context_before_send.
Both session.rs::prepare_messages_for_send and ai_handler.rs::compact_context_before_send exhibit the same asymmetry:
emit_context_compaction_start()is only called when full_compact conditions are metemit_context_compaction_end()fires whenemitted_compaction_start || compaction_changed
When microcompact modifies messages but full_compact is skipped (pressure is Warning but below auto_compact threshold), an end event is emitted with mode="microcompact" without a preceding start event. The UI handlers in app.rs expect paired start/end events to manage animation and state—calling stop without start causes asymmetric state updates.
Two reasonable fixes:
- Emit
ContextCompactionStartwithmode="microcompact"before microcompact runs, or - Only emit the end event when
emitted_compaction_startis true
Address both locations consistently to avoid future drift.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aish-llm/src/session.rs` around lines 1183 - 1263, The end event can
be emitted without a matching start; update both prepare_messages_for_send and
compact_context_before_send so that whenever microcompact_chat_messages changes
messages you call emit_context_compaction_start(..., "microcompact") before
running the microcompact step (set emitted_compaction_start = true and
compaction_mode = "microcompact"), and keep the existing
emit_context_compaction_end(...) path unchanged so the start/end pair is always
emitted; ensure you reference emit_context_compaction_start,
emit_context_compaction_end, microcompact_chat_messages,
emitted_compaction_start, compaction_mode, prepare_messages_for_send and
compact_context_before_send when applying the change.
| async fn compact_context_before_send( | ||
| &mut self, | ||
| plan_state: &PlanModeState, | ||
| ) -> ContextCompactReport { | ||
| let before_state = self.context_manager.budget_state(); | ||
| let before_tokens = before_state.estimated_tokens; | ||
| let mut emitted_compaction_start = false; | ||
| let mut report = ContextCompactReport { | ||
| before_tokens, | ||
| pressure_before: Some(before_state.pressure), | ||
| ..ContextCompactReport::default() | ||
| }; | ||
|
|
||
| let policy = self.context_manager.budget_policy().clone(); | ||
| if !policy.enabled { | ||
| report.after_tokens = before_tokens; | ||
| report.pressure_after = Some(before_state.pressure); | ||
| return report; | ||
| } | ||
|
|
||
| if matches!( | ||
| before_state.pressure, | ||
| ContextPressureLevel::Warning | ||
| | ContextPressureLevel::AutoCompact | ||
| | ContextPressureLevel::Blocking | ||
| ) { | ||
| report.microcompact = self.context_manager.microcompact(); | ||
| } | ||
|
|
||
| let after_micro_state = self.context_manager.budget_state(); | ||
| if after_micro_state.is_above_auto_compact_threshold && policy.full_compact_enabled { | ||
| self.llm_session | ||
| .emit_context_compaction_start("persistent_context", "full_compact"); | ||
| emitted_compaction_start = true; | ||
| if self.context_manager.compact_consecutive_failures() | ||
| >= policy.max_consecutive_failures | ||
| { | ||
| report.skipped_full_compact_due_to_fuse = true; | ||
| tracing::warn!( | ||
| failures = self.context_manager.compact_consecutive_failures(), | ||
| "persistent context model compact skipped because failure fuse is open" | ||
| ); | ||
| } else { | ||
| let candidates = self.context_manager.full_compact_candidate_messages(); | ||
| if candidates.is_empty() { | ||
| self.context_manager.record_compact_failure(); | ||
| } else { | ||
| match self | ||
| .llm_session | ||
| .summarize_context_messages( | ||
| &candidates, | ||
| Some(plan_state), | ||
| policy.summary_max_tokens, | ||
| ) | ||
| .await | ||
| { | ||
| Ok(summary) => { | ||
| match self.context_manager.apply_full_compact_summary(summary) { | ||
| Ok(full_report) => { | ||
| self.context_manager.reset_compact_failures(); | ||
| report.full_compact = Some(full_report); | ||
| } | ||
| Err(err) => { | ||
| self.context_manager.record_compact_failure(); | ||
| tracing::warn!( | ||
| error = %err, | ||
| failures = self.context_manager.compact_consecutive_failures(), | ||
| "persistent context model compact summary could not be applied" | ||
| ); | ||
| } | ||
| } | ||
| } | ||
| Err(err) => { | ||
| self.context_manager.record_compact_failure(); | ||
| tracing::warn!( | ||
| error = %err, | ||
| failures = self.context_manager.compact_consecutive_failures(), | ||
| "persistent context model compact failed; falling back to final trim" | ||
| ); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| let after_state = self.context_manager.budget_state(); | ||
| report.after_tokens = after_state.estimated_tokens; | ||
| report.pressure_after = Some(after_state.pressure); | ||
| let compaction_changed = | ||
| report.microcompact.changed_messages > 0 || report.full_compact.is_some(); | ||
| if emitted_compaction_start || compaction_changed { | ||
| let mode = if report.full_compact.is_some() { | ||
| "full_compact" | ||
| } else { | ||
| "microcompact" | ||
| }; | ||
| self.llm_session.emit_context_compaction_end( | ||
| "persistent_context", | ||
| mode, | ||
| before_tokens, | ||
| report.after_tokens, | ||
| compaction_changed, | ||
| ); | ||
| } | ||
| report | ||
| } |
There was a problem hiding this comment.
Lifecycle event mismatch and over-eager fuse trip on empty candidate set.
Two related concerns in compact_context_before_send:
-
Orphan
ContextCompactionEndevent for microcompact-only path.emit_context_compaction_start("persistent_context", "full_compact")is only called inside the full-compact branch (line 456-457), but the end-event guard at line 515 fires whenevercompaction_changedis true — including microcompact-only runs. The UI (app.rs in this stack) will then see an end event with no preceding start. Same pattern exists insession.rs::prepare_messages_for_send; fix both consistently (either emit a symmetric start for microcompact, or only emit end whenemitted_compaction_start). -
Empty candidates should not tick the failure fuse. At line 469-470,
candidates.is_empty()callsrecord_compact_failure(). "No old messages to compact" is a structural condition (e.g., recent budget pressure on a small number of large messages wheremicro_keep_recent_messages >= len) — not a transient failure. Repeated turns in this state will trip the fuse and permanently disable full compaction (reset_compact_failures()only runs on success, which is unreachable when candidates stay empty). Consider treating empty-candidates as a skip without incrementing the fuse counter.
♻️ Proposed fix for the empty-candidates handling
let candidates = self.context_manager.full_compact_candidate_messages();
if candidates.is_empty() {
- self.context_manager.record_compact_failure();
+ tracing::debug!(
+ "persistent context full compact skipped: no candidate messages"
+ );
} else {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aish-shell/src/ai_handler.rs` around lines 425 - 530, In
compact_context_before_send: fix two issues — (1) ensure lifecycle events are
balanced by only emitting emit_context_compaction_end when a matching
emit_context_compaction_start was emitted (use the emitted_compaction_start flag
to gate the end-event) or emit a symmetric start for microcompact runs (update
the branch where report.microcompact is set to call
llm_session.emit_context_compaction_start with "microcompact"); and (2) avoid
incrementing the compact-failure fuse for structural "no candidates" cases by
changing the candidates.is_empty() branch in the full-compact path to treat
emptiness as a skip (do not call context_manager.record_compact_failure();
optionally set report.skipped_full_compact_due_to_fuse or another flag) so
repeated empty-candidate cycles do not trip compact_consecutive_failures();
update the referenced symbols: compact_context_before_send,
emitted_compaction_start, emit_context_compaction_start,
emit_context_compaction_end, full_compact_candidate_messages,
record_compact_failure, reset_compact_failures, and
skipped_full_compact_due_to_fuse.
| let context_budget_policy = context_budget_policy_from_config(&config); | ||
| llm_session.set_context_budget_policy(context_budget_policy.clone()); |
There was a problem hiding this comment.
Reapply the resolved budget policy after model/config changes.
This policy is only computed during AishShell::new(). Later flows like /model and run_setup_wizard() mutate self.config and call self.ai_handler.update_model(...), but never recompute/reinstall the budget policy, so both the session and handler keep stale window/threshold values after a model switch.
Also applies to: 955-956
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aish-shell/src/app.rs` around lines 292 - 293, The context budget
policy computed by context_budget_policy_from_config in AishShell::new is not
re-applied after config/model changes, so update flows (e.g., the /model
handler, run_setup_wizard, and any place calling self.ai_handler.update_model)
must recompute and reinstall the policy: call
context_budget_policy_from_config(&self.config) and then
llm_session.set_context_budget_policy(...) (or the equivalent on the
session/handler) immediately after update_model completes to replace stale
window/threshold values; ensure this same fix is applied at the other locations
referenced (around the existing lines where
llm_session.set_context_budget_policy is set and at the other noted spot).
| LlmEventType::ContextCompactionEnd => { | ||
| compaction_active_ref.store(false, Ordering::SeqCst); | ||
| animation_ref.stop(); | ||
| if !compaction_notice_shown_ref.swap(true, Ordering::SeqCst) { | ||
| if let Some(message) = context_compaction_notice(&event) { | ||
| println!("\x1b[2m{}\x1b[0m", message); | ||
| } | ||
| } |
There was a problem hiding this comment.
Only mark the compaction notice as shown when you actually print it.
Right now swap(true, ...) happens before checking changed, so a no-op ContextCompactionEnd (changed = false) suppresses the first later successful completion notice in the same turn.
Suggested fix
- if !compaction_notice_shown_ref.swap(true, Ordering::SeqCst) {
- if let Some(message) = context_compaction_notice(&event) {
- println!("\x1b[2m{}\x1b[0m", message);
- }
- }
+ if let Some(message) = context_compaction_notice(&event) {
+ if !compaction_notice_shown_ref.swap(true, Ordering::SeqCst) {
+ println!("\x1b[2m{}\x1b[0m", message);
+ }
+ }Also applies to: 2319-2329, 3000-3010
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aish-shell/src/app.rs` around lines 606 - 613, The compaction notice
flag compaction_notice_shown_ref is being set (swap(true,...)) before we verify
and actually print the notice in the LlmEventType::ContextCompactionEnd handler,
which can suppress future valid notices; change the flow so you only set
compaction_notice_shown_ref to true after you successfully obtain and print a
message from context_compaction_notice(&event). Concretely, keep
compaction_active_ref.store(false, Ordering::SeqCst) and animation_ref.stop()
as-is, then call context_compaction_notice(&event) and if Some(message) {
println!(...); compaction_notice_shown_ref.store(true, Ordering::SeqCst); }
(i.e., remove the early swap and only mark shown when printing). Apply the same
change to the other duplicate blocks referenced.
…ontent parsing The auto-compaction flow (AI-Shell-Team#187) introduced compact_context_before_send into the error correction path, which could clear shell output containing error details before the LLM analyzes them. This made error correction unreliable. Also fixes extract_message_text to preserve original string content instead of trimming, and unifies the tool-call path in session.rs to use the same parser.
Background
This PR adds the first end-to-end context auto-compaction flow on the Rust branch and includes the parser normalization needed to avoid empty compact summaries when providers return array-form content blocks.
Changes
message.content/delta.contentin the LLM response parserValidation
cargo test -p aish-llm streaming::tests:: -- --nocapturecargo test -p aish-context --lib -- --nocapturecargo test -p aish-llm test_prepare_messages_for_send_triggers_microcompact_without_model -- --nocapturecargo build -p aish-cliRisk
Main risk is behavior tuning around compaction thresholds and how aggressively older shell/tool output is collapsed under pressure.
Summary by CodeRabbit
Release Notes
New Features
Documentation
Localization