Skip to content

feat: add context auto-compaction flow#187

Merged
F16shen merged 4 commits into
AI-Shell-Team:rustfrom
F16shen:feat/context-auto-compaction
May 15, 2026
Merged

feat: add context auto-compaction flow#187
F16shen merged 4 commits into
AI-Shell-Team:rustfrom
F16shen:feat/context-auto-compaction

Conversation

@F16shen
Copy link
Copy Markdown
Collaborator

@F16shen F16shen commented May 15, 2026

Background

This PR adds the first end-to-end context auto-compaction flow on the Rust branch and includes the parser normalization needed to avoid empty compact summaries when providers return array-form content blocks.

Changes

  • normalize string and array-form message.content / delta.content in the LLM response parser
  • add context budget policy calculation, pressure levels, and full/micro compaction behavior
  • compact old shell/tool output while preserving recent context needed for follow-up turns
  • emit compaction lifecycle events and wire the shell/app UI, config, docs, and i18n strings

Validation

  • cargo test -p aish-llm streaming::tests:: -- --nocapture
  • cargo test -p aish-context --lib -- --nocapture
  • cargo test -p aish-llm test_prepare_messages_for_send_triggers_microcompact_without_model -- --nocapture
  • cargo build -p aish-cli

Risk

Main risk is behavior tuning around compaction thresholds and how aggressively older shell/tool output is collapsed under pressure.

Summary by CodeRabbit

Release Notes

  • New Features

    • Automatic context compaction with configurable compression strategies
    • Context window budget management and pressure level monitoring
    • Real-time status notifications during context optimization
  • Documentation

    • Added comprehensive configuration guide for context auto-compact settings
  • Localization

    • Added context compaction UI messages in German, Spanish, French, Japanese, and Chinese

Review Change Stack

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the pull request. A maintainer will review it when available.

Please keep the PR focused, explain the why in the description, and make sure local checks pass before requesting review.

Contribution guide: https://github.com/AI-Shell-Team/aish/blob/main/CONTRIBUTING.md

@github-actions
Copy link
Copy Markdown
Contributor

This pull request description looks incomplete. Please update the missing sections below before review.

Missing items:

  • Summary
  • User-visible Changes
  • Compatibility
  • Testing
  • Change Type
  • Scope

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Warning

Rate limit exceeded

@F16shen has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 42 minutes and 26 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: f64b8afb-9ff8-4910-9604-7aec86aea411

📥 Commits

Reviewing files that changed from the base of the PR and between f7c2706 and 305914f.

📒 Files selected for processing (3)
  • crates/aish-context/src/budget.rs
  • crates/aish-llm/src/session.rs
  • crates/aish-shell/src/app.rs
📝 Walkthrough

Walkthrough

This PR implements an automated context window management system that compresses conversation history when token usage approaches configured thresholds. It adds token budgeting logic, configurable compaction strategies (microcompact for low-value shell output, full-compact using LLM summarization), budget-aware message preparation in the LLM session, and shell integration with pre-send compaction and UI feedback across terminal and SSH modes.

Changes

Context Auto-Compaction Feature

Layer / File(s) Summary
Configuration schema and defaults
CONFIGURATION.md, crates/aish-config/src/model.rs
Introduces ContextAutoCompactConfig with fields for token windows, buffer thresholds, microcompact keep-recent counts, full-compact fuse, and model-specific context windows; wires serde defaults; documents in configuration guide.
Budget policy and window resolution
crates/aish-context/src/budget.rs, crates/aish-context/src/lib.rs
Implements token-budget model with ContextBudgetPolicy, pressure-level enums, threshold calculation from effective window and buffers, and source-precedence window resolution (auto-compact override → model config → legacy budget → default).
Context manager compaction pipeline
crates/aish-context/src/manager.rs
Extends ContextManager with compact_for_send orchestration, microcompact to clear/compress low-value shell outputs, full-compact helpers (rewrite_with_summary), failure-fuse tracking, and budget-aware token estimation.
LLM session summary and message preparation
crates/aish-llm/src/session.rs
Adds prepare_messages_for_send routing between trim/microcompact/full-compact based on budget pressure; implements summarize_context_messages for LLM-driven conversation summaries; manages compaction state and events across request lifecycle.
Shell handler pre-send compaction
crates/aish-shell/src/ai_handler.rs
Integrates context budgeting into AiHandler; adds compact_context_before_send to trigger full compaction with plan-state context before handle_question/handle_error_correction send messages.
App budget resolution and UI event handling
crates/aish-shell/src/app.rs
Resolves effective ContextBudgetPolicy from config with model context windows and constraints; adds compaction state tracking and event-driven UI animations/notice messages in main terminal and SSH rendering threads.
Event types, streaming parser, and locale strings
crates/aish-core/src/types.rs, crates/aish-llm/src/streaming.rs, crates/aish-i18n/locales/*
Adds ContextCompactionStart/ContextCompactionEnd event variants; improves streaming parser to handle content as string or array; provides multi-locale compaction UI strings (en-US, es-ES, fr-FR, de-DE, ja-JP, zh-CN).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • AI-Shell-Team/aish#131: Overlaps at error-correction code path in crates/aish-shell/src/ai_handler.rs that both PRs modify.

Suggested labels

size: XL, tests, docs, experienced-contributor

Poem

🐰 A bunny hops through tokens bright,
Compacting context left and right,
When pressure mounts and windows shrink,
The summarizer's here—quick as a blink!
Now conversations fit just right. 🌙

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.77% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add context auto-compaction flow' clearly and specifically summarizes the main change—implementing an end-to-end context auto-compaction feature. It is concise, uses standard commit convention, and directly relates to the primary objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/aish-shell/src/app.rs (1)

2280-2330: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Thread the resolved budget policy into SSH sub-sessions too.

These new SSH branches assume compaction events can fire, but the ad-hoc LlmSessions used in the SSH callbacks still run without set_context_budget_policy(...). That leaves SSH turns on a different compaction policy than the main shell session, so the new UI can stay cold exactly where long remote transcripts are most likely.

Also applies to: 2962-3011

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-shell/src/app.rs` around lines 2280 - 2330, The SSH callback
branches that create ad-hoc LlmSession instances (the closures passed to
session.set_event_callback and the SSH-related handlers that mirror this logic)
do not apply the resolved context budget policy, so their compaction behavior
can diverge; update each place that constructs or configures an LlmSession used
for SSH/remote turns to call set_context_budget_policy(...) with the same budget
policy object/value used by the main shell session (reuse the same policy
variable or fetch it from the session config) so compaction events and UI
behavior match; specifically modify the code paths that create those ad-hoc
LlmSession(s) referenced alongside context_compaction_notice and the event
callback so they invoke set_context_budget_policy before the session is used.
🧹 Nitpick comments (1)
crates/aish-shell/src/ai_handler.rs (1)

348-389: 💤 Low value

Step numbering drift after inserting the compaction step.

Inserting "Step 4: Compact persistent context" at line 348 left the surrounding comments inconsistent: lines 362 and 366 both say "Step 5", and Step 6 is skipped (jumps from Step 5 to Step 7 at line 378). Trivial to renumber while the change is fresh.

✏️ Suggested renumber
-        // Step 5: Build context and system messages
+        // Step 5: Build context and system messages
         let context_messages = self.build_context_messages();
         let system_message = self.system_message();
 
-        // Step 5: Send to LLM
+        // Step 6: Send to LLM
         let process_result = self
             .llm_session
             .process_input(
                 &question_processed,
                 &context_messages,
                 system_message.as_deref(),
                 true,
             )
             .await?;
         let response = process_result.text;
 
-        // Step 7: Store the exchange in context
+        // Step 7: Store the exchange in context
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-shell/src/ai_handler.rs` around lines 348 - 389, Comments around
the new compaction step are misnumbered; update the inline step comments to be
sequential (e.g., compact_context_before_send is Step 4,
build_context_messages/system_message Step 5, llm_session.process_input Step 6,
context_manager.add_message calls Step 7, auto_retain_memory Step 8, and
persist_token_usage Step 9) so the comments match the actual flow; locate the
block around compact_context_before_send, build_context_messages,
llm_session.process_input, context_manager.add_message, auto_retain_memory, and
persist_token_usage and rename the "Step X" comments accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/aish-context/src/manager.rs`:
- Around line 619-636: In compact_shell_content the check
trimmed.contains("path") is too broad and preserves unrelated lines; update the
retention logic in compact_shell_content to match the real metadata format
instead (e.g., look for path=" or the path attribute inside an <offload ...>
tag) rather than any occurrence of "path" — either change the predicate to
trimmed.contains("path=\"") or perform a simple parse for an <offload ...
path="..."> attribute so only true metadata lines are retained.

In `@crates/aish-llm/src/session.rs`:
- Around line 1479-1491: The function estimate_one_chat_message currently
ignores policy.enable_token_estimation because both branches compute the same
rough heuristic; update it to mirror ContextManager::estimate_tokens by using
tiktoken_rs when policy.enable_token_estimation is true and only falling back to
the (content_len + reasoning_len)/4 heuristic when false, or alternatively
remove the dead if and document it's a heuristic; specifically modify
estimate_one_chat_message to call the same tokenizer (e.g.,
tiktoken_rs::cl100k_base_singleton() usage as in
ContextManager::estimate_tokens) when enable_token_estimation is true and
otherwise return the existing heuristic, ensuring you reference
policy.enable_token_estimation, estimate_one_chat_message, and
ContextManager::estimate_tokens so the behavior is consistent.
- Around line 1183-1263: The end event can be emitted without a matching start;
update both prepare_messages_for_send and compact_context_before_send so that
whenever microcompact_chat_messages changes messages you call
emit_context_compaction_start(..., "microcompact") before running the
microcompact step (set emitted_compaction_start = true and compaction_mode =
"microcompact"), and keep the existing emit_context_compaction_end(...) path
unchanged so the start/end pair is always emitted; ensure you reference
emit_context_compaction_start, emit_context_compaction_end,
microcompact_chat_messages, emitted_compaction_start, compaction_mode,
prepare_messages_for_send and compact_context_before_send when applying the
change.

In `@crates/aish-shell/src/ai_handler.rs`:
- Around line 425-530: In compact_context_before_send: fix two issues — (1)
ensure lifecycle events are balanced by only emitting
emit_context_compaction_end when a matching emit_context_compaction_start was
emitted (use the emitted_compaction_start flag to gate the end-event) or emit a
symmetric start for microcompact runs (update the branch where
report.microcompact is set to call llm_session.emit_context_compaction_start
with "microcompact"); and (2) avoid incrementing the compact-failure fuse for
structural "no candidates" cases by changing the candidates.is_empty() branch in
the full-compact path to treat emptiness as a skip (do not call
context_manager.record_compact_failure(); optionally set
report.skipped_full_compact_due_to_fuse or another flag) so repeated
empty-candidate cycles do not trip compact_consecutive_failures(); update the
referenced symbols: compact_context_before_send, emitted_compaction_start,
emit_context_compaction_start, emit_context_compaction_end,
full_compact_candidate_messages, record_compact_failure, reset_compact_failures,
and skipped_full_compact_due_to_fuse.

In `@crates/aish-shell/src/app.rs`:
- Around line 292-293: The context budget policy computed by
context_budget_policy_from_config in AishShell::new is not re-applied after
config/model changes, so update flows (e.g., the /model handler,
run_setup_wizard, and any place calling self.ai_handler.update_model) must
recompute and reinstall the policy: call
context_budget_policy_from_config(&self.config) and then
llm_session.set_context_budget_policy(...) (or the equivalent on the
session/handler) immediately after update_model completes to replace stale
window/threshold values; ensure this same fix is applied at the other locations
referenced (around the existing lines where
llm_session.set_context_budget_policy is set and at the other noted spot).
- Around line 606-613: The compaction notice flag compaction_notice_shown_ref is
being set (swap(true,...)) before we verify and actually print the notice in the
LlmEventType::ContextCompactionEnd handler, which can suppress future valid
notices; change the flow so you only set compaction_notice_shown_ref to true
after you successfully obtain and print a message from
context_compaction_notice(&event). Concretely, keep
compaction_active_ref.store(false, Ordering::SeqCst) and animation_ref.stop()
as-is, then call context_compaction_notice(&event) and if Some(message) {
println!(...); compaction_notice_shown_ref.store(true, Ordering::SeqCst); }
(i.e., remove the early swap and only mark shown when printing). Apply the same
change to the other duplicate blocks referenced.

---

Outside diff comments:
In `@crates/aish-shell/src/app.rs`:
- Around line 2280-2330: The SSH callback branches that create ad-hoc LlmSession
instances (the closures passed to session.set_event_callback and the SSH-related
handlers that mirror this logic) do not apply the resolved context budget
policy, so their compaction behavior can diverge; update each place that
constructs or configures an LlmSession used for SSH/remote turns to call
set_context_budget_policy(...) with the same budget policy object/value used by
the main shell session (reuse the same policy variable or fetch it from the
session config) so compaction events and UI behavior match; specifically modify
the code paths that create those ad-hoc LlmSession(s) referenced alongside
context_compaction_notice and the event callback so they invoke
set_context_budget_policy before the session is used.

---

Nitpick comments:
In `@crates/aish-shell/src/ai_handler.rs`:
- Around line 348-389: Comments around the new compaction step are misnumbered;
update the inline step comments to be sequential (e.g.,
compact_context_before_send is Step 4, build_context_messages/system_message
Step 5, llm_session.process_input Step 6, context_manager.add_message calls Step
7, auto_retain_memory Step 8, and persist_token_usage Step 9) so the comments
match the actual flow; locate the block around compact_context_before_send,
build_context_messages, llm_session.process_input, context_manager.add_message,
auto_retain_memory, and persist_token_usage and rename the "Step X" comments
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 549efe3b-a620-4bbd-b6db-8778a59843eb

📥 Commits

Reviewing files that changed from the base of the PR and between 352b631 and f7c2706.

📒 Files selected for processing (16)
  • CONFIGURATION.md
  • crates/aish-config/src/model.rs
  • crates/aish-context/src/budget.rs
  • crates/aish-context/src/lib.rs
  • crates/aish-context/src/manager.rs
  • crates/aish-core/src/types.rs
  • crates/aish-i18n/locales/de-DE.yaml
  • crates/aish-i18n/locales/en-US.yaml
  • crates/aish-i18n/locales/es-ES.yaml
  • crates/aish-i18n/locales/fr-FR.yaml
  • crates/aish-i18n/locales/ja-JP.yaml
  • crates/aish-i18n/locales/zh-CN.yaml
  • crates/aish-llm/src/session.rs
  • crates/aish-llm/src/streaming.rs
  • crates/aish-shell/src/ai_handler.rs
  • crates/aish-shell/src/app.rs

Comment on lines +619 to +636
fn compact_shell_content(content: &str) -> Option<String> {
if content.len() <= 512 && !is_low_value_output(content) {
return None;
}
let mut retained = Vec::new();
for line in content.lines() {
let trimmed = line.trim();
if trimmed.starts_with("Command:")
|| trimmed.starts_with("Exit code:")
|| trimmed.contains("<return_code>")
|| trimmed.contains("</return_code>")
|| trimmed.contains("<offload>")
|| trimmed.contains("</offload>")
|| trimmed.contains("path")
{
retained.push(trimmed.to_string());
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Over-broad path substring match retains spurious lines during shell compaction.

trimmed.contains("path") matches any line that happens to contain the substring path anywhere — e.g., "pathway", "I dispatch a job", "Use the path to enlightenment", or random user prose. Other retention conditions in this loop use structured markers (Command:, Exit code:, <return_code>, <offload>), but path is a bare lowercase word that will preserve many non-metadata lines, weakening the microcompact effectiveness on noisy shell output.

Consider anchoring this match to the actual metadata format you intend to retain (likely the path="..." attribute inside <offload …>), e.g. trimmed.contains("path=\"") or a stricter <offload …path="…"> parse.

♻️ Suggested tightening
-            || trimmed.contains("<offload>")
-            || trimmed.contains("</offload>")
-            || trimmed.contains("path")
+            || trimmed.contains("<offload>")
+            || trimmed.contains("</offload>")
+            || trimmed.contains("path=\"")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn compact_shell_content(content: &str) -> Option<String> {
if content.len() <= 512 && !is_low_value_output(content) {
return None;
}
let mut retained = Vec::new();
for line in content.lines() {
let trimmed = line.trim();
if trimmed.starts_with("Command:")
|| trimmed.starts_with("Exit code:")
|| trimmed.contains("<return_code>")
|| trimmed.contains("</return_code>")
|| trimmed.contains("<offload>")
|| trimmed.contains("</offload>")
|| trimmed.contains("path")
{
retained.push(trimmed.to_string());
}
}
fn compact_shell_content(content: &str) -> Option<String> {
if content.len() <= 512 && !is_low_value_output(content) {
return None;
}
let mut retained = Vec::new();
for line in content.lines() {
let trimmed = line.trim();
if trimmed.starts_with("Command:")
|| trimmed.starts_with("Exit code:")
|| trimmed.contains("<return_code>")
|| trimmed.contains("</return_code>")
|| trimmed.contains("<offload>")
|| trimmed.contains("</offload>")
|| trimmed.contains("path=\"")
{
retained.push(trimmed.to_string());
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-context/src/manager.rs` around lines 619 - 636, In
compact_shell_content the check trimmed.contains("path") is too broad and
preserves unrelated lines; update the retention logic in compact_shell_content
to match the real metadata format instead (e.g., look for path=" or the path
attribute inside an <offload ...> tag) rather than any occurrence of "path" —
either change the predicate to trimmed.contains("path=\"") or perform a simple
parse for an <offload ... path="..."> attribute so only true metadata lines are
retained.

Comment on lines +1183 to +1263
async fn prepare_messages_for_send(&self, mut messages: Vec<ChatMessage>) -> Vec<ChatMessage> {
let policy = &self.context_budget_policy;
if !policy.enabled {
return trim_messages(messages, self.max_context_tokens, 5);
}

let before_tokens = estimate_chat_tokens(&messages, policy);
let before_state = policy.state_for_tokens(before_tokens);
let mut compaction_changed = false;
let mut emitted_compaction_start = false;
let mut compaction_mode = "microcompact";

if matches!(
before_state.pressure,
ContextPressureLevel::Warning
| ContextPressureLevel::AutoCompact
| ContextPressureLevel::Blocking
) {
let report = microcompact_chat_messages(&mut messages, policy);
if report.changed_messages > 0 {
compaction_changed = true;
tracing::info!(
changed_messages = report.changed_messages,
reclaimed_tokens = report.reclaimed_tokens,
"send-path context microcompact completed"
);
}
}

let after_micro_tokens = estimate_chat_tokens(&messages, policy);
let after_micro_state = policy.state_for_tokens(after_micro_tokens);
if after_micro_state.is_above_auto_compact_threshold && policy.full_compact_enabled {
self.emit_context_compaction_start("send_path", "full_compact");
emitted_compaction_start = true;
compaction_mode = "full_compact";
let failures = *self.compact_consecutive_failures.lock().unwrap();
if failures >= policy.max_consecutive_failures {
tracing::warn!(
failures,
"send-path full compact skipped because failure fuse is open"
);
} else {
match self
.full_compact_chat_messages_with_model(messages.clone(), policy)
.await
{
Ok(compacted) => {
*self.compact_consecutive_failures.lock().unwrap() = 0;
compaction_changed = true;
messages = compacted;
}
Err(err) => {
let mut guard = self.compact_consecutive_failures.lock().unwrap();
*guard = (*guard).saturating_add(1);
tracing::warn!(
error = %err,
failures = *guard,
"send-path full compact failed; falling back to final trim"
);
}
}
}
}

if emitted_compaction_start || compaction_changed {
let after_tokens = estimate_chat_tokens(&messages, policy);
self.emit_context_compaction_end(
"send_path",
compaction_mode,
before_tokens,
after_tokens,
compaction_changed,
);
}

trim_messages(
messages,
self.max_context_tokens,
policy.micro_keep_recent_messages.max(5),
)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Locate UI handlers for ContextCompactionStart/End to confirm whether
# they rely on strict start/end pairing.
rg -nP -C3 'ContextCompactionStart|ContextCompactionEnd'

Repository: AI-Shell-Team/aish

Length of output: 5118


🏁 Script executed:

# Verify the code at the exact lines mentioned in session.rs
sed -n '1183,1263p' crates/aish-llm/src/session.rs | cat -n

Repository: AI-Shell-Team/aish

Length of output: 3956


🏁 Script executed:

# Search for the parallel issue in ai_handler.rs
rg -n 'compact_context_before_send|ContextCompactionStart|ContextCompactionEnd' crates/aish-llm/src/ai_handler.rs -A5 -B5

Repository: AI-Shell-Team/aish

Length of output: 136


🏁 Script executed:

# Check if ai_handler.rs exists and has the mentioned function
fd 'ai_handler' crates/aish-llm/src/ -type f

Repository: AI-Shell-Team/aish

Length of output: 233


🏁 Script executed:

# Search for the function mentioned in the review
rg 'compact_context_before_send' crates/ --type rust

Repository: AI-Shell-Team/aish

Length of output: 363


🏁 Script executed:

# List all files in aish-llm/src to see what files exist
ls -la crates/aish-llm/src/

Repository: AI-Shell-Team/aish

Length of output: 1030


🏁 Script executed:

# Search for any other emit_context_compaction patterns
rg 'emit_context_compaction' crates/ --type rust -B5 -A10

Repository: AI-Shell-Team/aish

Length of output: 6278


🏁 Script executed:

# Get the full compact_context_before_send function
rg -n 'async fn compact_context_before_send' crates/aish-shell/src/ai_handler.rs -A 100

Repository: AI-Shell-Team/aish

Length of output: 5014


ContextCompactionEnd may be emitted without a matching Start in both prepare_messages_for_send and compact_context_before_send.

Both session.rs::prepare_messages_for_send and ai_handler.rs::compact_context_before_send exhibit the same asymmetry:

  • emit_context_compaction_start() is only called when full_compact conditions are met
  • emit_context_compaction_end() fires when emitted_compaction_start || compaction_changed

When microcompact modifies messages but full_compact is skipped (pressure is Warning but below auto_compact threshold), an end event is emitted with mode="microcompact" without a preceding start event. The UI handlers in app.rs expect paired start/end events to manage animation and state—calling stop without start causes asymmetric state updates.

Two reasonable fixes:

  • Emit ContextCompactionStart with mode="microcompact" before microcompact runs, or
  • Only emit the end event when emitted_compaction_start is true

Address both locations consistently to avoid future drift.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-llm/src/session.rs` around lines 1183 - 1263, The end event can
be emitted without a matching start; update both prepare_messages_for_send and
compact_context_before_send so that whenever microcompact_chat_messages changes
messages you call emit_context_compaction_start(..., "microcompact") before
running the microcompact step (set emitted_compaction_start = true and
compaction_mode = "microcompact"), and keep the existing
emit_context_compaction_end(...) path unchanged so the start/end pair is always
emitted; ensure you reference emit_context_compaction_start,
emit_context_compaction_end, microcompact_chat_messages,
emitted_compaction_start, compaction_mode, prepare_messages_for_send and
compact_context_before_send when applying the change.

Comment thread crates/aish-llm/src/session.rs Outdated
Comment on lines +425 to +530
async fn compact_context_before_send(
&mut self,
plan_state: &PlanModeState,
) -> ContextCompactReport {
let before_state = self.context_manager.budget_state();
let before_tokens = before_state.estimated_tokens;
let mut emitted_compaction_start = false;
let mut report = ContextCompactReport {
before_tokens,
pressure_before: Some(before_state.pressure),
..ContextCompactReport::default()
};

let policy = self.context_manager.budget_policy().clone();
if !policy.enabled {
report.after_tokens = before_tokens;
report.pressure_after = Some(before_state.pressure);
return report;
}

if matches!(
before_state.pressure,
ContextPressureLevel::Warning
| ContextPressureLevel::AutoCompact
| ContextPressureLevel::Blocking
) {
report.microcompact = self.context_manager.microcompact();
}

let after_micro_state = self.context_manager.budget_state();
if after_micro_state.is_above_auto_compact_threshold && policy.full_compact_enabled {
self.llm_session
.emit_context_compaction_start("persistent_context", "full_compact");
emitted_compaction_start = true;
if self.context_manager.compact_consecutive_failures()
>= policy.max_consecutive_failures
{
report.skipped_full_compact_due_to_fuse = true;
tracing::warn!(
failures = self.context_manager.compact_consecutive_failures(),
"persistent context model compact skipped because failure fuse is open"
);
} else {
let candidates = self.context_manager.full_compact_candidate_messages();
if candidates.is_empty() {
self.context_manager.record_compact_failure();
} else {
match self
.llm_session
.summarize_context_messages(
&candidates,
Some(plan_state),
policy.summary_max_tokens,
)
.await
{
Ok(summary) => {
match self.context_manager.apply_full_compact_summary(summary) {
Ok(full_report) => {
self.context_manager.reset_compact_failures();
report.full_compact = Some(full_report);
}
Err(err) => {
self.context_manager.record_compact_failure();
tracing::warn!(
error = %err,
failures = self.context_manager.compact_consecutive_failures(),
"persistent context model compact summary could not be applied"
);
}
}
}
Err(err) => {
self.context_manager.record_compact_failure();
tracing::warn!(
error = %err,
failures = self.context_manager.compact_consecutive_failures(),
"persistent context model compact failed; falling back to final trim"
);
}
}
}
}
}

let after_state = self.context_manager.budget_state();
report.after_tokens = after_state.estimated_tokens;
report.pressure_after = Some(after_state.pressure);
let compaction_changed =
report.microcompact.changed_messages > 0 || report.full_compact.is_some();
if emitted_compaction_start || compaction_changed {
let mode = if report.full_compact.is_some() {
"full_compact"
} else {
"microcompact"
};
self.llm_session.emit_context_compaction_end(
"persistent_context",
mode,
before_tokens,
report.after_tokens,
compaction_changed,
);
}
report
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Lifecycle event mismatch and over-eager fuse trip on empty candidate set.

Two related concerns in compact_context_before_send:

  1. Orphan ContextCompactionEnd event for microcompact-only path. emit_context_compaction_start("persistent_context", "full_compact") is only called inside the full-compact branch (line 456-457), but the end-event guard at line 515 fires whenever compaction_changed is true — including microcompact-only runs. The UI (app.rs in this stack) will then see an end event with no preceding start. Same pattern exists in session.rs::prepare_messages_for_send; fix both consistently (either emit a symmetric start for microcompact, or only emit end when emitted_compaction_start).

  2. Empty candidates should not tick the failure fuse. At line 469-470, candidates.is_empty() calls record_compact_failure(). "No old messages to compact" is a structural condition (e.g., recent budget pressure on a small number of large messages where micro_keep_recent_messages >= len) — not a transient failure. Repeated turns in this state will trip the fuse and permanently disable full compaction (reset_compact_failures() only runs on success, which is unreachable when candidates stay empty). Consider treating empty-candidates as a skip without incrementing the fuse counter.

♻️ Proposed fix for the empty-candidates handling
             let candidates = self.context_manager.full_compact_candidate_messages();
             if candidates.is_empty() {
-                self.context_manager.record_compact_failure();
+                tracing::debug!(
+                    "persistent context full compact skipped: no candidate messages"
+                );
             } else {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-shell/src/ai_handler.rs` around lines 425 - 530, In
compact_context_before_send: fix two issues — (1) ensure lifecycle events are
balanced by only emitting emit_context_compaction_end when a matching
emit_context_compaction_start was emitted (use the emitted_compaction_start flag
to gate the end-event) or emit a symmetric start for microcompact runs (update
the branch where report.microcompact is set to call
llm_session.emit_context_compaction_start with "microcompact"); and (2) avoid
incrementing the compact-failure fuse for structural "no candidates" cases by
changing the candidates.is_empty() branch in the full-compact path to treat
emptiness as a skip (do not call context_manager.record_compact_failure();
optionally set report.skipped_full_compact_due_to_fuse or another flag) so
repeated empty-candidate cycles do not trip compact_consecutive_failures();
update the referenced symbols: compact_context_before_send,
emitted_compaction_start, emit_context_compaction_start,
emit_context_compaction_end, full_compact_candidate_messages,
record_compact_failure, reset_compact_failures, and
skipped_full_compact_due_to_fuse.

Comment on lines +292 to +293
let context_budget_policy = context_budget_policy_from_config(&config);
llm_session.set_context_budget_policy(context_budget_policy.clone());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reapply the resolved budget policy after model/config changes.

This policy is only computed during AishShell::new(). Later flows like /model and run_setup_wizard() mutate self.config and call self.ai_handler.update_model(...), but never recompute/reinstall the budget policy, so both the session and handler keep stale window/threshold values after a model switch.

Also applies to: 955-956

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-shell/src/app.rs` around lines 292 - 293, The context budget
policy computed by context_budget_policy_from_config in AishShell::new is not
re-applied after config/model changes, so update flows (e.g., the /model
handler, run_setup_wizard, and any place calling self.ai_handler.update_model)
must recompute and reinstall the policy: call
context_budget_policy_from_config(&self.config) and then
llm_session.set_context_budget_policy(...) (or the equivalent on the
session/handler) immediately after update_model completes to replace stale
window/threshold values; ensure this same fix is applied at the other locations
referenced (around the existing lines where
llm_session.set_context_budget_policy is set and at the other noted spot).

Comment on lines +606 to +613
LlmEventType::ContextCompactionEnd => {
compaction_active_ref.store(false, Ordering::SeqCst);
animation_ref.stop();
if !compaction_notice_shown_ref.swap(true, Ordering::SeqCst) {
if let Some(message) = context_compaction_notice(&event) {
println!("\x1b[2m{}\x1b[0m", message);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Only mark the compaction notice as shown when you actually print it.

Right now swap(true, ...) happens before checking changed, so a no-op ContextCompactionEnd (changed = false) suppresses the first later successful completion notice in the same turn.

Suggested fix
- if !compaction_notice_shown_ref.swap(true, Ordering::SeqCst) {
-     if let Some(message) = context_compaction_notice(&event) {
-         println!("\x1b[2m{}\x1b[0m", message);
-     }
- }
+ if let Some(message) = context_compaction_notice(&event) {
+     if !compaction_notice_shown_ref.swap(true, Ordering::SeqCst) {
+         println!("\x1b[2m{}\x1b[0m", message);
+     }
+ }

Also applies to: 2319-2329, 3000-3010

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-shell/src/app.rs` around lines 606 - 613, The compaction notice
flag compaction_notice_shown_ref is being set (swap(true,...)) before we verify
and actually print the notice in the LlmEventType::ContextCompactionEnd handler,
which can suppress future valid notices; change the flow so you only set
compaction_notice_shown_ref to true after you successfully obtain and print a
message from context_compaction_notice(&event). Concretely, keep
compaction_active_ref.store(false, Ordering::SeqCst) and animation_ref.stop()
as-is, then call context_compaction_notice(&event) and if Some(message) {
println!(...); compaction_notice_shown_ref.store(true, Ordering::SeqCst); }
(i.e., remove the early swap and only mark shown when printing). Apply the same
change to the other duplicate blocks referenced.

@F16shen F16shen merged commit 7567317 into AI-Shell-Team:rust May 15, 2026
7 checks passed
@F16shen F16shen deleted the feat/context-auto-compaction branch May 15, 2026 07:38
jexShain added a commit to jexShain/aish that referenced this pull request May 19, 2026
…ontent parsing

The auto-compaction flow (AI-Shell-Team#187) introduced compact_context_before_send into the
error correction path, which could clear shell output containing error details
before the LLM analyzes them. This made error correction unreliable.

Also fixes extract_message_text to preserve original string content instead of
trimming, and unifies the tool-call path in session.rs to use the same parser.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant