feat: rewrite prompt templates, add prompt caching and tool-loop trimming by jexShain · Pull Request #188 · AI-Shell-Team/aish

jexShain · 2026-05-15T09:27:06Z

Summary

Overhaul oracle/cmd_error/error_detect prompts for better AI interaction
Add render_static_core/render_env_block for cache-friendly prompt splitting
Add inject_knowledge_stable for idempotent knowledge injection in context manager
Add Anthropic prompt caching (CacheControl) on system messages
Improve Langfuse observability: session-level trace, per-iteration spans
Add trim_tool_loop_messages to prevent unbounded context growth during agent loops
New guess_command prompt template

Test plan

Verify existing unit tests pass (cargo test in affected crates)
Verify prompt templates render correctly
Verify Langfuse traces group correctly per session
Verify tool-loop trimming works for long agent sessions

Summary by CodeRabbit

New Features
- Added prompt caching support for improved LLM response performance
- Enhanced long-term memory recall with source attribution formatting
Improvements
- Improved working directory tracking during command execution
- Better handling of whitespace-only AI responses
- Enhanced command result context injection and formatting
- Refined skill presentation in AI context
- More stable AI session management across multiple turns

…ming - Overhaul oracle/cmd_error/error_detect prompts for better AI interaction - Add render_static_core/render_env_block for cache-friendly prompt splitting - Add inject_knowledge_stable for idempotent knowledge injection - Add Anthropic prompt caching (CacheControl) on system messages - Improve Langfuse observability: session-level trace, per-iteration spans - Add trim_tool_loop_messages to prevent unbounded context growth - New guess_command prompt template

github-actions · 2026-05-15T09:27:19Z

Thanks for the pull request. A maintainer will review it when available.

Please keep the PR focused, explain the why in the description, and make sure local checks pass before requesting review.

Contribution guide: https://github.com/AI-Shell-Team/aish/blob/main/CONTRIBUTING.md

github-actions · 2026-05-15T09:27:19Z

This pull request description looks incomplete. Please update the missing sections below before review.

Missing items:

User-visible Changes
Compatibility
Testing
Change Type
Scope

coderabbitai · 2026-05-15T09:27:20Z

Warning

Rate limit exceeded

@jexShain has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 45 minutes and 32 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4ac8fcfa-6ef7-45de-b8f4-2b907f6def63

📥 Commits

Reviewing files that changed from the base of the PR and between 890ba24 and f8fa76e.

📒 Files selected for processing (4)

crates/aish-llm/src/session.rs
crates/aish-prompts/src/manager.rs
crates/aish-shell/src/ai_handler.rs
crates/aish-shell/src/app.rs

📝 Walkthrough

Walkthrough

This PR enhances the LLM interaction loop with stable prompt caching and improved session-level observability. It introduces idempotent knowledge injection, Anthropic cache-control headers, session-scoped Langfuse traces, context-growth limits for tool calls, split prompt rendering to preserve cache boundaries, and system-wide integration of CWD tracking and deterministic system info.

Changes

Stable Prompt Caching & Langfuse Session Tracing

Layer / File(s)	Summary
Stable Knowledge Injection Foundation `crates/aish-context/src/manager.rs`	`ContextManager::inject_knowledge_stable` detects content changes via `knowledge_cache` and idempotently updates or removes knowledge messages only when needed; new unit tests cover insertion, no-ops, replacement, and clearing scenarios.
Prompt Caching Infrastructure `crates/aish-llm/src/types.rs`, `crates/aish-llm/tests/llm_integration_test.rs`	New `CacheControl` type and optional `cache_control` field on `ChatMessage` enable Anthropic prompt caching; field is omitted from JSON when unset and includes ephemeral markers when set.
Langfuse Session-Level Tracing and Span Naming `crates/aish-llm/src/langfuse.rs`, `crates/aish-llm/src/session.rs`	Session-level trace ID created once per session and reused across turns; monotonic per-turn counter used for span naming; `span_generation` accepts `name: &str` parameter; system messages receive ephemeral cache-control on Anthropic models.
Context Management and Tool-Call Trimming `crates/aish-llm/src/session.rs`	New `trim_tool_loop_messages` helper removes old tool-call rounds while preserving stable prefix and leading system messages; trims both non-streaming and streaming response paths to prevent unbounded message growth.
Prompt Template Refactoring and Split Rendering `crates/aish-prompts/src/manager.rs`	New `render_static_core` and `render_env_block` methods split prompt rendering into session-static and per-call components; embedded prompts rewritten with new variable placeholders (`{{uname_info}}`, `{{basic_env_info}}`, etc.) for cache stability.
AI Handler: Knowledge Injection and System Info Helpers `crates/aish-shell/src/ai_handler.rs`	Splits system prompt into static core and env block for stable caching; integrates stable knowledge injection for memory recall and skills; adds cached system info helpers (`uname_info`, `basic_env_info`, `output_language`) using `OnceLock`.
App-Level Integration: CWD Tracking and Response Handling `crates/aish-shell/src/app.rs`	Whitespace-only responses treated as empty; builtin command results immediately injected into context; external command CWD changes tracked and synchronized; SSH prompt variables updated to use new system info and role helpers.

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly Related PRs

AI-Shell-Team/aish#131: Both PRs refactor Langfuse integration in aish-llm; #131 introduces langfuse-ergonomic while this PR adds session-level trace IDs and span naming on top of that foundation.

Suggested Labels

size: XL, experienced-contributor

Poem

🐰 Hop through caches with stable core,
Sessions traced from shore to shore,
Prompts split neat—static, then live,
Context trimmed, tools forgive,
Memory recalled, skills in view,
One cache to bind them all—how true!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the three main changes: prompt template rewrite, prompt caching support, and tool-loop message trimming—all of which are substantial components of this changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/aish-llm/src/session.rs (1)

1747-1756: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Potential slice panic when preserved-recent overlaps system prefix

At Line 1756, messages[system_count..recent_start] can panic when recent_start < system_count (possible with many leading system messages and large preserve_recent).

Proposed fix

-    let recent_start = messages.len().saturating_sub(preserve_recent);
+    let recent_start = messages
+        .len()
+        .saturating_sub(preserve_recent)
+        .max(system_count);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aish-llm/src/session.rs` around lines 1747 - 1756, The slice
messages[system_count..recent_start] can panic when recent_start < system_count;
fix by clamping the middle slice bounds before taking it — compute a safe
middle_start = system_count.min(recent_start) (or check if recent_start <=
system_count and set middle to empty) and then build middle from
messages[middle_start..recent_start]. Update any logic that relies on middle
(the variable named middle and the surrounding token calculations using
system_count, recent_start, preserve_recent, estimate_tokens, and max_tokens) so
you never create an out-of-bounds slice.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/aish-context/src/manager.rs`:
- Around line 875-933: The new tests for ContextManager (functions
inject_knowledge_stable_first_call_adds,
inject_knowledge_stable_same_content_is_noop,
inject_knowledge_stable_different_content_updates,
inject_knowledge_stable_empty_content_clears,
inject_knowledge_stable_empty_twice_is_noop) are not formatted to project
standard; run rustfmt (cargo fmt --all) to reformat the test block and commit
the resulting changes so the CI cargo fmt --check passes, ensuring any minor
whitespace/indentation differences around the inject_knowledge_stable tests are
fixed.
- Around line 440-444: The retain call currently removes any
MemoryType::Knowledge entry whose content equals old (cached.clone()) regardless
of tag; change the predicate in self.messages.retain to also match the tag
(e.g., compare m.tag to cached.tag) so only the exact tagged knowledge entry is
deleted—locate the retain usage around self.messages.retain, the variables
cached/old, and the MemoryType::Knowledge/m.content checks and add an additional
check for m.tag == cached.tag (or equivalent) to avoid deleting unrelated
knowledge entries.

In `@crates/aish-llm/src/session.rs`:
- Around line 260-277: The failing formatting is due to unformatted Rust code in
the modified block (and other ranges); run rustfmt by executing `cargo fmt
--all` (or `rustfmt`/`cargo +stable fmt`) and commit the changes so the sections
referencing langfuse, langfuse_session_id, langfuse_turn_counter, and the
trace_session call are properly formatted; ensure locks/unwrapping lines around
self.langfuse_session_id.lock().unwrap(), the fetch_add on
self.langfuse_turn_counter, and the async trace_session invocation keep
idiomatic spacing and line breaks after running the formatter.
- Around line 264-272: You're holding the std::sync::MutexGuard from
langfuse_session_id across an .await inside the async path (involving
langfuse.trace_session), which blocks other tasks; fix by not awaiting while the
mutex is held: acquire the lock and check if session id is None, and if so
take/drop the guard (e.g., let needs_init = session_id_guard.is_none();
drop(session_id_guard); if needs_init { let id =
langfuse.trace_session(...).await; let mut session_id_guard =
self.langfuse_session_id.lock().unwrap(); if session_id_guard.is_none() {
*session_id_guard = Some(id); } } — this ensures trace_session is awaited
without holding the mutex and still sets langfuse_session_id (use the symbols
langfuse, trace_session, langfuse_session_id, session_id_guard).

In `@crates/aish-shell/src/ai_handler.rs`:
- Around line 623-629: The code is double-wrapping the recall text with the
long-term-memory tags causing a duplicate closing tag when the prior branch
already appended </long-term-memory>; update the construction of content passed
to self.context_manager.inject_knowledge_stable so it does not add the wrapper
if text already contains the closing tag (i.e. check the string or use the
already-wrapped variable), ensuring inject_knowledge_stable("memory_recall",
&content) receives a single well-formed <long-term-memory>...</long-term-memory>
block; locate the content variable and the call to
self.context_manager.inject_knowledge_stable in ai_handler.rs to implement the
conditional or avoid re-wrapping.

In `@crates/aish-shell/src/app.rs`:
- Around line 1472-1483: The code injects raw shell output and cwd into XML-like
tags (see variables builtin_output, self.state.cwd and the constructed entry
passed to self.ai_handler.add_shell_context), which allows insertion of fake
tags; fix by creating a shared escaping helper (e.g., escape_for_xml_like or
similar) that replaces &, <, and > with safe entities and use it when formatting
both <output> and <cwd> fields (also apply the same helper at the other site
noted around lines 1855-1862) so all serialized shell data is escaped before
calling add_shell_context.

---

Outside diff comments:
In `@crates/aish-llm/src/session.rs`:
- Around line 1747-1756: The slice messages[system_count..recent_start] can
panic when recent_start < system_count; fix by clamping the middle slice bounds
before taking it — compute a safe middle_start = system_count.min(recent_start)
(or check if recent_start <= system_count and set middle to empty) and then
build middle from messages[middle_start..recent_start]. Update any logic that
relies on middle (the variable named middle and the surrounding token
calculations using system_count, recent_start, preserve_recent, estimate_tokens,
and max_tokens) so you never create an out-of-bounds slice.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 08783f97-976c-4ffc-83af-365f5b335a37

📥 Commits

Reviewing files that changed from the base of the PR and between 7567317 and 890ba24.

📒 Files selected for processing (8)

crates/aish-context/src/manager.rs
crates/aish-llm/src/langfuse.rs
crates/aish-llm/src/session.rs
crates/aish-llm/src/types.rs
crates/aish-llm/tests/llm_integration_test.rs
crates/aish-prompts/src/manager.rs
crates/aish-shell/src/ai_handler.rs
crates/aish-shell/src/app.rs

- Fix potential slice panic in trim_messages when recent_start < system_count - Fix double-closing </long-term-memory> tag in recall text truncation - Fix tool name mismatch: bash_exec/python_exec → bash in prompt templates - Run cargo fmt to fix CI formatting failures

github-actions Bot added size: XL experienced-contributor labels May 15, 2026

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

jexShain merged commit df4808b into AI-Shell-Team:rust May 15, 2026
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: rewrite prompt templates, add prompt caching and tool-loop trimming#188

feat: rewrite prompt templates, add prompt caching and tool-loop trimming#188
jexShain merged 2 commits into
AI-Shell-Team:rustfrom
jexShain:rust-prompt-caching

jexShain commented May 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

coderabbitai Bot commented May 15, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Possibly Related PRs

Suggested Labels

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jexShain commented May 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Possibly Related PRs

Suggested Labels

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jexShain commented May 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 15, 2026 •

edited

Loading