[Refactor] Simplify infinite context compression system by dingyi222666 · Pull Request #774 · ChatLunaLab/chatluna

dingyi222666 · 2026-03-10T18:27:48Z

This PR refactors the infinite context compression system to be simpler and more straightforward.

New Features

Added manual compression command via chatluna.chat.compress with optional room selection and force flag
Added CompressContextResult interface to track compression metrics (input tokens, output tokens, reduction percentage)
Improved compression feedback with detailed metrics displayed in user-facing messages

Bug fixes

Fixed compression logic to handle edge cases and empty conversations properly
Improved error handling in compression workflow with better i18n support

Other Changes

Simplified compression prompt and logic to focus on conversation summarization
Replaced chunked compression approach with single-pass summary of entire conversation
Removed complex message filtering and token-based chunking logic
Extracted formatTranscript helper function for cleaner code
Updated i18n messages for both English and Chinese to display token reduction statistics
Updated room.ts and chat.ts commands to support i18n_base option for flexible message localization
Improved compression flow to provide better feedback to users about compression results

- Simplify compression prompt and logic to focus on conversation summarization - Replace chunked compression with single-pass summary of entire conversation - Add CompressContextResult interface to track compression metrics (input tokens, output tokens, reduction percentage) - Add manual compression command via 'chatluna.chat.compress' with force flag - Update room and chat commands to display compression metrics in messages - Update i18n messages for both English and Chinese to show token reduction stats - Improve compression flow to handle edge cases and provide better feedback - Extract formatTranscript helper function for cleaner code

chatgpt-codex-connector · 2026-03-10T18:27:57Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

coderabbitai · 2026-03-10T18:28:09Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c44d2ebe-834c-4c64-9b56-23cd1507f023

📥 Commits

Reviewing files that changed from the base of the PR and between 6bb9d81 and 054b296.

⛔ Files ignored due to path filters (2)

packages/core/src/locales/en-US.yml is excluded by !**/*.yml
packages/core/src/locales/zh-CN.yml is excluded by !**/*.yml

📒 Files selected for processing (7)

packages/core/src/commands/chat.ts
packages/core/src/commands/room.ts
packages/core/src/llm-core/chain/infinite_context_chain.ts
packages/core/src/llm-core/chat/app.ts
packages/core/src/llm-core/chat/infinite_context.ts
packages/core/src/middlewares/room/compress_room.ts
packages/core/src/services/chat.ts

总体概述

本次变更引入了新的聊天压缩命令和改进的压缩流程。新增 chatluna.chat.compress 命令，修改了压缩相关方法的签名以支持强制压缩选项，并引入了 CompressContextResult 返回类型以提供详细的压缩指标。

文件变更

内聚组 / 文件	变更摘要
新增聊天压缩命令 `packages/core/src/commands/chat.ts`	添加新的 `chatluna.chat.compress` 子命令，支持 `-r` 房间选项，调用链处理压缩请求。
压缩命令和中间件 `packages/core/src/commands/room.ts`, `packages/core/src/middlewares/room/compress_room.ts`	修改压缩命令处理流程，添加 `force` 和 `i18n_base` 参数支持，改进了返回结果的结构化处理和国际化消息适配。
压缩链和结果类型 `packages/core/src/llm-core/chain/infinite_context_chain.ts`, `packages/core/src/llm-core/chat/infinite_context.ts`	重构压缩流程：引入新的 `ChatLunaInfiniteContextChunkResult` 接口，修改 `compressChunk` 返回类型为包含文本和用量元数据的结构；重写 `compressIfNeeded` 方法，添加基于令牌计数的预检查和新的 `CompressContextResult` 返回类型。
压缩 API 服务层 `packages/core/src/llm-core/chat/app.ts`, `packages/core/src/services/chat.ts`	更新 `compressContext` 方法签名以支持可选的 `force` 参数，修改返回类型从 `Promise<boolean>` 为 `Promise<CompressContextResult>`，并在服务层传递该参数。

序列图

sequenceDiagram
    participant User as 用户
    participant ChatCmd as 聊天命令
    participant RoomCmd as 房间命令
    participant Middleware as 压缩中间件
    participant Service as 聊天服务
    participant Manager as 无限上下文管理器
    participant Chain as 压缩链
    participant LLM as 语言模型

    User->>ChatCmd: 执行 chatluna.chat.compress -r room1
    ChatCmd->>Middleware: 触发 compress_room 中间件
    Middleware->>Service: compressContext(room, force=true)
    Service->>Manager: compressIfNeeded(wrapper, force=true)
    Manager->>Manager: 统计消息令牌数
    alt 需要压缩
        Manager->>Chain: 创建压缩任务
        Chain->>LLM: 调用 LLM 压缩对话
        LLM-->>Chain: 返回压缩摘要和用量
        Chain-->>Manager: 返回 ChatLunaInfiniteContextChunkResult
        Manager->>Manager: 更新聊天历史记录
        Manager-->>Service: 返回 CompressContextResult
    else 跳过压缩
        Manager-->>Service: 返回 CompressContextResult (compressed=false)
    end
    Service-->>Middleware: 返回压缩结果
    Middleware-->>User: 显示压缩统计信息

代码审查工作量估计

🎯 3 (中等) | ⏱️ ~25 分钟

诗歌

🐰 新的压缩命令如春风拂过，
强制选项让流程更加灵活，
令牌计数精确又细致，
中文提示温暖又贴心，
聊天记忆在这里舞动，
压缩的智慧闪闪发光✨

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/conversation-compact

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-10T18:32:23Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the infinite context compression system to be more streamlined and user-friendly. The core change involves simplifying the underlying compression mechanism from a multi-chunk approach to a single-pass summarization of the entire conversation. This not only reduces complexity but also enhances the clarity of the compression process by providing detailed token reduction statistics to the user, improving the overall feedback and control over conversation history management.

Highlights

New Manual Compression Command: Added a new manual compression command via chatluna.chat.compress which supports optional room selection and a force flag.
Enhanced Compression Metrics: Introduced a CompressContextResult interface to track and display detailed compression metrics, including input tokens, output tokens, and reduction percentage, in user-facing messages.
Simplified Compression Logic: Refactored the infinite context compression system to use a simpler, single-pass summarization approach for the entire conversation, replacing the previous complex chunked compression and message filtering logic.
Improved Internationalization (i18n) Support: Updated i18n messages for both English and Chinese to display token reduction statistics and added i18n_base options to commands for flexible message localization.
Bug Fixes and Error Handling: Addressed compression logic edge cases and empty conversations, and improved error handling within the compression workflow.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

packages/core/src/commands/chat.ts
- Added a new chatluna.chat.compress command.
packages/core/src/commands/room.ts
- Modified the chatluna.room.compress command to support force and i18n_base options.
packages/core/src/llm-core/chain/infinite_context_chain.ts
- Updated imports to include AIMessage and UsageMetadata.
- Introduced the ChatLunaInfiniteContextChunkResult interface.
- Simplified the compression prompt for conversation summarization.
- Modified the compressChunk method to return ChatLunaInfiniteContextChunkResult and removed the _isAlreadyCompressed method.
packages/core/src/llm-core/chat/app.ts
- Imported CompressContextResult.
- Modified the compressContext method to accept a force parameter, return CompressContextResult, and throw a ChatLunaError if chat history is not initialized.
packages/core/src/llm-core/chat/infinite_context.ts
- Defined the CompressContextResult interface.
- Extracted the formatTranscript helper function.
- Refactored the compressIfNeeded method to implement single-pass summarization, accept a force parameter, and return CompressContextResult.
- Removed several complex internal methods related to chunking, message filtering, and summary building.
packages/core/src/locales/en-US.yml
- Updated English localization messages for compression success and skipped states to include token statistics.
- Added new localization entries for the chatluna.chat.compress command.
packages/core/src/locales/zh-CN.yml
- Updated Chinese localization messages for compression success and skipped states to include token statistics.
- Added new localization entries for the chatluna.chat.compress command.
packages/core/src/middlewares/room/compress_room.ts
- Adjusted the compression middleware to utilize the i18n_base option for messages.
- Updated success and skipped messages to display detailed token reduction statistics.
- Passed the force option to the compressContext service call.
packages/core/src/services/chat.ts
- Modified the compressContext service method to accept and propagate a force parameter.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the chat history compression mechanism, introducing a new chatluna.chat.compress command and updating the existing chatluna.room.compress command to support forced compression and detailed output. The core compression logic in InfiniteContextManager has been simplified, moving from a complex chunking and merging strategy to a direct summarization of the entire conversation transcript by the LLM, with the return type now including detailed token usage metadata. Localization messages have been updated to display these new compression statistics. However, the formatTranscript function is vulnerable to prompt injection due to unsanitized user input, which could allow attackers to manipulate the summarization LLM. Additionally, the compressIfNeeded function contains duplicated logic for handling skipped compression, which could be improved with a helper function for better maintainability.

gemini-code-assist · 2026-03-10T18:39:48Z

+function formatTranscript(messages: BaseMessage[]) {
+    return messages
+        .map((message) => {
+            const role = message.getType().toUpperCase()
+            const name = message.name ? ` (${message.name})` : ''
+            const content = getMessageContent(message.content).trim()
+            return `[${role}${name}]\n${content || '(empty)'}`
+        })
+        .join('\n\n---\n\n')
 }


The formatTranscript function constructs a conversation transcript by concatenating message roles and contents using a simple string format with \n\n---\n\n as a separator. This transcript is then passed to an LLM for summarization. Because user-provided message content is not sanitized or escaped, an attacker can inject the separator and fake role headers (e.g., [SYSTEM]) into their messages. This allows them to spoof the conversation history seen by the summarization LLM, potentially leading to a manipulated summary. Since this summary is added back to the chat history and used as context for future AI responses, it can be used to influence the bot's behavior or misrepresent the conversation state.

function formatTranscript(messages: BaseMessage[]) { return messages .map((message) => { const role = message.getType().toUpperCase() const name = message.name ? ` (${message.name})` : '' const content = getMessageContent(message.content).trim() // Escape the separator to prevent transcript spoofing const escapedContent = content.replace(/\n\n---\n\n/g, '\n\n - -- \n\n') return `[${role}${name}]\n${escapedContent || '(empty)'}` }) .join('\n\n---\n\n') }

gemini-code-assist · 2026-03-10T18:39:48Z

        if (!model) {
-            return
+            return {
+                inputTokens: 0,
+                outputTokens: 0,
+                reducedTokens: 0,
+                reducedPercent: 0,
+                compressed: false
+            }
        }

        const messages = await this.options.chatHistory.getMessages()

        if (messages.length === 0) {
-            return
-        }
-
-        const invocation = model.invocationParams()
-        const maxTokenLimit =
-            invocation.maxTokenLimit && invocation.maxTokenLimit > 0
-                ? invocation.maxTokenLimit
-                : model.getModelMaxContextSize()
-
-        if (!maxTokenLimit || maxTokenLimit <= 0) {
-            return
-        }
-
-        const presetMessages = Array.isArray(
-            this.options.preset?.value?.messages
-        )
-            ? (this.options.preset?.value.messages as BaseMessage[])
-            : []
-
-        const presetTokens = await this._calculateMessageTokenStats(
-            model,
-            presetMessages
-        ).then((stats) =>
-            stats.reduce((sum, current) => sum + current.tokens, 0)
-        )
-
-        const threshold = Math.floor(
-            maxTokenLimit * (this.options.threshold ?? 0.85)
-        )
-
-        const stats = await this._calculateMessageTokenStats(model, messages)
-        const totalTokens =
-            stats.reduce((sum, current) => sum + current.tokens, 0) +
-            presetTokens
-
-        if (totalTokens <= threshold) {
-            return
+            return {
+                inputTokens: 0,
+                outputTokens: 0,
+                reducedTokens: 0,
+                reducedPercent: 0,
+                compressed: false
+            }
        }


There are multiple places in this function where you return a CompressContextResult for cases where compression is skipped. This logic is duplicated. To improve maintainability and reduce code duplication, consider creating a helper function to generate this "uncompressed" result object.

For example:

function createUncompressedResult(inputTokens: number): CompressContextResult { return { inputTokens, outputTokens: inputTokens, reducedTokens: 0, reducedPercent: 0, compressed: false, }; }

Then you could simplify the early returns, for example:

if (!model || messages.length === 0) { return createUncompressedResult(0); }

And for other cases:

if (!maxTokenLimit || maxTokenLimit <= 0) { return createUncompressedResult(inputTokens); }

dingyi222666 merged commit 3ec82d6 into v1-dev Mar 10, 2026
2 of 3 checks passed

dingyi222666 deleted the fix/conversation-compact branch March 10, 2026 18:32

dingyi222666 mentioned this pull request Mar 10, 2026

[Chore] Bump package versions #775

Merged

gemini-code-assist Bot reviewed Mar 10, 2026

View reviewed changes

coderabbitai Bot mentioned this pull request Apr 8, 2026

[Fix] preserve compacted history metadata #820

Merged

coderabbitai Bot mentioned this pull request May 23, 2026

[Refactor] improve context compression budgeting #874

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Refactor] Simplify infinite context compression system#774

[Refactor] Simplify infinite context compression system#774
dingyi222666 merged 1 commit into
v1-devfrom
fix/conversation-compact

dingyi222666 commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector Bot commented Mar 10, 2026

Uh oh!

coderabbitai Bot commented Mar 10, 2026 •

edited

Loading

Review failed

Uh oh!

Uh oh!

gemini-code-assist Bot commented Mar 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dingyi222666 commented Mar 10, 2026

New Features

Bug fixes

Other Changes

Uh oh!

chatgpt-codex-connector Bot commented Mar 10, 2026

Uh oh!

coderabbitai Bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

总体概述

文件变更

序列图

代码审查工作量估计

相关 PR

诗歌

Uh oh!

Uh oh!

gemini-code-assist Bot commented Mar 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Mar 10, 2026 •

edited

Loading