fix(aiocqhttp): normalize quoted media sources in replies#8316
fix(aiocqhttp): normalize quoted media sources in replies#8316Foolllll-J wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- In
_resolve_onebot_file_referenceyou currentlyexcept BaseException, which will also swallowasyncio.CancelledErrorand system-level exceptions; consider narrowing this toException(or a more specific subset) so cancellations and fatal errors are not silently ignored. - The
_get_sourcehelpers inRecordandVideoare duplicated and follow the same precedence rules; consider extracting a shared utility or mixin to avoid divergence if the source-selection logic needs to change in the future.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_resolve_onebot_file_reference` you currently `except BaseException`, which will also swallow `asyncio.CancelledError` and system-level exceptions; consider narrowing this to `Exception` (or a more specific subset) so cancellations and fatal errors are not silently ignored.
- The `_get_source` helpers in `Record` and `Video` are duplicated and follow the same precedence rules; consider extracting a shared utility or mixin to avoid divergence if the source-selection logic needs to change in the future.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/aiocqhttp/aiocqhttp_platform_adapter.py" line_range="136-40" />
<code_context>
+
+ file_ref = normalized.get("file")
+ file_id = normalized.get("file_id")
+ candidate = file_ref or file_id
+ if not isinstance(candidate, str) or not candidate.strip():
+ usable_source = _pick_usable_media_source(normalized)
+ if usable_source:
+ normalized["file"] = usable_source
+ if seg_type == "file":
+ normalized.setdefault("url", usable_source)
+ return normalized
+ return normalized
+ if _looks_like_resolved_media_ref(candidate) or os.path.exists(candidate):
+ normalized["file"] = candidate
+ return normalized
</code_context>
<issue_to_address>
**issue (bug_risk):** Unstripped `candidate` may contain leading/trailing spaces and fail resolution checks.
`candidate` is checked using `candidate.strip()` but the subsequent `_looks_like_resolved_media_ref` and `os.path.exists` calls use the unstripped value. If `file`/`file_id` includes surrounding whitespace, these checks may incorrectly fail. Consider normalizing once (e.g. `candidate = candidate.strip()`) before the checks and later uses.
</issue_to_address>
### Comment 2
<location path="astrbot/core/platform/sources/aiocqhttp/aiocqhttp_platform_adapter.py" line_range="98-102" />
<code_context>
+ )
+
+ for action, params in actions:
+ try:
+ ret = await bot.call_action(action=action, **params)
+ except BaseException:
+ continue
+ data = _unwrap_onebot_action_data(ret)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Catching `BaseException` here is overly broad and may hide critical errors.
This will also suppress `KeyboardInterrupt`, `SystemExit`, and other critical exceptions, which can break shutdown and debugging. Here it’s safer to catch `Exception` (or specific library exceptions) so truly critical errors still propagate.
```suggestion
for action, params in actions:
try:
ret = await bot.call_action(action=action, **params)
except Exception:
continue
```
</issue_to_address>
### Comment 3
<location path="astrbot/core/platform/sources/aiocqhttp/aiocqhttp_platform_adapter.py" line_range="31" />
<code_context>
from .aiocqhttp_message_event import AiocqhttpMessageEvent
+def _looks_like_resolved_media_ref(value: str) -> bool:
+ return value.startswith(("http://", "https://", "file://")) or os.path.isabs(value)
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new media resolution helpers into a small `MediaResolver` class with shared helpers and action templates to reduce duplication and make the control flow easier to follow.
You can reduce the new complexity without changing behavior by:
1. **Encapsulating shared context and flow in a small resolver object**
2. **Centralizing the “is resolved” check**
3. **De‑duplicating action construction with static templates**
### 1. Encapsulate context into a `MediaResolver`
Instead of threading `bot`, `message_type`, `group_id`, `seg_type` through multiple helpers and call sites, wrap them in a small resolver. This keeps the public surface small while keeping functionality identical:
```python
class MediaResolver:
def __init__(self, bot: CQHttp, message_type: str, group_id: str | int | None):
self.bot = bot
self.message_type = message_type
self.group_id = group_id
async def normalize(self, seg_type: str, seg_data: dict[str, Any]) -> dict[str, Any]:
# move current _normalize_onebot_media_data body here
# but replace direct calls with self._resolve_file_reference(...)
...
async def _resolve_file_reference(self, file_ref: str, seg_type: str | None = None) -> str | None:
# move current _resolve_onebot_file_reference body here
# use self.bot, self.message_type, self.group_id
...
```
Call site becomes simpler and avoids repeated parameters:
```python
resolver = MediaResolver(self.bot, event["message_type"], event.get("group_id"))
# file segment special case
normalized_data = await resolver.normalize("file", m["data"])
...
# generic handler
normalized_data = await resolver.normalize(t, m["data"])
a = ComponentTypes[t](**normalized_data)
```
This keeps all behavior but removes a lot of cross‑cutting arguments and makes the flow easier to follow.
### 2. Centralize the “resolved path or URL” check
You currently repeat:
- `_looks_like_resolved_media_ref(value) or os.path.exists(value)`
across `_pick_usable_media_source`, `_normalize_onebot_media_data`, `_resolve_onebot_file_reference`.
Extract this into a single helper that returns either a normalized string or `None`:
```python
def _resolve_local_or_url_candidate(value: str | None) -> str | None:
if not isinstance(value, str):
return None
candidate = value.strip()
if not candidate:
return None
if _looks_like_resolved_media_ref(candidate) or os.path.exists(candidate):
return candidate
return None
```
Then simplify callers, e.g.:
```python
def _pick_usable_media_source(seg_data: dict[str, Any]) -> str:
for key in ("url", "file", "path", "file_path"):
resolved = _resolve_local_or_url_candidate(seg_data.get(key))
if resolved:
return resolved
return ""
```
and in the resolver:
```python
candidate = file_ref or file_id
resolved = _resolve_local_or_url_candidate(candidate)
if resolved:
normalized["file"] = resolved
return normalized
```
This reduces branching and duplication without changing semantics.
### 3. De‑duplicate action list construction with templates
The `actions.extend([...])` pattern is verbose and repeated per candidate. You can describe the patterns once and instantiate them:
```python
# module-level, static
_BASE_ACTION_TEMPLATES: list[tuple[str, dict[str, str]]] = [
("get_file", {"file_id": "candidate"}),
("get_file", {"file": "candidate"}),
("get_image", {"file": "candidate"}),
("get_image", {"file_id": "candidate"}),
("get_private_file_url", {"file_id": "candidate"}),
]
def _build_actions_for_candidate(
candidate: str,
message_type: str,
group_id: str | int | None,
seg_type: str | None,
) -> list[tuple[str, dict[str, Any]]]:
actions: list[tuple[str, dict[str, Any]]] = []
if seg_type == "record":
actions.append(("get_record", {"file": candidate}))
for action, params in _BASE_ACTION_TEMPLATES:
concrete = {k: (candidate if v == "candidate" else v) for k, v in params.items()}
actions.append((action, concrete))
if str(message_type).lower() == "group" and group_id not in (None, ""):
actions.append(("get_group_file_url", {"group_id": group_id, "file_id": candidate}))
return actions
```
Then `_resolve_onebot_file_reference` (or `MediaResolver._resolve_file_reference`) becomes clearer:
```python
actions: list[tuple[str, dict[str, Any]]] = []
for candidate in candidates:
actions.extend(_build_actions_for_candidate(candidate, message_type, group_id, seg_type))
```
This removes duplicated action construction logic and makes it obvious what variations exist.
---
These changes keep your new media resolution behavior intact, but:
- Narrow the public API to `MediaResolver.normalize(...)`
- Centralize repeated checks
- Make candidate/action handling more declarative and easier to scan.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| from .aiocqhttp_message_event import AiocqhttpMessageEvent | ||
|
|
||
|
|
||
| def _looks_like_resolved_media_ref(value: str) -> bool: |
There was a problem hiding this comment.
issue (complexity): Consider refactoring the new media resolution helpers into a small MediaResolver class with shared helpers and action templates to reduce duplication and make the control flow easier to follow.
You can reduce the new complexity without changing behavior by:
- Encapsulating shared context and flow in a small resolver object
- Centralizing the “is resolved” check
- De‑duplicating action construction with static templates
1. Encapsulate context into a MediaResolver
Instead of threading bot, message_type, group_id, seg_type through multiple helpers and call sites, wrap them in a small resolver. This keeps the public surface small while keeping functionality identical:
class MediaResolver:
def __init__(self, bot: CQHttp, message_type: str, group_id: str | int | None):
self.bot = bot
self.message_type = message_type
self.group_id = group_id
async def normalize(self, seg_type: str, seg_data: dict[str, Any]) -> dict[str, Any]:
# move current _normalize_onebot_media_data body here
# but replace direct calls with self._resolve_file_reference(...)
...
async def _resolve_file_reference(self, file_ref: str, seg_type: str | None = None) -> str | None:
# move current _resolve_onebot_file_reference body here
# use self.bot, self.message_type, self.group_id
...Call site becomes simpler and avoids repeated parameters:
resolver = MediaResolver(self.bot, event["message_type"], event.get("group_id"))
# file segment special case
normalized_data = await resolver.normalize("file", m["data"])
...
# generic handler
normalized_data = await resolver.normalize(t, m["data"])
a = ComponentTypes[t](**normalized_data)This keeps all behavior but removes a lot of cross‑cutting arguments and makes the flow easier to follow.
2. Centralize the “resolved path or URL” check
You currently repeat:
_looks_like_resolved_media_ref(value) or os.path.exists(value)
across _pick_usable_media_source, _normalize_onebot_media_data, _resolve_onebot_file_reference.
Extract this into a single helper that returns either a normalized string or None:
def _resolve_local_or_url_candidate(value: str | None) -> str | None:
if not isinstance(value, str):
return None
candidate = value.strip()
if not candidate:
return None
if _looks_like_resolved_media_ref(candidate) or os.path.exists(candidate):
return candidate
return NoneThen simplify callers, e.g.:
def _pick_usable_media_source(seg_data: dict[str, Any]) -> str:
for key in ("url", "file", "path", "file_path"):
resolved = _resolve_local_or_url_candidate(seg_data.get(key))
if resolved:
return resolved
return ""and in the resolver:
candidate = file_ref or file_id
resolved = _resolve_local_or_url_candidate(candidate)
if resolved:
normalized["file"] = resolved
return normalizedThis reduces branching and duplication without changing semantics.
3. De‑duplicate action list construction with templates
The actions.extend([...]) pattern is verbose and repeated per candidate. You can describe the patterns once and instantiate them:
# module-level, static
_BASE_ACTION_TEMPLATES: list[tuple[str, dict[str, str]]] = [
("get_file", {"file_id": "candidate"}),
("get_file", {"file": "candidate"}),
("get_image", {"file": "candidate"}),
("get_image", {"file_id": "candidate"}),
("get_private_file_url", {"file_id": "candidate"}),
]
def _build_actions_for_candidate(
candidate: str,
message_type: str,
group_id: str | int | None,
seg_type: str | None,
) -> list[tuple[str, dict[str, Any]]]:
actions: list[tuple[str, dict[str, Any]]] = []
if seg_type == "record":
actions.append(("get_record", {"file": candidate}))
for action, params in _BASE_ACTION_TEMPLATES:
concrete = {k: (candidate if v == "candidate" else v) for k, v in params.items()}
actions.append((action, concrete))
if str(message_type).lower() == "group" and group_id not in (None, ""):
actions.append(("get_group_file_url", {"group_id": group_id, "file_id": candidate}))
return actionsThen _resolve_onebot_file_reference (or MediaResolver._resolve_file_reference) becomes clearer:
actions: list[tuple[str, dict[str, Any]]] = []
for candidate in candidates:
actions.extend(_build_actions_for_candidate(candidate, message_type, group_id, seg_type))This removes duplicated action construction logic and makes it obvious what variations exist.
These changes keep your new media resolution behavior intact, but:
- Narrow the public API to
MediaResolver.normalize(...) - Centralize repeated checks
- Make candidate/action handling more declarative and easier to scan.
There was a problem hiding this comment.
Code Review
This pull request refactors media source resolution for Record and Video components and introduces a normalization layer in the aiocqhttp adapter to resolve OneBot V11 media references. Review feedback identifies a copy-paste error in the Record component where audio files were being saved with an image extension and suggests using Exception instead of BaseException for safer error handling.
| if source.startswith("base64://"): | ||
| bs64_data = source.removeprefix("base64://") | ||
| image_bytes = base64.b64decode(bs64_data) | ||
| file_path = os.path.join( | ||
| get_astrbot_temp_path(), f"recordseg_{uuid.uuid4()}.jpg" | ||
| ) | ||
| with open(file_path, "wb") as f: | ||
| f.write(image_bytes) | ||
| return os.path.abspath(file_path) |
There was a problem hiding this comment.
In the Record component, the temporary file is being saved with a .jpg extension and uses image_bytes, which is a copy-paste error from the Image component. Please fix the extension and variable names. Additionally, since this logic is similar to other attachment components, refactor it into a shared helper function to avoid code duplication. Ensure this functionality is accompanied by unit tests.
| if source.startswith("base64://"): | |
| bs64_data = source.removeprefix("base64://") | |
| image_bytes = base64.b64decode(bs64_data) | |
| file_path = os.path.join( | |
| get_astrbot_temp_path(), f"recordseg_{uuid.uuid4()}.jpg" | |
| ) | |
| with open(file_path, "wb") as f: | |
| f.write(image_bytes) | |
| return os.path.abspath(file_path) | |
| if source.startswith("base64://"): | |
| bs64_data = source.removeprefix("base64://") | |
| audio_bytes = base64.b64decode(bs64_data) | |
| file_path = os.path.join( | |
| get_astrbot_temp_path(), f"recordseg_{uuid.uuid4().hex}.amr" | |
| ) | |
| with open(file_path, "wb") as f: | |
| f.write(audio_bytes) | |
| return os.path.abspath(file_path) |
References
- When implementing similar functionality for different cases, refactor the logic into a shared helper function to avoid code duplication.
- New functionality, such as handling attachments, should be accompanied by corresponding unit tests.
| for action, params in actions: | ||
| try: | ||
| ret = await bot.call_action(action=action, **params) | ||
| except BaseException: |
There was a problem hiding this comment.
51a818e to
35aeeb3
Compare
Fixes #8049 .
修复 aiocqhttp / OneBot v11 场景下,引用消息中的语音、视频、文件媒体段在进入 AstrBot 后无法被正确消费的问题。
此前在 QQ/NapCat 的引用消息场景中,媒体段里的
file字段经常只是平台内部文件名或引用值,例如0f47835d687410ab50cfed981e80c15c.amr、BV1y9Gj6****.mp4,并不是 AstrBot 当前进程可直接访问的本地路径,也不是可直接下载的 URL。旧逻辑会将这类
file原样交给上层组件,导致:not a valid file: *.amrnot a valid file: *.mp4url,部分组件仍可能优先消费无效的file值本次修改将 quoted media 的修复前移到正确层级:
从而确保引用消息中的语音、视频、文件媒体段在进入后续 Agent / Provider / Plugin 链路前,就已经具备可直接消费的来源信息。
Modifications / 改动点
本次改动主要补齐了 aiocqhttp 引用媒体段的来源解析逻辑,并统一了
Record/Video组件对多来源字段的消费方式。核心改动包括:
在
astrbot/core/platform/sources/aiocqhttp/aiocqhttp_platform_adapter.py中新增 OneBot 媒体来源规范化逻辑:file是否已经是可直接使用的引用url/path/file_pathfile/file_id只是平台内部引用值时,尝试通过 OneBot action 进一步解析为真实 URL 或文件引用video/record/file组件前,将媒体段统一规范成上层可直接消费的结构在
astrbot/core/message/components.py中增强Record组件的 source 选择逻辑:convert_to_file_path()与convert_to_base64()不再只依赖fileurl -> file -> path的顺序选择真实可用来源在
astrbot/core/message/components.py中修正Record的 base64 临时文件落地逻辑:.jpg临时文件的 copy-paste 问题在
astrbot/core/message/components.py中增强Video组件的 source 选择逻辑:Video增加显式url字段convert_to_file_path()与to_dict()的来源选择逻辑Video与Image/Record的多来源语义一致在测试侧保留并补充了稳定的组件层单元测试,覆盖:
Record在url存在而file无效时仍可正常转本地路径Record在同样场景下仍可正常转为 base64Record从base64://来源落地临时文件时,会使用正确的音频扩展名和内容写入This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
修复前引用语音消息时:
修复后不再报错。
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Normalize aiocqhttp/OneBot v11 media sources in quoted messages so that audio, video, and file segments are resolved to directly consumable URLs or paths before reaching higher-level components.
Bug Fixes:
filevalues are only internal identifiers rather than usable paths or URLs.Enhancements:
url,file, andpathconsistently when converting to file paths, base64, or dictionaries.Tests: