fix: 清洗远程文件消息段文件名#8318
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a filename sanitization function for file components and refactors the file download logic to utilize pathlib for better path management. It also includes a new unit test to verify that remote filenames are correctly sanitized. The reviewer suggested extending the sanitization logic to handle characters that are invalid on Windows filesystems (such as ':', '*', and '?') to ensure cross-platform compatibility.
| def _sanitize_file_component_name(name: str | None) -> str: | ||
| if not name: | ||
| return "file" | ||
|
|
||
| normalized = str(name).replace("\\", "/") | ||
| basename = PurePosixPath(normalized).name.replace("\x00", "").strip() | ||
| if basename in {"", ".", ".."}: | ||
| return "file" | ||
| return basename |
There was a problem hiding this comment.
The _sanitize_file_component_name function correctly handles path traversal by extracting the basename and removing null bytes. However, it does not account for other characters that are invalid in filenames on certain operating systems, particularly Windows (e.g., :, *, ?, ", <, >, |). If a remote filename contains these characters, the subsequent download_file call will fail with an OSError on Windows. Consider replacing these characters with an underscore or removing them to ensure cross-platform compatibility. Additionally, as this is new functionality for handling file components, please ensure it is accompanied by unit tests.
| def _sanitize_file_component_name(name: str | None) -> str: | |
| if not name: | |
| return "file" | |
| normalized = str(name).replace("\\", "/") | |
| basename = PurePosixPath(normalized).name.replace("\x00", "").strip() | |
| if basename in {"", ".", ".."}: | |
| return "file" | |
| return basename | |
| def _sanitize_file_component_name(name: str | None) -> str: | |
| if not name: | |
| return "file" | |
| normalized = str(name).replace("\\", "/") | |
| basename = PurePosixPath(normalized).name.replace("\\x00", "").strip() | |
| # Remove or replace characters that are invalid on Windows filesystems | |
| for char in ':*?"<>|': | |
| basename = basename.replace(char, "_") | |
| if basename in {"", ".", ".."}: | |
| return "file" | |
| return basename |
References
- New functionality, such as handling attachments, should be accompanied by corresponding unit tests.
|
已根据 Gemini Code Assist 的建议补充处理:
本地验证:
|
变更内容
File消息段远程下载增加文件名清洗,仅保留 basename 并移除空字节关联 Issue
Closes #8317
验证
UV_CACHE_DIR=/tmp/uv-cache UV_PYTHON_INSTALL_DIR=/tmp/uv-python uv run pytest tests/unit/test_file_message_component.pyUV_CACHE_DIR=/tmp/uv-cache UV_PYTHON_INSTALL_DIR=/tmp/uv-python uv run ruff format .UV_CACHE_DIR=/tmp/uv-cache UV_PYTHON_INSTALL_DIR=/tmp/uv-python uv run ruff check .Summary by Sourcery
Sanitize remote file names in File message downloads and ensure temporary directories exist before saving, with tests verifying the new behavior.
Bug Fixes:
Tests: