Skip to content

Adapt to dlsime v0.0.2#4242

Merged
lvhan028 merged 9 commits intoInternLM:mainfrom
JimyMa:bump_to_dlslime_002
Jan 7, 2026
Merged

Adapt to dlsime v0.0.2#4242
lvhan028 merged 9 commits intoInternLM:mainfrom
JimyMa:bump_to_dlslime_002

Conversation

@JimyMa
Copy link
Copy Markdown
Contributor

@JimyMa JimyMa commented Dec 30, 2025

canonicalization of DLSlime Interface for bumping to v0.0.2.

@CUHKSZzxy
Copy link
Copy Markdown
Collaborator

May update the dlslime version here

https://github.com/InternLM/lmdeploy/blob/main/docker/install.sh#L70

hooks:
- id: docformatter
language_version: python3.10
language_version: python3.12
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May let us know the motivation

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

back to 3.10

@lvhan028 lvhan028 requested review from CUHKSZzxy and grimoire January 4, 2026 12:52
@lvhan028 lvhan028 changed the title init Adapt to dlsime v0.0.2 Jan 4, 2026
@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Jan 4, 2026

May resolve the linting issue

Copy link
Copy Markdown
Collaborator

@grimoire grimoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adapts the lmdeploy codebase to work with DLSlime v0.0.2, which introduces breaking API changes for the KV cache migration backend. The changes canonicalize the interface between lmdeploy and the DLSlime library for high-performance distributed serving.

Key changes include:

  • Converting async methods (p2p_initialize, p2p_drop_connect) to synchronous methods to align with DLSlime v0.0.2 API
  • Changing mr_key type from str to int in data models and implementations
  • Updating DLSlime API usage: replacing DLSlimeAssignment class with tuple-based batch format and changing endpoint_info from property to method call

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lmdeploy/pytorch/engine/mp_engine/base_worker.py Removed async keyword from p2p_initialize method to match new synchronous API
lmdeploy/pytorch/engine/engine.py Changed p2p_initialize and p2p_drop_connect to synchronous methods
lmdeploy/pytorch/engine/cache_engine.py Updated mr_key usage from str(i) to i to match new integer type
lmdeploy/pytorch/disagg/messages.py Changed mr_key field type from str to int in data models
lmdeploy/pytorch/disagg/conn/engine_conn.py Converted p2p_initialize from async to sync method
lmdeploy/pytorch/disagg/backend/dlslime.py Major refactoring: removed DLSlimeAssignment import, simplified async migration logic, changed to tuple-based batch format, updated endpoint_info to method call, and fixed type hints
lmdeploy/pytorch/disagg/README.md Updated DLSlime version requirement from >=0.0.1.post7 to >=0.0.2 and changed example model
.pre-commit-config.yaml Updated docformatter Python version from 3.10 to 3.12

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

hooks:
- id: docformatter
language_version: python3.10
language_version: python3.12
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python version for docformatter has been changed from 3.10 to 3.12. Ensure that the development environment and CI/CD pipelines support Python 3.12, and verify that all dependencies are compatible with this version. This change may affect other developers who have Python 3.10 installed.

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +55
assign.mr_key,
assign.mr_key,
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tuple contains duplicate mr_key values at positions 0 and 1. Based on the DLSlime v0.0.2 interface documentation and typical RDMA operation patterns, the tuple should likely contain (local_mr_key, remote_mr_key, target_offset, source_offset, length). If both local and remote memory regions use the same key, this duplication may be intentional, but please verify this matches the expected DLSlime v0.0.2 API signature.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same local and remote mr_key (0 for k cache and 1 for v cache).

Comment on lines +26 to +27
self.local_engine_config: DistServeEngineConfig = (init_request.local_engine_config)
self.remote_engine_config: DistServeEngineConfig = (init_request.remote_engine_config)
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses around the assignment values are unnecessary and don't serve any purpose. They can be removed for cleaner code.

Suggested change
self.local_engine_config: DistServeEngineConfig = (init_request.local_engine_config)
self.remote_engine_config: DistServeEngineConfig = (init_request.remote_engine_config)
self.local_engine_config: DistServeEngineConfig = init_request.local_engine_config
self.remote_engine_config: DistServeEngineConfig = init_request.remote_engine_config

Copilot uses AI. Check for mistakes.

def endpoint_info(self, remote_engine_id: int, protocol: MigrationProtocol):
return self.links[remote_engine_id].endpoint[protocol].endpoint_info
def endpoint_info(self, remote_engine_id: str, protocol: MigrationProtocol):
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter type for remote_engine_id has been changed from int to str in the implementation, but the abstract base class in base.py (line 20) still defines it as int. This creates a type mismatch. The abstract method signature should be updated to match the implementation, or all implementations should use consistent types. Note that other parts of the codebase (e.g., DistServeInitRequest) use str for engine IDs.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str is right.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehint of remote_engine_id in base is fixed.

@lvhan028 lvhan028 merged commit 12350af into InternLM:main Jan 7, 2026
15 checks passed
43758726 pushed a commit to 43758726/lmdeploy that referenced this pull request Jan 7, 2026
* init

* revert pre-commit

* add await

* lint

* update docker install.sh (dlslime==0.0.1.post10)=>(dlslime==0.0.2)

* fix type hint of endpoint_info in base

* update docker install.sh (dlslime==0.0.2)=>(dlslime==0.0.2.post1)
lvhan028 pushed a commit that referenced this pull request Jan 8, 2026
* [Fix] fix quant calibration dataset

* [Fix] fix bug and grammar, and merge main

* [Fix] change processor to tokenizer

* [Fix] fix changes in .github/md-link-config.json

* fix: get rid of buggy timm-1.0.23 (#4260)

* [ascend] fix paged prefill (#4254)

* [ascend] fix paged prefill

* update

* update

* Adapt to dlsime v0.0.2 (#4242)

* init

* revert pre-commit

* add await

* lint

* update docker install.sh (dlslime==0.0.1.post10)=>(dlslime==0.0.2)

* fix type hint of endpoint_info in base

* update docker install.sh (dlslime==0.0.2)=>(dlslime==0.0.2.post1)

* Fix ascend/maca/camb runtime_requirements (#4262)

* fix reqs

* fix reqs

* [Fix] fix load local dataset cache bug

* Revert "[Fix] fix load local dataset cache bug"

This reverts commit be3f2cc8338ff6e8d4f00ff2239d3360254fdba8.

* [Fix] delete load local dataset

* [Fix] fix get_dataset function args explain

---------

Co-authored-by: windreamer <windreamer@gmail.com>
Co-authored-by: tangzhiyi11 <tangzhiyi11@users.noreply.github.com>
Co-authored-by: JimyMa <33408125+JimyMa@users.noreply.github.com>
Co-authored-by: jinminxi104 <jinminxi104@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants