feat: add pytorch_engine_qwen2_5vl_sm120 by kolmogorov-quyet · Pull Request #3750 · InternLM/lmdeploy

kolmogorov-quyet · 2025-07-19T03:21:31Z

Motivation

Silence spurious CancelledError logs from asyncio.run_coroutine_threadsafe when the future is cancelled.
Improve FlashAttention kernel meta-tuning for newer GPUs (adds SM 128 path).
Keep the repository clean with extra .gitignore entries.
Add a minimal smoke test for the new engine to ensure CI coverage.

Modifications

File	Key changes
.gitignore	Ignore `builder/`, `lmdeploy/lib/`, IDE caches, etc.
`lmdeploy/pytorch/kernels/cuda/flashattention.py`	Added `_kernel_meta_sm128()`, refactored meta-selection logic (~40 LoC).
`lmdeploy/serve/async_engine.py`	Replaced `lambda f: f.result()` with a safe callback: `lambda f: None if f.cancelled() else f.result()`
`tests/test.py`	New smoke test: init engine, run one caption generation, assert non-empty output.

Backward Compatibility

No API breaks; existing engines and interfaces continue to work.

Use Cases

Cleaner logs under heavy load—no more CancelledError spam.
Better kernel parameters for Ada/Hopper-next GPUs (RTX 50xx, etc.).
Quick regression guard via the new unit test.

Checklist

Code formatted / linted
Unit tests added & pass (pytest -q)
Documentation updated if necessary
Ready for review

windreamer · 2025-07-19T12:58:20Z

lmdeploy/pytorch/kernels/cuda/flashattention.py


    num_warps = 4
-    if _nv_cap[0] < 8:
+    if _nv_cap[0] >= 12:  # Blackwell (sm_120 etc.)


Would you please follow the original style to enable Blackwell support when _nv_cap[0] < 13 and put this branch to the end of the if-elif-else block? I think this will help the community better understand the code.

windreamer · 2025-07-19T13:02:52Z

And you can try to install pre-commit to help you fix lint errors.

Thank you for your efforts!

kolmogorov-quyet · 2025-07-19T16:50:28Z

Thank you for your helpful review and suggestions 🙏
I've updated the code to follow the original style for better clarity, as you mentioned.
Also, I’ve set up pre-commit and made sure it passes all hooks, to avoid common PEP8 issues.

Please take a look when you have time — really appreciate your support!

lmdeploy/pytorch/kernels/cuda/flashattention.py

grimoire · 2025-07-24T02:50:57Z

test.py

+from lmdeploy import PytorchEngineConfig, pipeline
+from lmdeploy.vl import load_image
+
+backend_config = PytorchEngineConfig(session_len=16384)


@lvhan028 any advice about the tests?

Hi, @kolmogorov-quyet
We appreciate your contribution. Just a quick note—the lmdeploy/tests directory is intended for unit test cases rather than functional testing.
We've already integrated Qwen2.5-VL model testing into lmdeploy's functional test suite, so you can safely remove this test file

grimoire

LGTM

feat: add pytorch_engine_qwen2_5vl_sm120

d5d03c4

kolmogorov-quyet mentioned this pull request Jul 19, 2025

Question: is there anyone who managed to run this lib on RTX 5090? #3421

Closed

windreamer reviewed Jul 19, 2025

View reviewed changes

pre commit pass test

ea1fde3

pre commit pass test

01eb4c5

windreamer requested changes Jul 21, 2025

View reviewed changes

lmdeploy/pytorch/kernels/cuda/flashattention.py Outdated Show resolved Hide resolved

nicer and readable code

6df499b

windreamer approved these changes Jul 23, 2025

View reviewed changes

windreamer requested review from grimoire and lvhan028 July 23, 2025 17:14

Merge branch 'InternLM:main' into feature/pytorch_engine_qwen2_5vl_sm120

2509117

grimoire reviewed Jul 24, 2025

View reviewed changes

lmdeploy/pytorch/kernels/cuda/flashattention.py Outdated Show resolved Hide resolved

grimoire reviewed Jul 24, 2025

View reviewed changes

fix block_m, bock_n error

85c972d

grimoire approved these changes Jul 24, 2025

View reviewed changes

lvhan028 added the enhancement New feature or request label Jul 24, 2025

lvhan028 merged commit 5bea8f5 into InternLM:main Jul 24, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add pytorch_engine_qwen2_5vl_sm120#3750

feat: add pytorch_engine_qwen2_5vl_sm120#3750
lvhan028 merged 6 commits intoInternLM:mainfrom
kolmogorov-quyet:feature/pytorch_engine_qwen2_5vl_sm120

kolmogorov-quyet commented Jul 19, 2025

Uh oh!

windreamer Jul 19, 2025

Uh oh!

windreamer commented Jul 19, 2025

Uh oh!

kolmogorov-quyet commented Jul 19, 2025

Uh oh!

Uh oh!

Uh oh!

grimoire Jul 24, 2025

Uh oh!

lvhan028 Jul 24, 2025

Uh oh!

grimoire left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kolmogorov-quyet commented Jul 19, 2025

Motivation

Modifications

Backward Compatibility

Use Cases

Checklist

Uh oh!

windreamer Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

windreamer commented Jul 19, 2025

Uh oh!

kolmogorov-quyet commented Jul 19, 2025

Uh oh!

Uh oh!

Uh oh!

grimoire Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

lvhan028 Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

grimoire left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants