Skip to content

Support group router for moe models#4120

Merged
lvhan028 merged 2 commits intoInternLM:mainfrom
RunningLeon:group-router
Nov 13, 2025
Merged

Support group router for moe models#4120
lvhan028 merged 2 commits intoInternLM:mainfrom
RunningLeon:group-router

Conversation

@RunningLeon
Copy link
Copy Markdown
Collaborator

Motivation

Support group router for moe models

Use cases (Optional)

pipeline

from lmdeploy import pipeline, GenerationConfig, PytorchEngineConfig

if __name__ == '__main__':
    backend_config = PytorchEngineConfig(hf_overrides=dict(router_n_groups=4))
    model_path = 'Qwen/Qwen3-30B-A3B'
    pipe = pipeline(model_path, backend_config=backend_config)

    resps = pipe(['Hi.'])
    for res in resps:
        print(res)

api server

lmdeploy serve api_server \
Qwen/Qwen3-30B-A3B \
--backend pytorch \
--hf-overrides '{"router_n_groups": 4}'

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
  3. If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@lvhan028 lvhan028 added the enhancement New feature or request label Nov 12, 2025
@lvhan028 lvhan028 requested a review from grimoire November 13, 2025 03:40
@lvhan028 lvhan028 merged commit bf52c04 into InternLM:main Nov 13, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants