fix qwen3-vl-moe long context by grimoire · Pull Request #4342 · InternLM/lmdeploy

grimoire · 2026-02-09T13:09:02Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

Copilot

Pull request overview

This PR aims to fix long-context generation behavior by ensuring updated per-step model_metas (e.g., sequence length tracking) are correctly propagated through the PyTorch engine after a forward pass.

Changes:

Add a context.is_model_meta_updated flag in InternVL paths that (re)initialize context.model_metas.
Change the engine’s model_forward logic to only take model_metas from context when is_model_meta_updated is set.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
`lmdeploy/pytorch/models/internvl.py`	Marks `context.model_metas` as “updated” in several InternVL Flash/long-context paths.
`lmdeploy/pytorch/engine/model_agent/agent.py`	Alters how `model_forward` decides whether to return `model_metas` from `update_model_metas()` vs `context.model_metas`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T13:13:16Z

lmdeploy/pytorch/models/internvl.py

            prev_lens = [meta.get('new_seqlen', 0) for meta in context.model_metas]

            for idx, meta in enumerate(context.model_metas):
                meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})


context.is_model_meta_updated is only set when initializing context.model_metas, but in the has_model_metas branch you mutate context.model_metas in-place (meta.update(...)) without setting the flag. With the updated model_forward logic (agent now relies on is_model_meta_updated), these in-place updates won’t be propagated to output['model_metas'], which can break long-context segmented prefill where new_seqlen must carry across segments. Set context.is_model_meta_updated = True whenever new_seqlen is changed, including the in-place update path.

Suggested change

meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})

meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})

context.is_model_meta_updated = True

Copilot · 2026-02-09T13:13:17Z

lmdeploy/pytorch/models/internvl.py

                else:
                    # init model metas
                    context.model_metas = [{'new_seqlen': seqlen} for seqlen in seq_lengths.tolist()]
+                    context.is_model_meta_updated = True


In the long-context prefill path, context.model_metas is updated in the has_model_metas branch above via in-place meta.update(...), but context.is_model_meta_updated is only set when initializing model metas here. With model_forward now keying off is_model_meta_updated, the updated new_seqlen may not be returned to the scheduler on subsequent segments. Set the flag when new_seqlen is updated in the has_model_metas branch as well (or otherwise ensure updated metas are returned).

Copilot · 2026-02-09T13:13:17Z

lmdeploy/pytorch/engine/model_agent/agent.py

-            if context.model_metas is not None and context.model_metas[0] is not None:
+            # if context.model_metas is not None and context.model_metas[0] is not None:
+            if getattr(context, 'is_model_meta_updated', False):
                model_metas = context.model_metas


Switching the post-forward override to only use context.model_metas when context.is_model_meta_updated is true can drop model_metas for models that don’t implement update_model_metas (it returns None) and rely on context.model_metas being carried through the engine. Consider adding a safe fallback such as using context.model_metas when model_metas is None and context.model_metas is present/non-empty, in addition to (or instead of) the flag, to avoid regressions.

Suggested change

model_metas = context.model_metas

model_metas = context.model_metas

elif model_metas is None:

# Fallback: if update_model_metas returned None but context has model_metas,

# reuse context.model_metas to avoid dropping metadata for models

# that rely on it being carried through the engine.

ctx_model_metas = getattr(context, 'model_metas', None)

if ctx_model_metas:

model_metas = ctx_model_metas

Copilot · 2026-02-09T13:13:17Z

lmdeploy/pytorch/engine/model_agent/agent.py

            # InternVL-3.5-Flash will change the seqlen, model_metas during forward
-            if context.model_metas is not None and context.model_metas[0] is not None:
+            # if context.model_metas is not None and context.model_metas[0] is not None:
+            if getattr(context, 'is_model_meta_updated', False):


The commented-out if context.model_metas ... line is dead code now and makes the control flow harder to read. Consider removing it and updating the surrounding comment to document the new intended behavior (when context.model_metas should override update_model_metas).

Copilot · 2026-02-09T13:13:18Z

lmdeploy/pytorch/engine/model_agent/agent.py

            # InternVL-3.5-Flash will change the seqlen, model_metas during forward
-            if context.model_metas is not None and context.model_metas[0] is not None:
+            # if context.model_metas is not None and context.model_metas[0] is not None:
+            if getattr(context, 'is_model_meta_updated', False):
                model_metas = context.model_metas
            output['model_metas'] = model_metas


The PR description is still the default template and doesn’t explain the motivation/repro for the “qwen3-vl-moe long context” issue or how these model_metas changes address it. Please add a brief reproduction scenario and expected/actual behavior so reviewers can validate the fix and assess risk.

CUHKSZzxy · 2026-02-11T04:22:55Z

lmdeploy/pytorch/engine/model_agent/agent.py

                output = dict(hidden_states=output)
            # InternVL-3.5-Flash will change the seqlen, model_metas during forward
-            if context.model_metas is not None and context.model_metas[0] is not None:
+            # if context.model_metas is not None and context.model_metas[0] is not None:


May delete this line. Tested on qwen3vl and internvl-3.5-flash with a large image context, functionality looks good.

fix qwen3-vl-moe long context

3448d60

Copilot AI review requested due to automatic review settings February 9, 2026 13:09

Copilot started reviewing on behalf of grimoire February 9, 2026 13:09 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

update comment

aae75ac

lvhan028 requested a review from CUHKSZzxy February 10, 2026 12:13

lvhan028 added the Bug:P1 label Feb 10, 2026

CUHKSZzxy approved these changes Feb 11, 2026

View reviewed changes

cache_config

3be3788

lvhan028 merged commit cb0af3f into InternLM:main Feb 11, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix qwen3-vl-moe long context#4342

fix qwen3-vl-moe long context#4342
lvhan028 merged 3 commits intoInternLM:mainfrom
grimoire:fix-vlm-longcontext

grimoire commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

CUHKSZzxy Feb 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})
	meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})
	context.is_model_meta_updated = True

-                model_metas = context.model_metas
+                model_metas = context.model_metas
+            elif model_metas is None:
+                # Fallback: if update_model_metas returned None but context has model_metas,
+                # reuse context.model_metas to avoid dropping metadata for models
+                # that rely on it being carried through the engine.
+                ctx_model_metas = getattr(context, 'model_metas', None)
+                if ctx_model_metas:
+                    model_metas = ctx_model_metas

Conversation

grimoire commented Feb 9, 2026

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

CUHKSZzxy Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CUHKSZzxy Feb 11, 2026 •

edited

Loading