Skip to content

fix qwen3-vl-moe long context#4342

Merged
lvhan028 merged 3 commits intoInternLM:mainfrom
grimoire:fix-vlm-longcontext
Feb 11, 2026
Merged

fix qwen3-vl-moe long context#4342
lvhan028 merged 3 commits intoInternLM:mainfrom
grimoire:fix-vlm-longcontext

Conversation

@grimoire
Copy link
Copy Markdown
Collaborator

@grimoire grimoire commented Feb 9, 2026

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
  3. If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

Copilot AI review requested due to automatic review settings February 9, 2026 13:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix long-context generation behavior by ensuring updated per-step model_metas (e.g., sequence length tracking) are correctly propagated through the PyTorch engine after a forward pass.

Changes:

  • Add a context.is_model_meta_updated flag in InternVL paths that (re)initialize context.model_metas.
  • Change the engine’s model_forward logic to only take model_metas from context when is_model_meta_updated is set.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
lmdeploy/pytorch/models/internvl.py Marks context.model_metas as “updated” in several InternVL Flash/long-context paths.
lmdeploy/pytorch/engine/model_agent/agent.py Alters how model_forward decides whether to return model_metas from update_model_metas() vs context.model_metas.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

prev_lens = [meta.get('new_seqlen', 0) for meta in context.model_metas]

for idx, meta in enumerate(context.model_metas):
meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context.is_model_meta_updated is only set when initializing context.model_metas, but in the has_model_metas branch you mutate context.model_metas in-place (meta.update(...)) without setting the flag. With the updated model_forward logic (agent now relies on is_model_meta_updated), these in-place updates won’t be propagated to output['model_metas'], which can break long-context segmented prefill where new_seqlen must carry across segments. Set context.is_model_meta_updated = True whenever new_seqlen is changed, including the in-place update path.

Suggested change
meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})
meta.update({'new_seqlen': prev_lens[idx] + new_seqlens[idx]})
context.is_model_meta_updated = True

Copilot uses AI. Check for mistakes.
Comment on lines +895 to +898
else:
# init model metas
context.model_metas = [{'new_seqlen': seqlen} for seqlen in seq_lengths.tolist()]
context.is_model_meta_updated = True
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long-context prefill path, context.model_metas is updated in the has_model_metas branch above via in-place meta.update(...), but context.is_model_meta_updated is only set when initializing model metas here. With model_forward now keying off is_model_meta_updated, the updated new_seqlen may not be returned to the scheduler on subsequent segments. Set the flag when new_seqlen is updated in the has_model_metas branch as well (or otherwise ensure updated metas are returned).

Copilot uses AI. Check for mistakes.
if context.model_metas is not None and context.model_metas[0] is not None:
# if context.model_metas is not None and context.model_metas[0] is not None:
if getattr(context, 'is_model_meta_updated', False):
model_metas = context.model_metas
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching the post-forward override to only use context.model_metas when context.is_model_meta_updated is true can drop model_metas for models that don’t implement update_model_metas (it returns None) and rely on context.model_metas being carried through the engine. Consider adding a safe fallback such as using context.model_metas when model_metas is None and context.model_metas is present/non-empty, in addition to (or instead of) the flag, to avoid regressions.

Suggested change
model_metas = context.model_metas
model_metas = context.model_metas
elif model_metas is None:
# Fallback: if update_model_metas returned None but context has model_metas,
# reuse context.model_metas to avoid dropping metadata for models
# that rely on it being carried through the engine.
ctx_model_metas = getattr(context, 'model_metas', None)
if ctx_model_metas:
model_metas = ctx_model_metas

Copilot uses AI. Check for mistakes.
Comment on lines 181 to +183
# InternVL-3.5-Flash will change the seqlen, model_metas during forward
if context.model_metas is not None and context.model_metas[0] is not None:
# if context.model_metas is not None and context.model_metas[0] is not None:
if getattr(context, 'is_model_meta_updated', False):
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out if context.model_metas ... line is dead code now and makes the control flow harder to read. Consider removing it and updating the surrounding comment to document the new intended behavior (when context.model_metas should override update_model_metas).

Copilot uses AI. Check for mistakes.
Comment on lines 181 to 185
# InternVL-3.5-Flash will change the seqlen, model_metas during forward
if context.model_metas is not None and context.model_metas[0] is not None:
# if context.model_metas is not None and context.model_metas[0] is not None:
if getattr(context, 'is_model_meta_updated', False):
model_metas = context.model_metas
output['model_metas'] = model_metas
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description is still the default template and doesn’t explain the motivation/repro for the “qwen3-vl-moe long context” issue or how these model_metas changes address it. Please add a brief reproduction scenario and expected/actual behavior so reviewers can validate the fix and assess risk.

Copilot uses AI. Check for mistakes.
output = dict(hidden_states=output)
# InternVL-3.5-Flash will change the seqlen, model_metas during forward
if context.model_metas is not None and context.model_metas[0] is not None:
# if context.model_metas is not None and context.model_metas[0] is not None:
Copy link
Copy Markdown
Collaborator

@CUHKSZzxy CUHKSZzxy Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May delete this line. Tested on qwen3vl and internvl-3.5-flash with a large image context, functionality looks good.

@lvhan028 lvhan028 merged commit cb0af3f into InternLM:main Feb 11, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants