Skip to content

Qwen3 next#4039

Merged
lvhan028 merged 20 commits intoInternLM:mainfrom
grimoire:qwen3-next
Nov 7, 2025
Merged

Qwen3 next#4039
lvhan028 merged 20 commits intoInternLM:mainfrom
grimoire:qwen3-next

Conversation

@grimoire
Copy link
Copy Markdown
Collaborator

Qwen3-next require kernels from:

https://github.com/Dao-AILab/causal-conv1d
https://github.com/fla-org/flash-linear-attention

We need env check for different model-device combinations.

@lvhan028 lvhan028 requested a review from windreamer October 15, 2025 07:47
@lvhan028 lvhan028 added the enhancement New feature or request label Oct 15, 2025
Copy link
Copy Markdown
Collaborator

@windreamer windreamer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider to integrate ssm cache pool in PD migration request?

@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Nov 1, 2025

opencompass evaluation failed

2025-11-01 22:35:12,565 - lmdeploy - ERROR - engine.py:1234 - exception happened: <class 'IndexError'> index 1032 is out of bounds for axis 0 with size 1032
Traceback (most recent call last):
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 1229, in async_loop
    await self._async_loop_main(resp_que=resp_que,
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 1128, in _async_loop_main
    forward_inputs, next_running = await inputs_maker.prefetch_next_inputs()
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 320, in prefetch_next_inputs
    return await self._send_next_inputs_impl(prefill, True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 285, in _send_next_inputs_impl
    forward_inputs = self._make_forward_inputs(prefill, enable_empty)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 227, in _make_forward_inputs
    return self.engine._make_forward_inputs(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/engine/engine.py", line 897, in _make_forward_inputs
    scheduler_output = scheduler.schedule(is_prefill=prefill, prealloc_size=prealloc_size)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/scheduler.py", line 299, in schedule
    output = self._schedule_prefill(0)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/utils.py", line 271, in __func_warpper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/scheduler.py", line 232, in _schedule_prefill
    if not __evict_for_seq(seq, waiting):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/scheduler.py", line 213, in __evict_for_seq
    return eviction_helper.evict_for_seq(seq, evictable, prealloc_size)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/eviction_helper/recompute_eviction_helper.py", line 74, in _evict_for_ssm
    state_manager.free(evict_seq)
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/state_manager.py", line 55, in free
    self.allocator.free(seq.logical_state)
  File "/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/paging/state_manager.py", line 29, in free
    self._free_states[num_used] = state_id
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^
IndexError: index 1032 is out of bounds for axis 0 with size 1032

serving:

 lmdeploy serve api_server Qwen/Qwen3-Next-80B-A3B-Instruct --tp 4

@lvhan028 lvhan028 self-requested a review November 1, 2025 14:36
@grimoire
Copy link
Copy Markdown
Collaborator Author

grimoire commented Nov 2, 2025

Fixed

@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Nov 3, 2025

The OC evaluation test failed due to an issue where quite a few prompts from aime2025 dataset became trapped in repetition.

@grimoire
Copy link
Copy Markdown
Collaborator Author

grimoire commented Nov 3, 2025

Fixed

@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Nov 5, 2025

mmlu_pro got lower accuracy than opencompass academic leaderboard report

dataset                       version    metric                        mode    qwe3-next-instruct
----------------------------  ---------  ----------------------------  ------  --------------------
core_average                  -          -                             -       -
                              -          -                             -       -
Instruction Following         -          -                             -       -
IFEval                        353ae7     Prompt-level-strict-accuracy  gen     80.22
                              -          -                             -       -
General Reasoning             -          -                             -       -
hle_llmjudge                  6ff468     accuracy                      gen     8.90
GPQA_diamond_repeat_4         772ea0     accuracy (4 runs average)     gen     74.49
                              -          -                             -       -
Math Calculation              -          -                             -       -
aime2025_repeat_32            5e9f4f     accuracy (32 runs average)    gen     70.00
                              -          -                             -       -
Knowledge                     -          -                             -       -
mmlu_pro                      -          naive_average                 gen     79.51
                              -          -                             -       -
Code                          -          -                             -       -
lcb_code_generation_repeat_6  -          -                             -       -

from https://rank.opencompass.org.cn/leaderboard-llm-academic/?m=REALTIME
Qwen3-Next-80B-A3B-Instruct: IFEval(87.6), mmlu_pro(81.3), GPQA_diamond(74.1), HLE(8), aime2025(69.2),

@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Nov 7, 2025

dataset                       version    metric                        mode    qwe3-next-instruct
----------------------------  ---------  ----------------------------  ------  --------------------
core_average                  -          naive_average                 gen     62.64
                              -          -                             -       -
Instruction Following         -          -                             -       -
IFEval                        353ae7     Prompt-level-strict-accuracy  gen     87.43
                              -          -                             -       -
General Reasoning             -          -                             -       -
hle_llmjudge                  6ff468     accuracy                      gen     9.04
GPQA_diamond_repeat_4         772ea0     accuracy (4 runs average)     gen     73.36
                              -          -                             -       -
Math Calculation              -          -                             -       -
aime2025_repeat_32            5e9f4f     accuracy (32 runs average)    gen     69.38
                              -          -                             -       -
Knowledge                     -          -                             -       -
mmlu_pro                      -          naive_average                 gen     81.19
                              -          -                             -       -
Code                          -          -                             -       -
lcb_code_generation_repeat_6  b5b6c5     pass@1 (6 runs average)       gen     55.43

@lvhan028 lvhan028 merged commit 281e101 into InternLM:main Nov 7, 2025
5 checks passed
@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Nov 7, 2025

cc @zhulinJulia24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants