Skip to content

2026.03.01

Latest

Choose a tag to compare

@iforgetmyname iforgetmyname released this 01 Mar 15:01
· 5 commits to main since this release
70806de

What's Changed

  • Build the deepep package with the chip model included. by @oagniqgnat in #274
  • fix:buffer control by @Yael-X in #361
  • Revert " Build the deepep package with the chip model included." by @kaniel-outis in #363
  • reset ci -- run test mixed running for experts on a2. by @zhuyutong332 in #365
  • adapt ant moving to A2 single machine by @luanyundu in #362
  • Fix the bug that total expert num greater than 256 or local expert num is less than 8 by @luanyundu in #364
  • CI execution requirements for separating a2 and a3 by @zhuyutong332 in #367
  • support qwen3.5 by @chenxu214 in #377
  • Update layernorm_gated.py by @chenxu214 in #378
  • GLM5 optimize by @cen121212 in #382
  • [fix] Handle transposed w13_weight by @gjsheu in #357
  • Fix the bug that the layout kernel crashed when the num of experts is no less than 384 by @luanyundu in #383
  • revise causal_conv1d: bugfix and enhance accuracy for model kimilinear by @McZyWu in #370
  • feat:[fused_sigmoid_gating_delta_rule_update_npu_kernel] support kda feature--to be aligned with sgl-kernel, for model kimi-linear by @McZyWu in #371
  • Bump version to 2026.03.01 by @iforgetmyname in #388

New Contributors

Full Changelog: 2026.02.01...2026.03.01