What's Changed
- Build the deepep package with the chip model included. by @oagniqgnat in #274
- fix:buffer control by @Yael-X in #361
- Revert " Build the deepep package with the chip model included." by @kaniel-outis in #363
- reset ci -- run test mixed running for experts on a2. by @zhuyutong332 in #365
- adapt ant moving to A2 single machine by @luanyundu in #362
- Fix the bug that total expert num greater than 256 or local expert num is less than 8 by @luanyundu in #364
- CI execution requirements for separating a2 and a3 by @zhuyutong332 in #367
- support qwen3.5 by @chenxu214 in #377
- Update layernorm_gated.py by @chenxu214 in #378
- GLM5 optimize by @cen121212 in #382
- [fix] Handle transposed w13_weight by @gjsheu in #357
- Fix the bug that the layout kernel crashed when the num of experts is no less than 384 by @luanyundu in #383
- revise causal_conv1d: bugfix and enhance accuracy for model kimilinear by @McZyWu in #370
- feat:[fused_sigmoid_gating_delta_rule_update_npu_kernel] support kda feature--to be aligned with sgl-kernel, for model kimi-linear by @McZyWu in #371
- Bump version to 2026.03.01 by @iforgetmyname in #388
New Contributors
- @chenxu214 made their first contribution in #377
- @gjsheu made their first contribution in #357
- @McZyWu made their first contribution in #370
Full Changelog: 2026.02.01...2026.03.01