Skip to content

Releases: sgl-project/sgl-kernel-npu

2026.03.01

01 Mar 15:01
70806de

Choose a tag to compare

What's Changed

  • Build the deepep package with the chip model included. by @oagniqgnat in #274
  • fix:buffer control by @Yael-X in #361
  • Revert " Build the deepep package with the chip model included." by @kaniel-outis in #363
  • reset ci -- run test mixed running for experts on a2. by @zhuyutong332 in #365
  • adapt ant moving to A2 single machine by @luanyundu in #362
  • Fix the bug that total expert num greater than 256 or local expert num is less than 8 by @luanyundu in #364
  • CI execution requirements for separating a2 and a3 by @zhuyutong332 in #367
  • support qwen3.5 by @chenxu214 in #377
  • Update layernorm_gated.py by @chenxu214 in #378
  • GLM5 optimize by @cen121212 in #382
  • [fix] Handle transposed w13_weight by @gjsheu in #357
  • Fix the bug that the layout kernel crashed when the num of experts is no less than 384 by @luanyundu in #383
  • revise causal_conv1d: bugfix and enhance accuracy for model kimilinear by @McZyWu in #370
  • feat:[fused_sigmoid_gating_delta_rule_update_npu_kernel] support kda feature--to be aligned with sgl-kernel, for model kimi-linear by @McZyWu in #371
  • Bump version to 2026.03.01 by @iforgetmyname in #388

New Contributors

Full Changelog: 2026.02.01...2026.03.01

2026.02.01.post2

15 Feb 17:00
b7e88d6

Choose a tag to compare

2026.02.01.post2 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: 2026.02.01...2026.02.01.post2

2026.02.01.post1

11 Feb 15:01
c726cd8

Choose a tag to compare

2026.02.01.post1 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: 2026.02.01...2026.02.01.post1

2026.02.01

02 Feb 19:17
ba46a30

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 2025120...2026.02.01

2026.01.28

28 Jan 03:29
2c77463

Choose a tag to compare

2026.01.28 Pre-release
Pre-release

What's Changed

  • Added the low_latency operator API documentation. by @oagniqgnat in #337
  • The environment variable DEEPEP_HCCL_BUFFSIZE is added by @zzx-study in #329
  • chunk_gated_delta_rule_npu output final state by @RuixuanZhang06 in #341
  • support the situation that topk maybe -1 on machine A3 by @luanyundu in #313
  • Add AscendC triangular inverse by @zouzias in #332
  • (test) add solve_tril from upstream by @zouzias in #339
  • [Doc] Improved README.md content and English grammar and integrated the DeepWiki badge for Ask AI by @Mitchell-xiyunfeng in #345
  • add function for deep-ep tests by @zhuyutong332 in #301
  • Support x86_64 and aarch64 binary release by @iforgetmyname in #325

New Contributors

  • @zzx-study made their first contribution in #329
  • @zouzias made their first contribution in #332
  • @Mitchell-xiyunfeng made their first contribution in #345

Full Changelog: 2026.01.21...2026.01.28

2026.01.21

21 Jan 08:06
46b73de

Choose a tag to compare

2026.01.21 Pre-release
Pre-release
Added the verification of num_max_dispatch_tokens_per_rank to the dec…

2026.01.19

19 Jan 04:00
38ad69d

Choose a tag to compare

2026.01.19 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: 2026.01.12...2026.01.19

2026.01.12

12 Jan 11:22
25542f2

Choose a tag to compare

2026.01.12 Pre-release
Pre-release

What's Changed

Full Changelog: 2026.01.09...2026.01.12

2026.01.09

09 Jan 02:59
ea4949d

Choose a tag to compare

2026.01.09 Pre-release
Pre-release

What's Changed

Full Changelog: 2026.01.07...2026.01.09

2026.01.07

07 Jan 08:09
bacee3f

Choose a tag to compare

2026.01.07 Pre-release
Pre-release

What's Changed

  • LoRA: Optimization LoRA kernels and refactoring by @vlserov in #284
  • Support build with cann 8.5 by @BourneSun0527 in #283
  • Added an environment variable to control whether to enable the Combine Ant Migration feature. by @oagniqgnat in #304
  • Supplement A2 doc, software and hardware compatibility info by @zuje123 in #294
  • fix layout numTokensPerExpertTensor partial Initialization bug by @zuje123 in #303

Full Changelog: 2025.12.31...2026.01.07