Skip to content

[RoadMap] SGL-Kernel-NPU binary release #258

@iforgetmyname

Description

@iforgetmyname

Background

We are happy to announce that SGL-Kernel-NPU has achieved 100k+ lines of code 🎉

However, this arises the problem that compilation takes quite a long time, which is not affordable for prepare stage in pr-test-npu of the SGLang repository. To fix this problem, SGL-Kernel-NPU will gradually switch to binary release for now on, and here's the roadmap for this project.

Naming Conventions

SGL-Kernel-NPU will be released in zipped packages containing three different python package releases:

  • DeepEP on NPU
  • Torch-Memory-Saver on NPU
  • SGL-Kernel-NPU

Depending on our current package dependencies, the zipped package will be named after:

sgl-kernel-npu-torch{TORCH_VERSION}-py{PYTHON_VERSION}-cann{CANN_VERSION}-{NPU_PLATFORM}-{OS_ARCHITECTURE}.zip

e.g.
sgl-kernel-npu-torch2.8.0-py311-cann8.5.0-910b-aarch64.zip

Key Milestones

  • Uniforms open-source licenses
  • Uniforms releasing version with YYYY.MM.DD
  • Switches to pyproject.toml package description and deprecates outdated setup.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions