Stars
MoE training for Me and You and maybe other people
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Accelerating MoE with IO and Tile-aware Optimizations
Utilities for writing performant, readable Triton and Gluon kernels
Unofficial description of the CUDA assembly (SASS) instruction sets.
LauzHack Deep Learning Bootcamp

