Skip to content

Pull requests: OpenSparseLLMs/Linear-MoE

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Param Counts in Qwen2
#17 by ysngki was merged Mar 25, 2025 Loading…
Support megatron model evaluation using opencompass
#2 by LanDisen was merged Oct 12, 2024 Loading…
[Eval] eval linear_attention linear moe
#3 by LanDisen was merged Oct 17, 2024 Loading…
add inference_params in linear_rnn
#4 by LanDisen was merged Oct 18, 2024 Loading…
add megablocks support for linear_moe_qwen2
#5 by Spico197 was merged Oct 29, 2024 Loading…
Support expert parallel for Qwen2-MoE with MegaBlocks
#6 by Spico197 was merged Nov 12, 2024 Loading…
implement four types of mixattn
#7 by JusenD was merged Dec 9, 2024 Loading…
[Llama3] Able to run Llama3, support Mixattn in Llama3
#8 by JusenD was merged Dec 11, 2024 Loading…
Support lm-evaluation-harness for linear-moe
#9 by LanDisen was merged Dec 12, 2024 Loading…
pick and merge lasp2 related code into linear-moe
#10 by weigao266 was merged Dec 12, 2024 Loading…
Pick and merge Llama3 branch
#11 by weigao266 was merged Dec 26, 2024 Loading…
support mistral
#13 by JusenD was merged Feb 6, 2025 Loading…
[MoM][Gated Deltanet] Support MoM and gated deltanet
#15 by JusenD was merged Mar 5, 2025 Loading…
Adding RWKV-7 implementation to the Linear RNN model.
#19 by MerCury-Orbit was merged May 8, 2025 Loading…
docs: update README.md
#14 by eltociear was closed Aug 20, 2025 Loading…
ProTip! Updated in the last three days: updated:>2026-02-17.