mamba2

Here are 9 public repositories matching this topic...

pathcosmos / EVAFRILL-Mo

Hybrid Mamba-2 + Transformer 2.94B LLM (Nemotron-H style) — Korean 3B model pretrained from scratch on 7× NVIDIA B200 GPUs with SFT + DPO alignment

transformer sft dpo pretraining fp8 korean-llm nemotron hybrid-architecture mamba2 nvidia-b200

Updated Mar 26, 2026
Python

gxcsoccer / alloy

Star

Hybrid SSM-Attention language model on Apple Silicon with MLX — interleaving Mamba-2 and Transformer for efficient inference

python machine-learning deep-learning transformer attention language-model ssm mamba hybrid-model mlx state-space-model apple-silicon llm mamba2

Updated Mar 29, 2026
Python

This is a complete testing and construction project for a recurrent small-parameter language model based on the Mamba2 architecture.这是一个完整的基于mamba2架构的循环小参数语言模型的测试与构建项目.And it try to be built with a Mano Optimiters.Mano is a new Optimister

mano mamba2

Updated Feb 7, 2026
Python

Harp404 / lume-hybrid

Star

A 390M-parameter Mamba2 + Differential Attention hybrid language model

nlp deep-learning language-model hybrid-architecture mamba2 differential-attention pytorch-state-space-model

Updated May 8, 2026
Python

1634594707 / VerDee

Star

VerDee: Vertical Deep Network on Mamba-2 — staged LoRA (shallow/mid/deep), early-exit routing, and domain experts. Pilot on 370M; 16GB GPU friendly. Research/experiment repo.

lora state-space-model peft early-exit huggingface qlora llm-finetuning mamba2 vertical-deep-network

Updated May 25, 2026
Python

Dopove / Glacier

Star

GLACIER: Mamba with infinite memory. This project integrates the Mamba SSM with ICE-Lite, a virtual memory engine, to solve context rot. By adding persistent, time-aware memory, GLACIER gives Mamba the long-term recall of a Transformer while retaining its $O(N)$ speed. Apache 2.0 licensed, by Dopove.

Updated May 22, 2026
Python

varad-more / mamba-from-scratch

Star

Build Your Own Mamba — From Math to Metal

pytorch triton from-scratch ssm mamba state-space-models selective-scan mamba2

Updated Apr 22, 2026
Python

DONGRYEOLLEE1 / lm-architecture-lab

Star

A research notebook on competing sequence modeling paradigms: implement, benchmark, and understand the architectural bets.

benchmark transformer mps mamba xlstm lfm2 mamba2 gemma4

Updated Apr 27, 2026
Python

wisnunugroho21 / nugie-jax-mamba

Star

A simple, minimalistic, and explainable JAX implementation of Mamba 2 & Mamba 3

deep-learning neural-network transformer mamba jax llm mamba-state-space-models mamba-2 mamba2 mamba-ssm mamba-3

Updated May 10, 2026
Python

Improve this page

Add a description, image, and links to the mamba2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mamba2 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mamba2

Here are 9 public repositories matching this topic...

pathcosmos / EVAFRILL-Mo

gxcsoccer / alloy

hujiyo / Pawlette

Harp404 / lume-hybrid

1634594707 / VerDee

Dopove / Glacier

varad-more / mamba-from-scratch

DONGRYEOLLEE1 / lm-architecture-lab

wisnunugroho21 / nugie-jax-mamba

Improve this page

Add this topic to your repo