VerDee: Vertical Deep Network on Mamba-2 — staged LoRA (shallow/mid/deep), early-exit routing, and domain experts. Pilot on 370M; 16GB GPU friendly. Research/experiment repo.
lora state-space-model peft early-exit huggingface qlora llm-finetuning mamba2 vertical-deep-network
-
Updated
May 25, 2026 - Python