Stars
This repository provides the official implementation of CodeQuant (ICLR, 2026), a unified clustering and quantization framework for Mixture-of-Experts (MoE) Large Language Models (LLMs), addressing…
raullenchai / awesome-mlx
Forked from antranapp/awesome-mlxA curated list of awesome projects, tools, and resources for Apple MLX — the ML framework for Apple Silicon
[CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
REAP expert pruning for MoE LLMs on Apple Silicon via MLX
REAP: Router-weighted Expert Activation Pruning for SMoE compression
An MLX port of Meta's Coconut reasoning model
Robust recipes to align language models with human and AI preferences
The official implementation of Self-Play Fine-Tuning (SPIN)
The AILuminate v1.1 benchmark suite is an AI risk assessment benchmark developed with broad involvement from leading AI companies, academia, and civil society.
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
MoBA: Mixture of Block Attention for Long-Context LLMs
Optimizing inference proxy for LLMs
A Zoom Team Chat bot that combines Cerebras' Llama 3.1-8b model with Exa search capabilities to provide intelligent responses. The bot can search for current information when needed and maintain co…
Effective LLM Alignment Toolkit
Official repository of Sparse ISO-FLOP Transformations for Maximizing Training Efficiency
2-2000x faster ML algos, 50% less memory usage, works on all hardware - new and old.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
torchtrail: trace the graph of torch functions and modules for visualization, reports, etc
Minimalistic large language model 3D-parallelism training
OCR, layout analysis, reading order, table recognition in 90+ languages
Benchmark of Apple MLX operations on all Apple Silicon chips (GPU, CPU) + MPS and CUDA.
Port of Andrej Karpathy's nanoGPT to Apple MLX framework.



