-
OpenLLM: An open platform for operating large language models (LLMs) in production.
-
Triton
-
Text Generation Inference:https://github.com/huggingface/text-generation-inference
-
FastTransformer:https://github.com/NVIDIA/FasterTransformer
-
LLM Accelerator:https://github.com/microsoft/LMOps
-
ZhiLight(知乎&面壁智能):https://github.com/zhihu/ZhiLight
-
LMDeploy,TurboMind
-
AWQ、AutoAWQ
llama.cpp、vllm、lightLLM、fastLLM