Community maintained hardware plugin for vLLM on AWS Neuron
-
Updated
May 22, 2026 - Python
Community maintained hardware plugin for vLLM on AWS Neuron
AWS TorchNeuron Deep Learning Projects using Trainium1 Instances
A hybrid testbed for evaluating top open-source LLMs (like gpt-oss-20b and Llama 3.3) on local, cloud GPUs, and AWS Inferentia2/Trainium instances, focusing on vLLM optimization, capacity management, kernel bypass, hardware-software co-design, as well as supporting infrastructure such as NCCL, RDMA, NVMeoF.
Add a description, image, and links to the trainium topic page so that developers can more easily learn about it.
To associate your repository with the trainium topic, visit your repo's landing page and select "manage topics."