Skip to content

rookiemann/llama-cpp-python-py314-cuda131-wheel-or-python314-llama-cpp-gpu-wheel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

llama-cpp-python GPU Wheel for Python 3.14 (CUDA 13.1)

Fully working GPU-accelerated wheel for llama-cpp-python==0.3.16 on Python 3.14 (Windows amd64).

Built December 17, 2025 with:

  • CUDA Toolkit 13.1 (latest)
  • Full CUDA graph support
  • Tested: ~85 tokens/second on Llama 3 8B Q4_K_M (RTX 3090)

https://github.com/rookiemann/llama-cpp-python-py314-cuda131-wheel-or-python314-llama-cpp-gpu-wheel/releases/tag/v0.3.16-cuda13.1-py3.14

Install

pip install llama_cpp_python-0.3.16-cp314-cp314-win_amd64.whl