- Clone the BiLLM repository and navigate to the BiLLM folder
git https://github.com/Aaronhuang-778/BiLLM.git
cd BiLLM- Dependence
torch==2.0.1+cu117
transformers==4.51.3
datasets==2.14.6
huggingface-hub==0.16.4Before starting quantization, you need to replace some files to support the Qwen3 model:
-
Add eval_my directory: Place the eval_my directory at the same level as the BiLLM folder.
-
Replace bigptq.py file: Replace the file at BiLLM/bigptq.py with our modified version.
-
Replace run.py files: Replace the files at BiLLM/run.py with our versions.
CUDA_VISIBLE_DEVICES=0, python3 run.py /Path/to/Qwen3/Qwen3-32B c4 braq --blocksize 128 --save --salient_metric hessian --device "cuda:0" | tee billm_qwen3_14B.log