Name	Name	Last commit message	Last commit date
parent directory ..
imgs	imgs
utils	utils
LICENSE	LICENSE
README.md	README.md
bigptq.py	bigptq.py
binary.py	binary.py
datautils.py	datautils.py
eval_ppl_utils.py	eval_ppl_utils.py
modelutils.py	modelutils.py
requirements.txt	requirements.txt
run.py	run.py

Name

Last commit message

Last commit date

imgs

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs (Qwen3)

Installation Setup

Clone the BiLLM repository and navigate to the BiLLM folder

git https://github.com/Aaronhuang-778/BiLLM.git
cd BiLLM

Dependence

torch==2.0.1+cu117
transformers==4.51.3
datasets==2.14.6
huggingface-hub==0.16.4

File Replacement

Before starting quantization, you need to replace some files to support the Qwen3 model:

Add eval_my directory: Place the eval_my directory at the same level as the BiLLM folder.
Replace bigptq.py file: Replace the file at BiLLM/bigptq.py with our modified version.
Replace run.py files: Replace the files at BiLLM/run.py with our versions.

Quantization and Evaluation

Quantized and evaluate model

CUDA_VISIBLE_DEVICES=0, python3 run.py /Path/to/Qwen3/Qwen3-32B c4 braq --blocksize 128 --save --salient_metric hessian --device "cuda:0" | tee billm_qwen3_14B.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs (Qwen3)

Installation Setup

File Replacement

Quantization and Evaluation

Quantized and evaluate model

FilesExpand file tree

BiLLM

Directory actions

More options

Directory actions

More options

Latest commit

History

BiLLM

Folders and files

parent directory

README.md

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs (Qwen3)

Installation Setup

File Replacement

Quantization and Evaluation

Quantized and evaluate model