DBF (Double Binary Factorization)

DBF is an extreme compression method that approximates weight matrices using binary factors, achieving approximately 1.5-bit quantization.

Algorithm

DBF decomposes each weight matrix (W) as:

[ W \approx A \cdot \text{diag}(d) \cdot B ]

where:

(A \in {-1, +1}^{m \times k}) is a binary matrix
(d \in \mathbb{R}^k) is a scaling vector
(B \in {-1, +1}^{k \times n}) is a binary matrix

This factorization drastically reduces the storage requirement since the binary matrices only need 1 bit per element. The scaling vector (d) provides the necessary dynamic range.

The optimization is performed using ADMM (Alternating Direction Method of Multipliers) with optional weight balancing.

Parameters

Parameter	Type	Description	Default
`target_bits`	`float`	Target bit-width (e.g., 1.5)	—
`iters`	`int`	Number of ADMM optimization iterations	`100`
`reg`	`float`	Regularization coefficient	`1.0`
`use_balancing`	`bool`	Apply weight balancing before factorization	`True`
`balance_iters`	`int`	Number of balancing iterations	`20`
`balance_alpha`	`float`	Balancing alpha parameter	`0.5`

Usage

from onecomp import ModelConfig, Runner
from onecomp.quantizer.dbf import DBF

model_config = ModelConfig(
    model_id="meta-llama/Llama-2-7b-hf",
    device="cuda:0",
)

dbf = DBF(target_bits=1.5, iters=100, use_balancing=True)

runner = Runner(model_config=model_config, quantizer=dbf)
runner.run()

Save and Load

DBF models can be saved in a format compatible with the OneComp loader:

runner.save_quantized_model("./output/dbf_model")

# Load later
from onecomp import load_quantized_model
model, tokenizer = load_quantized_model("./output/dbf_model")

The DoubleBinaryLinear inference layer implements a 5-stage pipeline: scaling0 -> binary_B -> scaling2 -> binary_A -> scaling4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DBF (Double Binary Factorization)

Algorithm

Parameters

Usage

Save and Load

FilesExpand file tree

dbf.md

Latest commit

History

dbf.md

File metadata and controls

DBF (Double Binary Factorization)

Algorithm

Parameters

Usage

Save and Load