Skip to content

[quantization] Fix a bug in AffineObserverBase.compute_qparams#634

Merged
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:fix_affine_base
Apr 16, 2026
Merged

[quantization] Fix a bug in AffineObserverBase.compute_qparams#634
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:fix_affine_base

Conversation

@dvsav
Copy link
Copy Markdown
Contributor

@dvsav dvsav commented Apr 15, 2026

What

This change fixes incorrect computation of scale in AffineObserverBase.compute_qparams for asymmetric qscheme.

Why

Symptoms

The issue was detected when testing QuantLayerNorm wrapper for torch.nn.LayerNorm.
I've obtained a surprisingly large divergence between the original model (torch.nn.LayerNorm) and the fake-quantized model QuantLayerNorm:

┌───────────── Quantization Error Summary ─────────────
│ Mean |diff|: 0.695877
│ PEIR       : 45.260801 %
└──────────────────────────────────────────────────────
    ┌────────────────────────────────────────────┐
 4.1┤                                            │
 2.7┤                                            │
    │                                            │
 1.3┤                                            │
-0.1┤                 ••••••••••••••••••••••• •  │
    │  • •••••••••••••••                         │
-1.5┤                                            │
-2.9┤                                            │
    │                                            │
-4.3┤                                            │
    └┬──────────┬──────────┬─────────┬──────────┬┘
   -4.3       -2.2       -0.1       2.0       4.1 

Debugging revealed that QuantLayerNorm was incorrectly quantizing the variance of the input tensor it was trying to normalize:

# tico/quantization/wrapq/wrappers/nn/quant_layernorm.py

# 4) variance (via squared mean)
v = s_q.mean(dim=dims, keepdim=True)
v_q = self._fq(v, self.obs_var) # <-- almost all values of v_q were clamped to the same value

Note that variance is usually strictly positive (which is one of the bug contitions - see next).

The Bug

  • File: tico/quantization/wrapq/observers/affine_base.py
  • Function: AffineObserverBase.fake_quant

The bug manifests itself when 2 conditions are met simultaneously:

  1. observer.qscheme.is_symmetric() returns False. For example when you specify default_qscheme=QScheme.PER_TENSOR_ASYMM or default_qscheme=QScheme.PER_CHANNEL_ASYMM in PTQConfig or when you specify unsigned default_dtype in PTQConfig (or leave it default uint8 which is also unsigned).
  2. The range of observed variable doesn't include 0 (all values > 0 or all values < 0).

Calculated zero point then goes outside of the range respresentable by the quantized type and gets clamped which leads to an inconsistent combination of scale and zero point. Here's an example:

fp_min = 1.0
fp_max = 3.55
qmin = 0
qmax = 255
scale = (fp_max - fp_min) / (qmax - qmin) = (3.55 - 1.0) / (255 - 0) = 0.01
zero_point = round(qmin - fp_min / scale) = round(0 - 1.0 / 0.01) = -100 # doesn't fit into qmin...qmax range!
zero_point = zero_point.clamp(qmin, qmax) = 0

# Let's quantize fp_max=3.55
quantized_fp_max = round(fp_max / scale + zero_point) = round(3.55 / 0.01 + 0) = 355 # doesn't fit into qmin...qmax range!
quantized_fp_max = quantized_fp_max.clamp(qmin, qmax) = 355.clamp(0, 255) = 255

# Let's dequantize it back
dequantized_fp_min = scale * (quantized_fp_min - zero_point) = 0.01 * (255 - 0) = 2.55 # not 3.55!

Actually any fp value greater than scale * qmax gets quantized to qmax and therefore clamped to scale * qmax.

The Solution

The simplest solution is to expand the range of observed variable so that it includes 0. This means:

  • if fp_min > 0 replace fp_min with 0.
  • if fp_max < 0 replace fp_max with 0.

Let's revisit the example above with this fix:

fp_min = 1.0
fp_max = 3.55
qmin = 0
qmax = 255

fp_min > 0.0 ==> fp_min := 0.0 # <-- THIS IS THE FIX

scale = (fp_max - fp_min) / (qmax - qmin) = (3.55 - 0.0) / (255 - 0) = 0.014
zero_point = round(qmin - fp_min / scale) = round(0 - 0.0 / 0.014) = 0

# Let's quantize fp_max=3.55
quantized_fp_max = round(fp_max / scale + zero_point) = round(3.55 / 0.014 + 0) = 255

# Let's dequantize it back
dequantized_fp_min = scale * (quantized_fp_min - zero_point) = 0.014 * (255 - 0) = 3.55

Now the original and the dequantized values are the same.

In AffineObserverBase this solution was expressed as the correction of variable range:

# Force the range to include 0
rng = torch.where(0 < self.min_val, self.max_val, rng)
rng = torch.where(0 > self.max_val, -self.min_val, rng)

Unit Tests

The bug has been covered with 4 new regression tests (see below).

BEFORE FIX

$ python -m pytest test/quantization/wrapq/observers/test_affine_base.py -v
======================================================================= test session starts ========================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 12 items                                                                                                                                                 

test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_degenerate_constant_cases PASSED                                         [  8%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_fake_quant_requires_qparams PASSED                                       [ 16%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_load_qparams_and_fake_quant PASSED                                       [ 25%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams PASSED                               [ 33%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams_negative_range FAILED                [ 41%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams_positive_range FAILED                [ 50%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_fake_quant_path PASSED                                       [ 58%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams PASSED                                          [ 66%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams_negative_range FAILED                           [ 75%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams_positive_range FAILED                           [ 83%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_symmetric_qparams PASSED                                      [ 91%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_reset_clears_minmax_and_qparams PASSED                                   [100%]

============================================================================= FAILURES =============================================================================
__________________________________________ TestAffineObserverBase.test_per_channel_asymm_stats_and_qparams_negative_range __________________________________________

self = <test.quantization.wrapq.observers.test_affine_base.TestAffineObserverBase testMethod=test_per_channel_asymm_stats_and_qparams_negative_range>

    def test_per_channel_asymm_stats_and_qparams_negative_range(self):
        # Test per-channel asymmetric quantization with negative-only ranges
        # shape (C=2, N=3)
        x = torch.tensor([[-1.0, -3.0, -2.0], [-4.0, -5.0, -0.5]])
    
        obs = _MinMaxLikeObserver(
            name="pc_asymm_neg",
            dtype=DType.int(5),  # 5-bit signed
            qscheme=QScheme.PER_CHANNEL_ASYMM,
            channel_axis=0,
        )
        obs.collect(x)
    
        self.assertTrue(torch.equal(obs.min_val, torch.tensor([-3.0, -5.0])))
        self.assertTrue(torch.equal(obs.max_val, torch.tensor([-1.0, -0.5])))
    
        scale, zp = obs.compute_qparams()
        qmin, qmax = obs.dtype.qmin, obs.dtype.qmax
        expected_scale = (-obs.min_val) / (qmax - qmin)
        expected_zp = torch.full(size=(x.shape[0],), fill_value=qmax)
    
>       self.assertTrue(torch.allclose(scale, expected_scale, atol=1e-6))
E       AssertionError: False is not true

test/quantization/wrapq/observers/test_affine_base.py:263: AssertionError
__________________________________________ TestAffineObserverBase.test_per_channel_asymm_stats_and_qparams_positive_range __________________________________________

self = <test.quantization.wrapq.observers.test_affine_base.TestAffineObserverBase testMethod=test_per_channel_asymm_stats_and_qparams_positive_range>

    def test_per_channel_asymm_stats_and_qparams_positive_range(self):
        # Test per-channel asymmetric quantization with positive-only ranges
        # shape (C=2, N=3)
        x = torch.tensor([[1.0, 3.0, 2.0], [4.0, 5.0, 0.5]])
    
        obs = _MinMaxLikeObserver(
            name="pc_asymm_pos",
            dtype=DType.int(5),  # 5-bit signed
            qscheme=QScheme.PER_CHANNEL_ASYMM,
            channel_axis=0,
        )
        obs.collect(x)
    
        self.assertTrue(torch.equal(obs.min_val, torch.tensor([1.0, 0.5])))
        self.assertTrue(torch.equal(obs.max_val, torch.tensor([3.0, 5.0])))
    
        scale, zp = obs.compute_qparams()
        qmin, qmax = obs.dtype.qmin, obs.dtype.qmax
        expected_scale = obs.max_val / (qmax - qmin)
        expected_zp = torch.full(size=(x.shape[0],), fill_value=qmin)
    
>       self.assertTrue(torch.allclose(scale, expected_scale, atol=1e-6))
E       AssertionError: False is not true

test/quantization/wrapq/observers/test_affine_base.py:239: AssertionError
_______________________________________________ TestAffineObserverBase.test_per_tensor_asymm_qparams_negative_range ________________________________________________

self = <test.quantization.wrapq.observers.test_affine_base.TestAffineObserverBase testMethod=test_per_tensor_asymm_qparams_negative_range>

    def test_per_tensor_asymm_qparams_negative_range(self):
        # Test per-tensor asymmetric quantization with negative-only range
        obs = _MinMaxLikeObserver(name="pt_asymm_neg", dtype=DType.uint(4))
        obs.collect(torch.tensor([-4.0, -3.0, -2.0]))
        obs.collect(torch.tensor([-1.0]))
    
        self.assertEqual(obs.min_val.item(), -4.0)
        self.assertEqual(obs.max_val.item(), -1.0)
    
        scale, zp = obs.compute_qparams()
        qmin, qmax = obs.dtype.qmin, obs.dtype.qmax
        expected_scale = 4.0 / (qmax - qmin)
        expected_zp = qmax
    
>       self.assertAlmostEqual(scale.item(), expected_scale, places=6)
E       AssertionError: 0.20000000298023224 != 0.26666666666666666 within 6 places (0.06666666368643442 difference)

test/quantization/wrapq/observers/test_affine_base.py:215: AssertionError
_______________________________________________ TestAffineObserverBase.test_per_tensor_asymm_qparams_positive_range ________________________________________________

self = <test.quantization.wrapq.observers.test_affine_base.TestAffineObserverBase testMethod=test_per_tensor_asymm_qparams_positive_range>

    def test_per_tensor_asymm_qparams_positive_range(self):
        # Test per-tensor asymmetric quantization with positive-only range
        obs = _MinMaxLikeObserver(name="pt_asymm_pos", dtype=DType.uint(4))
        obs.collect(torch.tensor([1.0, 2.0, 3.0]))
        obs.collect(torch.tensor([4.0]))
    
        self.assertEqual(obs.min_val.item(), 1.0)
        self.assertEqual(obs.max_val.item(), 4.0)
    
        scale, zp = obs.compute_qparams()
        qmin, qmax = obs.dtype.qmin, obs.dtype.qmax
        expected_scale = 4.0 / (qmax - qmin)
        expected_zp = 0
    
>       self.assertAlmostEqual(scale.item(), expected_scale, places=6)
E       AssertionError: 0.20000000298023224 != 0.26666666666666666 within 6 places (0.06666666368643442 difference)

test/quantization/wrapq/observers/test_affine_base.py:198: AssertionError
===================================================================== short test summary info ======================================================================
FAILED test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams_negative_range - AssertionError: False is not true
FAILED test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams_positive_range - AssertionError: False is not true
FAILED test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams_negative_range - AssertionError: 0.20000000298023224 != 0.26666666666666666 within 6 places (0.06666666368643442 difference)
FAILED test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams_positive_range - AssertionError: 0.20000000298023224 != 0.26666666666666666 within 6 places (0.06666666368643442 difference)
=================================================================== 4 failed, 8 passed in 1.58s ====================================================================

AFTER FIX

$ python -m pytest test/quantization/wrapq/observers/test_affine_base.py -v
======================================================================= test session starts ========================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 12 items                                                                                                                                                 

test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_degenerate_constant_cases PASSED                                         [  8%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_fake_quant_requires_qparams PASSED                                       [ 16%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_load_qparams_and_fake_quant PASSED                                       [ 25%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams PASSED                               [ 33%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams_negative_range PASSED                [ 41%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_asymm_stats_and_qparams_positive_range PASSED                [ 50%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_channel_fake_quant_path PASSED                                       [ 58%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams PASSED                                          [ 66%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams_negative_range PASSED                           [ 75%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_asymm_qparams_positive_range PASSED                           [ 83%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_per_tensor_symmetric_qparams PASSED                                      [ 91%]
test/quantization/wrapq/observers/test_affine_base.py::TestAffineObserverBase::test_reset_clears_minmax_and_qparams PASSED                                   [100%]

======================================================================== 12 passed in 1.56s ========================================================================

This change fixes incorrect computation of scale in AffineObserverBase.compute_qparams for asymmetric qscheme.

TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
Copy link
Copy Markdown
Contributor

@mhs4670go mhs4670go Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvsav

Great. I've missed that consideration when I implemented this funciton. This looks like a good fix to address the asymmetric qparam issue when the observed range does not include zero.

One potential concern is that expanding the range to include zero may significantly increase the scale when the original range is far from zero, which could reduce precision. It might be worth considering making this behavior configurable. I think it would be good to track that separately as a follow-up issue.

Copy link
Copy Markdown
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit e1f63c7 into Samsung:main Apr 16, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants