Skip to content

debug vllm per-block#456

Open
Charles2530 wants to merge 1 commit intoModelTC:mainfrom
Charles2530:feat/block
Open

debug vllm per-block#456
Charles2530 wants to merge 1 commit intoModelTC:mainfrom
Charles2530:feat/block

Conversation

@Charles2530
Copy link
Copy Markdown
Contributor

debug: original judge will get into per-block when per-channel or per-token quant when export vllm

Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix correctly addresses a logic bug in export_vllm.py. The original config.quant.weight.get('granularity', 'per_block') was effectively always truthy — any other granularity string like 'per_channel' would also evaluate to True, causing this branch to be entered incorrectly for non-per_block configurations.

One subtle behavioral change worth noting: when the 'granularity' key is absent entirely, the old code defaulted to 'per_block' (truthy, entering the branch), while the new code returns None == 'per_block'False (skipping the branch). It's worth confirming whether omitting granularity should imply per_block behavior or fall through — if the former, the fix should use config.quant.weight.get('granularity', 'per_block') == 'per_block' to preserve that default.

The commented-out elif on line 33 should be removed rather than left in; dead code in a condition chain adds noise and could confuse future readers about the intended logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants