[quantization] Support PTQ quantization for VLM Qwen3-vl by Torrero · Pull Request #635 · Samsung/TICO

Torrero · 2026-04-16T06:43:53Z

This PR introduces PTQ quantization for VLM Qwen3-vl.

For benchmarks evaluation Qwen3VLForConditionalGeneration has been inherited from the GenerationMixin.
In addition, quant_vision_model.py was changed by introducing logic for dynamically computing position_embeddings,cu_seqlens, or using precomputed values with a fixed grid_thw to export a static graph.

TICO-DCO-1.0-Signed-off-by: Evgenii Maltsev e.maltsev@samsung.com

This commit introduces PTQ quantization for VLM Qwen3-vl. TICO-DCO-1.0-Signed-off-by: Evgenii Maltsev <e.maltsev@samsung.com>

[quantization] Support support PTQ quantization for VLM Qwen3-vl

330dd72

This commit introduces PTQ quantization for VLM Qwen3-vl. TICO-DCO-1.0-Signed-off-by: Evgenii Maltsev <e.maltsev@samsung.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quantization] Support PTQ quantization for VLM Qwen3-vl#635

[quantization] Support PTQ quantization for VLM Qwen3-vl#635
Torrero wants to merge 1 commit intoSamsung:mainfrom
Torrero:qwen_ptq_quantization

Torrero commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Torrero commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Torrero commented Apr 16, 2026 •

edited

Loading