-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix ROCm build to respect PYTORCH_ROCM_ARCH for GPU_TARGETS (issue #22590)
ci/build
documentation
Improvements or additions to documentation
nvidia
rocm
Related to AMD ROCm
v1
#31079
opened Dec 20, 2025 by
westers
Loading…
5 tasks done
[Doc] Add warning regarding GPU profiling limitations on WSL2
documentation
Improvements or additions to documentation
#31078
opened Dec 20, 2025 by
kjuuii
Loading…
Fix ROCm CUDA graph replay synchronization bug (issue #29521)
ci/build
documentation
Improvements or additions to documentation
nvidia
rocm
Related to AMD ROCm
v1
#31077
opened Dec 20, 2025 by
westers
Loading…
5 tasks done
[ROCm][CI/Build] Fix Dockerfile.rocm to set VLLM_TARGET_DEVICE=rocm
ci/build
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
v1
#31075
opened Dec 20, 2025 by
westers
Loading…
Fix formatting of softmax equation in documentation
documentation
Improvements or additions to documentation
#31074
opened Dec 20, 2025 by
ssaketh-ch
Loading…
5 tasks
[CI/Build] Add CMake warning for ignored +PTX suffix in TORCH_CUDA_ARCH_LIST
ci/build
nvidia
#31073
opened Dec 20, 2025 by
mhetrerajat
Loading…
[ROCm][Test] Skip RTN quantization tests on ROCm
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
v1
#31072
opened Dec 20, 2025 by
westers
Loading…
[Doc] Clarify FP8 KV cache computation workflow
documentation
Improvements or additions to documentation
v1
#31071
opened Dec 20, 2025 by
westers
Loading…
[Doc] Fix image rendering in paged_attention.md
documentation
Improvements or additions to documentation
v1
#31070
opened Dec 20, 2025 by
westers
Loading…
[Bugfix] Fix truncate_prompt_tokens ignored in PoolingParams.encode()
v1
#31068
opened Dec 20, 2025 by
westers
Loading…
3 tasks
[Bugfix] Fix incorrect tensor parallel size in Ray executor warning
v1
#31067
opened Dec 20, 2025 by
westers
Loading…
[misc] allow overriding the TAG variable in auto_tune.sh
performance
Performance-related issues
#31065
opened Dec 20, 2025 by
kkr16
Loading…
3 of 4 tasks
[Frontend] add logprob, compression_rate to 'verbose_json' features
documentation
Improvements or additions to documentation
frontend
#31059
opened Dec 20, 2025 by
sangbumlikeagod
Loading…
5 tasks
[Scheduler] Fix CrossAttn blocks per-request for Variable length encoder inputs
v1
#31058
opened Dec 20, 2025 by
ekagra-ranjan
Loading…
[KVConnector] Auto-downgrade to PIECEWISE cudagraph mode for layerwise async ops
kv-connector
nvidia
#31057
opened Dec 20, 2025 by
yashwantbezawada
Loading…
[Bugfix] Fix GLM-4 MoE router logits dtype for data parallel chunking
#31055
opened Dec 20, 2025 by
ReinforcedKnowledge
Loading…
[CI] Fix H200 Distributed test
ci/build
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#31054
opened Dec 20, 2025 by
LucasWilkinson
Loading…
[MoE Refactor] Use modular kernel for unquantized Triton MoE
ready
ONLY add when PR is ready to merge/full CI is needed
#31052
opened Dec 20, 2025 by
zyongye
Loading…
[DO NOT MERGE] Rename Flashinfer MLA Backend to TRTLLM MLA Backend
ci/build
documentation
Improvements or additions to documentation
nvidia
v1
#31051
opened Dec 20, 2025 by
pavanimajety
•
Draft
5 tasks
[MoE Refactor] Split ONLY add when PR is ready to merge/full CI is needed
invoke_fused_moe_kernel
ready
#31050
opened Dec 20, 2025 by
zyongye
Loading…
[CI] Add Qwen3-Next-FP8 to Blackwell model tests
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#31049
opened Dec 19, 2025 by
vadiklyutiy
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.