-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[FlashInfer] Avoid FlashInfer block_size 16 + head_size 256 on blackwell
ready
ONLY add when PR is ready to merge/full CI is needed
v1
[Kernel] Remove Redundant Prefill Support From 3D Triton Attention Kernel
#27993
opened Nov 3, 2025 by
jvlunteren
Loading…
Update delta_text and model_output format to include newline
#27989
opened Nov 3, 2025 by
Bsist
Loading…
5 tasks
Remove unused swap_space parameter
documentation
Improvements or additions to documentation
frontend
kv-connector
tpu
Related to Google TPUs
v1
#27988
opened Nov 3, 2025 by
mcelrath
Loading…
2 of 5 tasks
[Frontend] Add a random prefix to client-provided request IDs
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#27987
opened Nov 3, 2025 by
markmc
Loading…
Enabling cooperative multi-gpu tests on multi-gpu nodes
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#27986
opened Nov 3, 2025 by
Alexei-V-Ivanov-AMD
Loading…
[Frontend] Make RequestIdMiddleware return the internal request_id
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#27983
opened Nov 3, 2025 by
markmc
Loading…
[Quantization] support gpt-oss for quantized kv cache weight loading
gpt-oss
Related to GPT-OSS models
#27980
opened Nov 3, 2025 by
xuebwang-amd
Loading…
5 tasks
[KVConnector] Enable get_block_ids_with_load_errors() in LMCache connector
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
#27978
opened Nov 3, 2025 by
ziruiliu
Loading…
5 tasks
fix(benchmarks): Remove hardcoded dtype in hf backend
performance
Performance-related issues
#27976
opened Nov 3, 2025 by
git-jxj
Loading…
3 of 5 tasks
[Refactor] Lazy import tool_parser
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
llama
Related to Llama models
ready
ONLY add when PR is ready to merge/full CI is needed
tool-calling
#27974
opened Nov 3, 2025 by
chaunceyjiang
Loading…
5 tasks
[Model] fix ernie45 reasoning_parser
ready
ONLY add when PR is ready to merge/full CI is needed
#27973
opened Nov 3, 2025 by
CSWYF3634076
Loading…
[Bugfix] Handle escaped characters in GLM tool parser to prevent double serialization
ci/build
frontend
gpt-oss
Related to GPT-OSS models
tool-calling
v1
#27970
opened Nov 3, 2025 by
soaringk
Loading…
3 of 5 tasks
[Model][Bugfix] fix pipeline parallelism support for NemotronH
#27968
opened Nov 3, 2025 by
tomeras91
Loading…
[Model] app optimal triton fused moe configs for NemotronH MoE
performance
Performance-related issues
#27967
opened Nov 3, 2025 by
tomeras91
Loading…
[Bugfix][ROCm] Fix AITER attention backend for deepseek-ocr model
deepseek
Related to DeepSeek models
rocm
Related to AMD ROCm
v1
#27965
opened Nov 3, 2025 by
vllmellm
Loading…
5 tasks
[Doc][Last/N] Improve all pooling task | Refactor pooling-related documentation
documentation
Improvements or additions to documentation
[Refactor] to simplify and extract the shared logic between chat completion and responses
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
tool-calling
#27961
opened Nov 3, 2025 by
chaunceyjiang
Loading…
5 tasks
[LoRA][FusedMoE] Introduce FusedMoEPermuteExpertsUnpermuteWithLoRA
needs-rebase
#27959
opened Nov 3, 2025 by
varun-sundar-rabindranath
Loading…
[V0 deprecation] Remove VLLM_USE_V1 usage in most modules
documentation
Improvements or additions to documentation
frontend
kv-connector
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
v1
#27955
opened Nov 3, 2025 by
wangxiyuan
Loading…
5 tasks
[CPU] Refactor CPU attention backend
ci/build
v1
#27954
opened Nov 3, 2025 by
bigPYJ1151
Loading…
2 of 5 tasks
[HARDWARE][CPU] Add Option for Disabling Binding to Specific CPU Cores
documentation
Improvements or additions to documentation
v1
#27953
opened Nov 3, 2025 by
StanHatko
Loading…
4 of 5 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2025-10-31.