Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[FlashInfer] Avoid FlashInfer block_size 16 + head_size 256 on blackwell ready ONLY add when PR is ready to merge/full CI is needed v1
#27994 opened Nov 3, 2025 by heheda12345 Loading…
5 tasks
v0.11.1
[flashinfer][fix] do not check nvcc availability
#27990 opened Nov 3, 2025 by mxz297 Loading…
Update delta_text and model_output format to include newline
#27989 opened Nov 3, 2025 by Bsist Loading…
5 tasks
Remove unused swap_space parameter documentation Improvements or additions to documentation frontend kv-connector tpu Related to Google TPUs v1
#27988 opened Nov 3, 2025 by mcelrath Loading…
2 of 5 tasks
[Frontend] Add a random prefix to client-provided request IDs frontend ready ONLY add when PR is ready to merge/full CI is needed
#27987 opened Nov 3, 2025 by markmc Loading…
Enabling cooperative multi-gpu tests on multi-gpu nodes ci/build ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#27986 opened Nov 3, 2025 by Alexei-V-Ivanov-AMD Loading…
[Frontend] Make RequestIdMiddleware return the internal request_id frontend ready ONLY add when PR is ready to merge/full CI is needed
#27983 opened Nov 3, 2025 by markmc Loading…
[Quantization] support gpt-oss for quantized kv cache weight loading gpt-oss Related to GPT-OSS models
#27980 opened Nov 3, 2025 by xuebwang-amd Loading…
5 tasks
[KVConnector] Enable get_block_ids_with_load_errors() in LMCache connector kv-connector ready ONLY add when PR is ready to merge/full CI is needed
#27978 opened Nov 3, 2025 by ziruiliu Loading…
5 tasks
fix(benchmarks): Remove hardcoded dtype in hf backend performance Performance-related issues
#27976 opened Nov 3, 2025 by git-jxj Loading…
3 of 5 tasks
[Refactor] Lazy import tool_parser deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend llama Related to Llama models ready ONLY add when PR is ready to merge/full CI is needed tool-calling
#27974 opened Nov 3, 2025 by chaunceyjiang Loading…
5 tasks
[Model] fix ernie45 reasoning_parser ready ONLY add when PR is ready to merge/full CI is needed
#27973 opened Nov 3, 2025 by CSWYF3634076 Loading…
[Model] app optimal triton fused moe configs for NemotronH MoE performance Performance-related issues
#27967 opened Nov 3, 2025 by tomeras91 Loading…
[Bugfix][ROCm] Fix AITER attention backend for deepseek-ocr model deepseek Related to DeepSeek models rocm Related to AMD ROCm v1
#27965 opened Nov 3, 2025 by vllmellm Loading…
5 tasks
[Doc][Last/N] Improve all pooling task | Refactor pooling-related documentation documentation Improvements or additions to documentation
#27963 opened Nov 3, 2025 by noooop Draft
5 tasks
[Refactor] to simplify and extract the shared logic between chat completion and responses frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling
#27961 opened Nov 3, 2025 by chaunceyjiang Loading…
5 tasks
Make pre-commit work on fedora
#27958 opened Nov 3, 2025 by rabi Loading…
[V0 deprecation] Remove VLLM_USE_V1 usage in most modules documentation Improvements or additions to documentation frontend kv-connector multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed structured-output v1
#27955 opened Nov 3, 2025 by wangxiyuan Loading…
5 tasks
[CPU] Refactor CPU attention backend ci/build v1
#27954 opened Nov 3, 2025 by bigPYJ1151 Loading…
2 of 5 tasks
[HARDWARE][CPU] Add Option for Disabling Binding to Specific CPU Cores documentation Improvements or additions to documentation v1
#27953 opened Nov 3, 2025 by StanHatko Loading…
4 of 5 tasks
Update Flashinfer from v0.4.1 to v0.5.0 ci/build ready ONLY add when PR is ready to merge/full CI is needed
#27952 opened Nov 3, 2025 by hmellor Loading… v0.11.1
ProTip! Updated in the last three days: updated:>2025-10-31.