Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Kernel][Bugfix] Fixup some warnings in nvfp4_blockwise_moe when CUDA < 12.8 bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed
#20324 opened Jul 1, 2025 by tlrmchlsmth Loading…
[Bugfix] Fix the max_seq_len limit of 16384 for DeepSeek models
#20322 opened Jul 1, 2025 by huaqiangwang Loading…
3 of 4 tasks
[USAGE] Improve error handling for weight initialization in Unquantized… documentation Improvements or additions to documentation v1
#20321 opened Jul 1, 2025 by koiker Loading…
3 of 4 tasks
HF Hub LoRA Resolver ci/build documentation Improvements or additions to documentation
#20320 opened Jul 1, 2025 by alex-jw-brooks Loading…
[Feature] Add HTTP support for KV cache events
#20318 opened Jul 1, 2025 by Zyann1 Draft
1 of 4 tasks
fix[Docs]: link anchor is incorrect #20309 documentation Improvements or additions to documentation structured-output
#20315 opened Jul 1, 2025 by yyzxw Loading…
4 tasks
Add support for Prithvi geospatial model in serving mode documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) needs-rebase structured-output v1
#20307 opened Jul 1, 2025 by mgazz Draft
1 of 4 tasks
[doc] quark_mxfp4_introduction documentation Improvements or additions to documentation
#20306 opened Jul 1, 2025 by lihaoyang-amd Draft
[Feature] Support Minimax-M1 function calls features documentation Improvements or additions to documentation frontend tool-calling
#20297 opened Jul 1, 2025 by qscqesze Loading…
Enable fp8 kv cache on rocm aiter backend. rocm Related to AMD ROCm v1
#20295 opened Jul 1, 2025 by fsx950223 Draft
4 tasks
Enable group size 64 for Machete
#20290 opened Jul 1, 2025 by czhu-cohere Loading…
3 of 4 tasks
[Model] Adds support for SlimMoE models Phi-tiny-MoE-instruct
#20286 opened Jun 30, 2025 by zichongli5 Loading…
3 of 4 tasks
[Misc][Doc] Add missing comment for LLM frontend
#20285 opened Jun 30, 2025 by draftbk Loading…
1 of 4 tasks
[TPU] Temporary fix vmem oom for long model len by reducing page size ready ONLY add when PR is ready to merge/full CI is needed tpu Related to Google TPUs v1
#20278 opened Jun 30, 2025 by Chenyaaang Loading…
[Docs] use uv in GPU installation docs documentation Improvements or additions to documentation
#20277 opened Jun 30, 2025 by davidxia Loading…
Dummy commit
#20273 opened Jun 30, 2025 by dhonnappa-amd Loading…
[Benchmark] Add benchmark tool for multi turn conversations performance Performance-related issues
#20267 opened Jun 30, 2025 by pliops-daniels Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.