-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Usage]: I want to speed up a function with VLLM, but I don't know how to do it
usage
How to use vllm
#11483
opened Dec 25, 2024 by
CallmeZhangChenchen
1 task done
[Usage]: 关于vllm0.6.4的性能提升在qwen2.5上并没有体现
usage
How to use vllm
#11482
opened Dec 25, 2024 by
umie0128
1 task done
[New Model]: QVQ-72B-Preview
new model
Requests to new models
#11479
opened Dec 25, 2024 by
ZB052-A
1 task done
[Feature]: Prefix cache aware load balancing
feature request
#11477
opened Dec 25, 2024 by
gaocegege
1 task done
[Usage]: missing openai templates
usage
How to use vllm
#11474
opened Dec 25, 2024 by
JohnConnor123
1 task done
[Usage]: How to figure out why vllm response nothing but trt-llm response meaningful result
usage
How to use vllm
#11473
opened Dec 25, 2024 by
GGBond8488
1 task done
[Usage]: About '--chat-template' parameters for model google/paligemma2-3b-ft-docci-448
usage
How to use vllm
#11471
opened Dec 24, 2024 by
llv22
1 task done
[Bug]: error when start in multiple GPU
bug
Something isn't working
#11467
opened Dec 24, 2024 by
weiminw
1 task done
[Installation]: May I ask if there is a good solution for deploying grmma-2-27b on v100? The deployment has been consistently unsuccessful
installation
Installation problems
#11462
opened Dec 24, 2024 by
3252152
1 task done
[Misc]: some minor issues in disaggregation test and benchmark tools
misc
#11455
opened Dec 24, 2024 by
Jeffwan
1 task done
[Bug]: InternVL2-40B Inference Precision Problem
bug
Something isn't working
#11454
opened Dec 24, 2024 by
renhedev
1 task done
[Usage]: Trying to add codeshell 7b model, but got an error
usage
How to use vllm
#11451
opened Dec 24, 2024 by
G1017
1 task done
[Bug]: The value of --max-model-len may influence results although the length of input less than max-model-len
bug
Something isn't working
#11447
opened Dec 24, 2024 by
Raphaelzrf
1 task done
[Bug]: Prefill/decode separation leads to blocking and crashing in multi concurrent scenarios
bug
Something isn't working
#11445
opened Dec 24, 2024 by
skyCreateXian
1 task done
[Performance]: Prefill is not using cuda graph and become very slow when LORA enabled
performance
Performance-related issues
#11436
opened Dec 23, 2024 by
niuzheng168
1 task done
[Feature]: AssertionError: MolmoForCausalLM does not support LoRA yet.
feature request
#11431
opened Dec 23, 2024 by
ayylemao
1 task done
[Bug]: 0.6.5 randomly closes connection/drops requests
bug
Something isn't working
#11421
opened Dec 23, 2024 by
0xymoro
1 task done
[Bug]: v0.6.5 breaks AI SDK's Something isn't working
generateObject
with nullable strings in schema ("type mismatch! call is<type>() before get<type>()" && is<std::string>()
)
bug
#11415
opened Dec 22, 2024 by
kldzj
1 task done
[Misc]: How to Profile Both EngineCoreClient and EngineCoreProc Activities in V1 Using Profiler
misc
#11413
opened Dec 22, 2024 by
baifanxxx
1 task done
Error in running 'python -m vllm.entrypoints.openai.api_server '
usage
How to use vllm
#11411
opened Dec 22, 2024 by
SetonLiang
1 task done
[RFC]: The two features i wish vllm has
RFC
#11410
opened Dec 22, 2024 by
MohamedAliRashad
1 task done
[Feature]: (Willing to PR) Avoid KV cache occupying GPU memory when not used
feature request
#11408
opened Dec 22, 2024 by
fzyzcjy
1 task done
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-11-24.