vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 4.9k
Star 32.5k

Code
Issues 1.2k
Pull requests 423
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024

#9006 opened Oct 1, 2024 by simon-mo

Open 26

vLLM's V1 Engine Architecture

#8779 opened Sep 24, 2024 by simon-mo

Open 10

Labels 56 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,237 Open 4,473 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Usage]: I want to speed up a function with VLLM, but I don't know how to do it usage

How to use vllm

#11483 opened Dec 25, 2024 by CallmeZhangChenchen

1 task done

[Usage]: 关于vllm0.6.4的性能提升在qwen2.5上并没有体现 usage

How to use vllm

#11482 opened Dec 25, 2024 by umie0128

1 task done

[New Model]: QVQ-72B-Preview new model

Requests to new models

#11479 opened Dec 25, 2024 by ZB052-A

1 task done

[Misc]: qwen2 vllm和transform 推理结果未对齐 misc

#11478 opened Dec 25, 2024 by Qyijiu

1 task done

[Feature]: Prefix cache aware load balancing feature request

#11477 opened Dec 25, 2024 by gaocegege

1 task done

[Usage]: missing openai templates usage

How to use vllm

#11474 opened Dec 25, 2024 by JohnConnor123

1 task done

[Usage]: How to figure out why vllm response nothing but trt-llm response meaningful result usage

How to use vllm

#11473 opened Dec 25, 2024 by GGBond8488

1 task done

[Usage]: About '--chat-template' parameters for model google/paligemma2-3b-ft-docci-448 usage

How to use vllm

#11471 opened Dec 24, 2024 by llv22

1 task done

[Misc]: Molmo inference multi-GPU misc

#11468 opened Dec 24, 2024 by epishchik

1 task done

[Bug]: error when start in multiple GPU bug

Something isn't working

#11467 opened Dec 24, 2024 by weiminw

1 task done

[Installation]: May I ask if there is a good solution for deploying grmma-2-27b on v100? The deployment has been consistently unsuccessful installation

Installation problems

#11462 opened Dec 24, 2024 by 3252152

1 task done

[Misc]: some minor issues in disaggregation test and benchmark tools misc

#11455 opened Dec 24, 2024 by Jeffwan

1 task done

[Bug]: InternVL2-40B Inference Precision Problem bug

Something isn't working

#11454 opened Dec 24, 2024 by renhedev

1 task done

[Usage]: Trying to add codeshell 7b model, but got an error usage

How to use vllm

#11451 opened Dec 24, 2024 by G1017

1 task done

[Bug]: The value of --max-model-len may influence results although the length of input less than max-model-len bug

Something isn't working

#11447 opened Dec 24, 2024 by Raphaelzrf

1 task done

[Bug]: Prefill/decode separation leads to blocking and crashing in multi concurrent scenarios bug

Something isn't working

#11445 opened Dec 24, 2024 by skyCreateXian

1 task done

[Performance]: Prefill is not using cuda graph and become very slow when LORA enabled performance

Performance-related issues

#11436 opened Dec 23, 2024 by niuzheng168

1 task done

[Feature]: AssertionError: MolmoForCausalLM does not support LoRA yet. feature request

#11431 opened Dec 23, 2024 by ayylemao

1 task done

[Bug]: 0.6.5 randomly closes connection/drops requests bug

Something isn't working

#11421 opened Dec 23, 2024 by 0xymoro

1 task done

[Feature]: QTIP Quantization feature request

#11416 opened Dec 22, 2024 by ehtom

1 task done

[Bug]: v0.6.5 breaks AI SDK's generateObject with nullable strings in schema ("type mismatch! call is<type>() before get<type>()" && is<std::string>()) bug

Something isn't working

#11415 opened Dec 22, 2024 by kldzj

1 task done

[Misc]: How to Profile Both EngineCoreClient and EngineCoreProc Activities in V1 Using Profiler misc

#11413 opened Dec 22, 2024 by baifanxxx

1 task done

Error in running 'python -m vllm.entrypoints.openai.api_server ' usage

How to use vllm

#11411 opened Dec 22, 2024 by SetonLiang

1 task done

[RFC]: The two features i wish vllm has RFC

#11410 opened Dec 22, 2024 by MohamedAliRashad

1 task done

[Feature]: (Willing to PR) Avoid KV cache occupying GPU memory when not used feature request

#11408 opened Dec 22, 2024 by fzyzcjy

1 task done

Previous 1 2 3 4 5 … 49 50 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2024-11-24.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly