Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Add fully sharded layer for QKVParallelLinearWithLora #5665

Merged
merged 7 commits into from
Jun 21, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
format
  • Loading branch information
jeejeelee committed Jun 19, 2024
commit 5766959b55de65fb031ad4d3a692d7dc622108aa
20 changes: 11 additions & 9 deletions tests/lora/test_baichuan.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,14 +81,16 @@ def test_baichuan_tensor_parallel_equality(baichuan_lora_files):
del llm_tp1
cleanup()

llm_tp2 = vllm.LLM(MODEL_PATH,
enable_lora=True,
max_num_seqs=16,
max_loras=4,
max_lora_rank=64,
tensor_parallel_size=2,
trust_remote_code=True,
fully_sharded_loras=True,) ## Verify fully sharded lora
llm_tp2 = vllm.LLM(
MODEL_PATH,
enable_lora=True,
max_num_seqs=16,
max_loras=4,
max_lora_rank=64,
tensor_parallel_size=2,
trust_remote_code=True,
fully_sharded_loras=True,
jeejeelee marked this conversation as resolved.
Show resolved Hide resolved
) ## Verify fully sharded lora
output_tp2 = do_sample(llm_tp2, baichuan_lora_files, lora_id=2)

del llm_tp2
Expand All @@ -108,4 +110,4 @@ def test_baichuan_tensor_parallel_equality(baichuan_lora_files):
del llm_tp4
cleanup()

assert output_tp1 == output_tp4
assert output_tp1 == output_tp4
Loading