Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/wyf3/llm_related
Browse files Browse the repository at this point in the history
  • Loading branch information
wyf3 committed Dec 7, 2024
2 parents f127a81 + fd93b61 commit 9069f8a
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions train_multimodal_from_scratch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ qwen2.5-0.5b:\
https://hf-mirror.com/Qwen/Qwen2.5-0.5B-Instruct\
siglip:\
此处使用的是如下版本的siglip(模型小,但是效果可能没那么好,训练更快,显存要求更低):\
https://hf-mirror.com/google/siglip-base-patch16-384
https://hf-mirror.com/google/siglip-base-patch16-224

也可以使用效果更好的版本,但是模型会更大(注意,使用这个版本可能需要修改image_pad_num这个参数,这个版本的模型输出的图片特征为(b,729,dim),在图片压缩的时候是reshape成(b,729/9,dim*9)):\
https://hf-mirror.com/google/siglip-so400m-patch14-384
Expand Down Expand Up @@ -41,4 +41,4 @@ SFT:\
deepspeed --include 'localhost:0,1' sft_train.py

## 测试
python test.py
python test.py

0 comments on commit 9069f8a

Please sign in to comment.