https://github.com/jingyaogong/minimind
预训练:
python train.py
SFT:
python sft_train.py
预训练:
torchrun --nproc_per_node=2 train.py
SFT:
torchrun --nproc_per_node=2 sft_train.py
预训练:
deepspeed --include 'localhost:0,1' train.py
SFT:
deepspeed --include 'localhost:0,1' sft_train.py
test_llm.ipynb