Skip to content

sentient-io/llm-chatglm-training

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fine Tuning local ChatGLM-6B model with ChatGLM Efficient Tuning

Requirement

  • Python 3.8+ and PyTorch 1.13.1+
  • 🤗Transformers, Datasets, Accelerate, PEFT and TRL
  • fire, protobuf, cpm-kernels and sentencepiece
  • jieba, rouge-chinese and nltk (used at evaluation)
  • gradio and matplotlib (used in train_web.py)
  • uvicorn, fastapi and sse-starlette (used in api_demo.py)

And powerful GPUs!

Getting Started

  1. Git clone https://github.com/sentient-io/llm-chatglm-training.git
  2. Run pip install -r requirements.txt
  3. Update model path (CHATGLM_REPO_NAME) at config.py
  4. Prepare dataset (use self instruct) method like Alpaca)
  5. Fine tuning with single GPU
CUDA_VISIBLE_DEVICES=0 python3 src/train_bash.py \
    --stage sft \
    --model_name_or_path <path_to_model> \
    --do_train \
    --dataset  <path_to_key_of_dataset_info.json> \
    --finetuning_type lora \
    --output_dir  <path_to_output_checkpoints> \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 2 \
    --save_steps 2 \
    --learning_rate 5e-5 \
    --num_train_epochs 3.0 \
    --fp16

Example

CUDA_VISIBLE_DEVICES=0 python3 src/train_bash.py \
    --stage sft \
    --model_name_or_path /root/glm/chatglm2-6b \
    --do_train \
    --dataset wine_en \
    --finetuning_type lora \
    --output_dir /root/glm/_wine_checkpoints \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 2 \
    --save_steps 2 \
    --learning_rate 5e-5 \
    --num_train_epochs 12.0 \
    --fp16

Potential Bugs

If CUDA out of memory, try adjusting 1) Reducing batch size 2) Tweak quantization_bit to 8bit or 4bit

Please refer to ChatGLM Efficient Tuning Wiki about the details of the arguments

Checkpoint saved at output_dir, to export the fine-tuned ChatGLM-6B model and get the weights, look at step 6)

  1. Export model
python src/export_model.py \
    --model_name_or_path path_to_your_chatglm_model \
    --checkpoint_dir path_to_checkpoint \
    --output_dir path_to_export

You may need to add the following files from https://huggingface.co/THUDM/chatglm-6b OR https://huggingface.co/THUDM/chatglm2-6b to the output_dir

  • tokenization_chatglm.py
  • modeling_chatglm.py
  • configuration_chatglm.py
  • quantization.py

You may need to also update tokenizer_config.json to remove "clean_up_tokenization_spaces": false"

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%