Fine Tuning local ChatGLM-6B model with ChatGLM Efficient Tuning

Requirement

Python 3.8+ and PyTorch 1.13.1+
🤗Transformers, Datasets, Accelerate, PEFT and TRL
fire, protobuf, cpm-kernels and sentencepiece
jieba, rouge-chinese and nltk (used at evaluation)
gradio and matplotlib (used in train_web.py)
uvicorn, fastapi and sse-starlette (used in api_demo.py)

And powerful GPUs!

Getting Started

Git clone https://github.com/sentient-io/llm-chatglm-training.git
Run pip install -r requirements.txt
Update model path (CHATGLM_REPO_NAME) at config.py
Prepare dataset (use self instruct) method like Alpaca)
Fine tuning with single GPU

CUDA_VISIBLE_DEVICES=0 python3 src/train_bash.py \
    --stage sft \
    --model_name_or_path <path_to_model> \
    --do_train \
    --dataset  <path_to_key_of_dataset_info.json> \
    --finetuning_type lora \
    --output_dir  <path_to_output_checkpoints> \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 2 \
    --save_steps 2 \
    --learning_rate 5e-5 \
    --num_train_epochs 3.0 \
    --fp16

Example

CUDA_VISIBLE_DEVICES=0 python3 src/train_bash.py \
    --stage sft \
    --model_name_or_path /root/glm/chatglm2-6b \
    --do_train \
    --dataset wine_en \
    --finetuning_type lora \
    --output_dir /root/glm/_wine_checkpoints \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 2 \
    --save_steps 2 \
    --learning_rate 5e-5 \
    --num_train_epochs 12.0 \
    --fp16

Potential Bugs

issubclass() arg 1 must be a class
- pip3 install --force-reinstall typing-extensions==4.5.0 https://stackoverflow.com/questions/2464568/can-someone-explain-what-exactly-this-error-means-typeerror-issubclass-arg-1
cannot import name 'soft_unicode' from 'markupsafe'
- pip3 install markupsafe==2.0.1 https://stackoverflow.com/questions/72191560/importerror-cannot-import-name-soft-unicode-from-markupsafe

If CUDA out of memory, try adjusting 1) Reducing batch size 2) Tweak quantization_bit to 8bit or 4bit

Please refer to ChatGLM Efficient Tuning Wiki about the details of the arguments

Checkpoint saved at output_dir, to export the fine-tuned ChatGLM-6B model and get the weights, look at step 6)

Export model

python src/export_model.py \
    --model_name_or_path path_to_your_chatglm_model \
    --checkpoint_dir path_to_checkpoint \
    --output_dir path_to_export

You may need to add the following files from https://huggingface.co/THUDM/chatglm-6b OR https://huggingface.co/THUDM/chatglm2-6b to the output_dir

tokenization_chatglm.py
modeling_chatglm.py
configuration_chatglm.py
quantization.py

You may need to also update tokenizer_config.json to remove "clean_up_tokenization_spaces": false"

Name		Name	Last commit message	Last commit date
Latest commit History 247 Commits
assets		assets
data		data
examples		examples
src		src
tests		tests
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine Tuning local ChatGLM-6B model with ChatGLM Efficient Tuning

Requirement

Getting Started

About

Releases

Packages

Languages

License

sentient-io/llm-chatglm-training

Folders and files

Latest commit

History

Repository files navigation

Fine Tuning local ChatGLM-6B model with ChatGLM Efficient Tuning

Requirement

Getting Started

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages