Skip to content

Aetherclass/TrainCrafter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Training Framework

A flexible and efficient framework for training and fine-tuning Large Language Models (LLMs) using PyTorch and Hugging Face Transformers.

Features

  • Support for various pre-trained models from Hugging Face
  • Efficient data processing and tokenization
  • Checkpoint saving and model versioning
  • GPU acceleration support
  • Customizable training parameters
  • Progress tracking with tqdm

Project Structure

xl/
├── model.py        # Model architecture definition
├── dataset.py      # Data loading and processing
├── train.py        # Training loop implementation
├── requirements.txt # Project dependencies

Requirements

  • Python 3.8+
  • PyTorch 2.1.0+
  • Transformers 4.35.0+
  • CUDA-capable GPU (recommended)

Installation

  1. Clone the repository:
git clone [your-repository-url]
cd xl
  1. Install dependencies:
pip install -r requirements.txt

Usage

Basic Training

  1. Prepare your training data as a list of text strings.

  2. Run the training script:

from train import train

texts = [
    "Your training text 1",
    "Your training text 2",
    # ... more training texts
]

model = train(
    texts,
    model_name="gpt2",  # or any other model from Hugging Face
    num_epochs=3,
    batch_size=8
)

Configuration Options

The training function accepts several parameters:

  • model_name: Name of the pre-trained model (default: "gpt2")
  • output_dir: Directory for saving checkpoints (default: "checkpoints")
  • num_epochs: Number of training epochs (default: 3)
  • batch_size: Batch size for training (default: 8)
  • learning_rate: Learning rate (default: 5e-5)
  • max_length: Maximum sequence length (default: 512)
  • device: Training device ("cuda" or "cpu")

Customization

You can customize the training process by:

  1. Modifying the model architecture in model.py
  2. Adjusting data processing in dataset.py
  3. Changing training parameters in train.py

Model Checkpoints

During training, the model saves checkpoints at:

  • Each epoch: checkpoints/checkpoint-epoch-{epoch_number}
  • Final model: checkpoints/final-model

Performance Tips

  1. Use GPU acceleration when possible
  2. Adjust batch size based on available memory
  3. Monitor training loss for optimal learning rate
  4. Use gradient accumulation for larger effective batch sizes

Contributing

Contributions are welcome! Please feel free to submit pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Hugging Face Transformers library
  • PyTorch team
  • Open source AI community

Contact

[Your Contact Information]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages