A model of image classification based on Yolov8 architecture using pytorch. Here, i use a custom dataset ** of 500 bird species containing about ** 80,000 images for training, validation and testing.
- Python3
- Pytorch
pip instal pytorch # pytorch library
pip install torchsummary # summary
pip install torchvision # pytorch for vision
NB: Update the libraries to their latest versions before training.
⬇️⬇️Download and extract training dataset on Kaggle: 500 bird species dataset
⬇️⬇️Download pretrained model: Model
Run the following scripts for training and/or testing
python train.py # For training the model
🤗🤗Hugging face version: Hugging Face
Run the following scripts for visual result of model:
1. Download Docker
Open CMD
2. Download my image
docker pull vvduc1803/500bird_cls:latest # Pull image
3. Copy and paste
docker run -it -d --name 500_bird_cls -p 1234:1234 vvduc1803/500bird_cls # Run container
4. Copy and paste
docker logs -f 500_bird_cls # Run visual result
Accuracy | Size | Training Epochs | Training Mode | |
---|---|---|---|---|
Model | 74.37 | 415.2 MB | 40 | scratch |
Batch size: 64, GPU: RTX 3050 4G
Model:
Sample classification results
Accuracy of the network on the 2500 test images: 76.38%
- The MODEL with 38M params has a very large size i.e 0.4 GB, compared to other models like Resnet18(40 MB)
- Adjusting parameters like **batch size, number of workers, pin_memory, ** etc. may help you reduce training time, especially if you have a big dataset and a high-end machine(hardware).
- Adjusting parameters like learning rate, weight decay etc maybe can help you improve model.
- Experiments with different learning-rate and optimizers.
- Converting and optimizing pytorch models for mobile deployment.
Van Duc