title | emoji | colorFrom | colorTo | sdk | app_file | pinned |
---|---|---|---|---|---|---|
t5s |
💯 |
yellow |
red |
streamlit |
app.py |
false |
T5 Summarisation Using Pytorch Lightning, DVC, DagsHub and HuggingFace Spaces
Here you will find the code for the project, but also the data, models, pipelines and experiments. This means that the project is easily reproducible on any machine, but also that you can contribute data, models, and code to it.
Have a great idea for how to improve the model? Want to add data and metrics to make it more explainable/fair? We'd love to get your help.
Blog: https://dagshub.com/blog/machine-summarization-an-open-data-science-project/
To use and run the DVC pipeline install the t5s
package
pip install t5s
Firstly we need to clone the repo containing the code so we can do that using:
t5s clone
We would then have to create the required directories to run the pipeline
t5s dirs
Now to define the parameters for the run we have to run:
t5s start [-h] [-d DATASET] [-s SPLIT] [-n NAME] [-mt MODEL_TYPE]
[-m MODEL_NAME] [-e EPOCHS] [-lr LEARNING_RATE]
[-b BATCH_SIZE]
Then we need to pull the models from DVC
t5s pull
Now to run the training pipeline we can run:
t5s run
Before pushing make sure that the DVC remote is setup correctly:
dvc remote modify origin url https://dagshub.com/{user_name}/summarization.dvc
dvc remote modify origin --local auth basic
dvc remote modify origin --local user {user_name}
dvc remote modify origin --local password {your_token}
Finally to push the model to DVC
t5s push
To push this model to HuggingFace Hub for inference you can run:
t5s upload
Next if we would like to test the model and visualise the results we can run:
t5s visualize
And this would create a streamlit app for testing