This repository provides the official PyTorch implementation of ContentVec.
This is a short video that explains the main concepts of our work. If you find this work useful and use it in your research, please consider citing our paper.
Pre-trained models (There are issues with the download link, we will fix it ASAP. For now, please send emails to request pretrained models.)
Model | Classes | |
---|---|---|
ContentVec_legacy | 100 | download |
ContentVec | 100 | download |
ContentVec_legacy | 500 | download |
ContentVec | 500 | download |
ckpt_path = "/path/to/the/checkpoint_best_legacy.pt"
models, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([ckpt_path])
model = models[0]
Download the zip file consisting of the following files:
{train,valid}.tsv
waveform list files in metadata{train,valid}.km
frame-aligned pseudo label files in labelsdict.km.txt
a dummy dictionary in labelsspk2info.dict
a dictionary mapping from speaker id to speaker embedding in metadata
Modify the root directory in the {train,valid}.tsv
waveform list files
Follow steps in setup.sh
to setup the code repo
Use run_pretrain_single.sh
to run on a single node
Use run_pretrain_multi.sh
and the corresponding slurm template to run on multiple GPUs and nodes