ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

This repository provides the official PyTorch implementation of ContentVec.

This is a short video that explains the main concepts of our work. If you find this work useful and use it in your research, please consider citing our paper.

Pre-trained models (There are issues with the download link, we will fix it ASAP. For now, please send emails to request pretrained models.)

Model	Classes
ContentVec_legacy	100	download
ContentVec	100	download
ContentVec_legacy	500	download
ContentVec	500	download

Load a model without setting up code repo

ckpt_path = "/path/to/the/checkpoint_best_legacy.pt"
models, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([ckpt_path])
model = models[0]

Train a new model

Data preparation

Download the zip file consisting of the following files:

{train,valid}.tsv waveform list files in metadata
{train,valid}.km frame-aligned pseudo label files in labels
dict.km.txt a dummy dictionary in labels
spk2info.dict a dictionary mapping from speaker id to speaker embedding in metadata

Modify the root directory in the {train,valid}.tsv waveform list files

Setup code repo

Follow steps in setup.sh to setup the code repo

Pretrain ContentVec

Use run_pretrain_single.sh to run on a single node

Use run_pretrain_multi.sh and the corresponding slurm template to run on multiple GPUs and nodes

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
contentvec		contentvec
.gitignore		.gitignore
README.md		README.md
contentvec_pretrain.slurm		contentvec_pretrain.slurm
run_pretrain_multi.sh		run_pretrain_multi.sh
run_pretrain_single.sh		run_pretrain_single.sh
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

Pre-trained models (There are issues with the download link, we will fix it ASAP. For now, please send emails to request pretrained models.)

Load a model without setting up code repo

Train a new model

Data preparation

Setup code repo

Pretrain ContentVec

About

Releases

Packages

Languages

License

232136813/contentvec

Folders and files

Latest commit

History

Repository files navigation

ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

Pre-trained models (There are issues with the download link, we will fix it ASAP. For now, please send emails to request pretrained models.)

Load a model without setting up code repo

Train a new model

Data preparation

Setup code repo

Pretrain ContentVec

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages