GitHub - ga642381/FastSpeech2: Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:

Multi-speaker FastSpeech 2 - PyTorch Implementation ⚡

This is a PyTorch implementation of Microsoft's FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.
Now supporting about 900 speakers in 🔥 LibriTTS for multi-speaker text-to-speech.

Datasets 🐘

This project supports 2 muti-speaker datasets:

🔥 Single-Speaker

LJSpeech

🔥 Multi-Speaker

LibriTTS
VCTK

Config

Configurations are in:

config/dataset.yaml
config/hparams.py

Please modify the dataest and mfa_path in hparams.

In this repo, we're using MFA v1. Migrating to MFA v2 is a TODO item.

Steps

preprocess.py
train.py
synthesize.py

1. Preprocess

File Structures:

[DATASET] / wavs / speaker / wav_files [DATASET] / txts / speaker / txt_files

wav_dir : the folder containing speaker dirs ( [DATASET] / wavs )
txt_dir : the folder containing speaker dirs ( [DATASET] / txts )
save_dir : the output directory (e.g. "./processed" )
--prepare_mfa : create mfa_data
--mfa : create textgrid files
--create_dataset : generate mel, phone, f0 ....., metadata.json

Example commands:

LJSpeech:

#run the script for organizing LJSpeech first
python ./script/organizeLJ.py

python preprocess.py /storage/tts2021/LJSpeech-organized/wavs /storage/tts2021/LJSpeech-organized/txts ./processed/LJSpeech --prepare_mfa --mfa --create_dataset

LibriTTS:

python preprocess.py /storage/tts2021//LibriTTS/train-clean-360 /storage/tts2021//LibriTTS/train-clean-360 ./processed/LibriTTS --prepare_mfa --mfa --create_dataset

VCTK:

python preprocess.py /storage/tts2021/VCTK-Corpus/wav48/ /storage/tts2021/VCTK-Corpus/txt ./processed/VCTK --prepare_mfa --mfa --create_dataset

metadata.json includes:

spker table
traning data
validation data

2. Train

data_dir : the preprocessed data directory
--comment: some comments

Example commands:

LJSpeech:

python train.py ./processed/LJSpeech --comment "Hello LJSpeech"

LibriTTS:

python train.py ./processed/LibriTTS --comment "Hello LibriTTS"

VCTK:

python train.py ./processed/VCTK --comment "Hello VCTK"

3. Synthesize

--ckpt_path: the checkpoint path
--output_dir: the directory to put the synthesized audios

Example commands:

python synthesize.py --ckpt_path ./records/LJSpeech_2021-11-22-22:42/ckpt/checkpoint_125000.pth.tar --output_dir ./output

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
audio		audio
config		config
data		data
model		model
scripts		scripts
text		text
transformer		transformer
utils		utils
.gitignore		.gitignore
FastSpeech2.png		FastSpeech2.png
README.md		README.md
preprocess.py		preprocess.py
synthesize.py		synthesize.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-speaker FastSpeech 2 - PyTorch Implementation ⚡

Datasets 🐘

🔥 Single-Speaker

🔥 Multi-Speaker

Config

Steps

1. Preprocess

File Structures:

Example commands:

metadata.json includes:

2. Train

Example commands:

3. Synthesize

Example commands:

References 📔

About

Releases

Packages

Languages

ga642381/FastSpeech2

Folders and files

Latest commit

History

Repository files navigation

Multi-speaker FastSpeech 2 - PyTorch Implementation ⚡

Datasets 🐘

🔥 Single-Speaker

🔥 Multi-Speaker

Config

Steps

1. Preprocess

File Structures:

Example commands:

metadata.json includes:

2. Train

Example commands:

3. Synthesize

Example commands:

References 📔

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages