TTS project

Vocoder

To download vocoder, add

"vocoder": {
        "path": "<path_to_be_placed>"
},

to config.json It will be waveglow v2 trained on ljspeech.

Graphemes and alignments

I used pretrained MFA. You might download it here or just add

"preprocessing": {
   ...
   "g2p": true,
   ...
}

to config. It will be placed in the same directory, as your provided dataset.

Audio validation

A defibrillator is a device that gives a high energy electric shock to the heart of someone who is in cardiac arrest
Massachusetts Institute of Technology may be best known for its math, science and engineering education
Wasserstein distance or Kantorovich Rubinstein metric is a distance function defined between probability distributions on a given metric space

These examples of synthezed audio are provided in directory test_data

Report

You may see the report if you follow this link

Installation guide

Firstly, install needed requirements for running model

pip install -r ./requirements.txt

Download model

Use bash script to download trained model

cd ./default_test_model
./download.sh

It will be placed to ./default_test_model/checkpoint.pth

If you have some issues using bash utilities, you may download model directly from google drive

Run test model with prepared configuration

python test.py \
   -c default_test_config.json \
   -r default_test_model/checkpoint.pth \
   -t test_data \
   -o test_result.json

Credits

This repository is based on a heavily modified fork of pytorch-template repository.

Docker

You can use this project with docker. Quick start:

docker build -t my_src_image .
docker run \
   --gpus '"device=0"' \
   -it --rm \
   -v /path/to/local/storage/dir:/repos/asr_project_template/data/datasets \
   -e WANDB_API_KEY=<your_wandb_api_key> \
	my_src_image python -m unittest

Notes:

-v /out/of/container/path:/inside/container/path -- bind mount a path, so you wouldn't have to download datasets at the start of every docker run.
-e WANDB_API_KEY=<your_wandb_api_key> -- set envvar for wandb (if you want to use it). You can find your API key here: https://wandb.ai/authorize

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
default_test_model		default_test_model
src		src
test_data/results		test_data/results
vocoder		vocoder
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
glow.py		glow.py
req_kaggle.txt		req_kaggle.txt
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TTS project

Vocoder

Graphemes and alignments

Audio validation

Report

Installation guide

Download model

Run test model with prepared configuration

Credits

Docker

About

Releases

Packages

Languages

License

L0u1Za/TextToSpeech

Folders and files

Latest commit

History

Repository files navigation

TTS project

Vocoder

Graphemes and alignments

Audio validation

Report

Installation guide

Download model

Run test model with prepared configuration

Credits

Docker

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages