A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
text-to-speech deep-learning unsupervised end-to-end pytorch tts speech-synthesis jets multi-speaker sota single-speaker neural-tts non-autoregressive fastspeech2 hifi-gan non-ar ultimate-tts text-to-wav
-
Updated
Jun 6, 2022 - Python