Open
Description
Hi, thanks for your nice jobs. I used your codes for ny own datasets and the synthesized voices seems not that normal at 160K steps now. Though we could still figure out what's being saied, the spectrum is unnormal (especially the high frequency part, as you can see from the following figures.) with severe metallic sound. I have double checked the feature extraction process and the training process, and all are normal. Do you know any reason about it? BTW, how many steps are required to train the LJSpeech model?
Thanks again.
Metadata
Assignees
Labels
No labels