-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For model/prior.py _initial_sample, why the prob is calculated as from N(0,1)? #2
Comments
Hi @seekerzz , t is always 1 in our setting. |
Thanks for your reply!
|
Can you share the synthesized samples? And where did you apply the speaker information, e.g., speaker embedding? |
Thanks for the sharing. So if I understood correctly, you add the speaker embedding to the text embedding right after the text encoder so that both posterior and prior encoder can take the speaker-dependent hidden representations X, am I right?
I quoted it from section 4 of the Glow-TTS paper. |
Yes! I am going to try their conditioning method. If it succeed I will share the result.😊 |
Ah, I see. I think It should work if you adopt the same way. Looking forward to seeing it! |
@seekerzz hey, have you made any progress? |
Great! Hope to get the clear sample soon. |
Hello, I mean the position of Mu and Logvar are misplaced. |
Ah, sorry for the misunderstanding. Yes, you're right. It should be switched. But the reason why it's still working is that they are the same but wrongly named (reversed). In other words, |
@seekerzz Could you share any synthesized samples? |
Hello, thanks for sharing the pytorch-based code!
However, I have some question about the
_initial_sample
func inmodel/prior.py
.epsilon
is sampled from N(0, t) (t is the temperature), how its logprob is calculated? For norm distribution,After log (the mean is 0)
.
Can you explain why use
\sigma
as 1 instead oft
here?The text was updated successfully, but these errors were encountered: